<H1>Example 1: Finding Antibodies for Gating a Specific Cell Population</H1>

In this tutorial we demonstrate how to use the ImmunoPheno library to identify commercial antibodies that can be used to gate a specific cell population. Specifically, we are going to look for antibodies that can be used to separate immature B cells from other B cells.

In [None]:
import pandas as pd
from immunopheno.data_processing import ImmunoPhenoData
from immunopheno.connect import ImmunoPhenoDB_Connect
from immunopheno.plots import plot_UMAP
from immunopheno.models import plot_all_fits

We first create an instance of `ImmunoPhenoDB_Connect` that will allows us to make queries to the ImmunoPheno database.

In [3]:
cxn = ImmunoPhenoDB_Connect("http://ec2-44-222-198-92.compute-1.amazonaws.com")

Loading necessary files...
Connecting to database...
Connected to database.


Let us now search for cell ontologies for which there is information in the database and that contain the word "B cell"

In [4]:
cxn.which_celltypes("B cell")

Unnamed: 0,idCL,label,idExperiment_used
0,CL:0000236,B cell,5
1,CL:0000787,memory B cell,12345
2,CL:0000788,naive B cell,12345
3,CL:0000816,immature B cell,12345
4,CL:0000955,pre-B-II cell,5


We see that the OBO Foundry Cell Ontology ID for immature B cells is `CL:0000816` and there is currently information from 5 CITE-seq experiments in the ImmunoPheno database. We can find more information about these experiments using the command `find_experiments()`. For example,

In [7]:
cxn.find_experiments(idCL=['CL:0000816'])

Unnamed: 0,idExperiment,nameExp,typeExp,pmid,doi,idBTO,tissue
0,1,"Human PBMCs under 4 titration levels, 0.04x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell
1,2,"Human PBMCs under 4 titration levels, 0.2x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell
2,3,"Human PBMCs under 4 titration levels, 1x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell
3,4,"Human PBMCs under 4 titration levels, 2x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell
4,5,PBMC from influenza vaccination,CITE,32094927,https://doi.org/10.1038/s41591-020-0769-8,BTO:0001025,peripheral blood mononuclear cell


The cell ontology ID of B cells is `CL:0000236`. This ontology contains immature B cells as a descendant ontology:

We now look for antibodies that distinguish immature B cells from other B cells in PBMCs

In [11]:
result_df1, plot_dict1 = cxn.find_antibodies(id_CLs=["CL:0000816"], background_id_CLs=["CL:0000236"], idBTO=["BTO:0001025"])
result_df1

Unnamed: 0,target,coeff,stderr,p_val,q_val,CL:0000816,CL:0000236
AB_2800813,CD24,1.637,0.861,0.057292,0.354170,0.954545,0.890756
AB_2800817,CD10,1.590,0.977,0.103631,0.532689,0.428571,0.338384
AB_2814295,CD303,1.517,1.039,0.144322,0.623109,0.586207,0.450000
AB_2810570,CD195,1.372,1.047,0.190070,0.654128,0.409091,0.260504
AB_2800745,CD25,1.273,1.034,0.218241,0.674797,0.409091,0.285714
...,...,...,...,...,...,...,...
AB_2800853,CD39,-1.060,0.730,0.146441,0.623109,0.863636,0.915966
AB_2800851,CD337,-1.118,0.846,0.185972,0.654128,0.172414,0.287500
AB_2810481,Integrin beta7,-1.122,0.809,0.165373,0.642591,0.500000,0.623431
AB_2800770,CD62L,-1.613,0.775,0.037346,0.282172,0.404762,0.564854


`find_antibodies()` runs a linear mixed effects model to identify antobody levels that differ between the two populations. Positive (negative) coefficients indicate antibodies upreguated (downregulated) in the cell populations specified in `id_CLs` and their descendant cell ontologies, while negative coefficients. The optional `idBTO="BTO:001025"` argument restricts the analysis to data from PBMCs, which correspond to the BRENDA tissue ontology id `BTO:001025` (as it can be seen from the output of `find_experiments()` above). If a tissue or list of tissues is not specified, all tissues in the ImmunoPheno database are considered. We observe that among the 136 antobidoies that were tested, anti-CD73 antibody `AB:2800916` is significantly downregulated in immature B cells, being detected in only 40% of immature B-cells. Moreover, by plotting the meassured protein levels, we observe that when it is expressed in immature B-cells, it is expressed at lower levels than in the general population of B-cells

In [27]:
result_dict, plot_dict_ct = cxn.find_celltypes(["AB_2800916"], idBTO=["BTO:0001025"])
plot_dict_ct["AB_2800916"]

`find_celltypes()` uses a linear mixed effects model to identify cell populations on which a given antibody or set of antibodies is upregulated or downregulated in comparisson to all the other cell populations. We can look at the results of the test in the table returned by `find_celltypes()`

In [28]:
result_dict["AB_2800916"]

Unnamed: 0,cellType,coeff,stderr,p_val,q_val,expressed
CL:0000788,naive B cell,7.03,0.372,1.13992e-79,1.1399199999999998e-78,0.84
CL:0000787,memory B cell,6.794,0.69,6.640001e-23,2.213334e-22,0.789474
CL:0000913,"effector memory CD8-positive, alpha-beta T cell",2.944,0.878,0.0007946236,0.001324373,0.541667
CL:0000816,immature B cell,1.851,0.728,0.01100597,0.01572282,0.4
CL:0000625,"CD8-positive, alpha-beta T cell",0.799,0.514,0.1200805,0.1334228,0.323944
CL:0000895,"naive thymus-derived CD4-positive, alpha-beta ...",0.713,0.155,3.9908e-06,7.9816e-06,0.291905
CL:0000905,"effector memory CD4-positive, alpha-beta T cell",0.279,0.303,0.3568406,0.3568406,0.266355
CL:0000576,monocyte,-0.967,0.554,0.08108739,0.1013592,0.163934
CL:0000904,"central memory CD4-positive, alpha-beta T cell",-1.008,0.166,1.2e-09,3e-09,0.170282
CL:0000623,natural killer cell,-2.549,0.196,8.886501000000001e-39,4.4432509999999995e-38,0.034545


From this table and the above plot, we see that in addition to B cells, approximately 50% of effector memory CD8 T-cells are also positive against `AB:2800916`.

Let us know now look for some more information about `AB:2800916`:

In [31]:
cxn.which_antibodies("AB_2800916")

Unnamed: 0,idAntibody,abName,abTarget,clonality,citation,comments,cloneID,host,vendor,catalogNum,idExperiment_used
0,AB_2800916,TotalSeq(TM)-C0577 anti-human CD73 (Ecto-5'-nu...,CD73,monoclonal,"(BioLegend Cat# 344031, RRID:AB_2800916)",Applications: PG,AD2,mouse,BioLegend,344031,234


We see that it has been used in 3 experiments in the ImmunoPheno database:

In [24]:
cxn.find_experiments(ab=["AB_2800916"])

Unnamed: 0,idExperiment,nameExp,typeExp,pmid,doi,idBTO,tissue
0,2,"Human PBMCs under 4 titration levels, 0.2x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell
1,3,"Human PBMCs under 4 titration levels, 1x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell
2,4,"Human PBMCs under 4 titration levels, 2x",CITE,36460735,https://doi.org/10.1038/s41598-022-24371-7,BTO:0001025,peripheral blood mononuclear cell


A different approach to looking for antibodies that can be used to separate immature B cells from other B cells is comparing the populations of B cells and immature B cells to all other cell populations in PBMCS and then look for differences in the results.

In [25]:
result_df1, plot_dict1 = cxn.find_antibodies(id_CLs=["CL:0000816"])
plot_dict1["CL:0000816"]

In [26]:
result_df2, plot_dict2 = cxn.find_antibodies(id_CLs=["CL:0000236"])
plot_dict2["CL:0000236"]

We can now compare the coefficients that resulted from these two sets of comparisons to identify antibodies that differ between immature B cells and the general B cell population

In [29]:
import plotly.express as px

c = list(set(result_df2.index)&set(result_df1.index))
df = pd.DataFrame({"Immature B cell": result_df1["coeff"].loc[c], "B cell": result_df2["coeff"].loc[c], "name": c}, index=c)
fig = px.scatter(df, x="Immature B cell", y="B cell", hover_data=["name"])
fig.layout.height = 500
fig.show()

Hoverig the mouse on the plot we again find `AB:2800916` as the main difference between the B cell and immature B cell populations, consistent with our previous results.