Tutorial 1: Finding Antibodies for Gating a Specific Cell Population
====================================================================

In this tutorial, we demonstrate how to use the ImmunoPheno library to identify commercial antibodies that can be used to gate a specific cell population. Specifically, we are going to perform a data-driven analysis to identify antibodies that can separate transitional stage B cells from other B cells. Transitional stage B cells are an intermediate stage population of cells that occur after the pro-B cell stage but before the mature B cell stage during B cell development. Due to their low abundance and transitory character, they are a relatively difficult population to isolate in cell sorting experiments. 

Note that you may get slightly different results when running this tutorial, as the reference information in the ImmunoPhenoDB database is frequently updated.

In [1]:
# Choose the adequate plotly renderer for visualizing plotly graphs in your system
import plotly.io as pio
pio.renderers.default = 'notebook_connected'

In [23]:
import pandas as pd
import plotly.express as px
from immunopheno.connect import ImmunoPhenoDB_Connect

We first create an instance of `ImmunoPhenoDB_Connect` that will allows us to make queries to the ImmunoPheno database.

In [3]:
cxn = ImmunoPhenoDB_Connect("http://www.immunopheno.org")

Loading necessary files...
Connecting to database...
Connected to database.


We can see a summary of what is currently in the database using the command `db_stats()`

In [4]:
cxn.db_stats()

Database Statistics
Number of experiments: 18
Number of tissues: 6
Number of cells: 115979
Number of antibodies: 646
Number of antibody targets: 294
Number of antibody clones: 390
Average number of experiments per antibody: 2.68


Let us now search for cell ontologies for which there is information in the database and that contain the word "B cell"

In [5]:
cxn.which_celltypes("B cell")

Unnamed: 0,idCL,label,idExperiment_used
0,CL:0000236,B cell,10
1,CL:0000787,memory B cell,12345678111213141516
2,CL:0000788,naive B cell,12345678111215161718
3,CL:0000816,immature B cell,51718
4,CL:0000817,precursor B cell,101218
5,CL:0000818,transitional stage B cell,1012
6,CL:0000826,pro-B cell,12
7,CL:0000955,pre-B-II cell,5
8,CL:0000970,unswitched memory B cell,1718
9,CL:0000972,class switched memory B cell,1718


We see that the [OBO Foundry Cell Ontology](https://www.ebi.ac.uk/ols4/ontologies/cl) ID for transitional stage B cells is `CL:0000818` and there is currently information from 2 experiments in the ImmunoPheno database. We can find more information about these experiments using the command `find_experiments()`. For example,

In [6]:
cxn.find_experiments(idCL=['CL:0000818'])

Unnamed: 0,idExperiment,nameExp,typeExp,pmid,doi,idBTO,tissue
0,10,An immunophenotype-coupled transcriptomic atla...,CITE,38514887,https://doi.org/10.1038/s41590-024-01782-4,BTO:0000141,bone marrow
1,12,Comprehensive Integration of Single-Cell Data,CITE,31178118,https://doi.org/10.1016/j.cell.2019.05.031,BTO:0000141,bone marrow


As expected, all datasets containing information about transitional stage B cells are from the bone marrow. 

The cell ontology ID of B cells is `CL:0000236`. This ontology contains transitional stage B cells (`CL:0000818`) as a descendant ontology:

In [7]:
cxn.plot_db_graph(root="CL:0000236")

In this graph, the nodes in red indicate cell ontologies for which there is data in the ImmunoPhenoDB database explicitly annotated with those ontologies. Nodes in blue indicate cell ontologies corresponding to derived annotations from the annotations in the ImmunoPhenoDB database.

We now look for antibodies that distinguish transitional B cells from other B cells in present in the bone marrow,

In [8]:
result_df1, plot_dict1 = cxn.find_antibodies(id_CLs=["CL:0000818"], background_id_CLs=["CL:0000236"], idBTO=["BTO:0000141"])
result_df1

Unnamed: 0,target,coeff,stderr,p_val,q_val,CL:0000818,CL:0000236
AB_2750556,CD38,7.109,0.977,0.000000e+00,0.000000e+00,1.000000,0.440799
AB_2750381,CD102,1.106,0.208,1.007000e-07,2.135600e-06,0.354232,0.241144
AB_2734286,CD10,0.790,0.139,1.410000e-08,4.967000e-07,0.952978,0.684974
AB_2783249,CD63,0.781,0.141,3.260000e-08,8.639000e-07,0.227273,0.089002
AB_2734267,CD45RA,0.687,0.131,1.535000e-07,2.712700e-06,0.806886,0.915740
...,...,...,...,...,...,...,...
AB_2750357,CD197,-1.271,0.546,1.998851e-02,9.444172e-02,0.066667,0.103809
AB_2749971,CD11c,-1.280,0.703,6.844322e-02,2.418327e-01,0.000000,0.192551
AB_2750000,CD27,-2.912,0.732,6.983760e-05,8.225319e-04,0.000000,0.677479
AB_2749972,CD34,-3.383,0.800,2.359320e-05,3.126100e-04,0.066667,0.290628


In [9]:
result_df1[result_df1["CL:0000818"]>0.8]

Unnamed: 0,target,coeff,stderr,p_val,q_val,CL:0000818,CL:0000236
AB_2750556,CD38,7.109,0.977,0.0,0.0,1.0,0.440799
AB_2734286,CD10,0.79,0.139,1.41e-08,4.967e-07,0.952978,0.684974
AB_2734267,CD45RA,0.687,0.131,1.535e-07,2.7127e-06,0.806886,0.91574
AB_2832712,HLA-DR DP DQ,0.491,0.099,7.618e-07,1.15355e-05,0.998433,0.958333
AB_2734256,CD19,-0.622,0.282,0.0274945,0.1040863,1.0,0.986301
AB_2750001,HLA-DR,-1.189,0.336,0.0004054752,0.003907306,0.866667,0.622432


In [10]:
plot_dict1["CL:0000818"]

`find_antibodies()` runs a linear mixed effects model to identify antobody levels that differ between the two populations. Positive (negative) coefficients indicate antibodies upreguated (downregulated) in the cell populations specified in `id_CLs` and their descendant cell ontologies, while negative coefficients. The optional `idBTO="BTO:000141"` argument restricts the analysis to data from bone marrow, which correspond to the [BRENDA tissue ontology](https://www.ebi.ac.uk/ols4/ontologies/bto) ID `BTO:000141` (as it can be seen from the output of `find_experiments()` above). If a tissue or list of tissues is not specified, all tissues in the ImmunoPheno database are considered. We observe that among the 106 antibodies that were tested, anti-CD38 `AB:2750556` and anti-CD10 `AB:2734286` are significantly upregulated in transitional stage B cells and detected in >95% of this population, compared to the general B-cell population in the bone marrow. 

We can now look at which cell populations in the bone marrow are positive for these two antibodies:

In [11]:
result_dict, plot_dict_ct = cxn.find_celltypes(["AB_2750556", "AB_2734286"], idBTO=["BTO:0000141"])
plot_dict_ct["AB_2750556"]

In [12]:
plot_dict_ct["AB_2734286"]

`find_celltypes()` uses a linear mixed effects model to identify cell populations on which a given antibody or set of antibodies is upregulated or downregulated in comparisson to all the other cell populations. We can look at the results of the test in the table returned by `find_celltypes()`

In [13]:
result_dict["AB_2750556"][result_dict["AB_2750556"]["expressed"]>0.8]

Unnamed: 0,cellType,coeff,stderr,p_val,q_val,expressed
CL:0000980,plasmablast,12.555,1.253,1.2621520000000001e-23,3.0291640000000005e-23,1.0
CL:0001054,CD14-positive monocyte,6.667,0.127,0.0,0.0,0.993635
CL:0000818,transitional stage B cell,6.598,0.92,0.0,0.0,1.0
CL:0000817,precursor B cell,6.369,1.4,5.3709e-06,8.5935e-06,1.0
CL:0000557,granulocyte monocyte progenitor cell,5.488,0.346,1.111304e-56,5.334258e-56,1.0
CL:0000549,basophilic erythroblast,4.453,2.916,0.1268099,0.1521719,1.0
CL:0002032,hematopoietic oligopotent progenitor cell,4.384,1.909,0.02166646,0.03058794,1.0
CL:0000826,pro-B cell,4.141,1.909,0.030117,0.040156,0.857143
CL:0000623,natural killer cell,3.553,0.251,2.337678e-45,8.014897e-45,0.824645


In [14]:
result_dict["AB_2734286"][result_dict["AB_2734286"]["expressed"]>0.8]

Unnamed: 0,cellType,coeff,stderr,p_val,q_val,expressed
CL:0010001,stromal cell of bone marrow,9.564,0.122,0.0,0.0,0.968978
CL:0000818,transitional stage B cell,8.165,0.115,0.0,0.0,0.952978
CL:0000817,precursor B cell,7.513,0.106,0.0,0.0,0.886968


We observe that among B cell populations in the bone marrow, pro-B cells and precursor B cells are also positive for `AB:2750556` and `AB:2734286`. We would therefore like to identify another antibody that we can combine with these antibodies to fully separate transitional B cells from other B cell populations in the bone marrow. We can achieve this by running again `find_antibodies()`, this time using pro-B cells and precursor B cells as background cell populations.

In [15]:
result_df1, plot_dict1 = cxn.find_antibodies(id_CLs=["CL:0000818"], background_id_CLs=["CL:0000826", "CL:0000817"], idBTO=["BTO:0000141"])
result_df1

Unnamed: 0,target,coeff,stderr,p_val,q_val,CL:0000818,CL:0000826,CL:0000817
AB_2734256,CD19,3.496,0.817,1.881360e-05,1.552120e-04,1.000000,0.000000,1.000000
AB_2750001,HLA-DR,3.292,1.164,4.679777e-03,2.316489e-02,0.866667,0.142857,0.769231
AB_2734267,CD45RA,2.345,0.232,4.725649e-24,2.339196e-22,0.806886,0.428571,0.588235
AB_2750381,CD102,1.877,0.225,7.333399e-17,0.000000e+00,0.354232,,0.162234
AB_2750347,CD79b,1.554,1.015,1.257598e-01,2.895400e-01,0.200000,0.142857,0.000000
...,...,...,...,...,...,...,...,...
AB_2800911,CD305,-0.940,0.230,4.528130e-05,3.448343e-04,0.683386,,0.764628
AB_2750000,CD27,-1.325,0.588,2.427334e-02,8.900226e-02,0.000000,0.142857,0.153846
AB_2734247,CD4,-2.238,0.962,2.003938e-02,7.630377e-02,0.066667,0.428571,0.230769
AB_2734366,CD127,-2.385,0.772,2.005962e-03,1.103279e-02,0.000000,0.714286,0.000000


From this analysis, we observe that anti-CD34 antibody `AB:2749972` is detected in >85% of the precursor B cells and pro-B cells, but only in 7% of the transitional stage B cells. Plotting the distribution of normalized expression levels confirms this observation,

In [16]:
result_dict, plot_dict_ct = cxn.find_celltypes(["AB_2749972"], idBTO=["BTO:0000141"])
plot_dict_ct["AB_2749972"]

We therefore conclude that using a combination of the antibodies `AB:2750556`, `AB:2734286`, and `AB:2749972` is an effective strategy to isolate transtional B cells in the bone marrow. Let us know now look for some more information about these antibodies:

In [17]:
cxn.which_antibodies("AB_2750556")

Unnamed: 0,idAntibody,abName,abTarget,clonality,citation,comments,cloneID,host,vendor,catalogNum,idExperiment_used
0,AB_2750556,TotalSeq(TM)-A0557 anti-mouse CD38,CD38,monoclonal,"(BioLegend Cat# 102733, RRID:AB_2750556)",Applications: PG,90,rat,BioLegend,102733,12


In [18]:
cxn.which_antibodies("AB_2734286")

Unnamed: 0,idAntibody,abName,abTarget,clonality,citation,comments,cloneID,host,vendor,catalogNum,idExperiment_used
0,AB_2734286,TotalSeq(TM)-A0062 anti-human CD10,CD10,monoclonal,"(BioLegend Cat# 312231, RRID:AB_2734286)",Applications: PG,HI10a,mouse,BioLegend,312231,710


In [19]:
cxn.which_antibodies("AB_2749972")

Unnamed: 0,idAntibody,abName,abTarget,clonality,citation,comments,cloneID,host,vendor,catalogNum,idExperiment_used
0,AB_2749972,TotalSeq(TM)-A0054 anti-human CD34,CD34,monoclonal,"(BioLegend Cat# 343537, RRID:AB_2749972)",Applications: PG,581,mouse,BioLegend,343537,712


Finally, we can find in which experiments of the ImmunoPhenoDB database these antibodies have been used:

In [20]:
cxn.find_experiments(ab=["AB_2750556"])

Unnamed: 0,idExperiment,nameExp,typeExp,pmid,doi,idBTO,tissue
0,12,Comprehensive Integration of Single-Cell Data,CITE,31178118,https://doi.org/10.1016/j.cell.2019.05.031,BTO:0000141,bone marrow


In [21]:
cxn.find_experiments(ab=["AB_2734286"])

Unnamed: 0,idExperiment,nameExp,typeExp,pmid,doi,idBTO,tissue
0,7,PBMC from influenza vaccination,CITE,32094927,https://doi.org/10.1038/s41591-020-0769-8,BTO:0001025,peripheral blood mononuclear cell
1,10,An immunophenotype-coupled transcriptomic atla...,CITE,38514887,https://doi.org/10.1038/s41590-024-01782-4,BTO:0000141,bone marrow


In [22]:
cxn.find_experiments(ab=["AB_2749972"])

Unnamed: 0,idExperiment,nameExp,typeExp,pmid,doi,idBTO,tissue
0,7,PBMC from influenza vaccination,CITE,32094927,https://doi.org/10.1038/s41591-020-0769-8,BTO:0001025,peripheral blood mononuclear cell
1,12,Comprehensive Integration of Single-Cell Data,CITE,31178118,https://doi.org/10.1016/j.cell.2019.05.031,BTO:0000141,bone marrow
