# Reconstruction and analysis of B-cell lineage trees from single cell data using Immcantation


![](assets/dowser-tutorial-cover.png)



Human B cells play a fundamental role in the adaptive immune response to infection and vaccination, as well as the pathology of allergies and many autoimmune diseases. Central to all of these processes is the fact that B cells are an evolutionary system, and undergo rapid somatic hypermutation and antigen-driven selection as part of the adaptive immune response. The similarities between this B cell response and evolution by natural selection have made phylogenetic methods a powerful means of characterizing important processes, such as immunological memory formation. Recent methodological work has led to the development of phylogenetic methods that adjust for the unique features of B cell evolution. Further, advances in single cell sequencing can now provide an unprecedented resolution of information, including linked heavy and light chain data, as well as the associated transcriptional states of individual B cells. In this tutorial, we show how single cell information can be integrated into B cell phylogenetic analysis using the Immcantation suite (Immcantation.org).

**This tutorial covers:**

Beginning with processed single cell RNA-seq (scRNA-seq) + BCR data from 10X Genomics, we will show:

- how cell type annotations can be associated with BCR sequences,
- how clonal clusters can be identified, and 
- how B cell phylogenetic trees can be built and visualized using these data sources.

## Resources

- You can email [immcantation@googlegroups.com](mailto:immcantation@googlegroups.com) with any questions or issues.
- Documentation: http://immcantation.org
- Source code and bug reports: https://bitbucket.org/kleinstein/immcantation
- Docker/Singularity container for this lab: https://hub.docker.com/r/immcantation/lab

## How to use the notebook

Jupyter Notebook documentation: https://jupyter-notebook.readthedocs.io/en/stable/

**Ctrl+Enter** will run the code in the selected cell and **Shift+Enter** will run the code and move to the following cell.

## Inside this container

This container comes with software and example data that is ready to use. The commands `versions report` and `builds report` show the versions and dates respectively of the tools and data.

### Software versions
Use this command to list the software versions

In [None]:
%%bash
versions report

### Build versions
Use this command to list the date and changesets used during the image build.

In [None]:
%%bash
builds report

### Example data used in the tutorial

- `../data/BCR.data.tsv`: B-Cell Receptor Data. Adaptive Immune Receptor Repertoire (AIRR) tsv BCRs already aligned to IMGT V, D, and J genes. To learn more visit  https://immcantation.readthedocs.io/en/stable/tutorials/tutorials.html


- `../data/GEX.data.rds`: Gene Expression Data. This file contains a Seurat object with RNA-seq data already processed and annotated. For examples visit https://satijalab.org/seurat/articles/pbmc3k_tutorial.html


## Outline of tutorial

1. B cell phylogenetics background.

1. Combining gene expression and BCR sequences.

1. Identifying clonal clusters, reconstruct germlines.

1. Building and visualizing trees.

1. Tree analysis, detecting ongoing evolution.




## B cells underlie both immune function and pathology

![](assets/dowser-tutorial/dowser-tutorial-bcells.png)

## BCRs are first produced by random recombination


## Each B cell has a single type of receptor


## B cell affinity maturation



## Adaptive Immune Receptor Repertoire (AIRR) sequencing



## B cell phylogenetic inference


## Trees link sources of B cell diversity


## Read in data to R session


## What’s in the box?



## Add BCR data to Seurat object


## Add BCR data to Seurat object



## Add GEX data to BCR object


## Add GEX data to BCR object



## Add GEX data to BCR object



## Identifying clonal clusters




## Picking a threshold using shazam




## Performing clustering using scoper




## Reconstruct germlines using dowser




## Formatting clones with dowser



## Constructing trees



## Tree building with dowser




## Plotting trees with dowser and ggtree




## More elaborate tree plots




## Reconstruct intermediate sequences




## Are lineages measurably evolving?




## Detecting measurable evolution




## Correlation tests with dowser




# References

## B cell phylo 

Hoehn, K. B. et al. (2016) The diversity and molecular evolution of B-cell receptors during infection. MBE. https://doi.org/10.1093/molbev/msw015

Hoehn, K. B. et al. (2019) Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination. PNAS 201906020.

Hoehn, K. B. et al. (2020) Phylogenetic analysis of migration, differentiation, and class switching in B cells.
bioRxiv. https://doi.org/10.1101/2020.05.30.124446

Hoehn, K. B. et al. (2021) Human B cell lineages engaged by germinal centers following influenza vaccination are measurably evolving. bioRxiv. https://doi.org/10.1101/2021.01.06.425648


## BCR analysis

Gupta,N.T. et al. (2017) Hierarchical clustering can identify b cell clones
with high confidence in ig repertoire sequencing data. The Journal of
Immunology, 1601850.

Gupta,N.T. et al. (2015) Change-o: A toolkit for analyzing large-scale b cell
immunoglobulin repertoire sequencing data. Bioinformatics, 31, 3356–3358.

Nouri,N. and Kleinstein,S.H. (2018a) A spectral clustering-based method
for identifying clones from high-throughput b cell repertoire sequencing data.
Bioinformatics, 34, i341–i349.

Nouri,N. and Kleinstein,S.H. (2018b) Optimized threshold inference for
partitioning of clones from high-throughput b cell repertoire sequencing
data. Frontiers in immunology, 9.

Stern,J.N. et al. (2014) B cells populating the multiple sclerosis brain
mature in the draining cervical lymph nodes. Science translational medicine,
6, 248ra107–248ra107.


Vander Heiden,J.A. et al. (2017) Dysregulation of b cell repertoire
formation in myasthenia gravis patients revealed through deep sequencing.
The Journal of Immunology, 1601415.

Yaari,G. et al. (2012) Quantifying selection in high-throughput
immunoglobulin sequencing data sets. Nucleic acids research, 40,
e134–e134.

Yaari,G. et al. (2013) Models of somatic hypermutation targeting and
substitution based on synonymous mutations from high-throughput
immunoglobulin sequencing data. Frontiers in immunology, 4, 358.
