Skip to content

barcode recover and clone relationship calculate

Notifications You must be signed in to change notification settings

mana-W/virus_barcode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Barlin

If your data include UMI、cell barcode and exogenous virus barcode
you can use Barlin to extract all of this tags and calculate the intergroup similarity.

Installation

Barlin can only work on typical Linux systems.

git clone https://github.com/mana-W/virus_barcode.git

Dependencies

starcode : https://github.com/gui11aume/starcode
umitools : https://github.com/CGATOxford/UMI-tools

R

pcks <- c("stringr","stringdist","ggplot2","jaccard","reshape2","tidyr","pheatmap","parallel","ggalluvial")
install.packages(pcks)

Usage

Data prepare

  1. Fastq file (with virus barcode):
    R1.fastq.gz
    R2.fastq.gz
  2. Fasta file of barcode template sequence:
    virus.fa
  3. Cells annotation (tsv):
    celltype.tsv
    Contents in column 'Cluster' have to be like: group_annotation


Running

Step1: Extract cell barcode and UMIs, prepare input file for next step.

sh CB_UMI.sh path/R1.fastq.gz path/R2.fastq.gz

Step2: Recover virus barcodes of cells and relationship between each pair of clusters.

Rscript find_virusBC.R UMI_CB_umitools/CB_UMI.tsv virus.fa celltype.tsv 0.5

Output of this step in directory res.
The most important file is res/clone_final.tsv, include imformation of barcodes.

Step3 (optional): Calculate cells' relationship span multiple groups and results visualization.

Rscript similarity.R 0.5 group1/res/clone_final.tsv group2/res/clone_final.tsv 0.6

Output of this step in directory spanres.

After this step you can also create a sanky plot by:

Rscript alluvial_plot.R group1_celltype.tsv group2_celltype.tsv spanres/all_jac.csv spanres/all_pvalue.csv