Skip to content
/ sctx Public

scRNA-seq utlities for deconvoluting transplant samples


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



17 Commits

Repository files navigation


Is a method developed to simulate transplant scRNA-seq samples and to identify donor and recipient cells from transplant scRNA-seq samples. At the moment scTx relies on the annotated SNV output from scSNV. scTx also relies on the index file generated by scSNV for annotation data.

##Demultiplexing transplant samples

#Build the python module for demultiplexing

git clone --recurse-submodules
cd sctx
pip install .

The only required output is the annotated pileup file from scSNV. There is an example file and notebook to demultiplex a kidney transplant sample in the example folder.

Simulationg transplant samples with ambient RNA and doublets

The code for mixing is now part of scSNV to simplify compiling. Instructions below assume the scsnv binary is on your path

scTx requires the HDF5 C and C++ Libraries to compile the simulation framework.

Simulating Transplant Samples

To simulate a mixture of two scRNA-seq samples we require three files.

  1. A tab separated barcode mapping file listing which singlets and doublets are to be included in the pileup output.

The first column name is the barcode that will be assigned to the cell. The second column are the file indices and barcodes to be used when generating the cell. The file index is based on the order of the bam files specified on the command line. In the example table below the scRNA-seq lung sample would be the first bam file and the PBMC file would be the second. Doublets can be generated by specifying two barcodes instead of one.

name    barcodes
  1. The two collapsed bam files from scSNV to mix, for example, in the table above we would need the lung and PBMC bam files specified in that order.

The mixture command can be run as follows:

scsnv mixture -a 0 -i scsnv_index_path_prefix -r genome.fa -o mixture/pileup -m barcode_map.txt -t 4 lung_collapsed.bam pbmc_collapsed.bam
Important Arguments:
Option Argument Function Required
-a, --ambient float Ambient RNA contamination level to simulate; for example, 0.10 for 10%, default 0.0 No
-i, --txidx path scSNV transcriptome index Yes
-r, --reference path Genome reference fasta Yes
-o, --out path Output prefix Yes
-m, --mixture path Barcode mapping file Yes
-l, --library str Library type (see below)
-h, --help None Print other command line options No

There are some optional arguments to control SNV filtering as well that can be viewed with the sctx mixture -h command. The default values were used for all of the manuscript work.

Library types:

10X V2 3-prime:   -l V2
10X V3 3-prime:   -l V3
10X V2/V1 5-prime:   -l V2_5P
10X V3 5-prime:   -l V3_5P
Output files:
File Contents
mixture/pileup_barcode_matrices.h5 Pileup data matrices (same as scSNV pileup)
mixture/pileup.txt.gz Summary data for each SNV that passed the filtering criteria (same as scSNV pileup)
mixture/pileup_mix_counts.txt The number of molecules used from each of the bam files
mixture/pileup_barcodes.txt.gz Barcode specific molecule counts (ie. those lost and gained as ambient RNA molecules) and the total molecules from each genotype

The pileup output files can then be annotated using the scSNV annotate command to identify potential RNA edits etc.

The annotated pileup output can be converted to files suitable for Vireo and Souporcell using the sctxmisc script incuding with the sctx python package:

sctxmisc snv2vcfmtx -r ref_lenghts.txt -f genome.fa -o mixture/vireo -e -m mixture/pileup_annotated.h5

This will write the output files necessary for vireo and souporcell.

The -e option removes edits. There are some additional filtering options that can be viewed with sctxmisc snv2vcfmtx -h

The ref_lengths file is a tab deliminated file with the reference name, length and comment

1       248956422       dna:chromosome chromosome:GRCh38:1:1:248956422:1 REF
10      133797422       dna:chromosome chromosome:GRCh38:10:1:133797422:1 REF
11      135086622       dna:chromosome chromosome:GRCh38:11:1:135086622:1 REF
12      133275309       dna:chromosome chromosome:GRCh38:12:1:133275309:1 REF
13      114364328       dna:chromosome chromosome:GRCh38:13:1:114364328:1 REF
14      107043718       dna:chromosome chromosome:GRCh38:14:1:107043718:1 REF
15      101991189       dna:chromosome chromosome:GRCh38:15:1:101991189:1 REF
16      90338345        dna:chromosome chromosome:GRCh38:16:1:90338345:1 REF
17      83257441        dna:chromosome chromosome:GRCh38:17:1:83257441:1 REF
18      80373285        dna:chromosome chromosome:GRCh38:18:1:80373285:1 REF
19      58617616        dna:chromosome chromosome:GRCh38:19:1:58617616:1 REF
2       242193529       dna:chromosome chromosome:GRCh38:2:1:242193529:1 REF

This file is automatically generated in the scSNV index folder with the suffix _lengths.txt


scRNA-seq utlities for deconvoluting transplant samples







No releases published