This repo contains data and python code associated with the Rytlewski et al manuscript on the beta-binomial model for differential abundance analysis of T cell clones.
Immunosequencing data for the healthy donors is contained locally in this repo under healthy_donor_files folder.
Immunosequencing data for the urothelial cancer patients can be freely downloaded from immuneACCESS: https://clients.adaptivebiotech.com/pub/snyder-2017-plosmedicine
Compatible with Python 2.7; needs the following python dependencies installed: sys, os, math, scipy 0.17.1, matplotlib 1.5.1 (> 2.0 recommended), numpy 1.14.2, pandas 0.22.0, deepcopy, gzip, optparse, traceback, multiprocessing
The following are example syntax that can be executed on the data found in the healthy_donor_files folder. The configuration.ini settings should not be changed, unless specified, to reproduce results.
Example file for --batchFile:
subject_id TSV 1 TSV 2
subjectA fileA1.tsv.gz fileA2.ts.gz
subjectB fileB1.tsv.gz fileB2.tsv.gz
subjectC fileC1.tsv.gz fileC2.tsv.gz
Columns need to be labeled exactly as shown in above example. ".gz" should only be included when files are gzipped. The script is compatible with both .tsv and .tsv.gz immunoSEQ files. Batchfiles that specify sample pairs for the healthy donors and cancer patients analyzed in this manuscript are provided in the repo.
In the configuration.ini file, set method = binomial before running.
python2.7 differential_abundance/rundiffabBatch_2017_09_24.py --batchfile batchfile_healthy.tsv --config differential_abundance/configuration.ini --tsvDir healthy_donor_files/ --outDir diffab_results --parallel
In the configuration.ini file, set method = betabinomial before running.
python2.7 differential_abundance/rundiffabBatch_2017_09_24.py --batchfile batchfile_healthy.tsv --config differential_abundance/configuration.ini --train differential_abundance/TrainingTSVs/replicates_Subject1_Standard.csv --tsvDir healthy_donor_files/ --outDir diffab_results --parallel
For Research Use Only.