Testing your own isolates against our model

There are two approaches you can use to run the model on your own isolates, located in the directories fastq_pipeline and assembly_pipeline. The fastq pipeline is recommended because the choice of assembly method can impact the results.

All of the code you will need to run this analysis is located in the appropriate directory. The pipelines require the following to run:

HMMER3
bowtie2
bcftools
samtools
fastaq (only required for the assembly pipeline) (pip3 install pyfastaq)
BBMap (can be installed with Bioconda)
R v. 4.1 or above
R package randomForest
R package ROCR (conda install -c bioconda r-rocr)
R package DMwR, can be installed using: library(devtools); remotes::install_github("cran/DMwR")

The fastq pipeline assumes a suffix of _1.fastq and _2.fastq for paired-end reads for each sample.

To run the pipeline on multiple samples, create a file with the path to your read files, excluding the suffixes.

cat samples.txt | while read i; do ./run_invasiveness_index.sh $i; done

Where samples.txt may contain:

path_to_samples/sampleID1
path_to_samples/sampleID2
path_to_samples/sampleID3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing your own isolates against our model

Clone this wiki locally