Skip to content
Code for analyzing population genomics in genome-resolved metagenomes
Python R
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R_scripts
__pycache__
helper_files
notebooks
test
.gitattributes
CHANGELOG.md
LICENSE
README.md
_version.py
calculate_null.py
combine_null.py
filter_reads.py
gene_analysis.py
joint_snp_calling.py
node2vec.py
pNpS.py
strainRep2.py
strainRepMultiSample.py
strains - Training Data.ipynb

README.md

strains_analysis

Code for analyzing population genomics in genome-resolved metagenomes

NOTE: The latest and maintained version of this software is available at: https://github.com/MrOlm/inStrain

Requires: pysam, tqdm, BioPython.

Usage

python strainRep2.py -s 5 -f 0.05 -l 0.96 sorted_indexed_bam.bam scaffolds.fasta -o output --log log.txt

python strainRep2.py -h actually is pretty helpful, that's all of the documentation.

output: 3 tables (and a big python object). Linkage table (showing snp linkage), frequency table (showing SNPs and their frequencies), and clonality table (showing the clonality and coverage of each position - from this gene clonality can be calculated and compared to the genome average) (edited) -s 5 requires 5 reads to confirm a SNP, you can adjust depending on your coverage. -f means minimum snp frequency of 5%, -l 0.96 means that read pairs must be 96% ID to reference. the statistics reported in the log file are also super useful

You can’t perform that action at this time.