Skip to content

zhaoc1/coreSNPs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

coreSNPs

In the pitfalls paper, we demonstrated the benefits of using positive controls in shotgun metagenomics sequencing for high-volume analysis. In particular, we used a marine Vibrio species (Vibrio campbellii) for all of our HiSeq runs at PCMP. Due to the high sequencing coverage and low diversity of the positive control samples, we are able to de novo assemble the Vibrio campbellii genome, and study the distributuion of the SNPs for the same genome being sequenced multiple times.

MAG: metagenome-assembled genomes

In this repo, we start with assembled and taxonomically annotated contigs from sunbeam pipeline:

  • extract V. campbellii contigs using sbx_contigs.

  • assess the draft assemblies quatlify using checkm.

  • pangenome analysis using roary.

  • extract core genes using in-house R script

  • calculate SNPs for each core gene using snp-sites

  • muscle (?) or at least subset muscle

  • calcuate hamming distance

Install coresnps Conda environment

conda env update --name=coresnps --quiet --file env.yml

Assess assemblies quality

Checkm requires python2.7 and we also take care of the precalcualted checkm-database in the checkm_dataset rule.

snakemake --configfile config.yml _run_checkm --use-conda --cores 8

Pangenome analysis

snakemake --configfile config.yml run_roary --cores 8

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages