SNPanalysis

The SNP pipeline is to generate a vcf file based on the raw data (fastq files) of the given samples. The script scripts/SNPpipeline.py requires an input as a json file (examples/SNP_data_B8441.json for example) containing all information about the genomes and the tools used for the analysis. Longshot (Edge et al. 2019) is used for long sequencing reads (nanopore for example), and gatk (Van der Auwera & O'Connor 2020) is used for short sequencing reads (Illumina for example).

Dependencies (see examples/SNP_data_B8441.json for example):

longshot, used for long reads
minimap2, used to map long reads to the reference genome
gatk, used short reads
bwa, used to map short reads to the reference genome
sratoolkit, optional for downloading sra
picard, optional for marking duplicates for short reads

How to create a vcf file

scripts/SNPpipeline.py -i examples/SNP_data_B8441.json -o B8441_vcf -prefix allsnps

References

Duong Vu (2023). https://github.com/vuthuyduong/SNPanalysis. DOI: 10.5281/zenodo.8046747

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
examples		examples
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

scripts

scripts

LICENSE

LICENSE

README.md

README.md

Repository files navigation

SNPanalysis

Dependencies (see examples/SNP_data_B8441.json for example):

How to create a vcf file

References

About

Releases 1

Packages

Languages

License

vuthuyduong/SNPanalysis

Folders and files

Latest commit

History

Repository files navigation

SNPanalysis

Dependencies (see examples/SNP_data_B8441.json for example):

How to create a vcf file

References

About

Resources

License

Stars

Watchers

Forks

Languages