-
Notifications
You must be signed in to change notification settings - Fork 14
/
params.json
1 lines (1 loc) · 3.13 KB
/
params.json
1
{"note":"Don't delete this file! It's used internally to help with page regeneration.","google":"","body":"What is ShoRAH\r\n======\r\nShoRAH is an open source project for the analysis of next generation sequencing\r\ndata. It is designed to analyse genetically heterogeneous samples. Its tools\r\nare written in different programming languages and provide error correction,\r\nhaplotype reconstruction and estimation of the frequency of the different\r\ngenetic variants present in a mixed sample.\r\n\r\n---\r\n\r\nThe software suite ShoRAH (Short Reads Assembly into Haplotypes) consists of\r\nseveral programs, the most imporant of which are:\r\n> `amplian.py` - amplicon based analysis\r\n\r\n> `dec.py` - local error correction based on diri_sampler\r\n\r\n> `diri_sampler` - Gibbs sampling for error correction via Dirichlet\r\n>process mixture\r\n\r\n> `contain` - removal of redundant reads\r\n\r\n> `mm.py` - maximum matching haplotype construction\r\n\r\n> `freqEst` - EM algorithm for haplotype frequency\r\n\r\n> `snv.py` - detects single nucleotide variants, taking strand bias into\r\n>account\r\n\r\n> `shorah.py` - wrapper for everything\r\n\r\n## Citation\r\nIf you use shorah, please cite the application note paper _Zagordi et al._ on\r\n[BMC Bioinformatics](http://www.biomedcentral.com/1471-2105/12/119).\r\n\r\n## General usage\r\n\r\n### Dependencies and installation\r\nPlease download and install:\r\n\r\n- [Biopython](http://biopython.org/wiki/Download), following the online\r\n instructions.\r\n- [GNU scientific library GSL](http://www.gnu.org/software/gsl/),\r\n installation is described in the included README and INSTALL files.\r\n- ncurses is required by samtools. It is usually included in Linux/Mac OS X.\r\n\r\nPlease note that these dependencies can be satisfied also using the package\r\nmanager of many operating system. For example\r\n[MacPorts](http://www.macports.org/) on Mac OS X,\r\n[yum](http://yum.baseurl.org/) on several linux installations and so on.\r\n\r\n\r\nType 'make' to build the C++ programs. This should be enough in most cases. If\r\nyour gsl installation is not standard, you might need to edit the relevant\r\nlines in the `Makefile` (location `/opt/local/` is already included).\r\n\r\n### Run\r\n\r\nThe input is a sorted bam file. Analysis can be performed in local or global\r\nmode.\r\n\r\n#### Local analysis\r\n\r\nThe local analysis alone can be run invoking `dec.py` or `amplian.py` (program\r\nfor the amplicon mode). They work by cutting window from the multiple sequence\r\nalignment, invoking `diri_sampler` on the windows and calling `snv.py` for the\r\nSNV calling.\r\n\r\n#### Global analysis\r\n\r\nThe whole global reconstruction consists of the following steps:\r\n\r\n1. error correction (*i.e.* local haplotype reconstruction);\r\n2. SNV calling;\r\n3. removal of redundant reads;\r\n4. global haplotype reconstruction;\r\n5. frequency estimation.\r\n\r\nThese can be run one after the other, or one can invoke `shorah.py`, that runs\r\nthe whole process from bam file to frequency estimation and SNV calling.\r\n","name":"Shorah","tagline":"Short Reads Assembly into Haplotypes"}