experimental read phasing from HapCut
Python
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
hapcut.py
pr.py
readset.py

README.md

readphaser -- separate reads based on mapping results and HapCUT data

Caveat emptor: readphaser was tested for use in Krasileva et al, 2013 and benchmarked using data from this project. In our tests, it phases many contigs where there is a sufficient density of variants and coverage. However, it has not been widely tested in other species and your mileage may vary. As with all bioinformatics program, check that your results are consistent with your own validation procedures. Also, I am quite busy nowadays so (as with all free software) there is no guarantee of support, but I will try my best.

pr.py takes phased variants from HapCut and a corresponding BAM file of alignments to produce:

  • FASTA file of phased reads (with phasing block and haplotype in the header).

  • FASTA file of reads unused in phasing (due to sequencing errors at the same location of a variant, for example).

  • FASTA file of reads from unphased contigs.

Running ReadPhaser:

$ python pr.py -u unphased.fq -p phased.fq hapcout.out aln.bam

Requirements

  • pysam

Citation

Please cite Krasileva et al, 2013.

Limitations

  • Indels are not handled: pr.py currently ignores reads with indels, because (1) they displace other variants and (2) variants around indels are often incorrectly called.