readphaser -- separate reads based on mapping results and HapCUT data
readphaser was tested for use in
Krasileva et al, 2013
and benchmarked using data from this project. In our tests, it phases
many contigs where there is a sufficient density of variants and
coverage. However, it has not been widely tested in other species and
your mileage may vary. As with all bioinformatics program, check that
your results are consistent with your own validation procedures. Also,
I am quite busy nowadays so (as with all free software) there is no
guarantee of support, but I will try my best.
pr.py takes phased variants from
HapCut and a
corresponding BAM file of alignments to produce:
FASTA file of phased reads (with phasing block and haplotype in the header).
FASTA file of reads unused in phasing (due to sequencing errors at the same location of a variant, for example).
FASTA file of reads from unphased contigs.
$ python pr.py -u unphased.fq -p phased.fq hapcout.out aln.bam
Please cite Krasileva et al, 2013.
- Indels are not handled:
pr.pycurrently ignores reads with indels, because (1) they displace other variants and (2) variants around indels are often incorrectly called.