Close assembly gaps using long-reads with focus on correctness.
git clone https://github.com/a-ludi/djunctor.git
cd djunctor
dub build
# run the program for development
dub run -- ARGS...
# run tests
dub test
# clean up
dub clean
djunctor
takes two inputs, a reference assembly and a set of long reads
(PacBio) and tries to close scaffold gaps or reduce their size.
program djunctor(ReferenceAssembly, LongReads)
begin
FindAlignments()
repeat
FilterUselessReads()
BuildPileUpsFromAlignments()
for each pileUp do
SelectGoodReads()
BuildConsensus()
InsertHit()
until (numHits == 0 or maxIterationsReached)
OutputResult()
end
- Low Complexity Regions
ccgcacctcaaatcgtcaccgttgtgtatcgaggggacttatagtgc tcctgtgacatgtcactgttgcggtcgaaccggtcgtgcaatccgac gtcccaatgcccgccgcattaacggtagccatAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAcgc atcaccgatcggggtcggtaataaaaggacaaagttagtgttggcca cgaacttctcacgaataagttccctggttttgcgagggaatgcatct gctaggcgtcactggacacagtgggaaagctgccgggggcga
- Low Complexity Regions
ccgcacctcaaatcgtcaccgttgtgtatcgaggggacttatagtgc tcctgtgacatgtcAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGT AGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAG TAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTAGTccacgagctggag cctaaaacaattccatgagactggtctaggttacgcagtgtagccgc atcaccgatcggggtcggtaataaaaggacaaagttagtgttggcca cgaacttctcacgaataagttccctggttttgcgagggaatgcatct gctaggcgtcactggacacagtgggaaagctgccgggggcga
- Tandem Repeats
ccgcacctcaaatcgtcaccgttgtgtatcgaggggacttatagtgc tcctgtgacatgtcactgttgcggtcgTTGTGTATCGAGGGGACTTA TAGTGCTCCTGTTTGTGTATCGAGGGGACTTATAGTGCTCCTGTTTG TGTATCGAGGGGACTTATAGTGCTCCTGTTTGTGTATCGAGGGGACT TATAGTGCTCCTGTTTGTGTATCGAGGGGACTTATAGTGCTCCTGTT TGTGTATCGAGGGGACTTATAGTGCTCCTGTTTGTGTATCGAGGGGA CTTATAGTGCTCCTGTaagttccctggttttgcgagggaatgcatct gctaggcgtcactggacacagtgggaaagctgccgggggcga
- Transposable elements
ccgcacctcaaatcgtcaccgttgtgtatcgaggggacttatagtgc tcctgtGACATGTCACTGTTGCGGTCGAACCGGTCGTGcaatccgac gtcccaatgcccgccgcattaacggtagccataGACATGTCACTGTT GCGGTCGAACCGGTCGTGtagcgcgacaaaaaccccacgagctggag cctaaaacaattccatgagactggtctaggGACATGTCACTGTTGCG GTCGAACCGGTCGTGtcgtaaaggtctgtcatagtttgtgtgtgtga gcggaagtataaacgaaaagaggaccagaaaaGACATGTCACTGTTG CGGTCGAACCGGTCGTGacagtgggaaagctgccgggggcga
This project is licensed under MIT License.