Skip to content

Commit

Permalink
Merge branch 'master' into ct-add-bwa-depletion
Browse files Browse the repository at this point in the history
  • Loading branch information
tomkinsc committed Jan 24, 2018
2 parents 1bd3b70 + 06b6cd6 commit dee3e8b
Show file tree
Hide file tree
Showing 6 changed files with 147 additions and 6 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM quay.io/broadinstitute/viral-baseimage:0.1.6
FROM quay.io/broadinstitute/viral-baseimage:0.1.7

LABEL maintainer "viral-ngs@broadinstitute.org"

Expand Down
Binary file not shown.
119 changes: 119 additions & 0 deletions test/input/TestOrderAndOrient/expected.lasv.ambig.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
>KM821998.1|LASV-ISTH2376-S-NG-2012H_contigs_ordered_and_oriented
ACTCGGACTCTCTCTGATTCTGAAGGTAATGAAACACCAGGTGGGTATTGTCTAACAAGA
TGGATGCTGATTGAGGCCGAGTTGAAGTGCTTTGGGAACACAGCAGTTGCAAAATGCAAT
GAGAAACATGATGAAGAGTTTTGTGACATGTTAAGATTGTTTGACTTTAACAAACAAGCC
ATAAGCAGGCTGAAGACAGAAGCACAAATGAGTATCCAACTGATAAACAAAGCAGTGAAC
GCACTGATCAATGACCAGCTGATTATGAAGAATCATTTGAGAGATATCATGGGAATTCCT
TACTGCAATTACAGCAAGTACTGGTATCTGAATCACACTGTAACAGGGAAGACATCATTG
CCAAGGTGCTGGCTTGTATCAAATGGTTCCTACCTAAATGAAACACACTTTTCTGATGAT
ATTGAACAACAAGCAGACAACATGATAACAGAAATGTTGCAAAAGGAGTACATAGACAGG
CAAGGGAAGACACCCTTGGGGTTAGTGGATCTTTTTGTCTTCAGCACCAGCTTTTACCTC
ATTAGCATCTTCCTCCACCTGGTTAAAATCCCAACCCACAGACATGTCATAGGAAAACCC
TGCCCCAAACCACACAGACTCAACCATATGGGCATATGCTCATGTGGTTTATACAAACAT
CCTGGTGTACCAGTCAAGTGGAAAAGATAAGAGATAGACCCACCCATGGGCCCCCGTGAC
CCACCGCCGAAAGGCGGTGGGTCACGGGGGCGTCCATTTACAGGACGACCTTGGGGCTTG
AGGTTCTAAACACCATGTCCCTGGGGAGAACTGCCCTCAAAACTGGTATATTGAGTCCTC
CTGACACAGCTGCATCATACATTATGCAATCCATTAGAGCACAGTGCGGGGTGATTTCCT
CTTTGCCTCCTCTTTTCTTCTTTTCGACAACCACTCCAGTATGCATATGGCATAGATCTT
TGCACTGATCCCAAACAGCATTTTCAAATCTCCTAGAGTCAACCTTGCTCAGTGCAATGT
CAATAAGCTTTATGTCCTTTCTTCCTTGGGAGTCCAAGAGTTTCTTGATATCATCTGAAC
CTTGACAGGTCAGTACCATATTGCGAGGGAGGGCTTCAATGACTGCGCTGGTCAAACCAG
GCTGAGCAGGGAAGAGATCTGCCACATCAATCCCATGAGAGTATTTGGCATCCTGCTTGA
ACTGTTTTAGGTCTGTTGGTTCTCTGAAGAAGTGTATGTAACAGCCTGACATAGGTTGGT
AAAGAGCTATTTCCACAGGGTCTTCTGGACGACCTTCGATGTCTATCCAGGTTTTGGCGC
TTGGGTCAAGCTGCATCATTGAGTCTTTGAGTGTCATTAATTGAGAATAGGTCAGCCCTG
TTGGAAACCCTGCTGACTGTAAAGACTTGTTAGACCCAGCAATGCCCACTTTTTGCGGTT
TTCCATCTGACTCGAGATCCACAGTAGTGTTCTCCCAAGCTCTACCCACAATAGAAGTCC
TTGAAGCTATGTAGGGCCAGCCGTCACCGGAGAGACAGATCTTGTAAAGTATATTCTCAT
AAGGGTTCCTCTCACCAGGTGTATCTGAAACAAACATTCCCAAAGATCTTTTTACTTTTA
AAATAGACTTCAAGATGCCATCCATTGTCTGAGGTGTAACCTTGATGGTCTCCAACATGT
TCCCCCCATCAAGCATACAAGCTCCGGCTTTCACTGCAGCTCCTAAGCTGAAGTTGTAAC
CAGAAATGTTCAAGGAGCTCTTCTTAGTGTCCACCATATTCAGTATAGGGTGGCTTTGGG
AAAGTCTGTCCAGGTCGGAGCTATTCGGGTACTTAGCTGTGTATATCAGGCCCAGGTCTG
TTAGTGCTAAGACGGCATCATTCAAATCAACTTGACCTTGTTTGGTGAGACATGCTAATG
TCAGACTAGGCATGGTGCCGAACTGGTTGTTCAAAAGGTCCGGATTCTTGACATCCCACA
CTCTAACGACTCCGTCTCTTCCAGGTTGGGTGCCCTGAGCCCCACCAACCATACCTATCA
TGTTCAGCAATGCCCTTCTCTGCTCAAGCTGCTGTGTACTCAAATTCCCCATATAAACAC
CTGAGCTTAGTGGCCTTTCTGTTCTAATGACCTTCGACTTTAGTTTTTCTAAGTCAGCTG
CCAGTGTCAGCAGATCATCTGAGGTCAAGGTCCCAACTCTCAAGATACTCTTTTGTTGTG
TCGATTTGAGTTCTACAAGATTGTTGACAGCCTGATTCAAATCTCTGAGACGTTTTAAGT
CACTATCATCCCTTTTTTGTTTACGCATTAAGCGCTGCACATTGCTGACTTCAGAAAAAT
CAAGACCATGAAGGAGAGCCTGAGCATCCTTAACAACTTGTAGTTTTATATTGGAGCAAT
AACCTGAAAGCTCCCTCCTCAATGACTGTGTCCAGAGAAATGACTTCACTTCCTTAGAAG
CACTCATCTTGGTCTGGACACTCGGGGACTTACTGGTTGAAAGTATCACTCAGTGAGTC
>KM821997.1|LASV-ISTH2376-L-NG-2012H_contigs_ordered_and_oriented
TTATTGAACAATATCCTAAGCATTGACCTCATCTTTTTACATGCATCAGGAATCAGCCTC
TCAATTTGCCTCATTATTCTATAACTACGATTCCCATCCACCCAATCTCTTACATCAGTC
TCACAGTTCAAAAGGAATGGATCAATAGGATACTTAGCATAACATAACAAACTCAAAGTC
CTCTTCTGAATTTGATTACACAGATCGACTGGGACACCATTAGCCACAGATTGATCAATT
ATGGTGTCAATTGTTTCTGCCAATTGATGGGGTTCTTTGCACTTTATGTTATGTAAAGCC
GCAGCAACAAATTTTGTTAGCAGAGGTACTTCGTCACCCCAGACATAGAACCTTGATTTG
AACTCTGCGACAAACCTGCCAATTGCACTCTTTGGACTCACAAACTTATTTAGCTGATCA
CTCATGTAGTAATGAAATTCTAACAATGTTTTGAATTCCTCTGTATCTCTTGAGAGTAAC
TCTGTTAGGTTCTGATCAAATAAAGATATCTGATCGTCACTGGACGTGTAAGCATCAATT
TGTCCTCCAGATACACACTTCACTGCAAAATTTATAAATCTTTCAGAAATCAATGCATAA
AAGTCTGAGGTGTTGTGCAGGATACCCTGTCCCATATCAAGGATTGAACTAATGTGTGAA
GGGACTATCCCAATTTGAAAGTTGGAATTAAAGAAGTCCTCAGTCATTGATTGTGTTGTC
TTCTTCTTTAATCCAAGTTGGGATTTGATGTATGACTTCATCATTGCTGATACAACATTA
AATGGGATCTCAACCATTTTGTGCATATGCCATGTTAATAATGTTGATAGGTAATCTTTA
CCTTTTAAATCAGCTTGTAAATCATCAGAAAGTAAGATTAGGTTTTGAATTATTGTTAGA
AATAAGAAAGGACACATCATAGGGCCCCACTTACTGTGGTCCATACTGTATGATACATGT
GCCGATGAGACATTTAGCTTCATTGACAAAATGGCATTGTCAAACTCCTTCTCATTATTT
AGACAGCTGCCTGCTAATTGTGATGTGAGCGCTTCAAAGTAATCCTCAATAAGCCTTGTG
AACATCTTTGTCCTTAAATCTCCGATGTACAACTCTCTATTGCCTCCCACTTGTTCCTTG
TAAGATAAAGAGAATTTTAGTCTACCAGTGTCAGGGCCAGTTGAATTATAAGACTGAGGT
GATTCTTGACTGTAAAAACACAGGTTTTTCAACATTGCTGTAGTACAGTTAGTTAGGGAC
AAGGCCTTGCTAAGTGCCTCAGAATTACTCTCACGCTCACTTATCCTGACATCATCAGCC
AGTCTTCCTGTATCAAATTTAAAGTTCAGACACTTACTCTTATAATGGGAGTACCTTCCC
ATCAGCTTGTTCCCATTCATCACTAACAAAATTGATTTGAAGCACTGGAAATATTCTTGG
TCCAAAAAAGTCCTAGTCACTACTGCTTTTGTCAGCTCACTAATAGGACATGAATTCATC
GGCCCTACATAAAAATACTTACTTCTCAAAGTAGCATCATTGTAGATATTCTCACAAATT
TTGATGTATTGATCCTCAGACAGGATTTCTCGATCAAAATCTTCTATCATATGGTTCGAA
AGTTCACATTTTATCAACCTTATTAAGAGTTTTTCTTCCACCACTTTGTCAAGATCAGAT
ATTGAGGGAGCATTGCAGTCTGCCTGCTGCTGTCTCTCTGGTTGCTCTCTATATTGACTG
ACAATATTGTCAACAGTACACCTCAGTTGATCAAAGTATTCTGATGCTCCTCCATCCAAT
AGTATCTCATCTAGATCAGCATTGTTGGTGTCACCACCCCTTCCTTTGGACCCCAGAACA
AGATTACTCATTGCTTGCTGTATCTTGTAATCATAATCCTGCTTATTGAGTAAGAATTTA
CCTTTACGTGCAAATACTTCAGTTAACTGTGTGACAGCCAAAGCTGTAAGCTTGTTGAAG
TCATAATCTAGAACACGGGACCCATCTGTATACTTATTCACAACGACACTCTTATTACTG
GCCAAGTCCAAAGCTGTCGCACAGCCGCTTGTAATAAGAGGGTCTCGCAGTGTCTTCTTT
ACTTCCTTTTCTTTAAACAATGAACCATTATTAAAACTAGACACTAGGAGTGAAATAAAT
TTCTTGGAAACACCAGGCTTTTTATACCTAACCTTTTCCGCATCTGCACAACATTCTTTG
CCTAGAAACTTTCTTGCATTATAAACCATTTCAGCAAGCTCTTCCTCCAATGCATCATCT
TTCGGGTTCACACTTACATGCCCAAATTCCAATTTAGGCTCAAGGAATTTTTCAAAGCAT
TTTATCTGGTCAGTCAGTCTGTCAGGTGTTTCTTTTGTAATGAAATGACACATATAAGAG
ATATTTAAAACAAACTTAAATCTATTGGTCATCATACTGACTACTTCTTCAGACAGAATT
ATTTTTAGTAGGTGACGAACAAGTTTGTAGAGCAGATATTCAACTTCTGTGATTAAATCC
TCCCTGATCTTCTCAATCAAATCCTTGTGATAAAAATCGGATACAAAGGCCATTACAAAA
TACCTAATGTTTTGTAAAAACTTCTGACACCTTTTACTAGGATGTGTTAGTATCAAAACC
AAAATCATCTTGGTTAACAATTTTATAGTAATCAATTGTTCAGATAGCTCTGTGCACTCT
TCTATCCAGCTGACCATAACATCGACTACTTTCTGTAACACATCACTGGAGAATACTGAA
GGGAAAAACCGTTTGGGATCAGCATAAAAGGAGCAAACCTCTCCAACTACATTGTTGTTA
ATAGCATAACACTTTGAGCATTCACCTGTCTTTTGATATAGTAGTTTGAATCCCATCCTA
CTGATGAAGAACTCTTGGCAATAGGCCTCTTTGCACCTTACAACCTGGTATCTAGAGGGT
CCAAATTCATTTTGTCTTAGTTTCACTGTGGAAGACGTTTTCATTGAGTTGATCAGAGCC
AGGCTCAATGATGATAGTCTTTCTAAGTCTGTCACATCTAGAATGTCAATAACACCAAGA
GTATAAGGGAACATGTCTTCCTTAGTTTTCTGGTAACTTATAGTGGGCGTGATACCGGAG
ATCTCAATGTCCATGATGGCATCATAAACATCACAATCAAGGATCTTGAGATCAATTCCA
TAAGAGTTGGTTTCTATGCCAACCTCCCTCAAAGCATTAGTGGCCCGAGTTAAAACTCTG
TTAATGATGGCAAGGAGCTCACAATTTCGTTCTTCTTTGAGTGATAGAACACCTGCATTC
CTGTCTTTAGCTCTCTTCCTACTGTCTATCCACCGTTTNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNTCAGTTTTAACAAATTGACTTTCTATGTTCCCTTGCCTTAGTTTGTTCCTA
AAGATTTGATACTCCTCTTCAATTAAACTCTTGACTTCATGAGAGGTGAGCTTGTTATTA
ATGCCCTGATGACAACAAGAAATTATTTCCTCAAAGTGTTTTGCACGCTTGTCAGTTAGA
ACATTAATGCTTTCAATCCCGGAGAGCCTCCCAGACGTTAAGGCTAAAGATTCACAAAGT
CTTGAATACTCAGATTCTTCGAACAATGCATTACTTTCTTGTGAGTATTTTAATAAAGTG
AACAAGGTGTCCCTGAGTTTGTCATTCACCCAATCAGGTATCTGCTCATTGTAAAAAGTT
GTCCTACCGTCTATGAGAGGGATAAGATTGATGCCAACGGATTCCAAGTCATTCTTTAAT
TGTTCTAGCTTCTTGTGATCTTCTAAATACTTCTGTTCAAAATTGGCAGGGGATGATCTG
ACAAAGCACTCAAGTAAGATAAGTACATTACCAGTAAGTTTGTAGCCGTCAGGCACCACA
AAGCATAAAGCTGGAGTGAGAATCCCATGATCGCTCAGAATGGATTCAACCGTTTTGTCT
TCAGTGTTATGTTCACATCCATTGGCTTTGCAAGAGTCAATCTCAATGCAAAGAGATAGT
AGTTTCAATCCCTCCATGAGAAGCATTCTGGGTTCGGTTTGGACCAAAAAGGCTAGCTTT
TGTCTCGATAGCTTCTCAT
18 changes: 18 additions & 0 deletions test/unit/test_assembly.py
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,24 @@ def test_multi_overlap(self):
str(Bio.SeqIO.read(outFasta, 'fasta').seq),
str(Bio.SeqIO.read(expected, 'fasta').seq))

def test_ambig_align(self):
inDir = util.file.get_test_input_path(self)
contigs_gz = os.path.join(inDir, 'contigs.lasv.ambig.fasta.gz')
contigs = util.file.mkstempfname('.fasta')
with util.file.open_or_gzopen(contigs_gz, 'rb') as f_in:
with open(contigs, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
expected = os.path.join(inDir, 'expected.lasv.ambig.fasta')
outFasta = util.file.mkstempfname('.fasta')
assembly.order_and_orient(
contigs,
os.path.join(inDir, 'ref.lasv.ISTH2376.fasta'),
outFasta)
def get_seqs(fasta):
return [str(s.seq) for s in Bio.SeqIO.parse(fasta, 'fasta')]
self.assertEqual(get_seqs(outFasta), get_seqs(expected))


class TestGap2Seq(TestCaseWithTmp):
'''Test gap-filling tool Gap2Seq'''

Expand Down
12 changes: 8 additions & 4 deletions tools/mummer.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,10 +284,14 @@ def scaffold_contigs_custom(self, refFasta, contigsFasta, outFasta,
seq = []
for _, left, right, n_features, features in fs.get_intervals(c):
# get all proposed sequences for this specific region
alt_seqs = list(
alnReaders[(c, f[-1][0])].retrieve_alt_by_ref(left, right, aln_start=f[1], aln_stop=f[2])
for f in features
)
alt_seqs = []
for f in features:
try:
alt_seqs.append(alnReaders[(c, f[-1][0])].retrieve_alt_by_ref(left, right, aln_start=f[1], aln_stop=f[2]))
except AmbiguousAlignmentException:
log.warn("dropping ambiguous alignment to ref seq {} at [{},{}]".format(c, f[1], f[2]))
pass

# pick the "right" one and glue together into a chromosome
ranked_unique_seqs = contig_chooser(alt_seqs, right-left+1, "%s:%d-%d" % (c, left, right))
seq.append(ranked_unique_seqs[0])
Expand Down
2 changes: 1 addition & 1 deletion travis/install-wdl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ cached_fetch_jar_from_github () {

cached_fetch_jar_from_github broadinstitute wdltool 0.14
cached_fetch_jar_from_github broadinstitute cromwell 29
cached_fetch_jar_from_github dnanexus dxWDL 0.57
cached_fetch_jar_from_github dnanexus dxWDL 0.58.1

TGZ=dx-toolkit-v0.240.1-ubuntu-14.04-amd64.tar.gz
if [ ! -f $CACHE_DIR/$TGZ ]; then
Expand Down

0 comments on commit dee3e8b

Please sign in to comment.