HybPiper version 2.2.0
·
94 commits
to master
since this release
- Add option
--end_withto commandhybpiper assemble. Allows the user to end the assembly pipeline at a chosen step (map_reads, distribute_reads, assemble_reads, exonerate_contigs). - Add option
--exonerate_skip_hits_with_frameshiftsto commandhybpiper assemble. If provided, skip Exonerate hits where the SPAdes contig contains frameshifts when considering hits for assembly of an*.FNAsequence. Default behaviour in HybPiper v2.2.0 is to include these hits; previous versions allowed them automatically. - Add option
--exonerate_skip_hits_with_internal_stop_codonsto commandhybpiper assemble. If provided, skip Exonerate hits where the SPAdes contig contains internal in-frame stop codon(s) when considering hits for assembly of an*.FNAsequence. A single terminal stop codon is allowed. Default behaviour in HybPiper v2.2.0 is to include these hits; previous versions allowed them automatically. - Add option
--exonerate_skip_hits_with_terminal_stop_codonsto commandhybpiper assemble. If provided, skip Exonerate hits where the SPAdes sequence contains a single terminal stop codon. Only applies when option--exonerate_skip_hits_with_internal_stop_codonsis also provided. Only use this flag if your target file exclusively contains protein-coding genes with no stop codons included, and you would like to prevent any in-frame stop codons in the output sequences. Default behaviour in HybPiper v2.2.0 is to include these hits; previous versions allowed them automatically. - Add option
--chimeric_stitched_contig_checkto commandhybpiper assemble. If provided, HybPiper will attempt to determine whether a stitched contig is a potential chimera of contigs from multiple paralogs. Default behaviour in HybPiper v2.2.0 is to skip this check; previous versions performed the check automatically. Skipping this check speeds up the final 'exonerate_contigs' step of the pipeline, significantly. - Add option
--no_pad_stitched_contig_gaps_with_nto commandhybpiper assemble. If provided, when constructing stitched contigs, do not pad any gaps between hits (with respect to the "best" protein reference) with a number of Ns corresponding to the reference gap multiplied by 3. Default behaviour in HybPiper v2.2.0 is to pad gaps with Ns; previous versions did this automatically. - Add option
--skip_targetfile_checksto commandhybpiper assemble. Skip the target file checks. Can be used if you are confident that your target file has no issues (e.g. if you have previously runhybpiper check_targetfile). - Add option
--no_spades_etato commandhybpiper assemble. When SPAdes is run concurrently using GNU parallel, the "--eta" flag can result in many "sh: /dev/tty: Device not configured" errors written to stderr. Using this option removes the "--eta" flag to GNU parallel, silencing both ETA output and the error message. - Fixed a bug in
exonerate_hits.pythat could (rarely) result in a duplicated region in the output*.FNAsequence. - Fixed a bug in
exonerate_hits.pythat occurred when more than two Exonerate hits had identical query ranges and similarity scores; this could result in a sequence not being returned for the given gene. - Added
testsfolder containing initial unit tests. Some tests require python packagepyfakefsto run. - Refactor of the hybpiper package. New module
hybpiper_main.pywith entry point (moved fromassemble.py), and someassemble.pyfunctions moved toutils.py. Target file checking functionality has been consolidated. - HybPiper now logs to
stdoutrather thanstderr. - Commands
hybpiper check_targetfileandhybpiper assemblenow write a report file when checking the target file (check_targetfile_report-<target file name>.txt), rather than logging details to the main sample log. Commandhybpiper check_targefilewrites the report to the current working directory, whereas commandhybpiper assemblewrites it to the sample directory. - If the option
--cpuis not specified forhybpiper assemble, HybPiper will now use all available CPUs minus one, rather than all available CPUs. - Command
hybpiper assemblenow checks for output from previous runs for the pipeline steps selected via--start_fromand--end_with(default is to select all steps). If previous output is found, HybPiper will exit with an error unless the option--force_overwriteis provided. - Corrected the reading frame of sequence
Artocarpus-gene660in the test dataset target file. - Command
hybpiper assemblenow writes the file<prefix>_chimera_check_performed.txtto the sample directory. This is a text file containing 'True' or 'False' depending on whether the option--skip_chimeric_geneswas provided to commandhybpiper assemble. Used byhybpiper retrieve_sequencesandhybpiper paralog_retriever.