Allow to rerun w/ blast and pfam results and to just reexecute the steps needed to take those data into account.
The test runner now includes a small pfam and blast on-the-fly execution step as well.
Better support for different genetic codes as described here:
Also, shouldn't report UTR records for cases where the coding region starts at a partial codon.
-algorithm updates: frame score > 0 and max for first 3 reading frames (instead of all 6), and orf with highest frame score is chosen allowing for minimal overlap among selected predictions.
-option --single_best_only provides the single longest of the selected orfs per contig.
-long orfs unlikely to appear in random sequence are automatically selected as candidates with this minimal long orf length set dynamically according to GC content.
-orf score and blast or pfam info is propagated to gff3 output
-single best orf now selected by default. If more than the single best orf is wanted, use the --all_good_orfs parameter.
-start codon refinement is now done by default. To turn it off and get the original behavior of extending to the longest orf position, use parameter: --no_refine_starts
-cdhit has been removed and replaced by our own fast method for removing redundancies.
-selection of coding regions is strictly governed by Markov-based likelihood scores across reading frames. No auto-retention of long orfs by default, but can be activated by parameter: --retain_long_orfs_length
-rigorously tested and benchmarked for prediction accuracy