Improves %id filtering, and adds `easel filter` miniapp. · EddyRivasLab/easel@bd21cdd

Commit

Improves %id filtering, and adds easel filter miniapp.

Tom suggests improving %id filtering so it doesn't select a fragment
over a full length sequence. Previous rule was to keep the
lower-indexed sequence (earlier in file); I now call this the
"origorder" preference rule.

Reimplemented esl_msaweight_IDFilter(), adding
esl_msaweight_IDFilter_adv() and esl_msaweight_IDFilter_txt(), along
same lines as revised PB weights. _adv() version for digital mode
alignments implements a "conscover" preference rule: within the "span"
of an aseq (from 1st to last residue), how many consensus columns does
it cover. The intent of the rule is to favor full length sequences
without introducing a lot of bias in insertion/deletion statistics.
_adv() can also be optionally configured to use the "origorder"
preference rule, or random preference. _txt(), for text mode
alignments, stays with the original implementation and the origorder
rule.

Wrote `easel filter` miniapp in cmd_filter.c, with man page
documentation in cmd_filter.md -- in Markdown instead of nroff, which
seems like a good long term switch to start making.

Reorganized unit tests in esl_msaweight, and added utest_idfilter().

Loading branch information

cryptogenomicon committed Apr 2, 2019

1 parent f952523 commit bd21cdd

0 comments on commit `bd21cdd`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `bd21cdd`

Commit

There are no files selected for viewing

0 comments on commit bd21cdd

0 comments on commit `bd21cdd`