Skip to content

Order of operations for applying filters. #161

@rbdavid

Description

@rbdavid

Major question/concern: Order of operations for applying filters will affect the results. Currently, the filters are applied in steps Fraction, Fragment, and then Taxonomy. Which sequences that pass the Fraction filter is a bit random and certainly uncaring to the biology; that filter just blindly removes based on a modulo boolean check. Whatever sequences remain may still not pass the other filters. This has the potential to really hamstring an analysis.

Applying the Fraction filter last, would enable the other filters to thoroughly remove the biologically uninteresting sequences, leaving behind a set of sequences that we know the user is interested in, from which the Fraction filter can remove sequences in a more random, uncaring fashion.

But, changing the order of filtering operations will/may return different results compared to those from the EFI v1 code base.

Originally posted by @rbdavid in #151 (comment)

I would put this as a low priority question to ask John/Remi later this summer. It can be added as a low priority issue if you want.

Originally posted by @nilsoberg in #151 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions