Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major changes to filter_reads #141

Merged
merged 3 commits into from
Jan 30, 2019
Merged

Major changes to filter_reads #141

merged 3 commits into from
Jan 30, 2019

Conversation

polyatail
Copy link
Contributor

This pull request addresses #79 by:

  • Ignoring taxonomic assignments not passing filter
  • Letting user override this functionality by passing --include-lowconf

In addition, this PR might break existing code by:

  • Producing output that's different because of ignoring assignments not passing filter
  • Renaming filter_reads to subset_reads

@boydgreenfield boydgreenfield added the in progress not yet ready for review label Dec 4, 2018
@polyatail polyatail added done ready to be merged and removed in progress not yet ready for review labels Dec 6, 2018
@polyatail polyatail added in progress not yet ready for review and removed done ready to be merged labels Jan 8, 2019
@polyatail polyatail self-assigned this Jan 8, 2019
@coveralls
Copy link

coveralls commented Jan 24, 2019

Coverage Status

Coverage increased (+0.3%) to 84.174% when pulling 239760e on roo-filter into 00bf2e2 on master.

@polyatail polyatail force-pushed the roo-filter branch 4 times, most recently from ec37f0b to ab73b67 Compare January 24, 2019 21:11
* Renamed to subset_reads which more closely reflects its actual function
* Now ignores taxonomic assignments not passing filter
* Added --include-lowconf which allows users to include taxonomic assignments not passing filter
* Modified test TSV to make some taxonomic assignments fail filter
* Recalculated test files and hashes manually
* Switch from FASTXIterator to skbio.io (slow)
* Passing --no-validate only verifies reads begin with '@'
* Restructure iterators to reduce statements in innermost loop
* Add support for bzip'd FASTX files
* Catch mismatch of TSV and FASTX file lengths
@polyatail polyatail added needs review need code review before merging and removed in progress not yet ready for review labels Jan 24, 2019
@polyatail polyatail added this to the Sprint Lava milestone Jan 24, 2019
@polyatail polyatail added needs changes and removed needs review need code review before merging labels Jan 29, 2019
Copy link
Contributor

@boydgreenfield boydgreenfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's alias to filter_reads too with a deprecation note. We've sent that command to users.

We should also update our canned Intercom response using this.

(False, False, True, False), # --with-children
(False, False, True, False), # --exclude-reads
(False, False, True, True), # --with-children --exclude-reads
(False, False, False, False, False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is insane, but OK 😱

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just well-tested.

@click.option('--split-pairs', default=False, is_flag=True,
help='By default, if either read in a pair matches, both will match. Choose this '
'option to consider each paired-end read separately. Resulting files may *not* '
'have the same number of reads!')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call it --subset-pairs-independently and have the following help message:

By default, if either read in a pair matches, both will be retained in the subset file. With this option, R1 and R2 files will be evaluated independently. Note that the subset output FASTQs are *not* guaranteed to have the same number of reads!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as requested.

@polyatail polyatail added done ready to be merged and removed needs changes labels Jan 30, 2019
@polyatail polyatail merged commit 0ff6ccf into master Jan 30, 2019
@polyatail polyatail deleted the roo-filter branch February 5, 2019 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants