Major changes to filter_reads #141

polyatail · 2018-12-04T17:47:43Z

This pull request addresses #79 by:

Ignoring taxonomic assignments not passing filter
Letting user override this functionality by passing --include-lowconf

In addition, this PR might break existing code by:

Producing output that's different because of ignoring assignments not passing filter
Renaming filter_reads to subset_reads

coveralls · 2019-01-24T03:15:48Z

Coverage increased (+0.3%) to 84.174% when pulling 239760e on roo-filter into 00bf2e2 on master.

* Renamed to subset_reads which more closely reflects its actual function * Now ignores taxonomic assignments not passing filter * Added --include-lowconf which allows users to include taxonomic assignments not passing filter * Modified test TSV to make some taxonomic assignments fail filter * Recalculated test files and hashes manually

* Switch from FASTXIterator to skbio.io (slow) * Passing --no-validate only verifies reads begin with '@' * Restructure iterators to reduce statements in innermost loop * Add support for bzip'd FASTX files * Catch mismatch of TSV and FASTX file lengths

boydgreenfield

Let's alias to filter_reads too with a deprecation note. We've sent that command to users.

We should also update our canned Intercom response using this.

boydgreenfield · 2019-01-29T18:32:10Z

tests/test_scripts.py

-    (False, False, True, False),  # --with-children
-    (False, False, True, False),  # --exclude-reads
-    (False, False, True, True),   # --with-children --exclude-reads
+    (False, False, False, False, False),


This is insane, but OK 😱

It's just well-tested.

boydgreenfield · 2019-01-29T18:44:47Z

onecodex/scripts/subset_reads.py

+@click.option('--split-pairs', default=False, is_flag=True,
+              help='By default, if either read in a pair matches, both will match. Choose this '
+                   'option to consider each paired-end read separately. Resulting files may *not* '
+                   'have the same number of reads!')


Let's call it --subset-pairs-independently and have the following help message:

By default, if either read in a pair matches, both will be retained in the subset file. With this option, R1 and R2 files will be evaluated independently. Note that the subset output FASTQs are *not* guaranteed to have the same number of reads!

Changed as requested.

polyatail requested a review from boydgreenfield December 4, 2018 17:47

boydgreenfield added the in progress not yet ready for review label Dec 4, 2018

polyatail added done ready to be merged and removed in progress not yet ready for review labels Dec 6, 2018

polyatail added in progress not yet ready for review and removed done ready to be merged labels Jan 8, 2019

polyatail self-assigned this Jan 8, 2019

boydgreenfield force-pushed the master branch from 2455ddc to c922835 Compare January 15, 2019 01:18

polyatail force-pushed the roo-filter branch from 9a208a7 to d78623a Compare January 24, 2019 03:12

polyatail force-pushed the roo-filter branch 4 times, most recently from ec37f0b to ab73b67 Compare January 24, 2019 21:11

polyatail added 2 commits January 24, 2019 13:26

polyatail force-pushed the roo-filter branch from ab73b67 to 97cab20 Compare January 24, 2019 21:27

polyatail added needs review need code review before merging and removed in progress not yet ready for review labels Jan 24, 2019

polyatail added this to the Sprint Lava milestone Jan 24, 2019

polyatail added needs changes and removed needs review need code review before merging labels Jan 29, 2019

boydgreenfield requested changes Jan 29, 2019

View reviewed changes

boydgreenfield reviewed Jan 29, 2019

View reviewed changes

Rename --split-pairs, issue deprecation warning, add filter_reads alias

239760e

polyatail added done ready to be merged and removed needs changes labels Jan 30, 2019

polyatail merged commit 0ff6ccf into master Jan 30, 2019

polyatail deleted the roo-filter branch February 5, 2019 21:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major changes to filter_reads #141

Major changes to filter_reads #141

polyatail commented Dec 4, 2018

coveralls commented Jan 24, 2019 •

edited

Loading

boydgreenfield left a comment

boydgreenfield Jan 29, 2019

polyatail Jan 30, 2019

boydgreenfield Jan 29, 2019

polyatail Jan 30, 2019

Major changes to filter_reads #141

Major changes to filter_reads #141

Conversation

polyatail commented Dec 4, 2018

coveralls commented Jan 24, 2019 • edited Loading

boydgreenfield left a comment

Choose a reason for hiding this comment

boydgreenfield Jan 29, 2019

Choose a reason for hiding this comment

polyatail Jan 30, 2019

Choose a reason for hiding this comment

boydgreenfield Jan 29, 2019

Choose a reason for hiding this comment

polyatail Jan 30, 2019

Choose a reason for hiding this comment

coveralls commented Jan 24, 2019 •

edited

Loading