New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add on the fly demux and speed up barcode matching. #1649
Add on the fly demux and speed up barcode matching. #1649
Conversation
Hmm I can't seem to get these tests to fail locally. @gbggrant would you mind testing as well? |
@gbggrant I think it makes the most sense to concentrate on the other PRs right now and then once they are merged come back to this one after a rebase since it will be changed significantly by the others. |
995a98a
to
106ae5a
Compare
@gbggrant This one is a little bit more substantive but it is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, a few minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. A couple of stylistic notes and a question about a comment.
faf2a46
to
b3e1053
Compare
Even without on the fly demux, this saves me a full day when demultiplexing a single NovaSeq lane. Would love to be able to use an official release with this feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of nits.
@jacarey is this ready to go? |
@gbggrant Just updated to latest. Once tests are green it is good to go. |
Description
The purpose of this PR is twofold:
IlluminaBasecallsToSam
andIlluminaBasecallsToFastq
Sample input parameters are parsed in a central place now which can parse any one of the following types of files:
A single barcode extractor is used for all 3 programs to do demultiplexing and metrics collection/output.
Test data was changed for the following reason:
The old test data was using static barcode files that were created before
HAMMING
was introduced. In order to minimize the number of test data files needed the barcode files were regenerated usingHAMMING
and the test data was updated to reflect this change.Checklist (never delete this)
Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.
Content
Review
For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests