Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add on the fly demux and speed up barcode matching. #1649

Conversation

jacarey
Copy link
Collaborator

@jacarey jacarey commented Mar 2, 2021

Description

The purpose of this PR is twofold:

  1. Speed up the barcode matching code.
  2. Add inline demux to IlluminaBasecallsToSam and IlluminaBasecallsToFastq

Sample input parameters are parsed in a central place now which can parse any one of the following types of files:

  •   ExtractIlluminaBarcodes      BARCODE_FILE
    
  •   IlluminaBasecallsToFastq     MULTIPLEX_PARAMS
    
  •   IlluminaBasecallsToSam       LIBRARY_PARAMS
    

A single barcode extractor is used for all 3 programs to do demultiplexing and metrics collection/output.

Test data was changed for the following reason:

The old test data was using static barcode files that were created before HAMMING was introduced. In order to minimize the number of test data files needed the barcode files were regenerated using HAMMING and the test data was updated to reflect this change.


Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on Travis

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests

@jacarey
Copy link
Collaborator Author

jacarey commented Mar 3, 2021

Hmm I can't seem to get these tests to fail locally. @gbggrant would you mind testing as well?

@jacarey
Copy link
Collaborator Author

jacarey commented Mar 4, 2021

@gbggrant I think it makes the most sense to concentrate on the other PRs right now and then once they are merged come back to this one after a rebase since it will be changed significantly by the others.

@jacarey jacarey force-pushed the tf_speedup_extract_illumina_barcodes branch 2 times, most recently from 995a98a to 106ae5a Compare April 2, 2021 19:20
@jacarey
Copy link
Collaborator Author

jacarey commented Apr 5, 2021

@gbggrant This one is a little bit more substantive but it is ready for review.

Copy link
Member

@jessicaway jessicaway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, a few minor comments

src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeMetric.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeMetric.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeMetric.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeMetric.java Outdated Show resolved Hide resolved
Copy link
Contributor

@gbggrant gbggrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. A couple of stylistic notes and a question about a comment.

src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeMetric.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeMetric.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/ExtractIlluminaBarcodes.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/SampleInputParameters.java Outdated Show resolved Hide resolved
@gbggrant
Copy link
Contributor

Hi @jacarey we've reviewed this one. Just checking in to see if this is the highest priority PR of yours (I see you have another (#1646) but it's still in draft.

@jacarey
Copy link
Collaborator Author

jacarey commented May 27, 2021

Hi @jacarey we've reviewed this one. Just checking in to see if this is the highest priority PR of yours (I see you have another (#1646) but it's still in draft.

@gbggrant I'm hoping to get to this early next week. I've moved 1646 from draft to PR. Thanks!

@jacarey jacarey force-pushed the tf_speedup_extract_illumina_barcodes branch from faf2a46 to b3e1053 Compare June 4, 2021 20:01
@mjhipp
Copy link
Contributor

mjhipp commented Jul 19, 2021

Even without on the fly demux, this saves me a full day when demultiplexing a single NovaSeq lane. Would love to be able to use an official release with this feature!

Copy link
Contributor

@gbggrant gbggrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits.

src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BarcodeExtractor.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/BasecallsConverter.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/ExtractBarcodesProgram.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/ExtractBarcodesProgram.java Outdated Show resolved Hide resolved
src/main/java/picard/illumina/ExtractBarcodesProgram.java Outdated Show resolved Hide resolved
@gbggrant
Copy link
Contributor

gbggrant commented Aug 4, 2021

@jacarey is this ready to go?

@jacarey
Copy link
Collaborator Author

jacarey commented Aug 4, 2021

@gbggrant Just updated to latest. Once tests are green it is good to go.

@gbggrant gbggrant merged commit 1577067 into broadinstitute:master Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants