Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring #1868

Merged
merged 5 commits into from Jun 13, 2023
Merged

Refactoring #1868

merged 5 commits into from Jun 13, 2023

Conversation

LadDeep
Copy link
Contributor

@LadDeep LadDeep commented Apr 7, 2023

Description

Commit Id: f691e4b
Extracted file writing redundant code from buildWriter() in IlluminaBasecallsToFastq. Performed following refactoring: Extract method and Introduce variables to enhance readability and reduce redundancy.

Commit Id: 50e9439
Extracted complex conditional to find out whether the adapters are present or not by introducing hasEitherAdapter() and hasBothAdapter() methods. Performed following refactoring: Decompose Conditional to enhance readability and understandability.

Commit Id: 2816540
MarkDuplicates and MarkDuplicatesForFlowHelper has method implementation for generateDuplicateIndexes() signatured in their super class. They have similar implementation and hence extracted common code is moved to MarkDuplicates and used directly from there. Performed following refactoring: Move Method to enhance readability, understandability and reduce redundancy.

Commit Id: c147cbb
AbstractWsgMetricsCollector has multiple responsibilities for collecting metrics and histograms which were seperated by creating WsgHistogramCollector class. Performed following refactoring: Extract Class to enhance readability, understandability and resolving Multifaceted Abstraction smell. **Note: Test case needs to be modified for this commit


Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on github actions

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests

@droazen droazen self-requested a review April 11, 2023 17:47
Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LadDeep The changes to the metrics collection code in this PR cause tests to fail. It looks like there are now two copies of the histogram arrays, one set in AbstractWgsMetricsCollector, and another set in the new WgsHistogramCollector class, and some code is using the AbstractWgsMetricsCollector arrays, while other code is using the WgsHistogramCollector arrays. If you fix this we can revisit this PR.

Was this an AI-generated patch, out of curiosity?

@LadDeep
Copy link
Contributor Author

LadDeep commented Apr 12, 2023

@droazen Yes, I'm aware that the tests are failing and I'm working on that. About the data duplicacy of the histogram, thanks for bringing it to light. They seemed to be tighly coupled and that led me to split the responsibility. I'll once again look through the issue.

No, these were done manually. I'm new to code refactoring and this is my first open source contribution.

Can you me help me in better understanding the project so that I can contribute effectively? Some guidance/roadmap would be highly appreciated!

@droazen
Copy link
Contributor

droazen commented Apr 12, 2023

@LadDeep Ah, I see -- well, the refactorings here are mostly clear improvements, it's just the changes to the metrics classes that are problematic. Many of the child classes of AbstractWgsMetricsCollector are still using the old histogram arrays in some places, hence the test failures. Since the metrics class hierarchy is fairly complex, you may wish to revert the metrics changes and just try to get the other refactorings merged.

If you wish to learn more about this project, the best resources are the tool documentation here (search for "Picard" or for the tool you're interested in) and the tutorials here. This tutorial, for example, should give a good sense of the kinds of "data pre-processing" tasks in genomics for which Picard is typically used. There are also recordings of workshop videos such as this one, which discusses genomic data pre-processing in general, and the well-known Picard tool MarkDuplicates in particular.

@LadDeep
Copy link
Contributor Author

LadDeep commented Apr 14, 2023

@droazen I'll try it out once to decouple both the classes considering all of the side effects that are occuring. Just wanted to know, if in case for merging without last commit, I should create another pull request after reverting it, right?

Thanks for the resources 🙌.

@droazen
Copy link
Contributor

droazen commented Apr 14, 2023

Hi @LadDeep, if you revert the metrics commit, it's fine to do so in this PR rather than opening a separate PR.

@LadDeep
Copy link
Contributor Author

LadDeep commented Apr 18, 2023

Alright @droazen. I'll let you know once I'm done.

@LadDeep
Copy link
Contributor Author

LadDeep commented Apr 18, 2023

@droazen I've reverted last commit. I'll work on AbstractWgsMetricsCollector again and create another PR.

@LadDeep LadDeep requested a review from droazen April 22, 2023 20:26
@cmnbroad
Copy link
Contributor

cmnbroad commented May 9, 2023

@droazen Pinging for review.

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good now!

@droazen droazen merged commit 802089a into broadinstitute:master Jun 13, 2023
3 checks passed
@LadDeep
Copy link
Contributor Author

LadDeep commented Jun 15, 2023

Thanks @droazen🙌🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants