New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-2476. Share more code between metadata and data scanners. #3727
HDDS-2476. Share more code between metadata and data scanners. #3727
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactor @Galsza. Just some minor comments but looks good overall. One thing I think we can clean up here as well is the scanner/scrubber mixed terminology in the classes and comments. IMO scanner is the term we should use. Scrubber terminology is already used on SCM for the pipeline scrubber, which deletes pipelines that have been allocated for too long. Since these tasks to do not delete anything I think scanner is a better name for them.
...vice/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerMetadataScanner.java
Show resolved
Hide resolved
...vice/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerMetadataScanner.java
Outdated
Show resolved
Hide resolved
...vice/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerMetadataScanner.java
Outdated
Show resolved
Hide resolved
...-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerDataScanner.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @Galsza. LGTM but let's give Stephen a chance to review as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Galsza for working on this. LGTM overall. I only have a few minor comments.
public ContainerDataScannerMetrics getMetrics() { | ||
return this.metrics; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently metrics
is stored in both parent and subclasses. We could eliminate the former by making getMetrics()
abstract, and using it instead of metrics
variable. It would also allow getting rid of the cast in subclass constructors.
LOG.info("Completed an iteration of " + this.getClass().toString() + | ||
" in {} minutes." + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actual output:
datanode_1 | 2022-09-07 15:48:00,816 [ContainerDataScanner(/data/hdds/hdds)] INFO ozoneimpl.AbstractContainerScanner: Completed an iteration of class org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner in 0 minutes. Number of iterations (since the data-node restart) : 13, Number of containers scanned in this iteration : 0, Number of unhealthy containers found in this iteration : 0
datanode_1 | 2022-09-07 15:48:00,831 [ContainerMetadataScanner] INFO ozoneimpl.AbstractContainerScanner: Completed an iteration of class org.apache.hadoop.ozone.container.ozoneimpl.ContainerMetadataScanner in 0 minutes. Number of iterations (since the data-node restart) : 25, Number of containers scanned in this iteration : 1, Number of unhealthy containers found in this iteration : 0
The part class org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner
looks a bit odd in the message. I think instance name would be better than class name. And since it is already included due to thread name pattern, we might omit it.
LOG.info("Completed an iteration of " + this.getClass().toString() + | |
" in {} minutes." + | |
LOG.info("Completed an iteration in {} minutes" + |
@Metrics(about = "DataNode container data scrubber metrics", context = "dfs") | ||
public final class ContainerMetadataScrubberMetrics { | ||
@Metrics(about = "Datanode container scanner metrics", context = "dfs") | ||
public abstract class AbstractContainerScannerMetric { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is a collection of metrics, should be plural: AbstractContainerScannerMetrics
.
…ntainerDataScrubberMetrics
…t between ContainerDataScanner and ContainerMetadataScanner.
Rename scrubber to scanner.
Remove metric field from AbstractContainerScanner.
Remove warnings about raw use of parameterized class
327096a
to
67ab432
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Galsza for updating the patch.
Please try to avoid force-push when updating the PR, as it, among other disadvantages, makes it harder to check for changes since last review. Here are some great articles that explain in more detail:
https://developers.mattermost.com/blog/submitting-great-prs/#4-avoid-force-pushing
https://www.freecodecamp.org/news/optimize-pull-requests-for-reviewer-happiness#request-a-review
if (stopping) { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already checked in run()
right before calling runIteration()
. Original code was also checking after the iteration, before logging and updating the metrics. I think the intention was to prevent incomplete iteration being logged as "Completed an iteration ...".
Therefore I think we should move it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending CI.
What changes were proposed in this pull request?
Clean up container scanner code so it contains less duplication. No functional change.
What is the link to the Apache JIRA
HDDS-2476: Share more code between metadata and data scanners
How was this patch tested?
No functional changes, unit/integration tests are green.