-
Notifications
You must be signed in to change notification settings - Fork 474
Simplify idle check used for idle process metric in Compactor #4740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Working on a test for the compactor metric to make sure it goes back to busy when a compaction is running then back to idle afterwards. |
|
In 04ed08b I added a test case that checks that the compactor idle metric correctly tracks what we expect it to be before, during and after a compaction. I am able to get the test to pass using the old logic in compactor but with the new logic, the idle value is not returning to 0 as I expect. I am not sure why that is happening at the moment so was hoping someone might be able to spot a flaw in the new logic that sets the idle metric value. |
keith-turner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 04ed08b I added a test case that checks that the compactor idle metric correctly tracks what we expect it to be before, during and after a compaction. I am able to get the test to pass using the old logic in compactor but with the new logic, the idle value is not returning to 0 as I expect. I am not sure why that is happening at the moment so was hoping someone might be able to spot a flaw in the new logic that sets the idle metric value.
Made some comments on what I think might fix things. Looking at the way the code works thinking the following method in AbstractServer
protected void idleProcessCheck(Supplier<Boolean> idleCondition) {should be changed to
protected void updateIdleStatus(boolean isIdle) {The supplier that is currently passed in is not cached and is only called once. It seems like the method need something to call it periodically because when its called it updates the idle status. If it is never called again then AFAICT the status will never change.
server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
Outdated
Show resolved
Hide resolved
server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
Outdated
Show resolved
Hide resolved
server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
Outdated
Show resolved
Hide resolved
server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
Outdated
Show resolved
Hide resolved
|
I added another test case to test that the scan server idle metric behaves as expected (very similar to compactor test case) |
|
Since the scope of this PR has changed slightly since it has been created, here is an updated description of changes in this PR:
Edit: The most critical portion to review here is probably the changes to the Compactor idle check. Just making sure that it makes sense to mark it as idle or not where I have it marked. |
dlmarion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR simplifies the check for the Compactor that is used to determine when to change the idle process metric for the Compactor. The new code just returns true when no compaction is running and false when one is. It defers the time checking portion to the upstream logic in
AbstractServer.The old method was to check how much time has passed since the last compaction was completed. There are at least two issues with this:
idleReportingPeriodNanos. This check also happens upstream when theSupplier<boolean>is read inAbstractServerso we are getting rid of that duplicate check in the new code.System.nanoTime()value of the last completed compaction is not negative. This is not a safe check to make sinceSystem.nanoTime()may provide a negative value. (This bug was identified here)