Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-3243: Add server-side request throttling #986

Closed
wants to merge 9 commits into from

Conversation

jhuan31
Copy link

@jhuan31 jhuan31 commented Jun 12, 2019

No description provided.

@jhuan31
Copy link
Author

jhuan31 commented Jun 13, 2019

close for now. will refactor and have a new PR

@jhuan31 jhuan31 closed this Jun 13, 2019
@jhuan31 jhuan31 reopened this Jun 14, 2019
@asfgit
Copy link

asfgit commented Jun 14, 2019

FAILURE

--none--

@eolivelli
Copy link
Contributor

That 'FAILURE' maybe related to the fact that I have disable ant job for master branch

@jhuan31
Copy link
Author

jhuan31 commented Jun 14, 2019

That 'FAILURE' maybe related to the fact that I have disable ant job for master branch

It actually ran

@jhuan31
Copy link
Author

jhuan31 commented Jun 15, 2019

retest ant build

1 similar comment
@jhuan31
Copy link
Author

jhuan31 commented Jun 15, 2019

retest ant build

Copy link
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very much needed feature for running ZK at scale, thanks for putting this up. Did first pass review, left some comments.

}
}

if (killed) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be elevated to the top of the first while block, so we can break early and start draining the queue sooner. consider one case - we have a big queue of submitted requests and most of them must be dropped. If we leave the check here, we'll not break the loop until processing the last request (the request of death.).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for killed is also in the inner while loop for each request in the queue.
So it won't delay the shutdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume while we are shutting down (kill is set to true), we have a large queue of requests, where each request is stale and must be dropped. With current code, the outer most while (true) { loop will try loop through every request in the queue, because L145 if (request.mustDrop()) { is evaluated as true for every request (which is stale). In this case, the check for killed in the inner while loop will not get executed. This was the reason I was suggesting to move the if (killed) { check before the if (request.mustDrop()) {.

Does this example make sense? Albeit this example somewhat hypothetical, having if (killed) { check earlier sounds no harm either.

Copy link
Author

@jhuan31 jhuan31 Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, get it! Thank you for your detailed explanation! Fix on the way... Actually I decide to add a check at the beginning of the outer loop instead of moving the existing check. If we remove the check after the inner while, then the throttler will submit one last request when killed is set. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhuan31 agree, add instead of moving is better.

@jhuan31
Copy link
Author

jhuan31 commented Jul 1, 2019

retest maven build

Copy link
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, thanks for update patch @jhuan31

There are two remaining issues I'd like to further discuss, please take a look.

}
}

if (killed) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume while we are shutting down (kill is set to true), we have a large queue of requests, where each request is stale and must be dropped. With current code, the outer most while (true) { loop will try loop through every request in the queue, because L145 if (request.mustDrop()) { is evaluated as true for every request (which is stale). In this case, the check for killed in the inner while loop will not get executed. This was the reason I was suggesting to move the if (killed) { check before the if (request.mustDrop()) {.

Does this example make sense? Albeit this example somewhat hypothetical, having if (killed) { check earlier sounds no harm either.

Copy link
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I only one remaining question (none blocking though). lgtm +1.

@hanm
Copy link
Contributor

hanm commented Jul 8, 2019

retest this please

Copy link
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @jhuan31. The patch looks great to me.

Just one final thing to resolve before I can merge this: there is one unit test testAddSessionAfterSessionExpiry consistently failing on CI and locally. Would you please check this out and fix it? The test log output for reference:

2019-07-08 17:30:51,517 [myid:] - INFO [Time-limited test:JUnit4ZKTestRunner$LoggedInvokeMethod@99] - TEST METHOD FAILED testAddSessionAfterSessionExpiry java.lang.AssertionError: Duplicate session expiry request has been generated expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.zookeeper.server.SessionTrackerTest.testAddSessionAfterSessionExpiry(SessionTrackerTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:80) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) 2019-07-08 17:30:51,521 [myid:] - INFO [main:ZKTestCase$1@75] - FAILED testAddSessionAfterSessionExpiry java.lang.AssertionError: Duplicate session expiry request has been generated expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.zookeeper.server.SessionTrackerTest.testAddSessionAfterSessionExpiry(SessionTrackerTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:80) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) 2019-07-08 17:30:51,522 [myid:] - INFO [main:ZKTestCase$1@65] - FINISHED testAddSessionAfterSessionExpiry

@jhuan31
Copy link
Author

jhuan31 commented Jul 8, 2019

The broken unit test has been fixed :) It was broken because we removed that enable-request-throttle flag so need to update the test setup.

@asfgit asfgit closed this in 7b3de52 Jul 8, 2019
@hanm
Copy link
Contributor

hanm commented Jul 8, 2019

awesome, thanks for the fix @jhuan31. merged to master after all check is green.

stickyhipp pushed a commit to stickyhipp/zookeeper that referenced this pull request Aug 2, 2019
Author: Jie Huang <jiehuang@fb.com>
Author: Joseph Blomstedt <jdb@fb.com>

Reviewers: Michael Han <hanm@apache.org>

Closes apache#986 from jhuan31/ZOOKEEPER-3243
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
Author: Jie Huang <jiehuang@fb.com>
Author: Joseph Blomstedt <jdb@fb.com>

Reviewers: Michael Han <hanm@apache.org>

Closes apache#986 from jhuan31/ZOOKEEPER-3243
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Sep 3, 2022
Author: Jie Huang <jiehuang@fb.com>
Author: Joseph Blomstedt <jdb@fb.com>

Reviewers: Michael Han <hanm@apache.org>

Closes apache#986 from jhuan31/ZOOKEEPER-3243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants