-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STORM-2039: backpressure refactoring in worker and executor #1627
STORM-2039: backpressure refactoring in worker and executor #1627
Conversation
@@ -743,7 +736,7 @@ | |||
(fn [& args] | |||
(check-credentials-changed) | |||
(if ((:storm-conf worker) TOPOLOGY-BACKPRESSURE-ENABLE) | |||
(check-throttle-changed)))) | |||
(topology-backpressure-callback)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like the wrong place for this.
For some reason the backpressure is tacked onto the credentials timer?
I know this was already the way it worked, but maybe there's a better place for it now? I don't think it's good that the value of the task.credentials.poll.secs
config affects the frequency of backpressure polling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, agreed. I'll add another timer for this.
+1 for the changes. My comments are suggestions for a couple more. |
a4fa289
to
24990d3
Compare
@knusbaum I made the changes suggested. Also removed an extra fn wrapper on that function triggered by the timer. |
+1 |
@abellina looks good to me, just one extra suggestion: So to strike a tradeoff and avoid the possible wrong stall of a topology, could you add do a random number generation here, e.g., generate a number between 0-9, if it is zero, we will call the try block no matter "prev-backpressure-flag" and "curr-backpressure-flag" is equal or not. Thanks a lot! |
@zhuoliu thanks for the comment. I am taking a look at STORM-1949 to try and reproduce it, and will comment there. I think that is a separate issue from this PR so we should see a potential fix to STORM-1949 in a different PR. Are you ok with this going in as is? |
sure, Alex. Really appreciate if you can submit a separate PR for 1949. |
Since with the refactoring we directly access the disruptor queue to get the backpressure status rather than having an additional flag in executor data which is error-prone when flipping (which is very possibly the root cause of the halt in STORM-1949), we may NOT need to add the ADDITIONAL ZK pressure which I mentioned in the above. |
+1 |
For 1.x branch, will submit a PR for 2.x branch.
@zhuoliu worked on the initial refactor. I added some due code review comments.