New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISPN-7172 Total order caches can hang during join #4663
ISPN-7172 Total order caches can hang during join #4663
Conversation
CompletableFuture<Void> joinFuture = new CompletableFuture<>(); | ||
cacheStatus.getTopologyUpdatesExecutor().executeAsync(() -> joinFuture); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's strictly necessary for correctness at this point or it just makes tests more predictable, but I think it's safer to implement it as promised in the comments above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how initializing the runningCaches
before or after the sending the joinFuture
to the executor could affect anything...
My point is: I'm not seeing any difference in the logic executed. So, what side effect am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitting joinFuture
will prevent the executor from running any other task until joinFuture
is completed. If the cache exists in runningCaches
and it's LimitedExecutor
has a free spot, it will process topology updates, and those have a chance of doing "stuff" before we have properly joined.
I think the initial stuff I was worried about was just blocking for a new view, which could overwhelm the OOB thread pool given enough caches. In this bug, the topology update was exposing the bug in LimitedExecutor
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense.
a134462
to
bd4d79c
Compare
* Fix WithinThreadExecutor handling in LimitedExecutor * Remove LimitedExecutor permit before putting the LocalCacheStatus in the runningCaches map.
bd4d79c
to
fc7108e
Compare
integrated! thanks @danberindei ! |
https://issues.jboss.org/browse/ISPN-7172
in the runningCaches map.