New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-25819][runtime] Reordered requesting and recycling buffers in order to avoid race condition in testIsAvailableOrNotAfterRequestAndRecycleMultiSegments #18865
Conversation
Please pay attention to the last commit where the global test timeout was removed |
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 31fce89 (Mon Feb 21 15:53:54 UTC 2022) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
@flinkbot run azure |
|
||
try { | ||
// request another 5 segments | ||
globalPool.requestUnpooledMemorySegments(numberOfSegmentsToRequest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be good to check that requesting even single one segment should fail to make sure it is empty? Or maybe both requesting one and five?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, perhaps using the one instead of five is good idea
// Starting to requesting next segments only after previous segments were release | ||
// otherwise the race condition between requesting new segments and releasing old ones | ||
// are possible: | ||
asyncRequest.start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the async request actually make sense here? I'm not sure if the async request adds something relevant to the test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, isn't the use of ArrayList in a concurrent thread non-thread-safe? (CopyOnWriteArrayList would be a thread-safe alternative.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, right now I see that this test was incorrect. This test supposed that requestUnpooledMemorySegments
would be blocked until enough buffers will be available but it is just not true. I don't know it was the initial mistake in the test or the old behavior was different. Anyway I will remove this extra thread that doesn't make sense anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments in code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smattheis, thanks for the review. I made a couple of changes inside of the current commits. I think it would not be a problem since the commits are small
|
||
try { | ||
// request another 5 segments | ||
globalPool.requestUnpooledMemorySegments(numberOfSegmentsToRequest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, perhaps using the one instead of five is good idea
// Starting to requesting next segments only after previous segments were release | ||
// otherwise the race condition between requesting new segments and releasing old ones | ||
// are possible: | ||
asyncRequest.start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, right now I see that this test was incorrect. This test supposed that requestUnpooledMemorySegments
would be blocked until enough buffers will be available but it is just not true. I don't know it was the initial mistake in the test or the old behavior was different. Anyway I will remove this extra thread that doesn't make sense anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@@ -757,7 +746,7 @@ public void testBlockingRequestFromMultiLocalBufferPool() | |||
// wait until all available buffers are requested | |||
while (segmentsRequested.size() + segments.size() + exclusiveSegments.size() | |||
< numBuffers) { | |||
Thread.sleep(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a necessary change? Is it in the right commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary at all. I just notice that 100 ms is too high here considering we have a loop. But perhaps, it is indeed not a good idea to have these changes in this commit. I can remove this change at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can leave it the way it is. Just wanted to check.
...-runtime/src/test/java/org/apache/flink/runtime/io/network/buffer/NetworkBufferPoolTest.java
Outdated
Show resolved
Hide resolved
…order to avoid race condition in testIsAvailableOrNotAfterRequestAndRecycleMultiSegments
…work buffers' scenario into NetworkBufferPoolTest
@akalash Do you mind creating a backport for 1.14? |
@dawidwys, good idea. I will do that |
What is the purpose of the change
This PR fixes NetworkBufferPoolTest.testIsAvailableOrNotAfterRequestAndRecycleMultiSegments
Brief change log
Verifying this change
It is test
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation