-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ISSUE-456] Avoid removeResources for multiple times #459
Conversation
Codecov Report
@@ Coverage Diff @@
## master #459 +/- ##
============================================
+ Coverage 58.74% 58.75% +0.01%
- Complexity 1664 1666 +2
============================================
Files 199 199
Lines 11236 11239 +3
Branches 999 1000 +1
============================================
+ Hits 6601 6604 +3
Misses 4243 4243
Partials 392 392
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
The fix looks good to me. If I understand the issue and this fix correctly, the reason why some appId is called multiple times is that:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, left minor comment
@@ -753,7 +807,8 @@ public void checkAndClearLeakShuffleDataTest(@TempDir File tempDir) throws Excep | |||
assertTrue(appIdsOnDisk.contains(appId)); | |||
|
|||
// make sure heartbeat timeout and resources are removed | |||
Thread.sleep(5000); | |||
Awaitility.await().timeout(10, TimeUnit.SECONDS).until( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is involved in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not related with this PR, could you help open another PR to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In original logic, shuffleTaskInfos.remove(appId)
was invoked after remove all resources, but it was invoked before remove all resources in this pr. It becomes a flaky test in this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
Done. Thanks your describe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged. Thanks @xianjingfeng @advancedxy @jerqi |
What changes were proposed in this pull request?
If
Resource
had been removed, avoid remove twice.Why are the changes needed?
When some appIds' removeResource took too much time, the
expiredAppCleanupExecutorService
in ShuffleTaskManagerwould check and detect the same appId is expired multiple times. Therefore the same appId might be added to
expiredAppIdQueue
multiple times. This PR fixes #456Does this PR introduce any user-facing change?
No
How was this patch tested?
UT