Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Fix TaskManagerTest.buildSpillDirectoryFailure #23098

Conversation

spershin
Copy link
Contributor

Description

There is a race in the test itself, which we fix here:

  • taskManager_->cleanOldTasks() can be called before the Driver is destroyed.
  • Since the Driver holds the Task, the Task is considered a zombie inside cleanOldTasks() and is NOT removed from the map.
  • So, destructor of the Task is not called.
  • Then we call waitForAllTasksToBeDeleted() which simply compares how many ctors and dtors we had called. They don't match.
  • Test fails.

Fixes #23062

Test Plan

Stress testing.
buck2 test @mode/opt github/presto-trunk/presto-native-execution/presto_cpp/main/tests:presto_main_test -- TaskManagerTest.buildSpillDirectoryFailure --stress-runs 100

== NO RELEASE NOTE ==

@spershin spershin requested a review from a team as a code owner June 27, 2024 23:32
@tanjialiang
Copy link
Contributor

Thanks for the fix. Does it make sense to just extend the test from velox's OperatorTestBase and reuse the rich utilities there?

@spershin
Copy link
Contributor Author

@tanjialiang
This test fixes the use of TaskManager's cleanOldTasks(). TaskManager is not in Velox, so we cannot use rich utilities from Velox.

@@ -54,6 +54,26 @@ namespace facebook::presto {

namespace {

// Repeatedly calls for cleanOldTasks() for a while to ensure that we overcome a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for cleanOldTasks() -->= cleanOldTasks()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Let's fix this some other time.
Now it is important to land this to remove the noise.

Copy link
Contributor

@amitkdutta amitkdutta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @spershin

@spershin spershin merged commit dd42f26 into prestodb:master Jun 28, 2024
59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[native] Flaky test TaskManagerTest.buildSpillDirectoryFailure
4 participants