Skip to content

Handle stopped task queue in table task cleanup tests#17720

Open
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/fix-flaky-table-task-cleanup
Open

Handle stopped task queue in table task cleanup tests#17720
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/fix-flaky-table-task-cleanup

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Feb 18, 2026

Summary

  • Harden PinotTableRestletResourceTest against TaskSchedulingInfo.getScheduledTaskNames() == null when scheduling fails.
  • Add getOrScheduleTask(...) helper that retries once after resuming the SegmentGenerationAndPushTask queue when queue-stopped scheduling error is returned.
  • Ensure testTableTasksCleanupWithActiveTasks resumes the task queue in a finally block so queue state does not leak into subsequent tests.

Validation

Not run (not requested); this is a test-only stability change for flaky test coverage.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates PinotTableRestletResourceTest to reduce flakiness in testTableTasksCleanupWithActiveTasks by making minion-task state checks and deletion cleanup more race-tolerant when tables are deleted with ignoreActiveTasks=true.

Changes:

  • Added helpers to wait for a task to reach a state or be missing, and to force-delete tasks with retries.
  • Updated the active-task cleanup test to stop the minion task queue before waiting/deleting the task.

Comment on lines +1293 to +1296
sendPutRequest(DEFAULT_INSTANCE.getControllerRequestURLBuilder()
.forStopMinionTaskQueue(MinionConstants.SegmentGenerationAndPushTask.TASK_TYPE));
waitForTaskStateOrTaskMissing(taskName, TaskState.STOPPED);
deleteMinionTaskWithRetry(taskName);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test stops the minion task queue but never resumes it. This can leak shared state into subsequent tests in this class/suite (note the earlier cleanup test explicitly resumes the queue “to avoid affecting other tests”). Please ensure the queue is resumed (ideally in a finally block or in @AfterMethod cleanup) even if assertions fail.

Copilot uses AI. Check for mistakes.
Comment on lines +1250 to +1254
private static boolean isTaskStateNotFound(IOException e) {
String message = e.getMessage();
return message != null && (message.contains("status code: 404") || message.contains("Not Found")
|| message.contains("does not exist"));
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isTaskStateNotFound detects 404s by substring-matching IOException.getMessage(), which is brittle and may misclassify errors if message formats change. Since the request helpers wrap HttpErrorStatusException as the IOException cause, prefer checking e.getCause() for HttpErrorStatusException and using its status code (or a dedicated helper that returns status codes) to reliably detect 404s.

Copilot uses AI. Check for mistakes.
@codecov-commenter
Copy link

codecov-commenter commented Feb 18, 2026

❌ 5 Tests Failed:

Tests completed Failed Passed Skipped
9352 5 9347 48
View the top 3 failed test(s) by shortest run time
org.apache.pinot.controller.api.PinotTableRestletResourceTest::testTableTasksValidationWithDanglingTasks
Stack Traces | 0.031s run time
No scheduled task names returned from task scheduling. taskInfo: {SegmentGenerationAndPushTask=TaskSchedulingInfo{scheduledTaskNames='null', generationErrors='[]', schedulingErrors='[Unable to start scheduling for task type SegmentGenerationAndPushTask as task queue may be stopped. Please check the task queue status.]'}}
org.apache.pinot.controller.api.PinotTableRestletResourceTest::testTableTasksCleanupWithNonActiveTasks
Stack Traces | 0.049s run time
No scheduled task names returned from task scheduling. taskInfo: {SegmentGenerationAndPushTask=TaskSchedulingInfo{scheduledTaskNames='null', generationErrors='[]', schedulingErrors='[Unable to start scheduling for task type SegmentGenerationAndPushTask as task queue may be stopped. Please check the task queue status.]'}}
org.apache.pinot.controller.api.PinotTableRestletResourceTest::testTableTasksValidationWithDanglingTasks
Stack Traces | 0.06s run time
No scheduled task names returned from task scheduling. taskInfo: {SegmentGenerationAndPushTask=TaskSchedulingInfo{scheduledTaskNames='null', generationErrors='[]', schedulingErrors='[Unable to start scheduling for task type SegmentGenerationAndPushTask as task queue may be stopped. Please check the task queue status.]'}}
org.apache.pinot.controller.api.PinotTableRestletResourceTest::testTableTasksCleanupWithNonActiveTasks
Stack Traces | 0.086s run time
No scheduled task names returned from task scheduling. taskInfo: {SegmentGenerationAndPushTask=TaskSchedulingInfo{scheduledTaskNames='null', generationErrors='[]', schedulingErrors='[Unable to start scheduling for task type SegmentGenerationAndPushTask as task queue may be stopped. Please check the task queue status.]'}}
org.apache.pinot.integration.tests.WindowResourceAccountingTest::setUp
Stack Traces | 10s run time
java.lang.RuntimeException: java.io.IOException: Failed to bind to address 0.0.0.0/0.0.0.0:39295

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@xiangfu0 xiangfu0 force-pushed the codex/fix-flaky-table-task-cleanup branch from afba597 to 2aca9c1 Compare February 18, 2026 09:11
@xiangfu0 xiangfu0 changed the title Stabilize table task cleanup test in active task deletion Handle stopped task queue in table task cleanup tests Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments