Skip to content

Conversation

@DonalEvans
Copy link
Contributor

This commit fixes two separate issues in the test.

Firstly, the assertion that search_count stats should be equal to zero may fail the second time the openAndRunJob runnable is invoked if the stats haven't been updated since the previous datafeed with the same ID was deleted. Wrapping the assertion in an assertBusy() call prevents this.

Secondly, deleting the datafeed may fail if the master node which processes the delete datafeed action hasn't finished updating its state to reflect the fact that the datafeed has been stopped yet, but the node that processes the datafeed stats request has. Wrapping the datafeed deletion in an assertBusy() call allows it to be retried if this race condition is encountered.

Closes #137207

To verify the fix for the second issue, I first modified assertBusy() to poll every 5ms instead of using an exponential backoff, which made it much easier to hit the race condition that causes the failure. This change resulted in increasing the failure rate of the test to ~40%. After adding the assertBusy() call around the datafeed deletion request, no failures were observed in 100 repeats of the test.

This commit fixes two separate issues in the test.

Firstly, the assertion that search_count stats should be equal to zero
may fail the second time the openAndRunJob runnable is invoked if the
stats haven't been updated since the previous datafeed with the same ID
was deleted. Wrapping the assertion in an assertBusy() call prevents
this.

Secondly, deleting the datafeed may fail if the master node which
processes the delete datafeed action hasn't finished updating its state
to reflect the fact that the datafeed has been stopped yet, but the node
that processes the datafeed stats request has. Wrapping the
datafeed deletion in an assertBusy() call allows it to be retried if
this race condition is encountered.

Closes elastic#137207
@DonalEvans DonalEvans added >test Issues or PRs that are addressing/adding tests :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v9.3.0 branch:9.2 branch:9.1 labels Nov 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@DonalEvans DonalEvans merged commit 2bb729b into elastic:main Nov 11, 2025
34 checks passed
@DonalEvans DonalEvans deleted the fix-datafeed-integration-test branch November 11, 2025 18:55
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.2 Commit could not be cherrypicked due to conflicts
9.1 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 137856

DonalEvans added a commit to DonalEvans/elasticsearch that referenced this pull request Nov 11, 2025
…#137856)

This commit fixes two separate issues in the test.

Firstly, the assertion that search_count stats should be equal to zero
may fail the second time the openAndRunJob runnable is invoked if the
stats haven't been updated since the previous datafeed with the same ID
was deleted. Wrapping the assertion in an assertBusy() call prevents
this.

Secondly, deleting the datafeed may fail if the master node which
processes the delete datafeed action hasn't finished updating its state
to reflect the fact that the datafeed has been stopped yet, but the node
that processes the datafeed stats request has. Wrapping the
datafeed deletion in an assertBusy() call allows it to be retried if
this race condition is encountered.

Closes elastic#137207

(cherry picked from commit 2bb729b)

# Conflicts:
#	muted-tests.yml
@DonalEvans
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
9.2
9.1

Questions ?

Please refer to the Backport tool documentation

DonalEvans added a commit to DonalEvans/elasticsearch that referenced this pull request Nov 11, 2025
…#137856)

This commit fixes two separate issues in the test.

Firstly, the assertion that search_count stats should be equal to zero
may fail the second time the openAndRunJob runnable is invoked if the
stats haven't been updated since the previous datafeed with the same ID
was deleted. Wrapping the assertion in an assertBusy() call prevents
this.

Secondly, deleting the datafeed may fail if the master node which
processes the delete datafeed action hasn't finished updating its state
to reflect the fact that the datafeed has been stopped yet, but the node
that processes the datafeed stats request has. Wrapping the
datafeed deletion in an assertBusy() call allows it to be retried if
this race condition is encountered.

Closes elastic#137207

(cherry picked from commit 2bb729b)

# Conflicts:
#	muted-tests.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :ml Machine learning Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v9.1.8 v9.2.2 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] DatafeedJobsIT testDatafeedTimingStats_DatafeedRecreated failing

3 participants