Skip to content

Conversation

@gmarouli
Copy link
Contributor

This PR stabilises testILMDownsampleRollingRestart and closes #136585.

This test started failing more often after #135834 because this change increased the chance of manifesting #137422.

To reduce this chance we wait for ILM to finish the phase before we end the test.

Closes #136585

@gmarouli gmarouli requested a review from nielsbauman October 31, 2025 07:28
@gmarouli gmarouli added the >test Issues or PRs that are addressing/adding tests label Oct 31, 2025
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.3.0 labels Oct 31, 2025
@gmarouli gmarouli added the :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data label Oct 31, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the investigation, Mary!!

And sorry that this ended up being ILM/Data Management after all...

Comment on lines 176 to 177
// We assert that ILM to successfully completed the phase
assertBusy(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do this with ClusterServiceUtils#addTemporaryStateListener rather than polling any API? We can check the API response afterwards ofc but every assertBusy adds a little extra idle time to tests and in total that's a pretty big contributor to overall test runtime.

}, 60, TimeUnit.SECONDS);
// We assert that ILM successfully completed the phase
logger.info("Waiting for ILM to complete the phase for index [{}]", targetIndex);
ClusterServiceUtils.addTemporaryStateListener(clusterState -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just adds (and returns) the listener - you have to safeAwait on it too :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

face palm.....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also use awaitClusterState instead :)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ensureGreen(targetIndex);
// We assert that ILM successfully completed the phase
logger.info("Waiting for ILM to complete the phase for index [{}]", targetIndex);
awaitClusterState(clusterState -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh yes I'd completely forgotten we had an awaitClusterState to do this 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, credits to @nielsbauman

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have less of an excuse tho 😇

$ git annotate -- test/framework/src/main/java/org/elasticsearch/test/ClusterServiceUtils.java | grep awaitClusterState
1337476bc9efc	(David Turner	2025-08-26 12:54:45 +0100	241)    public static void awaitClusterState(Predicate<ClusterState> statePredicate, ClusterService clusterService) {

…ticsearch/xpack/downsample/ILMDownsampleDisruptionIT.java

Co-authored-by: David Turner <david.turner@elastic.co>
@gmarouli gmarouli added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Oct 31, 2025
@elasticsearchmachine elasticsearchmachine merged commit 85d6e07 into elastic:main Oct 31, 2025
34 checks passed
@gmarouli gmarouli deleted the test-fix-136585 branch October 31, 2025 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine >test Issues or PRs that are addressing/adding tests v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] ILMDownsampleDisruptionIT testILMDownsampleRollingRestart failing

4 participants