Skip to content

Conversation

tzulitai
Copy link
Contributor

It was reported that Elasticsearch 1.x tests can fail with this exception thrown from the embedded ES node used in IT tests:
ProcessClusterEventTimeoutException[failed to process cluster event (acquire index lock) within 1m].

After some googling on this, it seems like this is a potential deadlock with Elasticsearch 1.x when creating indices.

From the looks of recent Travis tests, it seems that this flakiness rarely happens, so I think retrying the tests if they fail only for Elasticsearch 1.x and not newer versions would be a simple solution.

If it happens to pop out for 2.x and 5.x also, we might need to find another solution.

…n failure

This is allowed because Elasticsearch 1.x has a potential deadlock when
creating indices. Since this flakiness rarely happens, this commit
allows rerunning the Elasticsearch 1.x tests to try to mitigate this
problem instead of just failing them.
@rmetzger
Copy link
Contributor

+1 to merge

@tzulitai
Copy link
Contributor Author

Merging to master ..

@asfgit asfgit closed this in 72f56d1 Feb 27, 2017
@tzulitai tzulitai deleted the FLINK-5772 branch February 27, 2017 10:38
p16i pushed a commit to p16i/flink that referenced this pull request Apr 16, 2017
…n failure

This is allowed because Elasticsearch 1.x has a potential deadlock when
creating indices. Since this flakiness rarely happens, this commit
allows rerunning the Elasticsearch 1.x tests to try to mitigate this
problem instead of just failing them.

This closes apache#3410.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants