Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix testRunStateChangePolicyWithAsyncActionNextStep race condition #40707

Merged
merged 1 commit into from
Apr 2, 2019

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Apr 1, 2019

Previously we only set the latch countdown with nextStep.setLatch after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
MockAsyncActionStep is being executed.

This moves the latch setting to be before the call to
runPolicyAfterStateChange, which means it is always available when the
MockAsyncActionStep is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves elastic#40018
@dakrone dakrone added >test Issues or PRs that are addressing/adding tests v7.0.0 :Data Management/ILM+SLM Index and Snapshot lifecycle management v8.0.0 v7.2.0 v6.6.3 v6.7.2 labels Apr 1, 2019
@dakrone dakrone requested a review from gwbrown April 1, 2019 21:18
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

Copy link
Contributor

@gwbrown gwbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice find!

@dakrone dakrone merged commit 9fa74a1 into elastic:master Apr 2, 2019
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 2, 2019
* master:
  add reason to DataFrameTransformState and add hlrc protocol tests (elastic#40736)
  Remove timezone validation on rollup range queries (elastic#40647)
  Fix testRunStateChangePolicyWithAsyncActionNextStep race condition (elastic#40707)
  Don't mark shard as refreshPending on stats fetching (elastic#40458)
  Name Snapshot Data Blobs by UUID (elastic#40652)
  SQL: [TEST] Mute TIME related failing tests
  [TEST] RecoveryWithConcurrentIndexing test (elastic#40733)
@colings86 colings86 added v6.7.1 and removed v6.7.2 labels Apr 3, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…lastic#40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves elastic#40018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >test Issues or PRs that are addressing/adding tests v6.6.3 v6.7.1 v7.0.0-rc2 v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] IndexLifecycleRunnerTests.testRunStateChangePolicyWithAsyncActionNextStep failed
5 participants