Tests: Revamp static bwc test framework to use dangling indexes #10247

rjernst · 2015-03-25T01:13:43Z

The static old index tests currently take a long time to run because
each index version essentially recreates the cluster, and spins up
new nodes. This PR instead loads each old version into the existing
cluster as a dangling index. It also removes the intermediate
"StaticIndexBackwardCompatibilityTest" which was an extra layer
with no purpose, and moves a shared version of a commonly found
function to get an http client.

The test now takes between 40 and 60 seconds for me. I also ran it
"under stress" by running all ES tests in one shell, while
simultaneously running 10 iterations of the old index tests. Each
iteration took on average about 90 seconds, which is much better
than the 20+ minutes we see in master on jenkins.

bleskes · 2015-03-25T08:54:04Z

src/test/java/org/elasticsearch/bwcompat/OldIndexBackwardsCompatibilityTests.java


-@TimeoutSuite(millis = 40 * TimeUnits.MINUTE)
-public class OldIndexBackwardsCompatibilityTests extends StaticIndexBackwardCompatibilityTest {
+@LuceneTestCase.SuppressCodecs({"Lucene3x", "MockFixedIntBlock", "MockVariableIntBlock", "MockSep", "MockRandom", "Lucene40", "Lucene41", "Appending", "Lucene42", "Lucene45", "Lucene46", "Lucene49"})


out of curiosity - why are all these suppresses needed?

I don't remember, but they were copied from the static index superclass. I believe @s1monw added them originally on the first static bwc test, so maybe he can comment.

bleskes · 2015-03-25T09:22:22Z

This is much cleaner! thx . Left some comments.

rjernst · 2015-03-26T07:15:57Z

@bleskes I pushed another commit with some changes based on your feedback.

bleskes · 2015-03-26T15:30:49Z

Thx @rjernst . I replied to the comments.

rjernst · 2015-03-31T06:49:26Z

@bleskes I pushed a new commit. I believe this addresses your concern over using hard coded paths.

bleskes · 2015-03-31T06:54:42Z

Awesome. LGTM. I will work on the dangling request issue today, so we can get this in.

bleskes · 2015-03-31T06:55:48Z

scratch the waiting for the dangling indices request issue. missed the master:false on the loading node. No need to wait indeed.

In several places in the code we need to notify a node it needs to do something (typically the master). When that node is the local node, we have an optimization in serveral places that runs the execution code immediately instead of sending the request through the wire to itself. This is a shame as we need to implement the same pattern again and again. On top of that we may forget (see note bellow) to do so and we might have to write some craft if the code need to run under another thread pool. This commit folds the optimization in the TrasnportService, shortcutting wire serliazition if the target node is local. Note: this was discovered by elastic#10247 which tries to import a dangling index quickly after the cluster forms. When sending an import dangling request to master, the code didn't take into account that fact that the local node may master. If this happens quickly enough, one would get a NodeNotConnected exception causing the dangling indices not to be imported. This will succeed after 10s where InternalClusterService.ReconnectToNodes runs and actively connects the local node to itself (which is not needed), potentially after another cluster state update.

The static old index tests currently take a long time to run because each index version essentially recreates the cluster, and spins up new nodes. This PR instead loads each old version into the existing cluster as a dangling index. It also removes the intermediate "StaticIndexBackwardCompatibilityTest" which was an extra layer with no purpose, and moves a shared version of a commonly found function to get an http client. The test now takes between 40 and 60 seconds for me. I also ran it "under stress" by running all ES tests in one shell, while simultaneously running 10 iterations of the old index tests. Each iteration took on average about 90 seconds, which is much better than the 20+ minutes we see in master on jenkins. closes elastic#10247

The static old index tests currently take a long time to run because each index version essentially recreates the cluster, and spins up new nodes. This PR instead loads each old version into the existing cluster as a dangling index. It also removes the intermediate "StaticIndexBackwardCompatibilityTest" which was an extra layer with no purpose, and moves a shared version of a commonly found function to get an http client. The test now takes between 40 and 60 seconds for me. I also ran it "under stress" by running all ES tests in one shell, while simultaneously running 10 iterations of the old index tests. Each iteration took on average about 90 seconds, which is much better than the 20+ minutes we see in master on jenkins. closes #10247

In several places in the code we need to notify a node it needs to do something (typically the master). When that node is the local node, we have an optimization in serveral places that runs the execution code immediately instead of sending the request through the wire to itself. This is a shame as we need to implement the same pattern again and again. On top of that we may forget (see note bellow) to do so and we might have to write some craft if the code need to run under another thread pool. This commit folds the optimization in the TrasnportService, shortcutting wire serliazition if the target node is local. Note: this was discovered by elastic#10247 which tries to import a dangling index quickly after the cluster forms. When sending an import dangling request to master, the code didn't take into account that fact that the local node may master. If this happens quickly enough, one would get a NodeNotConnected exception causing the dangling indices not to be imported. This will succeed after 10s where InternalClusterService.ReconnectToNodes runs and actively connects the local node to itself (which is not needed), potentially after another cluster state update. Closes elastic#10350

The static old index tests currently take a long time to run because each index version essentially recreates the cluster, and spins up new nodes. This PR instead loads each old version into the existing cluster as a dangling index. It also removes the intermediate "StaticIndexBackwardCompatibilityTest" which was an extra layer with no purpose, and moves a shared version of a commonly found function to get an http client. The test now takes between 40 and 60 seconds for me. I also ran it "under stress" by running all ES tests in one shell, while simultaneously running 10 iterations of the old index tests. Each iteration took on average about 90 seconds, which is much better than the 20+ minutes we see in master on jenkins. closes elastic#10247

rjernst added >test Issues or PRs that are addressing/adding tests v2.0.0-beta1 v1.6.0 v1.5.1 labels Mar 25, 2015

bleskes reviewed Mar 25, 2015
View reviewed changes

rjernst mentioned this pull request Mar 26, 2015

Improve static backwards index tests #9143

Closed

bleskes mentioned this pull request Mar 31, 2015

Transport: shortcut local execution #10350

Closed

rjernst force-pushed the fix/slow-static-bwc branch from e9f716d to 3948a84 Compare April 1, 2015 07:48

s1monw assigned bleskes Apr 2, 2015

rjernst force-pushed the fix/slow-static-bwc branch from 3948a84 to c3011ce Compare April 4, 2015 06:22

rjernst merged commit c3011ce into elastic:master Apr 4, 2015

rjernst deleted the fix/slow-static-bwc branch September 18, 2020 03:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tests: Revamp static bwc test framework to use dangling indexes #10247

Tests: Revamp static bwc test framework to use dangling indexes #10247

Uh oh!

rjernst commented Mar 25, 2015

Uh oh!

bleskes Mar 25, 2015

Uh oh!

rjernst Mar 26, 2015

Uh oh!

bleskes commented Mar 25, 2015

Uh oh!

rjernst commented Mar 26, 2015

Uh oh!

bleskes commented Mar 26, 2015

Uh oh!

rjernst commented Mar 31, 2015

Uh oh!

bleskes commented Mar 31, 2015

Uh oh!

bleskes commented Mar 31, 2015

Uh oh!

Uh oh!

Tests: Revamp static bwc test framework to use dangling indexes #10247

Tests: Revamp static bwc test framework to use dangling indexes #10247

Uh oh!

Conversation

rjernst commented Mar 25, 2015

Uh oh!

bleskes Mar 25, 2015

Choose a reason for hiding this comment

Uh oh!

rjernst Mar 26, 2015

Choose a reason for hiding this comment

Uh oh!

bleskes commented Mar 25, 2015

Uh oh!

rjernst commented Mar 26, 2015

Uh oh!

bleskes commented Mar 26, 2015

Uh oh!

rjernst commented Mar 31, 2015

Uh oh!

bleskes commented Mar 31, 2015

Uh oh!

bleskes commented Mar 31, 2015

Uh oh!

Uh oh!