-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix backwards compatibility tests in 2.x #13477
Conversation
This is very much a work in progress. It'll take a bit more banging around to fix the remaining BWC tests. |
|
||
--------------------------------------------------------------------------- | ||
mvn test -Dtests.filter="@backwards" -Dtests.bwc.version=x.y.z -Dtests.security.manager=false | ||
$ mkdir -p core/backwards && pushd core/backwards | ||
$ curl -O https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.0.0-beta1/elasticsearch-2.0.0-beta1.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the provided dev-tools script for this? get-bwc-version.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I forgot about it. I'll update.
Thanks @nik9000! I left a couple comments. The changes so far look fine, I think. |
@@ -62,7 +61,7 @@ public void testSimpleFunctionScoreParsingWorks() throws IOException, ExecutionE | |||
.endObject() | |||
.endObject() | |||
.endObject() | |||
.endObject())); | |||
.endObject()).setSettings("number_of_shards", 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that was burried in a commit message:
FunctionScoreBackwardCompatibilityIT
relied on tf/idf information but didn't
set the number of shards to 1 or use dfs_query_then_fetch. Now it uses one
shard.
I'll add it to a comment here if you approve.
I'm dropping 2.0.0 from this because its not strictly required for 2.0.0 - its needed for 2.0.1 and 2.1.0. We'll get this soon but I don't want it to be a blocker. |
f8c8a5c
to
c26a648
Compare
Ok - I've finally got this ready for review. It doesn't fix all the tests but I was sinking hour and hours and hours per busted test and I didn't want each one to block getting the rest fixed. I did squash and rebase this during the review and for that I'm sorry - it was getting pretty sprawling and had some twists and turns that ended up not being good ways to fix the tests. But its really really ready for review now. |
Added 3.0 label. This should be merged into 2.0, 2.x, and master branches even though we don't expect master to be backwards compatible with anything. |
return settings; | ||
} | ||
Settings.Builder builder = Settings.settingsBuilder().put(settings); | ||
builder.remove(IndexMetaData.SETTING_DATA_PATH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should log something here to make it apparent to anyone reading that the setting was removed? Maybe at warn
level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
|
||
1. Go to | ||
https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/distribution/tar/elasticsearch/2.0.0-beta2-SNAPSHOT/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use rc1 now, but I certainly hope that by the time this PR gets, GA is released
f46c290
to
a5b0344
Compare
Quick update for anyone following along at home: as of the last patch the BWC tests that are not After that I think we have two options:
|
Then you can run the backwards compatibility tests: | ||
|
||
--------------------------------------------------------------------------- | ||
mvn verify -pl core -Dskip.unit.tests -Pdev -Dtests.filter="@backwards and not @awaitsfix" -Dtests.bwc.version=2.0.0-rc1 -Dtests.bwc.path="$(pwd)/backwards" -Dtests.security.manager=false -Dtests.seccomp=false -Dtests.jvms=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are -Pdev and -Dtests.jvms=1 required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Pdev isn't really required, but it does speed up the process.
I had a quick look and left some comments. Also ran tests on my machine, UpgradeIT failed with |
Awesome.
Booo! It didn't fail for me but let me try again. |
// enable secure computing mode | ||
if (seccomp) { | ||
Natives.trySeccomp(tmpFile); | ||
} else { | ||
logger.warn("Not enabling seccomp because it was explicitly disabled"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the whitespace changes and the warning? I would also not like to document this in TESTING.md or anywhere else. It defeats the whole point of it. I also don't think we should support allowing tests to do this for any reason, but I know I stand alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the whitespace changes and the warning?
I can.
I would also not like to document this in TESTING.md or anywhere else. It defeats the whole point of it. I also don't think we should support allowing tests to do this for any reason, but I know I stand alone.
I'm happy to spend the next few days working on getting it running under seccomp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think we can launch a little perl script in pre-integration-test, kill it in post-integration-test, that sits on a socket and starts up nodes. Its a perl script for testing. Then the rest of the stuff can run without us having to defeat every security mechanism we are trying to add...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - it looks like some kind of remote executor helper is required for this. I feel like it'd be easier to yank some of the code we already use to start the bwc nodes into a java process started into pre-integration-test than to write it in perl and fight with windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, i dont really think it is. I guess i don't understand why we are running backwards tests this way, versus doing real integration tests with the rest suite and similar (like we do in qa). All the stuff needed for that is published as maven artifacts. One of these approaches is a "fantasy world" that isn't quite realistic, it requires manual setup and is difficult to run, and flaky as far as jenkins goes. the other is automatic, realistic and reliable... what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the dependency resolution is funky here. I think we'd be better off with dependency resolution handled by maven/gradle/whatever then how it works now. I suspect the reason the resolution isn't done that way now is because these are left over from single module days.
I think giving the test process some degree of control of the bwc nodes is useful though - it wants to be able to shut them down and spin up new nodes to replace them to simulate rolling upgrades which seems pretty fair.
The way I see this we need these tests working as soon as we want to make 2.0.1 so I don't know that we want to do major surgery to them. I'm all for figuring a way to get them working under seccomp and I'll work on it. With the goal of eventually moving them into the qa project with real resolution. And no elasticsearch nodes running in process with the tests.
I'm really really concerned about the execution here. I do not think it should be allowed, even in tests. If these tests must really run this way (which i dont believe), we should start up some other background test listening on a socket or similar responsible for launching them, instead of allowing execution from the ES environment. (i already see this "setting a precendent" for more leniency). |
the reason why we run this way is a historic one. This is how I added it back in 1.2 when we had no BWC tests at all. From the todays perspective I think we should move away from how it's done. The question is if it makes sense to make the test work again and then work from there to cut over to a new system that is controlled from the outside ie. basic tests that replace the once that we have?
both can relatively simple be done without starting / stopping nodes from within ES and maybe thats a good start? I still wonder if we should make the ones we have work first and then go from there? |
This patch as it stands gets them working in the same way they worked before. I'm going to work on a little daemon we can start pre-integration-test that can shim the start/stop of There are lots of great things we could do here - move to the qa module and get proper maven dependency resolution so they run on every run - invoke current version of elasticsearch as an |
OK - I've build a service that can be forked pre-integration-test to run these the external nodes. Right now it forks in core's build only if I'm in no way proud of the service's implementation - it gets the job done but I'm sure I've done tons of non normal stuff so feel free to complain about it. |
Another update: the tests now pass with the security manager enabled so I've removed the recommendation to disable it. The external nodes and the internal nodes run under the security manager. Only ExternalNodeService runs without it and it runs in the maven JVM, at least it does the way I've got it rigged right now. |
Yet another update: the tests now no longer need the external setup and download step, they resolve their dependencies using maven. |
* Key used to set the path for the elasticsearch executable used to run backwards compatibility tests from | ||
* via the commandline -D{@value #TESTS_BACKWARDS_COMPATIBILITY} | ||
*/ | ||
public static final String TESTS_BACKWARDS_COMPATIBILITY = "tests.bwc"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, can we rather allow customizing repro info from subclasses of ESIntegTestCase? I don't like having to add everything here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I agree - I think this is nasty. I did it without a whole lot of thought trying to get the move working.
LGTM, let's get this into 2.x/2.1. It won't work on master until we port to gradle, but maybe by then we can re-evaluate how to do bwc tests to simplify this (and use all real nodes instead of some mock and some external). |
Moves backwards compatibility tests from core to qa/backwards and uses a service running inside of maven to fork the elasticsearch nodes so the tests can continue to run in seccomp. Because the tests were moved ReproduceInfoPrinter now cannot hardcode which parameters it prints - it scans them using a new annotation. `CompositeTestCluster#client` now remakes its client with new node addresses after every node is upgraded. That makes it possiblie for it to survive a rolling restart where the new nodes come up using different ports. `CompositeTestCluster#upgradeOneNode` now brings up the upgraded node with the same data directory as the old node which prevents the loss of data when there are no replicas. `ExternalNode` now uses a `unicastHosts` list to find the other running nodes rather than relying on multicast which was moved to a plugin in 2.0. `unicastHosts` is carefully setup by `CompositeTestCluster` to list all the other nodes that are currently running. `FunctionScoreBackwardCompatibilityIT` now forces a single shard so the tf/idf parts of the score don't get juggled. `ReproduceInfoPrinter` now appends properties required to properly reproduce the backwards compatibility tests and properly quotes parameters like `-Dtests.filter="@Backwards and not @awaitsfix`. About 2/3 of the backwards compatibility tests are passing and the rest are marked with `@AwaitsFix(bugUrl="elastic#13522 That issue will track all work getting the backwards compatibility tests back online. The following tests are so marked: - BasicAnalysisBackwardCompatibilityIT - BasicBackwardsCompatibilityIT - RecoveryBackwardsCompatibilityIT - SignificantTermsBackwardCompatibilityIT - SnapshotBackwardsCompatibilityIT Closes elastic#13425
d1ca167
to
285bc37
Compare
Squashed and rebased. |
Fix backwards compatibility tests in 2.x
Merged to 2.x. Cherry-picking to 2.1 now. |
And merged into 2.1: 10315d8 |
These tests have gone stale because 2.0 hasn't needed to be protocol compatible with anything. But 2.0.1 and 2.1.0 will have to be backwards compatible with 2.0.0. Since we don't have 2.0.0 released but we still want to work on features for 2.1.0 that have to be backwards compatible with 2.0.0 we should use 2.0.0-beta1 as a stand in for the eventual 2.0.0 release.
Closes #13425
Closes #13484