New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationalise ML native multi node test base classes, setup and teardown #49582
Comments
Pinging @elastic/ml-core (:ml) |
Why prefer I suppose another option would be to use the HLRC if we go with REST. Do we have that option? |
I guess at the time the tests were first written that was a good reason and probably explains why most tests indirectly extend But that is using the node client which is not possible for real users. It seems like it would be desirable now the HLRC is feature complete to switch to it. And it's supposed to be easy to migrate to because all the method signatures are identical between the transport client and HLRC. So migrating to that should just involve changing The only problem will come if there are places in the native multi node tests where we call actions that are meant to be internal only, and don't have corresponding REST endpoints. Another thing we could do if switching all the tests to use the same base class is problematic would be to split the tests into two parts: |
Also, there have been multiple times when working on a new project when the HLRC actions have yet to be written. BUT I am writing a new API that needs integration testing across multiple nodes. Manually creating REST requests is a pain in this scenario. |
Since this issue was first opened things have changed because the transport client no longer exists. These ML tests are now the only significant part of the code still using the node client to automate external integration tests - see #101808. We should start to plan the migration strategy to move all the ML native multi node tests to be REST tests. |
`ExternalTestCluster` doesn't really make sense now that the transport client is removed. We only use it in the ML integ test suite and it'd be good to avoid expanding its usage further, so this commit deprecates it and removes the functionality in `ESIntegTestCase` that might quietly switch to using it in a new test suite if running with certain system properties. Relates elastic#49582
Pinging @elastic/ml-core (Team:ML) |
`ExternalTestCluster` doesn't really make sense now that the transport client is removed. We only use it in the ML integ test suite and it'd be good to avoid expanding its usage further, so this commit deprecates it and removes the functionality in `ESIntegTestCase` that might quietly switch to using it in a new test suite if running with certain system properties. Relates #49582
The ML native multi node tests now contain a large number of test classes.
Some of these extend
MlNativeIntegTestCase
which in turn extendsESIntegTestCase
. Others extendESRestTestCase
.ESIntegTestCase
andESRestTestCase
both extendESTestCase
and both add extra setup/teardown code and hooks, but this is different between the two.All the ML native multi node tests share the same external test cluster. The differences in base classes between the various test classes makes it extremely hard to reason about exactly what state the cluster is in when a particular test runs, as the test that ran previously could have been in a different test class running different teardown code. In particular, it is not clear whether the previous test removed all the ML index templates or not. This can lead to very intermittent failures. For example, we suspect that https://gradle-enterprise.elastic.co/s/qbotz2vnmsija/tests/zcquf3hc3eoda-2zjlbn7rqx7gc was caused by this, but that is not a common failure.
As a proposal to make the state of the external multi node test cluster easier to reason about, I propose that:
MlNativeIntegTestCase
, and that should be changed to extendESRestTestCase
MlNativeIntegTestCase
,MlNativeAutodetectIntegTestCase
orMlNativeDataFrameAnalyticsIntegTestCase
The text was updated successfully, but these errors were encountered: