Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Accounting Breaker not reset in ML InferenceIngestIT #51201

Closed
tvernum opened this issue Jan 20, 2020 · 5 comments · Fixed by #51267
Closed

[CI] Accounting Breaker not reset in ML InferenceIngestIT #51201

tvernum opened this issue Jan 20, 2020 · 5 comments · Fixed by #51267
Assignees
Labels
:Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload :ml Machine learning >test-failure Triaged test failures from CI

Comments

@tvernum
Copy link
Contributor

tvernum commented Jan 20, 2020

https://gradle-enterprise.elastic.co/s/aqzoul36sqk4e/tests/zcquf3hc3eoda-n22ziczro7cng

:x-pack:plugin:ml:qa:native-multi-node-tests:integTestRunner

org.elasticsearch.xpack.ml.integration.InferenceIngestIT » testSimulate (3.864s)

Accounting breaker not reset to 9318 on node: {integTest-0}{TCh922uPQE6AAizDTivIZQ}{YeirhHa4S7a_gOaTshPgJA}{127.0.0.1}{127.0.0.1:60779}{dilm}{testattr=test, ml.machine_memory=103078793216, ml.max_open_jobs=20, xpack.installed=true}

Expected: <9318L>
but: was <8574L>

[2020-01-19T21:03:18,690][INFO ][o.e.x.m.i.InferenceIngestIT] [testSimulate] before test
[2020-01-19T21:03:18,691][INFO ][o.e.x.m.i.InferenceIngestIT] [testSimulate] [InferenceIngestIT#testSimulate]: setting up test
[2020-01-19T21:03:18,694][INFO ][o.e.p.PluginsService     ] [testSimulate] no modules loaded
[2020-01-19T21:03:18,694][INFO ][o.e.p.PluginsService     ] [testSimulate] loaded plugin [org.elasticsearch.index.reindex.ReindexPlugin]
[2020-01-19T21:03:18,694][INFO ][o.e.p.PluginsService     ] [testSimulate] loaded plugin [org.elasticsearch.transport.Netty4Plugin]
[2020-01-19T21:03:18,694][INFO ][o.e.p.PluginsService     ] [testSimulate] loaded plugin [org.elasticsearch.xpack.core.XPackClientPlugin]
[2020-01-19T21:03:18,807][INFO ][o.e.t.ExternalTestCluster] [testSimulate] Setup ExternalTestCluster [integTest] made of [3] nodes
[2020-01-19T21:03:19,913][INFO ][o.e.x.m.i.InferenceIngestIT] [testSimulate] [InferenceIngestIT#testSimulate]: all set up test
[2020-01-19T21:03:20,757][INFO ][o.e.x.m.i.InferenceIngestIT] [testSimulate] [InferenceIngestIT#testSimulate]: cleaning up after test
[2020-01-19T21:03:22,512][INFO ][o.e.x.m.i.InferenceIngestIT] [testSimulate] after test

Reproduction

./gradlew ':x-pack:plugin:ml:qa:native-multi-node-tests:integTestRunner' \
  --tests "org.elasticsearch.xpack.ml.integration.InferenceIngestIT.testSimulate" \
-Dtests.seed=56DD403935C93EDE -Dtests.security.manager=true \
-Dtests.locale=es-GT -Dtests.timezone=America/North_Dakota/New_Salem \
-Dcompiler.java=13 -Druntime.java=8

doesn't reproduce for me.

@tvernum tvernum added >test-failure Triaged test failures from CI :Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload :ml Machine learning labels Jan 20, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Circuit Breakers)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent
Copy link
Member

benwtrent commented Jan 20, 2020

Some ExternalTestCluster integration tests I wrote over the past week have been failing in a path I have never seen before:

All of these failures are due to

java.lang.AssertionError: Accounting breaker not reset to X on node <NODE>
Expected: X
     but: was Y

Y is always less than X by a couple 100. I am not sure what ensureEstimatedStats() is trying to determine.

https://github.com/elastic/elasticsearch/blob/5ba4f5fb3c98d7034cb307fd7f6774a33dcd9c94/test/framework/src/main/java/org/elasticsearch/test/ExternalTestCluster.java#L195..L198

Both of the failing tests call _ingest/pipeline/simulate and they are using the new(ish) asynchronous ingest document handling. They are also both using the same processor.

Before and after my tests, I create and delete some documents.
In one test I query an index.
In the other, I do not

I am sometimes able to re-create this scenario in Kibana console using other async processors.
But other times it does not happen. I am not sure what tot make of this :/

@droberts195
Copy link
Contributor

The meaning of "Accounting breaker not reset" is discussed in #30290 (comment)

It probably means the cleanup steps that run between tests do not consider something related to ingest pipelines but the tests are written on the assumption that they will. For example, maybe there's an assumption that the generic cleanup code that runs in between tests will delete all ingest pipelines and it doesn't. Or maybe there's an assumption that the generic cleanup code will wait for ingest pipelines that are in the middle of doing something to complete and it doesn't.

droberts195 added a commit that referenced this issue Jan 21, 2020
@droberts195
Copy link
Contributor

I muted the two simulate tests in InferenceIngestIT on master in d64e115

@benwtrent benwtrent self-assigned this Jan 21, 2020
benwtrent added a commit that referenced this issue Jan 22, 2020
Converts InferenceIngestIT into a `ESRestTestCase`.

closes #51201
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Jan 22, 2020
Converts InferenceIngestIT into a `ESRestTestCase`.

closes elastic#51201
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Jan 22, 2020
Converts InferenceIngestIT into a `ESRestTestCase`.

closes elastic#51201
benwtrent added a commit that referenced this issue Jan 22, 2020
Converts InferenceIngestIT into a `ESRestTestCase`.

closes #51201
benwtrent added a commit that referenced this issue Jan 22, 2020
Converts InferenceIngestIT into a `ESRestTestCase`.

closes #51201
debadair pushed a commit to debadair/elasticsearch that referenced this issue Jan 28, 2020
Converts InferenceIngestIT into a `ESRestTestCase`.

closes elastic#51201
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload :ml Machine learning >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants