Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Model should defend against times earlier than epoch 0 #394

Closed
droberts195 opened this issue Feb 15, 2019 · 2 comments · Fixed by #512
Closed

[ML] Model should defend against times earlier than epoch 0 #394

droberts195 opened this issue Feb 15, 2019 · 2 comments · Fixed by #512
Assignees

Comments

@droberts195
Copy link
Contributor

I saw this in a Java test log:

[2019-02-13T11:42:03,268][WARN ][o.e.x.m.d.DatafeedManager] [node-2] [test-lookback-only-given-empty-index] Datafeed lookback retrieved no data
... SKIPPED 9 LINES ...
[2019-02-13T11:42:03,648][WARN ][o.e.t.OutboundHandler    ] [node-2] send message failed [channel: Netty4TcpChannel{localAddress=/127.0.0.1:34349, remoteAddress=/127.0.0.1:57094}]
java.lang.IllegalStateException: Negative longs unsupported, use writeLong or writeZLong for negative numbers [-3600000]
	at org.elasticsearch.common.io.stream.StreamOutput.writeVLong(StreamOutput.java:274) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
	at org.elasticsearch.xpack.core.ml.job.process.autodetect.state.ModelSizeStats.writeTo(ModelSizeStats.java:156) ~[?:?]
	at org.elasticsearch.common.io.stream.StreamOutput.writeOptionalWriteable(StreamOutput.java:854) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
	at org.elasticsearch.xpack.core.ml.action.GetJobsStatsAction$Response$JobStats.writeTo(GetJobsStatsAction.java:277) ~[?:?]
	at org.elasticsearch.common.io.stream.StreamOutput.lambda$writeCollection$28(StreamOutput.java:1056) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
	at org.elasticsearch.common.io.stream.StreamOutput.writeCollection(StreamOutput.java:1075) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
	at org.elasticsearch.common.io.stream.StreamOutput.writeCollection(StreamOutput.java:1056) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
	at org.elasticsearch.common.io.stream.StreamOutput.writeList(StreamOutput.java:1063) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
	at org.elasticsearch.xpack.core.ml.action.util.QueryPage.writeTo(QueryPage.java:57) ~[?:?]
	at org.elasticsearch.action.support.tasks.TransportTasksAction$NodeTasksResponse.writeTo(TransportTasksAction.java:441) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]

However, this did not actually fail any test.

It shows that a model size stats document contained an epoch time of -3600 seconds = -3600000 milliseconds. The Java side cannot cope with this. We should change the C++ code so that it does not create such documents.

A request to generate results for times before the epoch could fail the job, which would then cause a test failure on the Java side that we could fix.

@davidkyle
Copy link
Member

I saw the same error as above it's reproducible with the following:

./gradlew :x-pack:plugin:ml:qa:native-multi-node-tests:integTestRunner -Dtests.seed=DF62D7F71CC2EDB6 -Dtests.class=org.elasticsearch.xpack.ml.integration.DatafeedJobsRestIT -Dtests.method=testLookbackWithoutPermissionsAndRollup

@droberts195
Copy link
Contributor Author

I've tracked down the problem. It's that we unconditionally call CAnomalyJob::refreshMemoryAndReport() when the autodetect process exits even if it didn't see any input data. It's very similar to the problem that was noted for quantiles in #393 and fixed in #437. I'll add a similar fix for model size stats.

@droberts195 droberts195 self-assigned this Jun 26, 2019
droberts195 added a commit to droberts195/ml-cpp that referenced this issue Jun 27, 2019
Similar to the fix for state and quantiles in elastic#437,
if no input is received and time is not advanced then
there is no need to write model size stats when the
autodetect process exits. Doing this can actually
cause a problem for a job that has never ever seen
any input, as the unnecessary model size stats were
written with a negative timestamp. This change also
adds an extra defensive check to prevent that ever
happening, although the only situation when it is
thought to be possible should be prevented by the
first change.

Fixes elastic#394
droberts195 added a commit that referenced this issue Jun 27, 2019
Similar to the fix for state and quantiles in #437,
if no input is received and time is not advanced then
there is no need to write model size stats when the
autodetect process exits. Doing this can actually
cause a problem for a job that has never ever seen
any input, as the unnecessary model size stats were
written with a negative timestamp. This change also
adds an extra defensive check to prevent that ever
happening, although the only situation when it is
thought to be possible should be prevented by the
first change.

Fixes #394
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants