Skip to content

[ML] Categorizer state is not background persisted #1136

@droberts195

Description

@droberts195

The intention is that we persist both anomaly detector state and categorizer state in the background periodically while a job is running (every 3-4 hours by default).

However, since version 7.4.0 the categorizer state has not been persisted in the background. Instead an error message like this is logged:

[2020-03-23T14:01:06,813][ERROR][org.elasticsearch.xpack.ml.process.logging.CppLogMessageHandler] [instance-0000000001] [my_job] [autodetect/15732] [CFieldDataTyper.cc@414] NULL persistence manager

The problem was introduced by the changes of #550. Prior to that the categorizer only needed a pointer to the background persister when it was unchained, i.e. running as part of a dedicated categorizer program. In #550 it was changed to use a new persistence manager class regardless of whether it was chained or not, but when chained (which is always the case in current shipped product) it was not passed a pointer to the persistence manager. Hence in production it never does background persistence. (The unit tests cannot detect this because the bug is in the main() function of a production program. It would need an integration test that uses the complete program to find the problem.)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions