Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Need to cope with .ml-config being an alias #38796

Closed
droberts195 opened this issue Feb 12, 2019 · 5 comments
Closed

[ML] Need to cope with .ml-config being an alias #38796

droberts195 opened this issue Feb 12, 2019 · 5 comments
Assignees
Labels

Comments

@droberts195
Copy link
Contributor

The code to use .ml-config for storing configurations currently assumes that .ml-config will be a concrete index, not an alias.

Until now we had assumed that we would have until 7.last to sort this out, as that is when people will potentially do a rolling upgrade to 8.x. However, this assumption is wrong. The migration assistant that can reindex indices in the 6.x format into 7.x format will exist in 7.0. There is nothing to stop somebody reindexing a .ml-config index created in 6.x into 7.x format as soon as they install 7.0.

Currently it causes a problem if you do this. I reindexed a 6.x .ml-config index in 7.0.0-beta1 using the migration assistant and got an index called .reindexed-v7-ml-config with an alias .ml-config pointing at it. Some parts of ML still worked in this state, but creating a new job did not:

[2019-02-12T16:13:30,043][DEBUG][o.e.a.b.TransportShardBulkAction] [Davids-MacBook-Pro-7.local] [.reindexed-v7-ml-config][0] failed to execute bulk item (create) index {[.ml-config][doc][anomaly_detector-farequote7], source[{"job_id":"farequote7","job_type":"anomaly_detector","job_version":"7.0.0","create_time":1549988009886,"analysis_config":{"bucket_span":"1h","detectors":[{"detector_description":"metric(responsetime) by airline partitionfield=sourcetype","function":"metric","field_name":"responsetime","by_field_name":"airline","partition_field_name":"sourcetype"}],"influencers":["airline","sourcetype"]},"analysis_limits":{"model_memory_limit":"1024mb","categorization_examples_limit":4},"data_description":{"format":"delimited","time_field":"time","time_format":"yyyy-MM-dd HH:mm:ssX","field_delimiter":",","quote_character":"\""},"model_snapshot_retention_days":1,"results_index_name":"custom-foo"}]}
java.lang.IllegalArgumentException: Rejecting mapping update to [.reindexed-v7-ml-config] as the final mapping would have more than 1 type: [_doc, doc]
        at org.elasticsearch.index.mapper.MapperService.internalMerge(MapperService.java:449) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.index.mapper.MapperService.internalMerge(MapperService.java:398) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:331) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:315) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:238) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) ~[elasticsearch-7.0.0-beta1.jar:7.0.0-beta1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]

Our 7.0 code needs to be made robust to the possibility that .ml-config is an alias to some other index before 7.0.0 GA.

@droberts195 droberts195 added >bug v7.0.0 :ml Machine learning labels Feb 12, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@droberts195
Copy link
Contributor Author

but creating a new job did not

I stopped looking when I found this, but that doesn't mean it's the only place that fails to work when .ml-config is an alias. We need to find usages of the relevant constant and review each place where it's used to ensure the code will work whether .ml-config is a concrete index or an alias.

@droberts195
Copy link
Contributor Author

#38821 shows the similar change that was made for results indices.

@davidkyle
Copy link
Member

The error is that the ml code uses doc as its single mapping type and after reindex the new indices have the mapping type _doc.

Rejecting mapping update to [.reindexed-v7-ml-config] as the final mapping would have more than 1 type: [_doc, doc]

#39256 describes the problem and closes this.

@davidkyle
Copy link
Member

Closed by #39256

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants