Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

action.auto_create_index evaluated incorrectly for date_index_name pipeline #50015

Open
m9aertner opened this issue Dec 10, 2019 · 4 comments
Open
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. Team:Data Management Meta label for data/management team Team:Distributed Meta label for distributed team

Comments

@m9aertner
Copy link
Contributor

Elasticsearch version (bin/elasticsearch --version):

7.5.0 (also observed with 5.6.16 and 6.8.5)

Plugins installed: []

plain OSS install

JVM version (java -version):

1.8.0_211

OS version (uname -a if on a Unix-like system):

CentOS6

Description of the problem including expected versus actual behavior:

Automatic index creation whitelist (action.auto_create_index) appears to not get honoured properly when the new index name gets computed / set as part of an ingest pipeline.

Steps to reproduce:

# Install fresh ElasticSearch 7.5.0, 
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-oss-7.5.0-x86_64.rpm
rpm --install elasticsearch-oss-7.5.0-x86_64.rpm
systemctl start elasticsearch.service

# Limit auto-creation of indices to those that start with "abc-"
http --body PUT :9200/_cluster/settings <<<'{ "persistent": { "action.auto_create_index": "+abc-*" } }'

# Set up a pipeline that creates monthly indexes, named "abc-YYYY-MM"
http -body PUT ":9200/_ingest/pipeline/abc-per-month" <<END
{
  "processors" : [
    {
      "date_index_name" : {
        "field" : "SOME_TIMESTAMP_MS",
        "date_rounding" : "M",
        "index_name_prefix" : "abc-",
        "index_name_format" : "uuuu-MM",
        "date_formats": [ "UNIX_MS" ]
      }
    }
  ]
}
END

# Send some sample data with pipeline (timestamp is 2017-11-19T17:05:11.111Z)
http --body POST ":9200/abc/_bulk" "pipeline==abc-per-month" <<END
{ "index" : { "_id" : "PipelineTest:1" } }
{ "SOME_TIMESTAMP_MS" : 1511111111111, "some": "thing" }
END

# This request fails, unexpectedly and I think inaccurately, with possibly inaccurate error message.
# The computed index name does match: "abc-2017-11" matches "+abc-*"
# Maybe the condition check is simply reversed? Or the pattern matching is inaccurate?
{
    "errors": true, 
    "ingest_took": 11, 
    "items": [
        {
            "index": {
                "_id": "PipelineTest:1", 
                "_index": "<abc-{2017-11||/M{uuuu-MM|UTC}}>", 
                "_type": "_doc", 
                "error": {
                    "index": "<abc-{2017-11||/M{uuuu-MM|UTC}}>", 
                    "index_uuid": "_na_", 
                    "reason": "no such index [<abc-{2017-11||/M{uuuu-MM|UTC}}>] and [action.auto_create_index] ([+abc-*]) doesn't match", 
                    "type": "index_not_found_exception"
                }, 
                "status": 404
            }
        }
    ], 
    "took": 5
}

# Note that normal implicit index creation does work (without pipeline):
# Index abc-something gets created implicitly, OK.
http --body POST ":9200/abc-something/_bulk" <<END
{ "index" : { "_id" : "PipelineTest:1" } }
{ "SOME_TIMESTAMP_MS" : 1511111111111, "some": "thing" }
END


# Workaround: disable action.auto_create_index (set to true, e.g. default setting)
http --body PUT :9200/_cluster/settings <<<'{ "persistent": { "action.auto_create_index": "true" } }'

# Now index "abc-2017-11" gets created, correct, even with pipeline.
http --body POST ":9200/abc/_bulk" "pipeline==abc-per-month" <<END
{ "index" : { "_id" : "PipelineTest:1" } }
{ "SOME_TIMESTAMP_MS" : 1511111111111, "some": "thing" }
END

{
    "errors": false, 
    "ingest_took": 1, 
    "items": [
        {
            "index": {
                "_id": "PipelineTest:1", 
                "_index": "abc-2017-11", 
                "_primary_term": 1, 
                "_seq_no": 0, 
                "_shards": {
                    "failed": 0, 
                    "successful": 1, 
                    "total": 2
                }, 
                "_type": "_doc", 
                "_version": 1, 
                "result": "created", 
                "status": 201
            }
        }
    ], 
    "took": 412
}

Provide logs (if relevant):

Logs do not show errors, just the cluster settings changes, including

[2019-12-10T08:59:12,405][INFO ][o.e.c.s.ClusterSettings  ] [...] updating [action.auto_create_index] from [true] to [+abc-*]
[2019-12-10T09:01:23,664][INFO ][o.e.c.s.ClusterSettings  ] [...] updating [action.auto_create_index] from [+abc-*] to [true]

[2019-12-10T09:01:30,664][INFO ][o.e.c.m.MetaDataCreateIndexService] [...] [abc-2017-11] creating index, cause [auto(bulk api)], templates [], shards [1]/[1], mappings []
[2019-12-10T09:01:30,928][INFO ][o.e.c.m.MetaDataMappingService] [...] [abc-2017-11/5j7hpltaTsK0KsD79YafKg] create_mapping [_doc]

[2019-12-10T09:07:30,674][INFO ][o.e.c.m.MetaDataCreateIndexService] [...] [abc-something] creating index, cause [auto(bulk api)], templates [], shards [1]/[1], mappings []
[2019-12-10T09:07:30,761][INFO ][o.e.c.m.MetaDataMappingService] [...] [abc-something/PBdRe703TLmzaNKo9C6TGQ] create_mapping [_doc]

Some References
#20640, #22435

@cbuescher cbuescher added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Dec 10, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/CRUD)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@rjernst rjernst added Team:Data Management Meta label for data/management team Team:Distributed Meta label for distributed team labels May 4, 2020
@de846
Copy link

de846 commented Aug 11, 2021

I'm also observing this behavior with 7.10.2.

@cwright-onshape
Copy link

cwright-onshape commented Jan 13, 2022

Still happening in 7.13.3. Any plan to fix this?

Encountered this while setting up legacy monitoring:

Caused by: org.elasticsearch.index.IndexNotFoundException: no such index [.monitoring-es-7-2022.01.13] and [action.auto_create_index] ([[.*]]) doesn't match
	at org.elasticsearch.action.support.AutoCreateIndex.shouldAutoCreate(AutoCreateIndex.java:104) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.action.admin.indices.create.AutoCreateAction$TransportAction$1.execute(AutoCreateAction.java:147) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:48) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:691) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:313) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:208) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:62) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:140) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:139) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:177) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241) ~[elasticsearch-7.13.3.jar:7.13.3]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204) ~[elasticsearch-7.13.3.jar:7.13.3]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:831) ~[?:?]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. Team:Data Management Meta label for data/management team Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

6 participants