Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create system indices on startup if configured #61656

Closed
Tracked by #50251
jaymode opened this issue Aug 27, 2020 · 11 comments
Closed
Tracked by #50251

Create system indices on startup if configured #61656

jaymode opened this issue Aug 27, 2020 · 11 comments
Assignees
Labels
:Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team

Comments

@jaymode
Copy link
Member

jaymode commented Aug 27, 2020

System indices are necessary for Elasticsearch and the stack to function and should be available as soon as possible. Upon introduction of a new system index or startup of a cluster, a mechanism should be provided so that system indices can be automatically created if configured to do so.

Some system indices that belong to the stack, such as the kibana indices will not have their configuration stored in the system index plugin so this is an example of system indices that we would not want to be created automatically.

For system indices that we know the configuration for ahead of time, then we should enable the automated creation of these to prevent multiple components from needing to maintain their own methods of creating these indices and then later updating them when the time comes. Configuration in this context means the mappings, settings, and any aliases necessary for the system indices.

@jaymode jaymode added >enhancement :Core/Infra/Core Core issues without another label labels Aug 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Core)

@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Aug 27, 2020
@jaymode jaymode mentioned this issue Aug 27, 2020
23 tasks
@pugnascotia
Copy link
Contributor

For system indices that we know the configuration for ahead of time

What would be an example of a system index for which we don't know the configuration ahead of time?

@jaymode
Copy link
Member Author

jaymode commented Oct 14, 2020

Kibana is an example since plugins can change some of their mappings.

@pugnascotia
Copy link
Contributor

I spent a while looking for prior art in the codebase, and the closest thing seems to be IndexTemplateRegistry. Most of that class concerns templates and policies, but its cluster state listener implements logic to look for required templates, and loads them if they are missing. We can build something similar for system indices, taking as much care as we can to check that the cluster is in a suitable state.

@jaymode
Copy link
Member Author

jaymode commented Oct 19, 2020

There is also a MetadataIndexTemplateService that we should look at as a prior art. I think we can model something after either one to pre-create the system indices. (An aside, at some point I'd like to only have a single implementation for index template installation and upgrades).

@pugnascotia
Copy link
Contributor

It's a shame we don't have a single implementation for creating templates - I have some experimental code to declare which templates a system index relies upon, wait until they exist, and then create the system index. However that doesn't work for e.g. .async-search because that index is created programmatically, including settings and mappings.

@jaymode
Copy link
Member Author

jaymode commented Oct 21, 2020

I think we should move away from templates for system indices that we are going to create; the values should be handled in code like async search does. I have been thinking about this some more today and one thought that I had relates to the auto creation of indices; we could defer creating until time of write. We would just hook into the creation of indices so that write requests will create the system index with all of the proper mappings/settings. This has a downside that get and search requests would still need to handle IndexNotFoundException.

@gwbrown @williamrandolph thoughts?

@jaymode
Copy link
Member Author

jaymode commented Oct 22, 2020

I'll add a bit more for why it might be best to avoid creating system indices that may go unused by users and that is the fact that each shard has overhead and our docs state 20 shards per GB of heap. If we think about small nodes and clusters, by pre-creating unused indices we are limiting already scarce resources for these users.

@pugnascotia
Copy link
Contributor

That was indeed my concern too about pre-creating indices. So perhaps we can define some common patterns for dealing with possibly-not-existing-yet indices, and creating them on-demand. Then our plugins and 3rd-party plugin authors could just hook into the common framework.

@rudolf
Copy link
Contributor

rudolf commented Nov 18, 2020

Just summarising some earlier email discussions:

Since Kibana's indices have dynamic mappings these can't be created on startup. However, auto-created indices still sometimes occur due to user error which then creates a hard to recover from situation.

Kibana would like for the Elasticsearch Kibana plugin to disable auto-create for all it's system indices elastic/kibana#81790

pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Nov 30, 2020
Part of elastic#61656.

Add the necessary support for automatically creating and updating system
indices. This works by making it possible to create a system index
descriptor with all the information needed to manage the mappings,
settings and aliases.

Follow-up work will opt existing indices into this framework.
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
pugnascotia added a commit that referenced this issue Dec 4, 2020
Part of #61656.

Add the necessary support for automatically creating and updating system
indices. This works by making it possible to create a system index
descriptor with all the information needed to manage the mappings,
settings and aliases.

Follow-up work will opt existing indices into this framework.
mark-vieira pushed a commit to mark-vieira/elasticsearch that referenced this issue Dec 4, 2020
Part of elastic#61656.

Add the necessary support for automatically creating and updating system
indices. This works by making it possible to create a system index
descriptor with all the information needed to manage the mappings,
settings and aliases.

Follow-up work will opt existing indices into this framework.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Dec 7, 2020
Part of elastic#61656.

Add the necessary support for automatically creating and updating system
indices. This works by making it possible to create a system index
descriptor with all the information needed to manage the mappings,
settings and aliases.

Follow-up work will opt existing indices into this framework.
pugnascotia added a commit that referenced this issue Dec 9, 2020
Backport of #65604.

Part of #61656.

Add the necessary support for automatically creating and updating system
indices. This works by making it possible to create a system index
descriptor with all the information needed to manage the mappings,
settings and aliases.

Follow-up work will opt existing indices into this framework.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Dec 10, 2020
Part of elastic#61656. Auto-create the .logstash index using the system index
infrastructure.
pugnascotia added a commit that referenced this issue Jan 5, 2021
Backport of #66190.

Part of #61656. Auto-create the `.logstash` index using the system index
infrastructure.
pugnascotia added a commit that referenced this issue Jan 5, 2021
Backport of #65959.

Part of #61656. Change the `.tasks` system index descriptor so that
the index can be automatically managed by Elasticsearch e.g. created
on-demand, mapping kept up-to-date, etc.

Also add an integration test to exercise the `SystemIndexManager`
end-to-end, and cherry-pick #66605 to add more system index tests.
pugnascotia added a commit that referenced this issue Jan 6, 2021
Backport of #66276.

Part of #61656. Use the system indices auto-creation infrastructure
for the searchable snapshots plugin.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Jan 7, 2021
Part of elastic#61656. Managing `.tasks` via the system index infrastructure is
causing a BWC issue, so this commit reverts the index back to being
created via a template until we can figure out the problem.
pugnascotia added a commit that referenced this issue Jan 7, 2021
Part of #61656. Managing `.tasks` via the system index infrastructure is
causing a BWC issue, so this commit reverts the index back to being
created via a template until we can figure out the problem.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Jan 7, 2021
Part of elastic#61656. Partial revet of elastic#66640.

Managing `.tasks` via the system index infrastructure is
causing a BWC issue, so this commit reverts the index back to being
created via a template until we can figure out the problem.
pugnascotia added a commit that referenced this issue Jan 8, 2021
Part of #61656. Partial revert of #66640.

Managing .tasks via the system index infrastructure is
causing a BWC issue, so this commit reverts the index back to being
created via a template until we can figure out the problem.
pugnascotia added a commit that referenced this issue Feb 2, 2021
Part of #61656.

Change the Security plugin so that its system indices are managed automatically
by the system indices infrastructure.

Also add an `origin` field to `CreateIndexRequest` and `UpdateSettingsRequest`.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Feb 2, 2021
Backport of elastic#67114. Part of elastic#61656.

Change the Security plugin so that its system indices are managed automatically
by the system indices infrastructure.

Also add an `origin` field to `CreateIndexRequest` and `UpdateSettingsRequest`.
pugnascotia added a commit that referenced this issue Feb 9, 2021
Backport of #67114. Part of #61656.

Change the Security plugin so that its system indices are managed automatically
by the system indices infrastructure.

Also add an `origin` field to `CreateIndexRequest` and `UpdateSettingsRequest`.
pugnascotia added a commit that referenced this issue Feb 22, 2021
Part of #61656.

Migrate the `.watches` and `.triggered_watches` system indices to use the auto-create infrastructure. The watcher history indices are left alone.

As part of this work, a `SystemIndexDescriptor` now inspects its mappings to determine whether it has any dynamic mappings. This influences how strict Elasticsearch is with enforcing the descriptor's mappings, since ES cannot know in advanced what all the mappings will be.

This PR also fixes the `SystemIndexManager` so that (1) it doesn't fall over when attempting to inspect the state of an index that hasn't been created yet, and (2) does update mappings if there's
no version field in the mapping metadata.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Feb 22, 2021
Part of elastic#61656.

Migrate the `.watches` and `.triggered_watches` system indices to use
the auto-create infrastructure. The watcher history indices are left
alone.

As part of this work, a `SystemIndexDescriptor` now inspects its
mappings to determine whether it has any dynamic mappings. This
influences how strict Elasticsearch is with enforcing the descriptor's
mappings, since ES cannot know in advanced what all the mappings will
be.

This PR also fixes the `SystemIndexManager` so that (1) it doesn't fall
over when attempting to inspect the state of an index that hasn't been
created yet, and (2) does update mappings if there's no version field in
the mapping metadata.
pugnascotia added a commit that referenced this issue Feb 23, 2021
Backport of #67588. Part of #61656.

Migrate the `.watches` and `.triggered_watches` system indices to use
the auto-create infrastructure. The watcher history indices are left
alone.

As part of this work, a `SystemIndexDescriptor` now inspects its
mappings to determine whether it has any dynamic mappings. This
influences how strict Elasticsearch is with enforcing the descriptor's
mappings, since ES cannot know in advanced what all the mappings will
be.

This PR also fixes the `SystemIndexManager` so that (1) it doesn't fall
over when attempting to inspect the state of an index that hasn't been
created yet, and (2) does update mappings if there's no version field in
the mapping metadata.
@pugnascotia
Copy link
Contributor

We have now migrated all system indices that Elasticsearch can manage itself to the system indices framework. As such, this issue can now be closed.

alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656.

Add the necessary support for automatically creating and updating system
indices. This works by making it possible to create a system index
descriptor with all the information needed to manage the mappings,
settings and aliases.

Follow-up work will opt existing indices into this framework.
alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656. Use the system indices auto-creation infrastructure
for the searchable snapshots plugin.
alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656. Auto-create the .logstash index using the system index
infrastructure.
alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656.

Migrate async search to use an auto-created system index. This does
change the behaviour of `AsyncTaskIndexService` - previously, it
would ensure the index existed before carrying out any operation,
whereas now the index is only created when a document is created.
For any other operation, the wrapped `IndexNotFoundException` will
be allowed to bubble up.
alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656. Change the `.tasks` system index descriptor so that
the index can be automatically managed by Elasticsearch e.g. created
on-demand, mapping kept up-to-date, etc.

Also add an integration test to exercise the `SystemIndexManager`
end-to-end.
alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656. Managing `.tasks` via the system index infrastructure is
causing a BWC issue, so this commit reverts the index back to being
created via a template until we can figure out the problem.
alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
Part of elastic#61656.

Change the Security plugin so that its system indices are managed automatically
by the system indices infrastructure.

Also add an `origin` field to `CreateIndexRequest` and `UpdateSettingsRequest`.
easyice pushed a commit to easyice/elasticsearch that referenced this issue Mar 25, 2021
Part of elastic#61656.

Change the Security plugin so that its system indices are managed automatically
by the system indices infrastructure.

Also add an `origin` field to `CreateIndexRequest` and `UpdateSettingsRequest`.
easyice pushed a commit to easyice/elasticsearch that referenced this issue Mar 25, 2021
Part of elastic#61656.

Migrate the `.watches` and `.triggered_watches` system indices to use the auto-create infrastructure. The watcher history indices are left alone.

As part of this work, a `SystemIndexDescriptor` now inspects its mappings to determine whether it has any dynamic mappings. This influences how strict Elasticsearch is with enforcing the descriptor's mappings, since ES cannot know in advanced what all the mappings will be.

This PR also fixes the `SystemIndexManager` so that (1) it doesn't fall over when attempting to inspect the state of an index that hasn't been created yet, and (2) does update mappings if there's
no version field in the mapping metadata.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

5 participants