-
Notifications
You must be signed in to change notification settings - Fork 24.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create system indices on-demand #64533
Create system indices on-demand #64533
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that it might be best to take a different approach and hook into the index creation mechanisms. My thought was mappings and settings get defined on a system index descriptor and we know by the name of the index what mappings and settings to apply to it, so when we get a create index request for a system index we know exactly what to do (at least for the ones we intend to have the system manage creation for).
@jaymode that's an interesting option. I assume we'd hook into That made me wonder - do we have any guards against a user creating e.g. Actually, I wonder if we can expand that do any auto-create index request that matches the system index patterns e.g. |
No we do not. This a goal for the project and I view this as a step to getting us there.
I'm not sure if who matters as much as the index being created correctly at the moment. We can implement security restrictions so that only the In my mind, I see us doing something like:
@gwbrown @williamrandolph thoughts? |
This is pretty much what I had in mind as well. Something like this to ensure the index has been created prior to every indexing operation might be necessary as things stand today, because it's very possible for the index to unexpectedly get deleted. But in the brave new world of protected system indices, we won't have to really worry about that case - we can either make sure that the index is created at startup (if that's necessary) or ensure that the right settings/mapping is applied when the index is auto-created. |
I've pushed a swathe of changes to create system indices on-demand. I also changed the security index to use this, since it's a complicated example. A whole bunch of tests are going to fail, I expect, but I want to see what happens 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look and left a couple of things I think we need to work on.
|
||
/** | ||
* This class holds the {@link SystemIndexDescriptor} objects that represent system indices the | ||
* node knows about. Methods for determining if an index should be a system index are also provided | ||
* to reduce the locations within the code that need to deal with {@link SystemIndexDescriptor}s. | ||
*/ | ||
public class SystemIndices { | ||
public class SystemIndices implements ClusterStateListener { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate the "SystemIndexManager" out of the SystemIndices
class? I like that SystemIndices
is fairly lightweight and compact. The manager (and other services) will depend on SystemIndices
, like SystemIndexMetadataUpgradeService
.
if (descriptor.getAliasName() == null) { | ||
aliases = Set.of(); | ||
} else { | ||
concreteIndexName = descriptor.getIndexPattern(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need a method to get the current concrete index name? Maybe someone defines a pattern with a * to cover a variety of system indices but we should know what the concrete value to create is (ie security_6 vs security_7).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaymode @gwbrown Right now I feel that there's a conflict between defining a system index pattern with a wildcard, and being able to automatically create it, essentially because we have to define the concrete / underlying index name somewhere.
At the moment, this branch assumes that if descriptor.isAutomaticallyManaged()
returns true
, then descriptor.getIndexPattern()
can be used as the concrete index name. Partly for that reason, the Security
plugin doesn't define a descriptor for .security-*
, but instead defines a .security-6
descriptor and a .security-7
descriptor, but only the latter has the .security
alias defined.
That being said, I'm not sure how this scheme would work if we needed to introduce a .security-8
for some reason. Clearly in a new cluster, .security-8
should be auto-created with the .security
alias pointing to it, but what about in an upgraded deployment?
I wonder whether a descriptor should be able to define:
- An index pattern
- One or more concrete index names, with optional mappings and settings
- An alias-to-index mapping, which also represents what the latest concrete index name is
- Perhaps a method to migrate between indices, e.g. convert
.foo-7
to.foo-8
, and re-point the alias?
So the for Security, we'd have one descriptor containing:
- An index pattern of
.security*
- An index definition for
.security-7
(it wouldn't be necessary to define one for.security-6
any more) - The alias
.security
, pointing at.security-7
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be able to separately have a reserved pattern and also have the ability to define concrete indices with aliases, so I agree with your proposal.
There will be follow-on work involved since we should prevent creation of aliases that match the index pattern, unless the alias definition matches that which the system knows about.
I have updated the PR after going over the mappings upgrade code more carefully, and adding test coverage. |
Currently the ML system indices are having the correct mappings applied via index templates. Will the cease to work as soon as this PR is merged? I know we need to switch over the ML system index mappings from index templates to the new system index descriptor mappings mechanism at some point before 8.0 release, but the timing depends on the effects of this PR. If index templates will stop being considered for system indices when this PR is merged then the switch needs to happen in this PR. Someone from the ML team can help with this by pushing to the PR branch when the basic framework is complete. Or if there will still be a period when index templates are considered for system indices then we can do the ML switchover as a separate followup PR. |
@droberts195 there's no hard requirement for ML to switch in this PR. I migrated various plugins so I could see how they would look when being automatically managed, but we can easily do that work later. So if it would be more convenient for ML to switch over at a later point, I'm happy to make sure that happens. I had thought I'd break up the PR before merging anyway, since it affects various Security classes too. |
@droberts195 actually, sorry, I looked again at the ML changes and ML won't be affected by this - its system index descriptors don't define any of the fields that would mark them as automatically managed. |
OK, great, thanks @pugnascotia. At some point before 8.0.0-alpha1 we'll create a followup PR that says the ML system indices are automatically managed and moves their mapping definitions over to the system index descriptors. It's great that it doesn't all have to be done in one big PR. |
I've split out the infrastructure parts of this PR into #65604. |
Shall we close this one @pugnascotia? |
Sure can. |
Part of #61656.
This PR introduces a service that makes it straightforward to automatically create an index on-demand. This pattern already existed in the codebase, but varied in its implementation.
Notes to reviewers:
BlobStoreCacheService