New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[meta] Index Lifecycle Management Plan #29823
Comments
Original comment by @talevy: Example Pipeline for reference (this comment may be updated as changes occur) Lifecycle Policy
template
Create Index
|
Original comment by @ppf2: I would like to suggest also setting |
Original comment by @colings86: @ppf2 yeah this is a good point and something we have already thought about (though its not detailed here explicitly) in the context of the forcemerge and shrink actions specifically. We decide that we didn't want to add a write block to the indices automatically which persists for all time after the start of the warm phase because users might find this surprising/annoying but instead the forcemerge and shrink actions enable the write block as their first step and then disable the write block when they are finished. We could also potentially add an explicit write block action that you could enable in the warm/cold phase to keep the index write block enabled outside of these specific actions as well but this isn't something that we currently have on the plan. Then, if we decide to add this explicit action we could have the UI support the action and potentially set it by default on the policy UI with the option for the user to disable it if they wish. Also if we go down the route where beats/logstash define a default policy for their indexes then I would expect those policies to enable this action too. |
Original comment by @PhaedrusTheGreek: @pickypg I bet it would be useful for the |
Original comment by @colings86: Let's stick to implementing the feature first and then we can look into what we can/should add to monitoring. It's still early enough in the implementation that the design is in flux and things may change. Also note that in the current design the |
Pinging @elastic/es-core-infra |
This option is only settable while the index is closed, and doesn't make sense for a force merge. Relates to elastic#29823
This option is only settable while the index is closed, and doesn't make sense for a force merge. Relates to #29823
These are only ever set internally during regular ILM execution, they don't need to be set otherwise. A subsequent PR will work on adding a dedicated endpoint for the `LIFECYCLE_NAME` setting so it can be changed by a user (and then marked as `InternalIndex` as well) Relates to elastic#29823
This adds HLRC support for the ILM operation of setting an index's lifecycle policy. It also includes extracting and renaming a number of classes (like the request and response objects) as well as the addition of a new `IndexLifecycleClient` for the HLRC. This is a prerequisite to making the `index.lifecycle.name` setting internal only, because we require a dedicated REST endpoint to change the policy, and our tests currently set this setting with the REST client multiple places. A subsequent PR will change the setting to be internal and move those uses over to this new API. This misses some links to the documentation because I don't think ILM has any documentation available yet. Relates to elastic#29827 and elastic#29823
These are only ever set internally during regular ILM execution, they don't need to be set otherwise. A subsequent PR will work on adding a dedicated endpoint for the `LIFECYCLE_NAME` setting so it can be changed by a user (and then marked as `InternalIndex` as well) Relates to #29823
This adds HLRC support for the ILM operation of setting an index's lifecycle policy. It also includes extracting and renaming a number of classes (like the request and response objects) as well as the addition of a new `IndexLifecycleClient` for the HLRC. This is a prerequisite to making the `index.lifecycle.name` setting internal only, because we require a dedicated REST endpoint to change the policy, and our tests currently set this setting with the REST client multiple places. A subsequent PR will change the setting to be internal and move those uses over to this new API. This misses some links to the documentation because I don't think ILM has any documentation available yet. Relates to #29827 and #29823
This commit makes the `index.lifecycle.name` setting internal an index, this means that the policy can only be set on the index creation, or with the specialized `RestSetIndexLifecyclePolicy` action. Relates to elastic#29823
By making the `settings()` method public on `UpdateSettingsRequest` (I think it should have been in the first place) we can get rid of this class entirely. Mock response objects are now constructed by parsing JSON without making the constructor public. Relates to elastic#29823
This commit makes the `index.lifecycle.name` setting internal an index, this means that the policy can only be set on the index creation, or with the specialized `RestSetIndexLifecyclePolicy` action. Relates to #29823
This commit removes the hacks associated with mocking Response objects. Rather than parse a wrapped byte array, the constructors for `IndicesAliasesResponse` and `ResizeResponse` are made public Relates to #29823
* Remove RolloverIndexTestHelper This removes the `RolloverIndexTestHelper` class in favor of making a couple of getters publically accessible as well as custom building a response object using JSON parsing. Relates to #29823
* Remove UpdateSettingsTestHelper class By making the `settings()` method public on `UpdateSettingsRequest` (I think it should have been in the first place) we can get rid of this class entirely. Mock response objects are now constructed by parsing JSON without making the constructor public. Relates to #29823
* Store phase steps for index in PolicyStepsRegistry This changes the way that steps are retrieved from `PolicyStepsRegistry` to store the steps on a per-index basis (in memory for now, though that will change in subsequent PRs). These steps are rebuilt as the index changes phases. This also fixes a bug where an action with the same phase and name was not being considered changed (and thus updated) in the compiled steps list. These are now correctly considered as "upsert" diffs. Relates to #29823
* Store phase steps for index in PolicyStepsRegistry This changes the way that steps are retrieved from `PolicyStepsRegistry` to store the steps on a per-index basis (in memory for now, though that will change in subsequent PRs). These steps are rebuilt as the index changes phases. This also fixes a bug where an action with the same phase and name was not being considered changed (and thus updated) in the compiled steps list. These are now correctly considered as "upsert" diffs. Relates to #29823
Since we now store a pre-compiled list of steps for an index's phase in the `PolicyStepsRegistry`, we no longer need to worry about updating policies as any updates won't affect the current phase, and will only be picked up on phase transitions. This also removes the tests that test these methods Relates to elastic#29823
* Remove canSetPolicy, canUpdatePolicy and canRemovePolicy Since we now store a pre-compiled list of steps for an index's phase in the `PolicyStepsRegistry`, we no longer need to worry about updating policies as any updates won't affect the current phase, and will only be picked up on phase transitions. This also removes the tests that test these methods Relates to #29823
* Remove canSetPolicy, canUpdatePolicy and canRemovePolicy Since we now store a pre-compiled list of steps for an index's phase in the `PolicyStepsRegistry`, we no longer need to worry about updating policies as any updates won't affect the current phase, and will only be picked up on phase transitions. This also removes the tests that test these methods Relates to #29823
This commit removes PhaseAfterStep and all the plumbing associated with it. Instead, we rely on the LifecyclePolicyRunner to police itself for advancing phases. This also makes a modification to the settings that are exposed related to the current phase, instead of returning the current phase/step/action as-is in the `index.lifecycle.phase` (etc) setting, these are now split into: `index.lifecycle.current_phase|action|step` - the currently executing phase/action/step which may or may not have completed `index.lifecycle.next_phase|action|step` - the next phase/action/step to which we will be proceeding While I don't think these will cause much issue (especially since nothing is being broken for users here), these changes were required to have the `phase_time` correctly updated now that we don't have a "shim" step between phases. Without these it would be confusing as the index would advance to have an `index.lifecycle.phase` setting that was potentially one phase in the future. Relates to elastic#29823
This removes `PhaseAfterStep` in favor of a new `PhaseCompleteStep`. This step in only a marker that the `LifecyclePolicyRunner` needs to halt until the time indicated for entering the next phase. This also fixes a bug where phase times were encapsulated into the policy instead of dynamically adjusting to policy changes. Supersedes #33140, which it replaces Relates to #29823
This removes `PhaseAfterStep` in favor of a new `PhaseCompleteStep`. This step in only a marker that the `LifecyclePolicyRunner` needs to halt until the time indicated for entering the next phase. This also fixes a bug where phase times were encapsulated into the policy instead of dynamically adjusting to policy changes. Supersedes #33140, which it replaces Relates to #29823
This moves away from caching a list of steps for a current phase, instead rebuilding the necessary step from the phase JSON stored in the index's metadata. Relates to elastic#29823
This moves away from caching a list of steps for a current phase, instead rebuilding the necessary step from the phase JSON stored in the index's metadata. Relates to #29823
This moves away from caching a list of steps for a current phase, instead rebuilding the necessary step from the phase JSON stored in the index's metadata. Relates to #29823
This commit changes the way that step execution flows. Rather than have any step run when the cluster state changes or the periodic scheduler fires, this now runs the different types of steps at different times. `AsyncWaitStep` is run at a periodic manner, ie, every 10 minutes by default `ClusterStateActionStep` and `ClusterStateWaitStep` are run every time the cluster state changes. `AsyncActionStep` is now run only after the cluster state has been transitioned into a new step. This prevents these non-idempotent steps from running at the same time. It addition to being run when transitioned into, this is also run when a node is newly elected master (only if set as the current step) so that master failover does not fail to run the step. This also changes the `RolloverStep` from an `AsyncActionStep` to an `AsyncWaitStep` so that it can run periodically. Relates to elastic#29823
This commit changes the way that step execution flows. Rather than have any step run when the cluster state changes or the periodic scheduler fires, this now runs the different types of steps at different times. `AsyncWaitStep` is run at a periodic manner, ie, every 10 minutes by default `ClusterStateActionStep` and `ClusterStateWaitStep` are run every time the cluster state changes. `AsyncActionStep` is now run only after the cluster state has been transitioned into a new step. This prevents these non-idempotent steps from running at the same time. It addition to being run when transitioned into, this is also run when a node is newly elected master (only if set as the current step) so that master failover does not fail to run the step. This also changes the `RolloverStep` from an `AsyncActionStep` to an `AsyncWaitStep` so that it can run periodically. Relates to #29823
This commit changes the way that step execution flows. Rather than have any step run when the cluster state changes or the periodic scheduler fires, this now runs the different types of steps at different times. `AsyncWaitStep` is run at a periodic manner, ie, every 10 minutes by default `ClusterStateActionStep` and `ClusterStateWaitStep` are run every time the cluster state changes. `AsyncActionStep` is now run only after the cluster state has been transitioned into a new step. This prevents these non-idempotent steps from running at the same time. It addition to being run when transitioned into, this is also run when a node is newly elected master (only if set as the current step) so that master failover does not fail to run the step. This also changes the `RolloverStep` from an `AsyncActionStep` to an `AsyncWaitStep` so that it can run periodically. Relates to #29823
All blockers for the initial beta release are now merged so I'm closing this out since we will track bugs and tasks for GA in separate issues with the |
Tasks
snapshot(will implement snapshotting as a separate solution)timeseries
, which will allow the following phases (in order): EMAIL REDACTED LINK REDACTED LINK REDACTED LINK REDACTED)IndexMetaData.getCreationDate
and use a custom setting so that it can be inherited across shrink and other operations EMAIL REDACTED LINK REDACTED)index.lifecycle.phase_time
andindex.lifecycle.action_time
to help tracktracking Steps progress
Remaining Tasks
Completed
warm
) @colings86 Changes PhaseAfterStep to take the name of the previous phase #30756index.lifecycle.skip
setting to allow indexes to be put into a "maintenance" mode where index lifecycle will not touch them @talevy add index.lifecycle.skip setting for skipping policy execution #30766index.lifecycle.date
in Rollover for rolled over indices @talevy inherit [index.lifecycle.date] from rolled-over time #30853index.lifecycle.name
: @colings86index.lifecycle.name
using the Update Settings API @jasontedor Add notion of internal index settings #31286PUT {index}/_lifecycle/{policy}
andPOST {index}/_lifecycle/{policy}
) @colings86 Adds API to assign or change the policy for an index #31277DELETE {index}/_lifecycle
) @colings86 Adds an API to remove ILM from an index completely #31358PUT {index}/_lifecycle/{policy}
) @colings86 Adds API to assign or change the policy for an index #31277is_write_index
Update ForceMergeAction to update max segment size before merging due to: Add ways to force-merge down to 1 segment #31742(for 6.x there is nothing to do here as we will maintain the current behaviour in the ForceMerge APIis_write_index
for managing indices/aliases in ILM: [meta] Alias is_write_index feature tracking #31959 @talevy ***Handle policy updates and changes toindex.lifecycle.name
: @colings86Allow PUT index lifecycle API to update an existing policyIf current step no longer exists in the policy and instead move to the next action that exists (or the next phase after step) @colings86 ***Add change policy for index API (REST endpointPUT {index}/_lifecycle/{policy}
)If current step no longer exists in the policy and instead move to the next action that exists (or the next phase after step) Skips to next available action on missing step #32283 @colings86 ***after
@colings86 ILM: fixafter
rendering to xcontent #33282Blockers to merging into master in priority order from most to least (items are marked in difficulty using *, **, ***)
Phase.toXContent()
). Instead of holding on to all of the compiled steps PolicyStepsRegistry should have a map of index to the steps of the current phase. Store phase steps for index in PolicyStepsRegistry #32926Blockers to first release in priority order from most to least (items are marked in difficulty using *, **, ***)
minimnum_age
@dakrone Rename "after" field #32624 *:Core/Features/ILM
andblocker
Optional (but would be really good to have)
The text was updated successfully, but these errors were encountered: