[meta] Index Lifecycle Management Plan #29823

elasticmachine · 2017-10-30T04:20:16Z

Tasks

tracking Steps progress

Remaining Tasks

Completed

Blockers to merging into master in priority order from most to least (items are marked in difficulty using , , )

Blockers to first release in priority order from most to least (items are marked in difficulty using , , )

Write Documentation **** @colings86 @talevy
High level REST client support Java high-level REST client completeness for ILM APIs #33100 **** (Remaining task @gwbrown)
clean up experience when invalid policy is used by index ILM: IndexLifecycleRunner should not spam logs with warnings when policy is misconfigured #33074 * @gwbrown
Rolling Upgrade Tests @talevy add ILM rolling upgrade tests #32828 **
Rename "after" field to minimnum_age @dakrone Rename "after" field #32624 *
Issues labelled :Core/Features/ILM and blocker
Manual Testing @dakrone

Optional (but would be really good to have)

ILM usage stats (using xpack usage API) @colings86 Adds usage data for ILM #33377 **

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-01-12T00:56:21Z

Original comment by @talevy:

Example Pipeline for reference (this comment may be updated as changes occur)

Lifecycle Policy

PUT /_xpack/index_lifecycle/my_lifecycle
{
   "policy": {
     "type": "timeseries",
     "phases": {
       "hot": {
         "after": "0s",
         "actions": {
          "rollover": {
            "alias": "logs-write",
            "max_age": "5s"
          }        
         }
       },
       "warm": {
         "after": "10s",
         "actions": {
           "allocate": {
             "require": { "_name": "node-1" },
             "include": {},
             "exclude": {}
           },
           "shrink": {
             "number_of_shards": 1
           },
           "forcemerge": {
             "max_num_segments": 1000
           }
         }
       },
       "cold": {
         "after": "20s",
         "actions": {
          "replicas": {
            "number_of_replicas": 0
          }
         }
       },
       "delete": {
         "after": "30s",
         "actions": {
           "delete": {}
         }
       }
     }
   }
}

template

PUT _template/my_template
{
  "index_patterns": ["logs-*"],
  "settings": {
    "number_of_shards": 4,
    "number_of_replicas": 1,
    "index.lifecycle.name": "my_lifecycle"
  },
  "aliases": {
    "logs-read": {}
  },
  "mappings": {
    "_doc": {
      "_source": {
        "enabled": false
      }
    }
  }
}

Create Index

PUT logs-000001
{
  "aliases": {
    "logs-write": {}
  }
}

elasticmachine · 2018-02-27T22:43:52Z

Original comment by @ppf2:

I would like to suggest also setting "index.blocks.write":true against the warm indices once they have moved to the warm tier to ensure that when force merge runs, it is run against an index with no further writes. This can even be done as the very first step of the warm processing, like as a step right before we do the allocation filtering to move it to warm.

elasticmachine · 2018-03-01T15:55:16Z

Original comment by @colings86:

@ppf2 yeah this is a good point and something we have already thought about (though its not detailed here explicitly) in the context of the forcemerge and shrink actions specifically. We decide that we didn't want to add a write block to the indices automatically which persists for all time after the start of the warm phase because users might find this surprising/annoying but instead the forcemerge and shrink actions enable the write block as their first step and then disable the write block when they are finished.

We could also potentially add an explicit write block action that you could enable in the warm/cold phase to keep the index write block enabled outside of these specific actions as well but this isn't something that we currently have on the plan. Then, if we decide to add this explicit action we could have the UI support the action and potentially set it by default on the policy UI with the option for the user to disable it if they wish. Also if we go down the route where beats/logstash define a default policy for their indexes then I would expect those policies to enable this action too.

elasticmachine · 2018-04-02T14:05:39Z

Original comment by @PhaedrusTheGreek:

@pickypg I bet it would be useful for the index_stats monitoring to capture the index.lifecycle.name here, so we can agg on it, e.g., Which index series have the worst search latency.

elasticmachine · 2018-04-03T08:50:08Z

Original comment by @colings86:

Let's stick to implementing the feature first and then we can look into what we can/should add to monitoring. It's still early enough in the implementation that the design is in flux and things may change. Also note that in the current design the index.lifecycle.name is in the index settings so would already be exported in monitoring I think.

elasticmachine · 2018-04-25T07:25:40Z

Pinging @elastic/es-core-infra

This option is only settable while the index is closed, and doesn't make sense for a force merge. Relates to elastic#29823

This option is only settable while the index is closed, and doesn't make sense for a force merge. Relates to #29823

These are only ever set internally during regular ILM execution, they don't need to be set otherwise. A subsequent PR will work on adding a dedicated endpoint for the `LIFECYCLE_NAME` setting so it can be changed by a user (and then marked as `InternalIndex` as well) Relates to elastic#29823

This adds HLRC support for the ILM operation of setting an index's lifecycle policy. It also includes extracting and renaming a number of classes (like the request and response objects) as well as the addition of a new `IndexLifecycleClient` for the HLRC. This is a prerequisite to making the `index.lifecycle.name` setting internal only, because we require a dedicated REST endpoint to change the policy, and our tests currently set this setting with the REST client multiple places. A subsequent PR will change the setting to be internal and move those uses over to this new API. This misses some links to the documentation because I don't think ILM has any documentation available yet. Relates to elastic#29827 and elastic#29823

These are only ever set internally during regular ILM execution, they don't need to be set otherwise. A subsequent PR will work on adding a dedicated endpoint for the `LIFECYCLE_NAME` setting so it can be changed by a user (and then marked as `InternalIndex` as well) Relates to #29823

This adds HLRC support for the ILM operation of setting an index's lifecycle policy. It also includes extracting and renaming a number of classes (like the request and response objects) as well as the addition of a new `IndexLifecycleClient` for the HLRC. This is a prerequisite to making the `index.lifecycle.name` setting internal only, because we require a dedicated REST endpoint to change the policy, and our tests currently set this setting with the REST client multiple places. A subsequent PR will change the setting to be internal and move those uses over to this new API. This misses some links to the documentation because I don't think ILM has any documentation available yet. Relates to #29827 and #29823

This commit makes the `index.lifecycle.name` setting internal an index, this means that the policy can only be set on the index creation, or with the specialized `RestSetIndexLifecyclePolicy` action. Relates to elastic#29823

By making the `settings()` method public on `UpdateSettingsRequest` (I think it should have been in the first place) we can get rid of this class entirely. Mock response objects are now constructed by parsing JSON without making the constructor public. Relates to elastic#29823

This commit makes the `index.lifecycle.name` setting internal an index, this means that the policy can only be set on the index creation, or with the specialized `RestSetIndexLifecyclePolicy` action. Relates to #29823

This commit removes the hacks associated with mocking Response objects. Rather than parse a wrapped byte array, the constructors for `IndicesAliasesResponse` and `ResizeResponse` are made public Relates to #29823

* Remove RolloverIndexTestHelper This removes the `RolloverIndexTestHelper` class in favor of making a couple of getters publically accessible as well as custom building a response object using JSON parsing. Relates to #29823

* Remove UpdateSettingsTestHelper class By making the `settings()` method public on `UpdateSettingsRequest` (I think it should have been in the first place) we can get rid of this class entirely. Mock response objects are now constructed by parsing JSON without making the constructor public. Relates to #29823

* Store phase steps for index in PolicyStepsRegistry This changes the way that steps are retrieved from `PolicyStepsRegistry` to store the steps on a per-index basis (in memory for now, though that will change in subsequent PRs). These steps are rebuilt as the index changes phases. This also fixes a bug where an action with the same phase and name was not being considered changed (and thus updated) in the compiled steps list. These are now correctly considered as "upsert" diffs. Relates to #29823

Since we now store a pre-compiled list of steps for an index's phase in the `PolicyStepsRegistry`, we no longer need to worry about updating policies as any updates won't affect the current phase, and will only be picked up on phase transitions. This also removes the tests that test these methods Relates to elastic#29823

* Remove canSetPolicy, canUpdatePolicy and canRemovePolicy Since we now store a pre-compiled list of steps for an index's phase in the `PolicyStepsRegistry`, we no longer need to worry about updating policies as any updates won't affect the current phase, and will only be picked up on phase transitions. This also removes the tests that test these methods Relates to #29823

This commit removes PhaseAfterStep and all the plumbing associated with it. Instead, we rely on the LifecyclePolicyRunner to police itself for advancing phases. This also makes a modification to the settings that are exposed related to the current phase, instead of returning the current phase/step/action as-is in the `index.lifecycle.phase` (etc) setting, these are now split into: `index.lifecycle.current_phase|action|step` - the currently executing phase/action/step which may or may not have completed `index.lifecycle.next_phase|action|step` - the next phase/action/step to which we will be proceeding While I don't think these will cause much issue (especially since nothing is being broken for users here), these changes were required to have the `phase_time` correctly updated now that we don't have a "shim" step between phases. Without these it would be confusing as the index would advance to have an `index.lifecycle.phase` setting that was potentially one phase in the future. Relates to elastic#29823

This removes `PhaseAfterStep` in favor of a new `PhaseCompleteStep`. This step in only a marker that the `LifecyclePolicyRunner` needs to halt until the time indicated for entering the next phase. This also fixes a bug where phase times were encapsulated into the policy instead of dynamically adjusting to policy changes. Supersedes #33140, which it replaces Relates to #29823

This moves away from caching a list of steps for a current phase, instead rebuilding the necessary step from the phase JSON stored in the index's metadata. Relates to elastic#29823

This moves away from caching a list of steps for a current phase, instead rebuilding the necessary step from the phase JSON stored in the index's metadata. Relates to #29823

This commit changes the way that step execution flows. Rather than have any step run when the cluster state changes or the periodic scheduler fires, this now runs the different types of steps at different times. `AsyncWaitStep` is run at a periodic manner, ie, every 10 minutes by default `ClusterStateActionStep` and `ClusterStateWaitStep` are run every time the cluster state changes. `AsyncActionStep` is now run only after the cluster state has been transitioned into a new step. This prevents these non-idempotent steps from running at the same time. It addition to being run when transitioned into, this is also run when a node is newly elected master (only if set as the current step) so that master failover does not fail to run the step. This also changes the `RolloverStep` from an `AsyncActionStep` to an `AsyncWaitStep` so that it can run periodically. Relates to elastic#29823

This commit changes the way that step execution flows. Rather than have any step run when the cluster state changes or the periodic scheduler fires, this now runs the different types of steps at different times. `AsyncWaitStep` is run at a periodic manner, ie, every 10 minutes by default `ClusterStateActionStep` and `ClusterStateWaitStep` are run every time the cluster state changes. `AsyncActionStep` is now run only after the cluster state has been transitioned into a new step. This prevents these non-idempotent steps from running at the same time. It addition to being run when transitioned into, this is also run when a node is newly elected master (only if set as the current step) so that master failover does not fail to run the step. This also changes the `RolloverStep` from an `AsyncActionStep` to an `AsyncWaitStep` so that it can run periodically. Relates to #29823

colings86 · 2018-12-13T12:53:51Z

All blockers for the initial beta release are now merged so I'm closing this out since we will track bugs and tasks for GA in separate issues with the :Core/Features/ILM label

elasticmachine added >feature Meta labels Apr 25, 2018

elasticmachine assigned colings86 Apr 25, 2018

colings86 added the :Data Management/ILM+SLM Index and Snapshot lifecycle management label Apr 25, 2018

eskibars mentioned this issue Apr 26, 2018

Curator should ignore indices with index.lifecycle.name set elastic/curator#1207

Closed

chrisronline mentioned this issue Jun 11, 2018

[ILM] Use new updated policy APIs elastic/kibana#19807

Closed

jasontedor mentioned this issue Jun 13, 2018

Add notion of internal index settings #31286

Merged

This was referenced Jun 15, 2018

[ILM] Update index management to show index lifecycle phase elastic/kibana#19949

Closed

[ILM] Ensure the client is handling the recent index creation changes related to aliasing elastic/kibana#20081

Closed

elasticmachine mentioned this issue Jul 25, 2018

Java high-level REST client completeness for xpack APIs #29827

Closed

98 tasks

dakrone added a commit to dakrone/elasticsearch that referenced this issue Jul 25, 2018

Remove "best_compression" option from the ForceMergeAction

1da0742

This option is only settable while the index is closed, and doesn't make sense for a force merge. Relates to elastic#29823

dakrone mentioned this issue Jul 25, 2018

Remove "best_compression" option from the ForceMergeAction #32373

Merged

dakrone added a commit that referenced this issue Jul 25, 2018

Remove "best_compression" option from the ForceMergeAction (#32373)

3daefe6

This option is only settable while the index is closed, and doesn't make sense for a force merge. Relates to #29823

dakrone mentioned this issue Jul 25, 2018

Make various LifecycleSettings Settings internal #32381

Merged

dakrone mentioned this issue Jul 27, 2018

Add high level rest client support for SetIndexLifecyclePolicy #32443

Merged

dakrone mentioned this issue Jul 31, 2018

Make index.lifecycle.name setting internal #32518

Merged

talevy self-assigned this Jul 31, 2018

jasontedor assigned dakrone Aug 1, 2018

dakrone mentioned this issue Aug 21, 2018

Remove canSetPolicy, canUpdatePolicy, and canRemovePolicy #33037

Merged

$@polyfractal$ polyfractal mentioned this issue Aug 22, 2018

[Rollup] Managing index lifecycle #33065

Closed

colings86 mentioned this issue Aug 23, 2018

Java high-level REST client completeness for ILM APIs #33100

Closed

13 tasks

dakrone mentioned this issue Aug 24, 2018

Remove PhaseAfterStep #33140

Closed

dakrone mentioned this issue Sep 4, 2018

Replace PhaseAfterStep with PhaseCompleteStep #33398

Merged

dakrone mentioned this issue Sep 17, 2018

Rebuild step on PolicyStepsRegistry.getStep #33780

Merged

robbavey mentioned this issue Sep 24, 2018

Index Lifecycle Management support for Logstash logstash-plugins/logstash-output-elasticsearch#798

Closed

dakrone mentioned this issue Sep 27, 2018

Change step execution flow to be deliberate about type #34126

Merged

colings86 closed this as completed Dec 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[meta] Index Lifecycle Management Plan #29823

[meta] Index Lifecycle Management Plan #29823

elasticmachine commented Oct 30, 2017 •

edited by colings86

elasticmachine commented Jan 12, 2018

elasticmachine commented Feb 27, 2018

elasticmachine commented Mar 1, 2018

elasticmachine commented Apr 2, 2018

elasticmachine commented Apr 3, 2018

elasticmachine commented Apr 25, 2018

colings86 commented Dec 13, 2018

[meta] Index Lifecycle Management Plan #29823

[meta] Index Lifecycle Management Plan #29823

Comments

elasticmachine commented Oct 30, 2017 • edited by colings86

Tasks

tracking Steps progress

Remaining Tasks

Completed

Blockers to merging into master in priority order from most to least (items are marked in difficulty using *, **, ***)

Blockers to first release in priority order from most to least (items are marked in difficulty using *, **, ***)

Optional (but would be really good to have)

elasticmachine commented Jan 12, 2018

elasticmachine commented Feb 27, 2018

elasticmachine commented Mar 1, 2018

elasticmachine commented Apr 2, 2018

elasticmachine commented Apr 3, 2018

elasticmachine commented Apr 25, 2018

colings86 commented Dec 13, 2018

elasticmachine commented Oct 30, 2017 •

edited by colings86

Blockers to merging into master in priority order from most to least (items are marked in difficulty using , , )

Blockers to first release in priority order from most to least (items are marked in difficulty using , , )