Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILM policy for cold tier configured by Kibana UI fails at "migrate" action #69347

Closed
yukshimizu opened this issue Feb 22, 2021 · 4 comments
Closed
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team

Comments

@yukshimizu
Copy link

When configuring cold tier with one cold node, ILM policy which is configured with Kibana UI fails at "migrate" action as following.

{
  "indices" : {
    "filebeat-7.7.0-2021.02.22-000364" : {
      "index" : "filebeat-7.7.0-2021.02.22-000364",
      "managed" : true,
      "policy" : "filebeat",
      "lifecycle_date_millis" : 1613986069200,
      "age" : "3.65h",
      "phase" : "cold",
      "phase_time_millis" : 1613996874352,
      "action" : "migrate",
      "action_time_millis" : 1613997004872,
      "step" : "check-migration",
      "step_time_millis" : 1613997005174,
      "step_info" : {
        "message" : "[filebeat-7.7.0-2021.02.22-000364] lifecycle action [migrate] waiting for [1] shards to be moved to the [data_cold] tier (tier migration preference configuration is [data_cold,data_warm,data_hot])",
        "shards_left_to_allocate" : 1,
        "all_shards_active" : true,
        "number_of_replicas" : 1
      },
      "phase_execution" : {
        "policy" : "filebeat",
        "phase_definition" : {
          "min_age" : "3h",
          "actions" : {
            "searchable_snapshot" : {
              "snapshot_repository" : "found-snapshots",
              "force_merge_index" : true
            },
            "set_priority" : {
              "priority" : 0
            }
          }
        },
        "version" : 11,
        "modified_date_in_millis" : 1613992153367
      }
    }
  }
}

After this message, if changeing the setting of number of replica to 0, it works fine after that.
This may be caused by migration from 1 primary and 1 replica warm tier to 1 primary and 0 replica cold tier. In case of configuring 2 cold nodes, this error does not happen. "allocate" action must help this, but we cannot configure "allocate" action at cold phase in Kibana UI.

Here is the API request of ILM generated by Kibana UI. There are no "allocate" actions at cold phase.

{
  "filebeat" : {
    "version" : 11,
    "modified_date" : "2021-02-22T11:09:13.367Z",
    "policy" : {
      "phases" : {
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_size" : "40gb",
              "max_age" : "1h"
            },
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "warm" : {
          "min_age" : "1h",
          "actions" : {
            "forcemerge" : {
              "max_num_segments" : 1
            },
            "set_priority" : {
              "priority" : 50
            }
          }
        },
        "cold" : {
          "min_age" : "3h",
          "actions" : {
            "searchable_snapshot" : {
              "snapshot_repository" : "found-snapshots",
              "force_merge_index" : true
            },
            "set_priority" : {
              "priority" : 0
            }
          }
        }
      }
    }
  }
}

If I add "allocate" action in the policy via API, it passes "migrate" action successfully.

{
  "indices" : {
    "filebeat-7.7.0-2021.02.22-000365" : {
      "index" : "filebeat-7.7.0-2021.02.22-000365",
      "managed" : true,
      "policy" : "filebeat",
      "lifecycle_date_millis" : 1613990269694,
      "age" : "3.08h",
      "phase" : "cold",
      "phase_time_millis" : 1614000686240,
      "action" : "searchable_snapshot",
      "action_time_millis" : 1614001294528,
      "step" : "wait-for-shard-history-leases",
      "step_time_millis" : 1614001294530,
      "phase_execution" : {
        "policy" : "filebeat",
        "phase_definition" : {
          "min_age" : "2h",
          "actions" : {
            "allocate" : {
              "number_of_replicas" : 0,
              "include" : { },
              "exclude" : { },
              "require" : { }
            },
            "searchable_snapshot" : {
              "snapshot_repository" : "found-snapshots",
              "force_merge_index" : true
            },
            "set_priority" : {
              "priority" : 0
            }
          }
        },
        "version" : 12,
        "modified_date_in_millis" : 1614000685981
      }
    }
  }
}

Elasticsearch version : 7.11.1

Steps to reproduce: This happnes in ESS environment.

@yukshimizu yukshimizu added >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management needs:triage Requires assignment of a team area label labels Feb 22, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Feb 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@mayya-sharipova mayya-sharipova removed the needs:triage Requires assignment of a team area label label Feb 22, 2021
@andreidan
Copy link
Contributor

@yukshimizu I've tried this in Kibana and I think it works correctly (the number of replicas configuration is separate from data allocation, even though they configure the same underlying ILM action).
image

I believe this works as expected. I hope you don't mind me closing this issue. If you have more questions, there's an active community in the forum that should be able to help get an answer to your question.

@yukshimizu
Copy link
Author

Thanks @andreidan
But, I can't find that configuration option (replicas at Cold phase) on the UI. Which version did you try?
7.11.1 Kibana on ESS does not show that configuration option. If 7.12 fixs this, that's fine.

スクリーンショット 2021-02-24 9 14 49

@andreidan
Copy link
Contributor

andreidan commented Feb 24, 2021

@yukshimizu thanks for the screenshot. It's the searchable_snapshot action configuration that hides the replicas configuration. Thanks for this report! I've opened an issue in the kibana repository elastic/kibana#92590 . Until this is fixed, as you pointed out, there's the option of using the elasticsearch API to update the policy and add the number of replicas configuration

"allocate" : {
              "number_of_replicas" : 0
            },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

4 participants