Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink method hangs on ES 7.x #1494

Open
sbalaram opened this issue Dec 9, 2019 · 5 comments
Open

Shrink method hangs on ES 7.x #1494

sbalaram opened this issue Dec 9, 2019 · 5 comments

Comments

@sbalaram
Copy link

sbalaram commented Dec 9, 2019

For usage questions and help

Please create a topic at https://discuss.elastic.co/c/elasticsearch
https://discuss.elastic.co/t/using-py-curator-shrink-method-hangs-on-es-7-x-aws-managed-service/210791

Perhaps a topic there already has an answer for you!

To submit a bug or report an issue

shrink using curator python never completes and the cluster status goes to unassigned shards.

AWS support explanation :

Cause of the issue: From ES 7.0 , Elastic deprecated the use of "copy_settings" parameter in the shrink api call and made "copy_settings=true" as the default value. There is no way to modify this parameter anymore. Reference --> https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking-changes-7.0.html#copy-settings-deprecated-shrink-split-apis "copy_settings" --> allows users to specify whether they want the target index (shrunken index) to have the same index settings as the source index. Now, with "copy_settings" set to true in the shrink api calls for 7.x ES clusters, all the index settings from the target index including "index.routing.allocation.require._name" and "index.blocks.read_only" are also copied down to the target index. The above setting forces ES to assign shards of target index to a specific node that is defined in "index.routing.allocation.require._name" but with replicas enabled, ES cannot allocate replica of a primary shard onto the same datanode. So, Cluster settings prevent ES from allocating replicas onto the same datanode as primaries and index settings prevent the shards to be allocated to any other node but "index.routing.allocation.require._name" value. Previously, these settings were never inherited down to the shrunken indices as they

the workaround is to

Step1. Update the source index routing setting to a specific node and block all writes.
Step2. Perform Shrink operation.
Step3. Remove routing and write block settings from the shrunken index.

the workaround works, but I loose the benefit of using curator shrink DETERMINISTIC option ( https://www.elastic.co/guide/en/elasticsearch/client/curator/current/option_shrink_node.html ), automatically determines the node with most free space

anyone successfully ran shrink using curator on 7.x ES AWS cluster without any workarounds ?

Expected Behavior

single call to shrink method worked on 6.x versions of ES, but the same fails on ES 7.x

Actual Behavior

the method hangs and the cluster state changes to unassigned shards

Steps to Reproduce the Problem

  1. call shrink with DETERMINISTIC option and it the operation never completes

Specifications

  • Version: 7.x
  • Platform:AWS managed service
  • Subsystem:

Context (Environment)

we have a shrink and merge python script built on top of Curator and that works for ES 6.x and with 7.x Shrink breaks

Detailed Description

@untergeek untergeek changed the title Shrink method hangs on ES 7.x AWS managed service Shrink method hangs on ES 7.x Dec 10, 2019
@untergeek
Copy link
Member

This is not limited to AWS ES.

@untergeek
Copy link
Member

Changing the shrink action in Curator to permit stripping of routing tags and other settings and then re-apply or apply others afterwards will be a bit of work.

@sbalaram
Copy link
Author

thanks for the response @untergeek , I'll make an attempt to understand the shrink code and see if I could contribute, if I get to that stage will send a PR following the how-to contribute guidelines.

wdyt ?

@untergeek
Copy link
Member

All submissions are accepted and reviewed. Thank you for your willingness.

@breml
Copy link
Contributor

breml commented Aug 7, 2020

@untergeek We are suffering from this issue as well. It looks like it has been fixed in #1528, but there is not yet a new released version of curator, which contains this fix. Do you mind to create a new release (I need the rpm packages in order to be able to install it in our production environment)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants