Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.0.1 Compatibility with AWS ES Service #685

Closed
IlyaSukhanov opened this issue Jul 8, 2016 · 21 comments
Closed

4.0.1 Compatibility with AWS ES Service #685

IlyaSukhanov opened this issue Jul 8, 2016 · 21 comments
Labels

Comments

@IlyaSukhanov
Copy link

Get following error when calling curator.IndexList(es_client):

elasticsearch.exceptions.AuthenticationException: TransportError(401, u'{"Message":"Your request: \'/_cluster/state/metadata/.[... snip ...]' is not allowed."}')

Equivalent operation does work with <4.0.0 release. Suspect its a regression with 4.0.0 release of issue #491.

@untergeek
Copy link
Member

untergeek commented Jul 8, 2016

This isn't a regression. Curator 4.0 only supports ES versions 2.x and up. See Versioning in the README. It seems you're using the API and creating your own client connection, as the get_client method supplied by Curator would have detected this and disallowed it.

The longer reason is that there is a ton of new functionality that is simply not supported by older versions of Elasticsearch. Much of this new functionality depends on using the cluster metadata calls, which are disallowed by AWS ES.

Curator 3.x persists for those who are compelled, for whatever reason, to stay in that space. Trying to straddle major versions, and document & instruct users which actions and filters do or do not work based on Elasticsearch versions is a support nightmare. As a result, a decision was made to cut off support for older versions of ES with the newer version of Curator.

If you're using AWS ES, you'll need to use the Curator 3.x branch. You can install this via pip with pip install -U elasticsearch-curator==3.5.1

@IlyaSukhanov
Copy link
Author

Thank you for clarification!

@flybd5
Copy link

flybd5 commented Aug 30, 2016

Hmm. "Curator 3.x persists for those who are compelled, for whatever reason, to stay in that space." Seem to me that "whatever reason" is that this doesn't work on AWS ES. Just wasted three hours before I found about this. Maybe you want to put this severe incompatibility up front on the docs and not wait until people find this issue.

@untergeek
Copy link
Member

@flybd5 When I released Curator 4, it was completely incompatible with AWS ES because they had not yet released their version of Elasticsearch 2.x. I'm sorry you've had a bad experience, but it's not my fault that AWS ES doesn't support the API calls necessary for Curator to work. That's 100% on Amazon for releasing an incomplete version of Elasticsearch. Please see the conversations in #717 regarding this, including the links to AWS forum posts on the subject, and the request for help from a core developer of the AWS ES product.

I'll see about adding a blurb to the README.

@flybd5
Copy link

flybd5 commented Aug 30, 2016

I'll see about adding a blurb to the README.

This is really all that you had to say.

@dzavalkinolx
Copy link

@untergeek What is the point to have aws key and secret options in the config if Curator is not compatible with AWS ES at all? First I had to find out out that I have to install requests-aws4auth manually (hilarious) and then spend 3 hours to play with IAM role permissions to find out that you decided for some reason to make optional API required and you refuse to support the way Curator 3 worked? The worst documentation and attitude I've ever seen...

@untergeek
Copy link
Member

What is the point to have aws key and secret options in the config if Curator is not compatible with AWS ES at all?

They were added before Curator 4 was released, as part of some feature adding being done by others. Having no way to properly mock the calls, or to do proper integration testing, I merged pull requests submitted by others after they reported a functional outcome. The get_client method was carried over from 3.x with little modification, so that is partly why those options persist, even if they've been renamed in places.

you decided for some reason to make optional API required and you refuse to support the way Curator 3 worked?

The /_cluster/state/metadata endpoint is not optional for Curator 4. Curator 4 is a major version release, and in many ways is a full API rewrite. So many of the things that Curator 3 did were suboptimal, and frequently repeated calls that didn't need to be repeated, if data was stored once. As stated in a previous comment on this ticket, when Curator 4 was being developed, there was no such thing as AWS ES 2.x, so I had no way of knowing that Amazon would release a 2.x version, nor that if they did that it still would not support the metadata endpoint. I developed for the unaltered, open-source releases of Elasticsearch, with the 2.x APIs and 5.x APIs in mind. Another important note is that earlier versions of Curator 3 did not support AWS ES 1.x either, for the same reason. A user submitted a method that would pull some necessary details using the _cat API, and thereby make Curator work (for that call) with AWS ES. #717 has some interesting details here, including one of the AWS ES developers reaching out to ask how they can make their version compatible with Curator. Another developer tried to hack a way to make it work, but found that there were no substitute calls for the metadata one, which also covers index state (open/closed) and routing information, both of which are considered essential for Curator.

The IndexList object in Curator 4 pulls the full list of indices, and all of their associated metadata at object instantiation time. It escapes the loops and frequent API calls that were in Curator 3.x. To suggest that I'm refusing to support the way Curator 3 worked is to miss the mark. It was, and is, a rewrite for optimization and to support newer API calls, rather than a deliberate maneuver to hurt 3.x or AWS ES users.

Please also remember that I wrote the documentation before AWS ES 2.x was released. This has clearly resulted in some frustration for you, and I'm sorry for that. I will alter the docs for all references to IAM credentials to point out that AWS ES is not supported, or perhaps remove them altogether from the documentation.

@trompx
Copy link

trompx commented Nov 24, 2016

Hello @untergeek,

I stumble upon this issue a bit late. I was using ES2.x but essentially for the completion suggester which had one major issue which got solved in ES5.x. So I had to upgrade. However, I had implemented multiple scripts to automate the backup/restore of all my indices with curator 3.x and just found out that curator 3.x does not support ES5.x so I am kind of stuck, either with a buggy ES2.x or with ES5.x but without any backup support. As making curator 4 works with AWS ES seems complicated for now, what would it take to make curator 3.x works with ES5.x?

It's a frustrating situation and I cannot really compromise on either of this issues, neither have the money or time right now to implement a new backup system (with ES cloud for instance like it was clumsily promoted in another post, yet I am very grateful of all the work you and all elastic coders that made all those great tools available, I am just hoping that I did not invest so much time to need to rebuild a different system).

@untergeek
Copy link
Member

Curator 4.2.3 is the current release. It re-introduced 3.x style singletons via the curator_cli. Check the current documentation for more information.

In either case, you will need to make some adjustments as the filter syntax for the new singleton actions is very different from the many flags in 3.x

@untergeek
Copy link
Member

@untergeek
Copy link
Member

If AWS doesn't open the cluster state metadata endpoint, it still won't be supported by Curator when/if they release a 5.x version

@trompx
Copy link

trompx commented Nov 24, 2016

Wow that was fast :) Thanks a bunch!

Before I dig deep into this, it means I can run command like I used to like:

curator $ES_OPTS snapshot --repository $S3_REPOSITORY --prefix logs- indices --prefix logs- --newer-than $LOGS_BACKUP_DAYS --time-unit days --timestring '%Y.%m.%d' ?

@untergeek
Copy link
Member

Somewhat similarly, yes. Run curator_cli --help and curator_cli snapshot --help for more information. Specifically, the --filter_list flag replaces --prefix, --time-unit, --newer-than, --timestring, etc.

You'll need to create a filter list that is a JSON representation of the YAML filters block from the new configuration style. There are some simple examples in the online documentation. You can test your --filter_list with curator_cli show_indices --filter_list ... to ensure it works as expected.

@trompx
Copy link

trompx commented Nov 25, 2016

Thanks, I was already converting my scripts and managed to make my first backup with curator 4.2.3.post1
curator_cli $ES_OPTS snapshot --repository $S3_REPOSITORY --name my-nice-name-hourly-%Y%m%d%H%M%S --filter_list '{"filtertype":"pattern","kind":"prefix","value":"myindice"}'

First with curator 3.x, the snapshot backup on the s3 repository had the same name specified in command line with --prefix. With new version I have names like meta-0UH9cMZsR1Df5v68IaDPEQ.dat. Is it possible to still get the nice formatted name as specified in --name my-nice-name-hourly-%Y%m%d%H%M%S ?

Second, doing the first backup, when I did a show_snapshots right after, I got the following output:

my-nice-name-hourly-20161125002650
my-nice-name-hourly-20161125002650

Is it normal that it lists the snapshot twice?

Then I indexed some more data, did another snapshot and show_snapshots again, I got:

my-nice-name-hourly-20161125010047
my-nice-name-hourly-20161125010047

While a new .dat file has been added to the s3 repository, the show_snapshots show no trace of the previous backups...

The whole point (at least in my case) of the backup is to be able to restore it, however the restore singleton is not available. Will it be implemented anytime soon? Otherwise I will need to look for other solutions :(

Anyway, thank you very much for making thoses singletons available!

@untergeek
Copy link
Member

untergeek commented Nov 25, 2016 via email

@trompx
Copy link

trompx commented Nov 25, 2016

I Will open new issues tomorrow (it's 3am here and I am on mobile..). But just so I know, what is the point of a backup if it is not possible to restore it? Once we have a curator backup on s3, can we restore it with other tools or it will be useless until restore singleton is available?

@untergeek
Copy link
Member

Restore works great in regular Curator 4 (YAML configuration file). There are tons of options to configure, so I haven't made a restore singleton. If someone wants to add it, I'll merge. Curator 3 never had restore due to its complexity.

@trompx
Copy link

trompx commented Nov 25, 2016

You lost me here, as stated here https://www.elastic.co/guide/en/elasticsearch/client/curator/current/faq_aws_iam.html and as you mentionned, in order to use Curator 4 with aws, the cluster state metadata endpoint has to be opened.

I managed to backup to aws s3 thanks to the singleton which uses the 3.x way of doing things that don't use that cluster state metadata.

Now that I have my backup on s3, you're saying that Curator 4 needs that endpoint only to backup but it is possible to restore any backup from aws s3 with the Curator 4 restore action, so we can set the aws keys in the YAML file and it works?

I used to restore with (thanks to elasticsearch aws plugin):
curl -s -S -o /dev/stderr -w "%{http_code}" -XPOST "${ESHOST}/_snapshot/${S3REPO}/${snapshot_to_restore}/_restore?pretty=true&wait_for_completion=true"

@untergeek
Copy link
Member

Snapshot to s3 is not the same as using Amazon AWS ES (i.e. their hosted Elasticsearch offering). S3 repositories are available to any and all ES installations.

The metadata endpoint is only needed with Amazon's own AWS ES hosted service.

@untergeek
Copy link
Member

The singletons do use the metadata endpoint, by the way.

@trompx
Copy link

trompx commented Nov 25, 2016

Thanks for the clarifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants