Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AWS Elasticsearch Auth #60

Open
thomasdziedzic opened this issue Jan 3, 2017 · 14 comments
Open

Support AWS Elasticsearch Auth #60

thomasdziedzic opened this issue Jan 3, 2017 · 14 comments

Comments

@thomasdziedzic
Copy link

AWS Elasticsearch implements a custom signature method to authenticate users [0].

It would be nice to be able to use this connector to move data into AWS Elasticsearch clusters that require authentication.

[0] - https://aws.amazon.com/blogs/security/how-to-control-access-to-your-amazon-elasticsearch-service-domain/

@zzbennett
Copy link

This would be really useful. I'm wondering how to implement this without adding dependencies on a bunch of AWS libraries that most people will not need or want on their classpath.

Perhaps with a new property that specifies the class name of a request interceptor? Then this property could be populated with the classname of an AWS request interceptor (like this one: https://github.com/inreachventures/aws-signing-request-interceptor) which adds the required AWS authentication to the ES requests. Then, if you are using AWS's ES, you can drop the required jars into your classpath and specify the request interceptor config in your ES connector config. It's a little cumbersome, am open to other ideas.

I have forked this repo in order to add AWS request signing, but I would like to contribute a solution upstream so I don't need to maintain a separate fork just for AWS's auth stuff.

@ewencp
Copy link
Contributor

ewencp commented Apr 10, 2017

@thomasdziedzic @zzbennett Definitely seems like a good idea -- I think this will be a matter of exposing a few more configs that are specific to AWS and then wiring up the auth pieces. There's an example of how to do the auth steps in this Jest issue and #77 is working on adding basic authentication support. If anyone is interested in taking a stab, I'd be happy to guide development and review a PR!

@zzbennett
Copy link

@ewencp I'd be happy to take a stab at this. I've got the code already and it is running well so far in our prototype connect deployment. I'll just productionalize it a bit and put up a PR for discussion.

@zzbennett
Copy link

zzbennett commented May 31, 2017

Okay, so I'm back to working on the ES connector. I've been mulling this over and although the modifications involved for supporting the AWS authentication are simple, implementing them in a "pluggable" way is somewhat trickier.

Inspired by the pluggable partitioners and formatters in the S3/HDFS connector, this is a possible solution:

Abstract the ES client logic. Currently the connector depends directly on the JestClient and the JestClientFactory. Rather than depending directly on the JestClient for executing ES requests, we could add an ESClient interface and a default implementation that will use the current JestClient logic. A config would be added containing the classname of the ESClient implementation, which would get instantiated using reflection. Most people would use the default for this config, but for people needing the AWS auth (or any kind of special logic around querying ES), they could plop an implementation of the ESClient on their classpath that provides the AWS authentication and change the ESClient classname config. The downsides are it requires a new config that most people won't need to touch, and handling pluggability this way can get a bit unwieldy. It does give users complete control over how the connector queries ES, which could be useful, like if they are doing something fancy like routing to different ES clusters.

Honestly though, for this particular issue it might make more sense to stand up a reverse proxy that will handle the authentication. AWS's ES can do IP based access control, so you could just set up a vanilla nginx reverse proxy and whitelist its IP. Or you could set the proxy up with this.

I guess it boils down to whether it is worth it to abstract the ESClient or not. If the ESClient abstraction makes sense for purposes besides AWS authentication, then handling authentication that way could be easier, otherwise, the reverse proxy is probably the way to go.

@jdsiddon
Copy link

jdsiddon commented Mar 1, 2018

Has there been any updates on this issue?

@zzbennett
Copy link

Since creating this issue AWS released VPC based Elasticsearch clusters, which don't require the auth signing of requests so there isn't as much of a need for this feature anymore.

@elarib
Copy link

elarib commented Mar 22, 2018

We're using secure elasticsearch on the PROD, and now we need to sink some topics on IT using Kafka connect (We've been doing it using Spark streaming).
@zzbennett I think this feature is needed .
How can i help? @ewencp @thomasdziedzic @jdsiddon @zzbennett so we can move forward with this PR .

@zzbennett
Copy link

@elarib What do you mean exactly by secure Elasticsearch? Is your Elasticsearch cluster deployed in AWS? And if so, is it deployed in a VPC? Or do you access it over the public internet

@elarib
Copy link

elarib commented Mar 23, 2018

@zzbennett Yesterday, i created a pull request with a description of this Use case: #185
There is some use case to secure ES so we can have multitenancy capability, using ES xPack or Searchguard.

@rhauch
Copy link
Member

rhauch commented Jun 7, 2018

#216 implements basic auth via the JEST client. Does that satisfy this request? If so, we can close this issue.

@joncourt
Copy link

joncourt commented Oct 15, 2018

Anyone working on a PR for this? Planning to do so myself if not...

We have a company policy that requires signing as per https://docs.aws.amazon.com/general/latest/gr/signature-v4-examples.html

Perhaps a fork specific to AWS elastic search to avoid adding AWS dependencies generally to this connector? Seems a bit heavyweight either way..

@purbon
Copy link
Member

purbon commented Jan 9, 2019

Hi,
about adding AWS specific support for security I do agree with @joncourt approach here. AWS has a lot of specific options (including necessary dependencies) that are very specific for AWS.

Important bit, as already commented out by @joncourt, is that you should issue Signature Version 4 signed requests, basically wrapping all your interaction with the search engine. This operation is of no benefit for any other Elasticsearch installation.

Access control is done with IAM policies, basically allowing or denying HTTP verbs against Resources. This policies let you authorise based on identity but as well on source, etc. This is where both the Signature and the policies take the work of doing the authorisation, at less to my understanding.

From their blog:

A note about authentication, which applies to both types of policies: you can use two strategies to authenticate Amazon ES requests. The first is based on the originating IP address. You can omit the Principal from your policy and specify an IP Condition. In this case, and barring a conflicting policy, any call from that IP address will be allowed access or be denied access to the resource in question. The second strategy is based on the originating Principal. In this case, you are required to include information that AWS can use to authenticate the requestor as part of every request to your Amazon ES endpoint, which you accomplish by signing the request using Signature Version 4. Later in this post, I provide an example of how you can sign a simple request against Amazon ES using Signature Version 4.

I would recommend doing it in a way where people not using AWS does not have to carry a heavy way of AWS deps, for example using a fork.

As well we should not forget that Elasticsearch has support for the security x-packs, this is another way of adding security on top of it as well, but not just that, a fewer people but as well people use https://search-guard.com/ as security solution for elasticsearch.

All of this calls for me for a solution that is portable and let people use their module for security and auth.

I hope it makes sense.

@chadwickthebold
Copy link

Linking for anyone else who comes across this, but it looks like there's a PR for this now #330

The complication we ran into with trying to use Elasticsearch on AWS via the IP range restriction suggested above is that it also limits requests to the Kibana instance that AWS gives you out of the box. It might not be a big problem depending on your use-case but its worth it to note.

@rldeep2889
Copy link

Hello All,

A bit curious. I am trying to pull data out of AWS MSK via connector to AWS ES. Can anyone throw some light as to how I can configure the signer or any other way to index to AWS ES.

PS : AWS MSK i am able to connect, just want some help to index to ES.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants