Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS support #4270

Merged
merged 1 commit into from
Jul 7, 2017
Merged

TLS support #4270

merged 1 commit into from
Jul 7, 2017

Conversation

pjain1
Copy link
Member

@pjain1 pjain1 commented May 12, 2017

Open for discussion.

Configurations -

  1. druid.server.http.plainText - enable/disable http connector
  2. druid.server.http.tls- enable/disable https connector
  3. druid.plainTextPort - port for http connector
  4. druid.tlsPort - port for https connector
  5. druid.server.https.* - Jetty server TLS configs

Behavior of configuring http connector using druid.host and druid.port is kept backwards compatible. However, if druid.server.http.tls is set then druid.tlsPort config will be used for TLS port and if druid.host contains port then that port will be ignored, this also means that druid.tlsPort should be a non-negative Integer . Please see DruidNodeTest unit tests for all the different configurations that will work and ones that will not.

High Level Implementation details -

  1. Changed Internal Discovery metadata (DruidServerMetadata) to include hostAndTlsPort in addition to host which is combination of host and plain text port. If druid.server.http.plainText is not set then announced host will be null, similarly if druid.server.http.tlsis not set then announced hostAndTlsPort will be null.
  2. Changed Curator Announcer to announce sslPort in addition to port. sslPort will be -1 if tls is disabled, similarly port will be -1 if plainText is disabled.
  3. Changed TaskLocation to include tlsPort.
  4. Changed Worker to include scheme which will be https if tlsPort is greater than or equal to 0 otherwise http.
  5. Changed ListenerResourceAnnouncer to announce HostAndPortWithScheme instead of HostAndPort. If tlsPort is valid then host and tls port with https scheme will be used otherwise plainText port and http will be used.
  6. Modified PasswordProvider to include getPassword(key) to support getting other passwords than just Metadata Storage Connector password. More details in comment in PasswordProvider class.
  7. There are decent number of ways to configure SSLContext to be used with HttpClient. As per the code in master if an implementation of SSLContext is bound then that would be used with the client. Therefore, keeping it the same way but providing a sample implementation module (in extensions-core) that creates an example context, good enough for most simple cases.

TODO -

  1. Tranquility support - May not be required immediately as added support for simultaneous HTTP and HTTPS connector.
  2. Unit tests - Added few tests
    3. Documentation
  3. Possibly change integration tests to support TLS

I have tested the changes locally.

@gianm
Copy link
Contributor

gianm commented May 12, 2017

@pjain1 very cool! Will fit in well with work we've been doing to add basic authentication, which requires https to be effective…

How would a transition from http to https look for a cluster? For outwards-facing APIs, could they be served on both http and https for some transition period, before the user decides to cut over to https completely? I think something like that is necessary to avoid disrupting clients. (i.e. a three part migration: first enable both http/https on Druid, then switch clients to https, then turn off http on Druid)

For inwards-facing APIs, like historical, middleManager, task announcement, etc, what's the idea there? The fact that "scheme" is added but there's still only one "port" makes me think that they won't support both http and https at the same time. How would we do a rolling update in that case?

@pjain1
Copy link
Member Author

pjain1 commented May 12, 2017

We did not wanted to support two ports simultaneously on nodes to avoid complexity and code which will not be useful in future. This is what we thought about how rolling upgrade will work -

  1. Upgrade code on all nodes and put all other necessary infrastructure for certificates and keys. By default all service will keep on running using http.
  2. Provide an implementation of SSLContext at all nodes (through an extension). Now the HttpClient on all nodes can talk to https enabled services if they are announcing scheme as https.
  3. Now all the nodes can be upgraded one by one since nodes always look at the scheme to decide whether to talk on http or https.
  4. External client have to handle https at some point so they should somehow handle this transition, probably they can have some logic where they can try connecting on https and if it doesn't work then fall back to http.

We thought of the three phase approach but thought it might a lot of unnecessary complexity to code which is not so nice. What do you think ?

@pjain1
Copy link
Member Author

pjain1 commented May 12, 2017

For the 4. point about external clients, actually they should also the external discovery to figure out the scheme and use that and then there won't be any disruptions.

@gianm
Copy link
Contributor

gianm commented May 12, 2017

For the internally-facing APIs I think it's workable to only listen/announce one scheme at a time.

But for the externally-facing APIs I think users will really appreciate it if we listen on two ports for a while. That's because a lot of peoples' clients don't use Curator service discovery to find Druid -- they use some other mechanism, like load balancers, VIPs, or even hard-coded URLs.

External client have to handle https at some point so they should somehow handle this transition, probably they can have some logic where they can try connecting on https and if it doesn't work then fall back to http.

Some clients don't have the ability to use such custom logic, such as if your client is actually a load balancer or VIP where you can't configure a fallback like that. And even for clients that do have the ability to use custom logic, we'd be asking users to write code to implement that, which is a burden on them.

@pjain1
Copy link
Member Author

pjain1 commented May 12, 2017

Sounds like a valid concern. @himanshug @cheddar Do you have any opinions on Gian's comment.

@himanshug
Copy link
Contributor

@gianm it was discussed internally whether to enable same process to run both http and https simultaneously , but with scheme discovery via zookeeper it was possible to support druid's rolling upgrade and become https. so we decided not to go that route.
i think external clients can be taken care of if we add the feature on druid, so that if anyone sends http://host:port/xxx then server responds with "HTTP 301 Redirect" to https://host:port/xxx . Now, user would first upgrade the software one by one on all the external clients to handle https and the redirect... then enable https on druid side.... then change configuration on all the external clients to explicitly use https url.
does that sound ok ?

@gianm
Copy link
Contributor

gianm commented May 12, 2017

Sorry but it still sounds too user-unfriendly to me.

Redirects aren't always easy or even possible for clients to follow. Consider the case where the brokers are fronted by a load balancer, which is super common (amazon ELB in particular is quite common, as are other kinds of load balancers). In many cases the user doesn't have access to the brokers directly and has to go through the balancer. If the load balancer doesn't follow redirects -- and many don't -- the client will get redirected to a url it can't access.

Also, many http libraries don't follow redirects by default, if they support it at all (our own http-client doesn't, and I've worked with others that also don't). And some libraries will follow for GET, but not for POST.

I think asking users to get all their clients to be able to deal with this kind of stuff before they can switch to https is going to be a pain for them, if it is even possible at all.

@himanshug
Copy link
Contributor

hmmm, in the case when brokers are behind a load balancer and all external clients talk to that ... then rolling upgrade is pretty simple ... remove http://broker-1 from LB, then enable https on broker-1 and then add https://broker-1 to LB ... and repeat same for all brokers.

that said, in the most general case only option would be to have broker accept both http and https. @cheddar any additional thoughts?

@gianm
Copy link
Contributor

gianm commented May 12, 2017

remove http://broker-1 from LB, then enable https on broker-1 and then add https://broker-1 to LB ... and repeat same for all brokers.

That wouldn't work for Amazon ELBs, at least, which don't let you specify different schemes or ports for different instances behind a load balancer. Those have to match across all of them.

I think in this case it's best for us to take the complexity hit in order to make life simpler for users.

@himanshug
Copy link
Contributor

yeah, may be, so user would probably configure additional druid.sslPort which will be used to start additional jetty https connector.

@gianm
Copy link
Contributor

gianm commented May 13, 2017

Yeah, that makes sense to me.

@nishantmonu51 nishantmonu51 self-assigned this May 15, 2017
@nishantmonu51 nishantmonu51 self-requested a review May 15, 2017 15:28
@gianm
Copy link
Contributor

gianm commented May 16, 2017

We might want additional configs too to determine what ports to listen on. Something to make it possible to disable http when we don't want it anymore.

Would we also add 'standard' SSL ports for Druid services, meant to replace the 808x and 81xx ports we currently use?

@pjain1
Copy link
Member Author

pjain1 commented May 16, 2017

@gianm Ignoring the current configs in this PR, moving forward something like this can be done-

  1. If tlsPort is set and is different from port then two connectors will be opened one for http and other for https.
  2. If tlsPort is set and is same as port or port is not set then use tlsPort to open https connector.
  3. Otherwise use port to open http connector.
  4. Fail in any other situation
    In this world if users want https then they must set tlsPort.

Another design would be to have a config like druid.server.https.enabled -

  1. Have same default value of tlsPort and port
  2. If the config is set and tlsPort and port value is equal then open an https connector.
  3. If the config is set and both ports values are different then open both http and https connector.
  4. If the config is not set then use port value to open an http connector.
    In this world, if users just want https then they must either set port to -1 or set equal value for port and tlsPort

There can be multiple combinations with these 3 configs. What do you think is better ?

@pjain1
Copy link
Member Author

pjain1 commented May 16, 2017

One more option is to have a config like druid.server.http.disable (defaults to false) and druid.server.https.enable (defaults to false) . These two properties directly correlate to port and tlsPort on DruidNode. HTTP connector always uses port and HTTPS always uses tlsPort. If we want standard TLS ports then they can be the default port number + 100 or something like that.

@gianm
Copy link
Contributor

gianm commented May 16, 2017

Both of those seem a bit magical to me. If I'm a user I don't want to have to worry about whether port and tlsPort are the same and what implications that has.

Instead how about the following?

  • set default tlsPort for each service to existing plaintext ports + 200; so broker is at 8282, historical is at 8283, MM tlsStartPort is at 8300, etc.
  • have a config druid.server.http.tls which defaults to false, but if enabled, turns on the tls ports.
  • have a config druid.server.http.plaintext which defaults to true, but if disabled, turns off the plaintext ports.
  • users can customize port/tlsPort/startPort/tlsStartPort if they want, but we expect that most users won't.
  • maybe deprecate the "port" configs and rename them to something like druid.plaintextPort?

@gianm
Copy link
Contributor

gianm commented May 16, 2017

oops, should have said druid.server.http.plaintext would default to true. Edited the last comment.

@pjain1
Copy link
Member Author

pjain1 commented May 16, 2017

Interesting, my last comment looks similar to what you are proposing.

@gianm
Copy link
Contributor

gianm commented May 16, 2017

Yes it does look similar :)

I like druid.server.http.tls and druid.server.http.plaintext -- to me it's easier to scan and understand something like http.plaintext=true than http.disable=false (the double negative is weird).

@pjain1
Copy link
Member Author

pjain1 commented May 16, 2017

OK lets go with the properties name you suggested. I'll make the changes soon. Thanks for the comments.

@pjain1
Copy link
Member Author

pjain1 commented May 16, 2017

If port is deprecated and if user was setting port using config then do we expect user to change it to plaintextPort or should the code internally map port to plaintextPort.

@jon-wei
Copy link
Contributor

jon-wei commented May 16, 2017

@pjain1 @gianm

Having two types of ports and a separate configuration to enable each type sounds good to me.

I think we could keep port as-is without deprecating (I feel like it should be sufficient to make that distinction in the docs without needing users to make configuration changes for all plaintext ports), but if we do, I'd favor treating port and plaintextPort as equivalent.

If both plaintext/TLS ports are enabled, I think having port and tlsPort be the same should be an error condition instead of having TLS silently take precedence, I feel like behavior is more clear then.

@leventov leventov removed this from the 0.10.2 milestone Jun 26, 2017
@jon-wei
Copy link
Contributor

jon-wei commented Jun 26, 2017

@pjain1 This patch mostly LGTM now, I'm okay with merging this once the discussions around the PasswordProvider are resolved.

I agree with @gianm and @drcrallen on keeping the existing password handling for the metadata store unchanged in this PR, and left my thoughts on @drcrallen's comment regarding the PasswordProvider(or SecretProvider) interface.

@pjain1 pjain1 closed this Jun 26, 2017
@pjain1 pjain1 reopened this Jun 26, 2017
@@ -14,7 +14,8 @@ The indexing service uses several of the global configs in [Configuration](../co
|Property|Description|Default|
|--------|-----------|-------|
|`druid.host`|The host for the current node. This is used to advertise the current processes location as reachable from another node and should generally be specified such that `http://${druid.host}/` could actually talk to this process|InetAddress.getLocalHost().getCanonicalHostName()|
|`druid.port`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8090|
|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8090|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should note that the MiddleManager defaults to ports 8091/8291

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added new section for MiddleManager configs

@pjain1
Copy link
Member Author

pjain1 commented Jun 27, 2017

@gianm @jon-wei @drcrallen to keep things simple I have kept PasswordProvider interface same and changed the code so that at each place where a password is required, a new object of PasswordProvider is instantiated. Thus each PasswordProvider can be individually set using runtime properties in similar way as it is done for the metadata store password.

Added separate documentation pages for how to use and implement PasswordProvider, please have a look.

@pjain1 pjain1 closed this Jun 28, 2017
@pjain1 pjain1 reopened this Jun 28, 2017
@pjain1 pjain1 closed this Jun 28, 2017
@pjain1 pjain1 reopened this Jun 28, 2017
@gianm
Copy link
Contributor

gianm commented Jun 29, 2017

@pjain1 the PasswordProvider approach looks good to me -- it's definitely simpler and looks like it should work well.

@jon-wei could you take another look at the overall patch please?

And @pjain1 could you please resolve the conflicts?

@jon-wei
Copy link
Contributor

jon-wei commented Jun 30, 2017

@pjain1 Reviewed the patch again, didn't have any more comments, looks good to me. The new password provider design looks good.

I'll approve after resolving conflicts.

@pjain1 pjain1 closed this Jun 30, 2017
@pjain1 pjain1 reopened this Jun 30, 2017
@pjain1
Copy link
Member Author

pjain1 commented Jun 30, 2017

I am going to do some basic testing again tomorrow after which I think the PR will be mergeable.

@pjain1
Copy link
Member Author

pjain1 commented Jun 30, 2017

The PR is mergeable now (if there are no more comments), note that I have added toString for TLSServerConfig and SSLClientConfig meaning that things like keyStorePath, keyStoreType etc. (excluding PasswordProviders) will be printed in logs.

@jon-wei
Copy link
Contributor

jon-wei commented Jun 30, 2017

@drcrallen @nishantmonu51 Did you have any more comments on this patch?

@jon-wei
Copy link
Contributor

jon-wei commented Jul 6, 2017

@nishantmonu51 @drcrallen @jihoonson @gianm

Does anyone else want to take a look at this patch again? If not, I'm planning on merging this tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants