Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support other TLS modes than mutual auth in Client #3156

Merged
merged 4 commits into from Jan 29, 2021

Conversation

simonswine
Copy link
Contributor

@simonswine simonswine commented Sep 10, 2020

What this PR does:

Increase TLS test coverage and support other TLS modes than mutual auth in Client by implementing explicit TLS enable flag.

This also adds a tls-server-name flag to require a specific server name instead of the hostname on the certificate.

Which issue(s) this PR fixes:

Fixes #3062 #3063

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@simonswine simonswine force-pushed the tls-server-name branch 2 times, most recently from e5b0a04 to c85522e Compare September 15, 2020 08:42
@simonswine simonswine force-pushed the tls-server-name branch 2 times, most recently from 6c4719b to 8a4e74c Compare September 15, 2020 10:02
@simonswine simonswine changed the title WIP: Support other TLS modes than mutual auth in Client Support other TLS modes than mutual auth in Client Sep 15, 2020
@simonswine simonswine marked this pull request as ready for review September 15, 2020 10:03

// TODO: Investigate why the client doesn't really receive the
// error about the bad certificate from the server side and just
// see connection closed instead
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone having an idea how we can propagate the server error of "bad certificate" in the GRPC case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc/grpc-go#622 (comment) suggests that you need to use a WithBlock() call option to get the certificate error. This might not be possible with the existing clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to investigate a bit more but I have found that issue and tried WithBlock() and had the request being blocked for a while rather than having any change of error behaviour. I will take another look tomorrow

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks! Changes LGTM. I just left a couple of nits and we should be good to go 🚀

pkg/util/tls/tls.go Outdated Show resolved Hide resolved
pkg/util/tls/tls.go Outdated Show resolved Hide resolved
pkg/util/tls/tls_integration_test.go Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@simonswine simonswine force-pushed the tls-server-name branch 2 times, most recently from 90ab965 to 098c8ce Compare September 15, 2020 16:40
CHANGELOG.md Outdated Show resolved Hide resolved
pkg/util/tls/tls.go Outdated Show resolved Hide resolved
pkg/chunk/gcp/bigtable_index_client.go Outdated Show resolved Hide resolved
@@ -74,12 +72,27 @@ func (cfg *ClientConfig) GetTLSConfig() (*tls.Config, error) {
return config, nil
}

type ClientConfigWithEnabled struct {
ClientConfig `yaml:",inline"`
Enabled bool `yaml:"tls_enabled"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected to see Enabled flag in grpcclient.Config, and then using that instead of current ClientConfigWithEnabled. WDYT? (It may require passing tlsEnabled flag to GetGRPCDialOptions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually have implemented it that way the first time around.

It ended up being quite tedious in terms of duplicated work and passing the boolean around.

Esp. in places where there is not a full grpcclient.Config used, like here:

https://github.com/cortexproject/cortex/blob/342e17c592e6c63c3d01227732a7ea0ebdb27111/pkg/querier/querier.go#L63
or there:
https://github.com/cortexproject/cortex/blob/342e17c592e6c63c3d01227732a7ea0ebdb27111/pkg/querier/blocks_store_replicated_set.go#L40

Wdyt if we maybe call it tls.GRPCClientConfig instead of tls.ClientConfigWithEnabled as this is more accurate in the way we use it

pkg/util/tls/tls.go Show resolved Hide resolved
pkg/util/tls/tls.go Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
f.BoolVar(&cfg.InsecureSkipVerify, prefix+".tls-insecure-skip-verify", false, "Skip validating server certificate.")
}

// GetTLSConfig initialises tls.Config from config options
func (cfg *ClientConfig) GetTLSConfig() (*tls.Config, error) {
// no tls config given at all
if cfg.CertPath == "" && cfg.KeyPath == "" && cfg.CAPath == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a breaking change? What's the impact of this change for components that are directly using ClientConfig? Before we were retuning nil (TLS disabled) while now we do return tls.Config{} (TLS enabled).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that for clients using tls.Dial which includes HTTP clients, the behaviour is the exact same if it is set to nil or &tls.Config{}:

https://github.com/golang/go/blob/1095dd6339dbaf8d7c92214396c0a4dbcfa38521/src/crypto/tls/tls.go#L201

pkg/util/tls/tls.go Outdated Show resolved Hide resolved
@@ -27,6 +27,8 @@ type Config struct {

BackoffOnRatelimits bool `yaml:"backoff_on_ratelimits"`
BackoffConfig util.BackoffConfig `yaml:"backoff_config"`

TLS tls.ClientConfigWithEnabled `yaml:",inline"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is non an innocuous refactoring. The problem is that doing this change we add TLS to any gRPC client, even the ones for which we don't support it. For example BigTable client (see generated doc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done on on purpose as an outcome of this thread (#3156 (comment)), so there's an unified grpcclient.Config. We ensure that we enable TLS on the bigtable client by default, as TLS is on for bigtable by default in Cortex today.

I am not too sure if I caught every usage, but we should respect the TLS arguments (but I am not too sure if we want to support them according to your comment).

@simonswine simonswine force-pushed the tls-server-name branch 3 times, most recently from c00866b to 05ce847 Compare October 28, 2020 08:47
CHANGELOG.md Outdated Show resolved Hide resolved
This allows to override the expected server name during TLS server
validation. This simplifies the TLS setup as a ServerName can be more
predictable than for example IP addresses. Fixes cortexproject#3063

Improve TLS client test coverage

Add integration tests that spin up a HTTP/GRPC server and verify that
the client options behave in the expected way.

Allow configuration of non-mutual TLS

Explicitly enable TLS in the client with the flag
`-<prefix>.tls-enabled`. This flag is implicitly enabled when any other
TLS flag is set.

This flag will only be respected by the GRPC client, as for the
HTTP client the scheme used in the URL will take precedence.

Signed-off-by: Christian Simon <simon@swine.de>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
@pstibrany
Copy link
Contributor

pstibrany commented Jan 28, 2021

I've rebased this patch on top of current master, and did the changes suggested in review comments:

  • we should remove "tls_enabled" flag from tls.ClientConfig struct.
  • for grpc clients, we should put "tls_enabled" bool flag into grpcclient.Config, and merge grpcclient.ConfigWithTLS with grpcclient.Config
  • etcd client which has its own config should embed tls.ClientConfig instead, but keep tls_enabled flag.

I haven't updated ruler to use tls when communicating with AM, since #3752 is already doing that.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @pstibrany for taking over this work 👏 A part form a couple of non blocking nits, I've the feeling we're introducing a regression in the configs client. Could you check it, please?

pkg/querier/blocks_store_replicated_set.go Outdated Show resolved Hide resolved
pkg/querier/store_gateway_client.go Outdated Show resolved Hide resolved
pkg/ring/kv/etcd/etcd.go Outdated Show resolved Hide resolved
@@ -29,18 +30,15 @@ func (cfg *ClientConfig) RegisterFlagsWithPrefix(prefix string, f *flag.FlagSet)
f.StringVar(&cfg.CertPath, prefix+".tls-cert-path", "", "Path to the client certificate file, which will be used for authenticating with the server. Also requires the key path to be configured.")
f.StringVar(&cfg.KeyPath, prefix+".tls-key-path", "", "Path to the key file for the client certificate. Also requires the client certificate to be configured.")
f.StringVar(&cfg.CAPath, prefix+".tls-ca-path", "", "Path to the CA certificates file to validate server certificate against. If not set, the host's root CA certificates are used.")
f.StringVar(&cfg.ServerName, prefix+".tls-server-name", "", "Override the expected name on the server certificate.")
f.BoolVar(&cfg.InsecureSkipVerify, prefix+".tls-insecure-skip-verify", false, "Skip validating server certificate.")
}

// GetTLSConfig initialises tls.Config from config options
func (cfg *ClientConfig) GetTLSConfig() (*tls.Config, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is also used by pkg/configs/client/client.go. Before it was returning nil if TLS was disabled, while now it always returns a config. Could you check the impact on the configs client? I've the feeling we're introducing a regression here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If operator doesn't set any TLS options, then I think that returned structure will work fine even when using TLS in configs.
Defaults we are setting are good for client-side use. (For server-side, Certificates shouldn't be nil).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested http client with no Transport and Transport: &http.Transport{TLSClientConfig: tlsCfg} (where tlsCfg is result of calling GetTLSConfig on empty cortex_tls.ClientConfig) against public site using https, and I'm getting the same 200 response in both cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think integration test in tls_integration_test.go covers this too.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Peter for checking it. LGTM.

@pstibrany pstibrany merged commit eb6a6f5 into cortexproject:master Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support other TLS modes than mutual auth in Client
4 participants