Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TLS support to HTTP/GRPC clients #2502

Merged
merged 38 commits into from May 14, 2020
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
67ac979
Checkpoint
annanay25 Apr 9, 2020
828e968
Add tls options to grpc client
annanay25 Apr 13, 2020
e85437b
Merge branch 'master' into add-tls-support
annanay25 Apr 13, 2020
1ee9ddb
Add new httpclient util package for use in all client configs
annanay25 Apr 14, 2020
ef58c41
Merge branch 'master' into add-tls-support
annanay25 Apr 15, 2020
745299d
Merge branch 'master' into add-tls-support
annanay25 Apr 22, 2020
b93af11
Change all grpc clients to use grpcclient
annanay25 Apr 22, 2020
8a71ce9
Fix build, add docs
annanay25 Apr 23, 2020
bab4c82
Fix tests
annanay25 Apr 23, 2020
8fee619
Fix lint, add tls to store-gw-client
annanay25 Apr 23, 2020
2ff0a73
Merge branch 'master' into add-tls-support
annanay25 Apr 23, 2020
d24a44d
Rename config parameters
annanay25 Apr 23, 2020
485f8ba
Lint
annanay25 Apr 23, 2020
5561a0f
Nit fix
annanay25 Apr 28, 2020
598b00f
Merge branch 'master' into add-tls-support
annanay25 Apr 28, 2020
548ddfa
Checkpoint
annanay25 Apr 29, 2020
1a185b5
Checkpoint
annanay25 Apr 29, 2020
1ccefbf
Checkpoint
annanay25 Apr 30, 2020
5fd056c
Merge branch 'master' into add-tls-support
annanay25 May 2, 2020
0a8c4eb
Add integration tests for TLS
annanay25 May 4, 2020
b6aafa9
Merge branch 'master' into add-tls-support
annanay25 May 4, 2020
d28a3d6
Correct package names, fix config file reference
annanay25 May 4, 2020
480308e
Fix cert paths
annanay25 May 6, 2020
ca7a6d9
Fix lint, add sample tls config file
annanay25 May 7, 2020
db64cd4
Crash quickly if certs are bad
annanay25 May 7, 2020
ee48ed3
Fixed linter and doc generation
pracucci May 8, 2020
b9325bd
Cleaned white noise
pracucci May 8, 2020
721fed1
Merge commit 'refs/pull/2502/head' of github.com:cortexproject/cortex…
annanay25 May 11, 2020
af4935d
Address review comments
annanay25 May 11, 2020
8f9f2e7
Fix docs, flags
annanay25 May 11, 2020
668e988
Fix test
annanay25 May 11, 2020
ef761f5
Fix lint, docs
annanay25 May 11, 2020
24919ee
Do not use TLS options with GCP clients
annanay25 May 11, 2020
9939303
Add client auth type, go mod tidy/vendor
annanay25 May 12, 2020
6eb9331
Address comments
annanay25 May 13, 2020
5aa65d1
Fix lint, add new integration test
annanay25 May 13, 2020
ae5f6f7
Revert logging level to warn, add CHANGELOG entry
annanay25 May 13, 2020
d645572
Merge branch 'master' into add-tls-support
annanay25 May 13, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
110 changes: 110 additions & 0 deletions docs/configuration/config-file-reference.md
Expand Up @@ -646,6 +646,56 @@ The `querier_config` configures the Cortex querier.
# instances form a ring and addresses are picked from the ring).
# CLI flag: -experimental.querier.store-gateway-addresses
[store_gateway_addresses: <string> | default = ""]

store_gateway_client_config:
# gRPC client max receive message size (bytes).
# CLI flag: -experimental.querier.store-gateway-client.grpc-max-recv-msg-size
[max_recv_msg_size: <int> | default = 104857600]

# gRPC client max send message size (bytes).
# CLI flag: -experimental.querier.store-gateway-client.grpc-max-send-msg-size
[max_send_msg_size: <int> | default = 16777216]

# Use compression when sending messages.
# CLI flag: -experimental.querier.store-gateway-client.grpc-use-gzip-compression
[use_gzip_compression: <boolean> | default = false]

# Rate limit for gRPC client; 0 means disabled.
# CLI flag: -experimental.querier.store-gateway-client.grpc-client-rate-limit
[rate_limit: <float> | default = 0]

# Rate limit burst for gRPC client.
# CLI flag: -experimental.querier.store-gateway-client.grpc-client-rate-limit-burst
[rate_limit_burst: <int> | default = 0]

# Enable backoff and retry when we hit ratelimits.
# CLI flag: -experimental.querier.store-gateway-client.backoff-on-ratelimits
[backoff_on_ratelimits: <boolean> | default = false]

backoff_config:
# Minimum delay when backing off.
# CLI flag: -experimental.querier.store-gateway-client.backoff-min-period
[min_period: <duration> | default = 100ms]

# Maximum delay when backing off.
# CLI flag: -experimental.querier.store-gateway-client.backoff-max-period
[max_period: <duration> | default = 10s]

# Number of times to backoff and retry before failing.
# CLI flag: -experimental.querier.store-gateway-client.backoff-retries
[max_retries: <int> | default = 10]

# TLS cert path for the client
# CLI flag: -experimental.querier.store-gateway-client.tls-cert-path
[tls_cert_path: <string> | default = ""]

# TLS key path for the client
# CLI flag: -experimental.querier.store-gateway-client.tls-key-path
[tls_key_path: <string> | default = ""]

# TLS CA path for the client
# CLI flag: -experimental.querier.store-gateway-client.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `query_frontend_config`
Expand Down Expand Up @@ -757,6 +807,18 @@ The `ruler_config` configures the Cortex ruler.
# CLI flag: -ruler.external.url
[external_url: <url> | default = ]

# TLS cert path for the client
# CLI flag: -ruler.client.tls-cert-path
[tls_cert_path: <string> | default = ""]

# TLS key path for the client
# CLI flag: -ruler.client.tls-key-path
[tls_key_path: <string> | default = ""]

# TLS CA path for the client
# CLI flag: -ruler.client.tls-ca-path
[tls_ca_path: <string> | default = ""]

# How frequently to evaluate rules
# CLI flag: -ruler.evaluation-interval
[evaluation_interval: <duration> | default = 1m]
Expand Down Expand Up @@ -1584,6 +1646,18 @@ bigtable:
# CLI flag: -bigtable.backoff-retries
[max_retries: <int> | default = 10]

# TLS cert path for the client
# CLI flag: -bigtable.tls-cert-path
[tls_cert_path: <string> | default = ""]

# TLS key path for the client
# CLI flag: -bigtable.tls-key-path
[tls_key_path: <string> | default = ""]

# TLS CA path for the client
# CLI flag: -bigtable.tls-ca-path
[tls_ca_path: <string> | default = ""]

# If enabled, once a tables info is fetched, it is cached.
# CLI flag: -bigtable.table-cache.enabled
[table_cache_enabled: <boolean> | default = true]
Expand Down Expand Up @@ -1960,6 +2034,18 @@ grpc_client_config:
# Number of times to backoff and retry before failing.
# CLI flag: -ingester.client.backoff-retries
[max_retries: <int> | default = 10]

# TLS cert path for the client
# CLI flag: -ingester.client.tls-cert-path
[tls_cert_path: <string> | default = ""]

# TLS key path for the client
# CLI flag: -ingester.client.tls-key-path
[tls_key_path: <string> | default = ""]

# TLS CA path for the client
# CLI flag: -ingester.client.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `frontend_worker_config`
Expand Down Expand Up @@ -2021,6 +2107,18 @@ grpc_client_config:
# Number of times to backoff and retry before failing.
# CLI flag: -querier.frontend-client.backoff-retries
[max_retries: <int> | default = 10]

# TLS cert path for the client
# CLI flag: -querier.frontend-client.tls-cert-path
[tls_cert_path: <string> | default = ""]

# TLS key path for the client
# CLI flag: -querier.frontend-client.tls-key-path
[tls_key_path: <string> | default = ""]

# TLS CA path for the client
# CLI flag: -querier.frontend-client.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `etcd_config`
Expand Down Expand Up @@ -2526,6 +2624,18 @@ The `configstore_config` configures the config database storing rules and alerts
# Timeout for requests to Weave Cloud configs service.
# CLI flag: -<prefix>.configs.client-timeout
[client_timeout: <duration> | default = 5s]

# TLS cert path for the client
# CLI flag: -<prefix>.configs.tls-cert-path
[tls_cert_path: <string> | default = ""]

# TLS key path for the client
# CLI flag: -<prefix>.configs.tls-key-path
[tls_key_path: <string> | default = ""]

# TLS CA path for the client
# CLI flag: -<prefix>.configs.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `tsdb_config`
Expand Down
100 changes: 100 additions & 0 deletions docs/configuration/single-process-config-blocks-tls.yaml
@@ -0,0 +1,100 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each pre-cooked config we provide, we have an integration test in integration/getting_started_single_process_config_test.go. Would be great if you could add a test there for this file (a new test, which can be an existing one you copy, paste and modify as needed).

# Configuration for running Cortex in single-process mode.
# This should not be used in production. It is only for getting started
# and development.

# Disable the requirement that every request to Cortex has a
# X-Scope-OrgID header. `fake` will be substituted in instead.
auth_enabled: false

server:
http_listen_port: 9009

# Configure the server to allow messages up to 100MB.
grpc_server_max_recv_msg_size: 104857600
grpc_server_max_send_msg_size: 104857600
grpc_server_max_concurrent_streams: 1000
grpc_tls_config:
cert_file: "server.crt"
key_file: "server.key"
client_auth_type: "RequireAndVerifyClientCert"
client_ca_file: "root.crt"


distributor:
shard_by_all_labels: true
pool:
health_check_ingesters: true

ingester_client:
grpc_client_config:
# Configure the client to allow messages up to 100MB.
max_recv_msg_size: 104857600
max_send_msg_size: 104857600
use_gzip_compression: true
tls_cert_path: "client.crt"
tls_key_path: "client.key"
tls_ca_path: "root.crt"

ingester:
# Disable blocks transfers on ingesters shutdown or rollout.
max_transfer_retries: 0

lifecycler:
# The address to advertise for this ingester. Will be autodiscovered by
# looking up address on eth0 or en0; can be specified if this fails.
# address: 127.0.0.1

# We want to start immediately and flush on shutdown.
join_after: 0
min_ready_duration: 0s
final_sleep: 0s
num_tokens: 512

# Use an in memory ring store, so we don't need to launch a Consul.
ring:
kvstore:
store: inmemory
replication_factor: 1

storage:
engine: tsdb

tsdb:
dir: /tmp/cortex/tsdb
bucket_store:
sync_dir: /tmp/cortex/tsdb-sync

# You can choose between local storage and Amazon S3, Google GCS and Azure storage. Each option requires additional configuration
# as shown below. All options can be configured via flags as well which might be handy for secret inputs.
backend: s3 # s3, gcs, azure or filesystem are valid options
s3:
bucket_name: cortex
endpoint: s3.dualstack.us-east-1.amazonaws.com
# Configure your S3 credentials below.
# secret_access_key: "TODO"
# access_key_id: "TODO"
# gcs:
# bucket_name: cortex
# service_account: # if empty or omitted Cortex will use your default service account as per Google's fallback logic
# azure:
# account_name:
# account_key:
# container_name:
# endpoint_suffix:
# max_retries: # Number of retries for recoverable errors (defaults to 20)
# filesystem:
# dir: ./data/tsdb

compactor:
data_dir: /tmp/cortex/compactor
sharding_ring:
kvstore:
store: inmemory

frontend_worker:
match_max_concurrent: true
grpc_client_config:
tls_cert_path: "client.crt"
tls_key_path: "client.key"
tls_ca_path: "root.crt"
108 changes: 108 additions & 0 deletions docs/production/tls.md
@@ -0,0 +1,108 @@
---
title: "Securing communication between cortex components with TLS"
annanay25 marked this conversation as resolved.
Show resolved Hide resolved
linkTitle: "Securing communication between cortex components with TLS"
weight: 5
slug: tls
---

Cortex is a distributed system with significant traffic between its services.
To allow for secure communication, Cortex supports TLS between all its
components. This guide describes the process of setting up TLS.

### Generation of certs to configure TLS

The first step to securing inter-service communication in Cortex with TLS is
generating certificates. A Certifying Authority (CA) will be used for this
purpose which should be private to the organization, as any certificates signed
by this CA will have permissions to communicate with the cluster.

We will use the following script to generate self signed certs for the cluster:

```
# Refer: https://github.com/joe-elliott/cert-exporter/blob/69d3d7230378325a1de4fa313432d3d6ced4a518/test/files/genCerts.sh
annanay25 marked this conversation as resolved.
Show resolved Hide resolved

# keys
openssl genrsa -out root.key
openssl genrsa -out client.key
openssl genrsa -out server.key

# root cert / certifying authority
openssl req -x509 -new -nodes -key root.key -subj "/C=US/ST=KY/O=Org/CN=root" -sha256 -days 100000 -out root.crt

# csrs - certificate signing requests
openssl req -new -sha256 -key client.key -subj "/C=US/ST=KY/O=Org/CN=client" -out client.csr
openssl req -new -sha256 -key server.key -subj "/C=US/ST=KY/O=Org/CN=localhost" -out server.csr

# certificates
openssl x509 -req -in client.csr -CA root.crt -CAkey root.key -CAcreateserial -out client.crt -days 100000 -sha256
openssl x509 -req -in server.csr -CA root.crt -CAkey root.key -CAcreateserial -out server.crt -days 100000 -sha256
```

Note that the above script generates certificates that are valid for 100000 days.
This can be changed by adjusting the `-days` option in the above commands.
It is recommended that the certs be replaced atleast once every 2 years.

The above script generates keys `client.key, server.key` and certs
`client.crt, server.crt` for both the client and server. The CA cert is
generated as `root.crt`.

### Load certs into the HTTP/GRPC server/client

Every HTTP/GRPC link between Cortex components supports TLS configuration
through the following config parameters:

#### Server flags

```
# Path to the TLS Cert for the HTTP Server
-server.http-tls-cert-path=/path/to/server.crt

# Path to the TLS Key for the HTTP Server
-server.http-tls-key-path=/path/to/server.key

# Type of Client Auth for the HTTP Server
-server.http-tls-client-auth="RequireAndVerifyClientCert"

# Path to the Client CA Cert for the HTTP Server
-server.http-tls-ca-path="/path/to/root.crt"

# Path to the TLS Cert for the GRPC Server
-server.grpc-tls-cert-path=/path/to/server.crt

# Path to the TLS Key for the GRPC Server
-server.grpc-tls-key-path=/path/to/server.key

# Type of Client Auth for the GRPC Server
-server.grpc-tls-client-auth="RequireAndVerifyClientCert"

# Path to the Client CA Cert for the GRPC Server
-server.grpc-tls-ca-path=/path/to/root.crt
```

#### Client flags
annanay25 marked this conversation as resolved.
Show resolved Hide resolved

Client flags are component specific.

For an HTTP client in the Alertmanager:
```
# Path to the TLS Cert for the HTTP Client
-alertmanager.configs.tls-cert-path=/path/to/client.crt

# Path to the TLS Key for the HTTP Client
-alertmanager.configs.tls-key-path=/path/to/client.key

# Path to the TLS CA for the HTTP Client
-alertmanager.configs.tls-ca-path=/path/to/root.crt
```

For a GRPC client in the Querier:
```
# Path to the TLS Cert for the GRPC Client
-querier.frontend-client.tls-cert-path=/path/to/client.crt

# Path to the TLS Key for the GRPC Client
-querier.frontend-client.tls-key-path=/path/to/client.key

# Path to the TLS CA for the GRPC Client
-querier.frontend-client.tls-ca-path=/path/to/root.crt
```