Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Cortex to latest master #1869

Merged
merged 8 commits into from
Apr 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/clients/promtail/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,13 +207,13 @@ tls_config:
# For a total time of 511.5s(8.5m) before logs are lost
backoff_config:
# Initial backoff time between retries
[minbackoff: <duration> | default = 500ms]
[min_period: <duration> | default = 500ms]

# Maximum backoff time between retries
[maxbackoff: <duration> | default = 5m]
[max_period: <duration> | default = 5m]

# Maximum number of retries to do
[maxretries: <int> | default = 10]
[max_retries: <int> | default = 10]

# Static labels to add to all logs being sent to Loki.
# Use map like {"foo": "bar"} to add a label foo with
Expand Down
12 changes: 6 additions & 6 deletions docs/clients/promtail/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,24 +102,24 @@ batched together before getting pushed to Loki, based on the max batch duration
In case of any error while sending a log entries batch, `promtail` adopts a
"retry then discard" strategy:

- `promtail` retries to send log entry to the ingester up to `maxretries` times
- `promtail` retries to send log entry to the ingester up to `max_retries` times
- If all retries fail, `promtail` discards the batch of log entries (_which will
be lost_) and proceeds with the next one

You can configure the `maxretries` and the delay between two retries via the
You can configure the `max_retries` and the delay between two retries via the
`backoff_config` in the promtail config file:

```yaml
clients:
- url: INGESTER-URL
backoff_config:
minbackoff: 100ms
maxbackoff: 10s
maxretries: 10
min_period: 100ms
max_period: 10s
max_retries: 10
```

The following table shows an example of the total delay applied by the backoff algorithm
with `minbackoff: 100ms` and `maxbackoff: 10s`:
with `min_period: 100ms` and `max_period: 10s`:

| Retry | Min delay | Max delay | Total min delay | Total max delay |
| ----- | --------- | --------- | --------------- | --------------- |
Expand Down
55 changes: 24 additions & 31 deletions docs/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,13 +243,13 @@ The `grpc_client_config` block configures a client connection to a gRPC service.
# Configures backoff when enbaled.
backoff_config:
# Minimum delay when backing off.
[minbackoff: <duration> | default = 100ms]
[min_period: <duration> | default = 100ms]

# The maximum delay when backing off.
[maxbackoff: <duration> | default = 10s]
[max_period: <duration> | default = 10s]

# Number of times to backoff and retry before failing.
[maxretries: <int> | default = 10]
[max_retries: <int> | default = 10]
```

## ingester_config
Expand Down Expand Up @@ -344,9 +344,6 @@ ring.
# conditions with ingesters exiting and updating the ring.
[min_ready_duration: <duration> | default = 1m]

# Store tokens in a normalised fashion to reduce the number of allocations.
[normalise_tokens: <boolean> | default = false]

# Name of network interfaces to read addresses from.
interface_names:
- [<string> ... | default = ["eth0", "en0"]]
Expand Down Expand Up @@ -375,13 +372,13 @@ kvstore:
[host: <string> | duration = "localhost:8500"]

# The ACL Token used to interact with Consul.
[acltoken: <string>]
[acl_token: <string>]

# The HTTP timeout when communicating with Consul
[httpclienttimeout: <duration> | default = 20s]
[http_client_timeout: <duration> | default = 20s]

# Whether or not consistent reads to Consul are enabled.
[consistentreads: <boolean> | default = true]
[consistent_reads: <boolean> | default = true]

# Configuration for an ETCD v3 client. Only applies if
# store is "etcd"
Expand Down Expand Up @@ -424,56 +421,52 @@ aws:
[s3forcepathstyle: <boolean> | default = false]

# Configure the DynamoDB connection
dynamodbconfig:
dynamodb:
# URL for DynamoDB with escaped Key and Secret encoded. If only region is specified as a
# host, the proper endpoint will be deduced. Use inmemory:///<bucket-name> to
# use a mock in-memory implementation.
dynamodb: <string>
dynamodb_url: <string>

# DynamoDB table management requests per-second limit.
[apilimit: <float> | default = 2.0]
[api_limit: <float> | default = 2.0]

# DynamoDB rate cap to back off when throttled.
[throttlelimit: <float> | default = 10.0]

# Application Autoscaling endpoint URL with escaped Key and Secret
# encoded.
[applicationautoscaling: <string>]
[throttle_limit: <float> | default = 10.0]

# Metics-based autoscaling configuration.
metrics:
# Use metrics-based autoscaling via this Prometheus query URL.
[url: <string>]

# Queue length above which we will scale up capacity.
[targetqueuelen: <int> | default = 100000]
[target_queue_length: <int> | default = 100000]

# Scale up capacity by this multiple
[scaleupfactor: <float64> | default = 1.3]
[scale_up_factor: <float64> | default = 1.3]

# Ignore throttling below this level (rate per second)
[minthrottling: <float64> | default = 1]
[ignore_throttle_below: <float64> | default = 1]

# Query to fetch ingester queue length
[queuelengthquery: <string> | default = "sum(avg_over_time(cortex_ingester_flush_queue_length{job="cortex/ingester"}[2m]))"]
[queue_length_query: <string> | default = "sum(avg_over_time(cortex_ingester_flush_queue_length{job="cortex/ingester"}[2m]))"]

# Query to fetch throttle rates per table
[throttlequery: <string> | default = "sum(rate(cortex_dynamo_throttled_total{operation="DynamoDB.BatchWriteItem"}[1m])) by (table) > 0"]
[write_throttle_query: <string> | default = "sum(rate(cortex_dynamo_throttled_total{operation="DynamoDB.BatchWriteItem"}[1m])) by (table) > 0"]

# Quer to fetch write capacity usage per table
[usagequery: <string> | default = "sum(rate(cortex_dynamo_consumed_capacity_total{operation="DynamoDB.BatchWriteItem"}[15m])) by (table) > 0"]
[write_usage_query: <string> | default = "sum(rate(cortex_dynamo_consumed_capacity_total{operation="DynamoDB.BatchWriteItem"}[15m])) by (table) > 0"]

# Query to fetch read capacity usage per table
[readusagequery: <string> | default = "sum(rate(cortex_dynamo_consumed_capacity_total{operation="DynamoDB.QueryPages"}[1h])) by (table) > 0"]
[read_usage_query: <string> | default = "sum(rate(cortex_dynamo_consumed_capacity_total{operation="DynamoDB.QueryPages"}[1h])) by (table) > 0"]

# Query to fetch read errors per table
[readerrorquery: <string> | default = "sum(increase(cortex_dynamo_failures_total{operation="DynamoDB.QueryPages",error="ProvisionedThroughputExceededException"}[1m])) by (table) > 0"]
[read_error_query: <string> | default = "sum(increase(cortex_dynamo_failures_total{operation="DynamoDB.QueryPages",error="ProvisionedThroughputExceededException"}[1m])) by (table) > 0"]

# Number of chunks to group together to parallelise fetches (0 to disable)
[chunkgangsize: <int> | default = 10]
[chunk_gang_size: <int> | default = 10]

# Max number of chunk get operations to start in parallel.
[chunkgetmaxparallelism: <int> | default = 32]
[chunk_get_max_parallelism: <int> | default = 32]

# Configures storing chunks in Bigtable. Required fields only required
# when bigtable is defined in config.
Expand Down Expand Up @@ -560,7 +553,7 @@ filesystem:

# Cache validity for active index entries. Should be no higher than
# the chunk_idle_period in the ingester settings.
[indexcachevalidity: <duration> | default = 5m]
[index_cache_validity: <duration> | default = 5m]

# The maximum number of chunks to fetch per batch.
[max_chunk_batch_size: <int> | default = 50]
Expand Down Expand Up @@ -900,7 +893,7 @@ and how to provision tables when DynamoDB is used as the backing store.
[retention_period: <duration> | default = 0s]

# Period with which the table manager will poll for tables.
[dynamodb_poll_interval: <duration> | default = 2m]
[poll_interval: <duration> | default = 2m]

# duration a table will be created before it is needed.
[creation_grace_period: <duration> | default = 10m]
Expand All @@ -919,7 +912,7 @@ The `provision_config` block configures provisioning capacity for DynamoDB.
```yaml
# Enables on-demand throughput provisioning for the storage
# provider, if supported. Applies only to tables which are not autoscaled.
[provisioned_throughput_on_demand_mode: <boolean> | default = false]
[enable_ondemand_throughput_mode: <boolean> | default = false]

# DynamoDB table default write throughput.
[provisioned_write_throughput: <int> | default = 3000]
Expand All @@ -929,7 +922,7 @@ The `provision_config` block configures provisioning capacity for DynamoDB.

# Enables on-demand throughput provisioning for the storage provide,
# if supported. Applies only to tables which are not autoscaled.
[inactive_throughput_on_demand_mode: <boolean> | default = false]
[enable_inactive_throughput_on_demand_mode: <boolean> | default = false]

# DynamoDB table write throughput for inactive tables.
[inactive_write_throughput: <int> | default = 1]
Expand Down
8 changes: 4 additions & 4 deletions docs/configuration/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,8 +135,8 @@ schema_config:
storage_config:
aws:
s3: s3://access_key:secret_access_key@region/bucket_name
dynamodbconfig:
dynamodb: dynamodb://access_key:secret_access_key@region
dynamodb:
dynamodb_url: dynamodb://access_key:secret_access_key@region
```

If you don't wish to hard-code S3 credentials, you can also configure an EC2
Expand All @@ -146,8 +146,8 @@ instance role by changing the `storage_config` section:
storage_config:
aws:
s3: s3://region/bucket_name
dynamodbconfig:
dynamodb: dynamodb://region
dynamodb:
dynamodb_url: dynamodb://region
```

### S3-compatible APIs
Expand Down
4 changes: 2 additions & 2 deletions docs/configuration/query-frontend.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ data:

frontend:
log_queries_longer_than: 5s
downstream: querier.<namespace>.svc.cluster.local:3100
downstream_url: querier.<namespace>.svc.cluster.local:3100
compress_responses: true
```

Expand Down Expand Up @@ -140,5 +140,5 @@ Once you've deployed these, you'll need your grafana datasource to point to the

the query frontend operates in one of two fashions:

1) with `--frontend.downstream-url` or its yaml equivalent `frontend.downstream`. This simply proxies requests over http to said url.
1) with `--frontend.downstream-url` or its yaml equivalent `frontend.downstream_url`. This simply proxies requests over http to said url.
2) without (1) it defaults to a pull service. In this form, the frontend instantiates per-tenant queues that downstream queriers pull queries from via grpc. When operating in this mode, queriers need to specify `-querier.frontend-address` or its yaml equivalent `frontend_worker.frontend_address`.
2 changes: 1 addition & 1 deletion docs/operations/storage/table-manager.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ read/write capacity units and autoscaling.

| DynamoDB | Active table | Inactive table |
| ------------------- | --------------------------------------- | ------------------------------------ |
| Capacity mode | `provisioned_throughput_on_demand_mode` | `inactive_throughput_on_demand_mode` |
| Capacity mode | `enable_ondemand_throughput_mode` | `enable_inactive_throughput_on_demand_mode` |
| Read capacity unit | `provisioned_read_throughput` | `inactive_read_throughput` |
| Write capacity unit | `provisioned_write_throughput` | `inactive_write_throughput` |
| Autoscaling | Enabled (if configured) | Always disabled |
Expand Down
10 changes: 10 additions & 0 deletions docs/operations/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,16 @@ Unfortunately Loki is software and software is hard and sometimes things are not

On this page we will document any upgrade issues/gotchas/considerations we are aware of.

## 1.5.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn - Cortex included many breaking changes in it's major (1.0) release. Should we increment to 2.0 here?


Loki 1.5.0 vendors Cortex v1.0.0 (congratulations!), which has a [massive list of changes](https://cortexmetrics.io/docs/changelog/#1-0-0-2020-04-02).

While changes in the command line flags affect Loki as well, we usually recommend people to use configuration file instead.

Cortex has done lot of cleanup in the configuration files, and you are strongly urged to take a look at the [annotated diff for config file](https://cortexmetrics.io/docs/changelog/#config-file-breaking-changes) before upgrading to Loki 1.5.0.

Following fields were removed from YAML configuration completely: `claim_on_rollout` (always true), `normalise_tokens` (always true).

## 1.4.0

Loki 1.4.0 vendors Cortex v0.7.0-rc.0 which contains [several breaking config changes](https://github.com/cortexproject/cortex/blob/v0.7.0-rc.0/CHANGELOG.md).
Expand Down
7 changes: 4 additions & 3 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ require (
github.com/containerd/containerd v1.3.2 // indirect
github.com/containerd/fifo v0.0.0-20190226154929-a9fb20d87448 // indirect
github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e
github.com/cortexproject/cortex v0.7.1-0.20200316184320-acc42abdf56c
github.com/cortexproject/cortex v1.0.0
github.com/davecgh/go-spew v1.1.1
github.com/docker/distribution v2.7.1+incompatible // indirect
github.com/docker/docker v0.7.3-0.20190817195342-4760db040282
Expand Down Expand Up @@ -39,20 +39,21 @@ require (
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f
github.com/opentracing/opentracing-go v1.1.1-0.20200124165624-2876d2018785
github.com/pierrec/lz4 v2.3.1-0.20191115212037-9085dacd1e1e+incompatible
github.com/pkg/errors v0.8.1
github.com/pkg/errors v0.9.1
github.com/prometheus/client_golang v1.5.0
github.com/prometheus/client_model v0.2.0
github.com/prometheus/common v0.9.1
github.com/prometheus/prometheus v1.8.2-0.20200213233353-b90be6f32a33
github.com/shurcooL/httpfs v0.0.0-20190707220628-8d4bc4ba7749
github.com/shurcooL/vfsgen v0.0.0-20181202132449-6a9ea43bcacd
github.com/stretchr/testify v1.5.1
github.com/thanos-io/thanos v0.11.0 // indirect
github.com/tonistiigi/fifo v0.0.0-20190226154929-a9fb20d87448
github.com/uber/jaeger-client-go v2.20.1+incompatible
github.com/ugorji/go v1.1.7 // indirect
github.com/weaveworks/common v0.0.0-20200310113808-2708ba4e60a4
go.etcd.io/etcd v0.0.0-20190815204525-8f85f0dc2607 // indirect
golang.org/x/net v0.0.0-20191112182307-2180aed22343
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b
google.golang.org/grpc v1.25.1
gopkg.in/alecthomas/kingpin.v2 v2.2.6
gopkg.in/fsnotify.v1 v1.4.7
Expand Down
Loading