Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull release-0.23 into main #4738

Merged
merged 9 commits into from Oct 6, 2021
40 changes: 23 additions & 17 deletions CHANGELOG.md
Expand Up @@ -24,34 +24,40 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re

- [#4508](https://github.com/thanos-io/thanos/pull/4508) Adjust and rename `ThanosSidecarUnhealthy` to `ThanosSidecarNoConnectionToStartedPrometheus`; Remove `ThanosSidecarPrometheusDown` alert; Remove unused `thanos_sidecar_last_heartbeat_success_time_seconds` metrics.

## v0.23.0 - In Progress
## [v0.23.1](https://github.com/thanos-io/thanos/tree/release-0.23) - 2021.10.1

- [#4714](https://github.com/thanos-io/thanos/pull/4714) EndpointSet: Do not use unimplemented yet new InfoAPI to obtain metadata (avoids unnecessary HTTP roundtrip, instrumentation/alerts spam and logs).

## [v0.23.0](https://github.com/thanos-io/thanos/tree/release-0.23) - 2021.09.23

### Added

- [#4453](https://github.com/thanos-io/thanos/pull/4453) Tools: Add flag `--selector.relabel-config-file` / `--selector.relabel-config` / `--max-time` / `--min-time` to filter served blocks.
- [#4482](https://github.com/thanos-io/thanos/pull/4482) COS: Add http_config for cos object store client.
- [#4487](https://github.com/thanos-io/thanos/pull/4487) Query: Add memcached auto discovery support.
- [#4444](https://github.com/thanos-io/thanos/pull/4444) UI: Add search block UI.
- [#4509](https://github.com/thanos-io/thanos/pull/4509) Logging: Adds duration_ms in int64 to the logs.
- [#4462](https://github.com/thanos-io/thanos/pull/4462) UI: Add find overlap block UI.
- [#4469](https://github.com/thanos-io/thanos/pull/4469) Compact: Add flag `compact.skip-block-with-out-of-order-chunks` to skip blocks with out-of-order chunks during compaction instead of halting
- [#4506](https://github.com/thanos-io/thanos/pull/4506) `Baidu BOS` object storage, see [documents](docs/storage.md#baidu-bos) for further information.
- [#4552](https://github.com/thanos-io/thanos/pull/4552) Compact: Adds `thanos_compact_downsample_duration_seconds` histogram.
- [#4594](https://github.com/thanos-io/thanos/pull/4594) reloader: Expose metrics in config reloader to give info on the last operation.
- [#4623](https://github.com/thanos-io/thanos/pull/4623) query-frontend: made HTTP downstream tripper (client) configurable via parameters `--query-range.downstream-tripper-config` and `--query-range.downstream-tripper-config-file`. If your downstream URL is localhost or 127.0.0.1 then it is strongly recommended to bump `max_idle_conns_per_host` to at least 100 so that `query-frontend` could properly use HTTP keep-alive connections and thus reduce the latency of `query-frontend` by about 20%.
- [#4636](https://github.com/thanos-io/thanos/pull/4636) Azure: Support authentication using user-assigned managed identity
- [#4453](https://github.com/thanos-io/thanos/pull/4453) Tools `thanos bucket web`: Add flag `--selector.relabel-config-file` / `--selector.relabel-config` / `--max-time` / `--min-time` to filter served blocks.
- [#4482](https://github.com/thanos-io/thanos/pull/4482) Store: Add `http_config` option for COS object store client.
- [#4487](https://github.com/thanos-io/thanos/pull/4487) Query/Store: Add memcached auto discovery support for all caching clients.
- [#4444](https://github.com/thanos-io/thanos/pull/4444) UI: Add search to the Block UI.
- [#4509](https://github.com/thanos-io/thanos/pull/4509) Logging: Add `duration_ms` in int64 to the logs for easier log filtering.
- [#4462](https://github.com/thanos-io/thanos/pull/4462) UI: Highlighting blocks overlap in the Block UI.
- [#4469](https://github.com/thanos-io/thanos/pull/4469) Compact: Add flag `compact.skip-block-with-out-of-order-chunks` to skip blocks with out-of-order chunks during compaction instead of halting.
- [#4506](https://github.com/thanos-io/thanos/pull/4506) Store: Add `Baidu BOS` object storage, see [documents](docs/storage.md#baidu-bos) for further information.
- [#4552](https://github.com/thanos-io/thanos/pull/4552) Compact: Add `thanos_compact_downsample_duration_seconds` histogram metric.
- [#4594](https://github.com/thanos-io/thanos/pull/4594) Reloader: Expose metrics in config reloader to give info on the last operation.
- [#4619](https://github.com/thanos-io/thanos/pull/4619) Tracing: Added consistent tags to Series call from Querier about number important series statistics: `processed.series`, `processed.samples`, `processed.samples` and `processed.bytes`. This will give admin idea of how much data each component processes per query.
- [#4623](https://github.com/thanos-io/thanos/pull/4623) Query-frontend: Make HTTP downstream tripper (client) configurable via parameters `--query-range.downstream-tripper-config` and `--query-range.downstream-tripper-config-file`. If your downstream URL is localhost or 127.0.0.1 then it is strongly recommended to bump `max_idle_conns_per_host` to at least 100 so that `query-frontend` could properly use HTTP keep-alive connections and thus reduce the latency of `query-frontend` by about 20%.

### Fixed

- [#4468](https://github.com/thanos-io/thanos/pull/4468) Rule: Fix temporary rule filename composition issue.
- [#4476](https://github.com/thanos-io/thanos/pull/4476) UI: fix incorrect html escape sequence used for '>' symbol.
- [#4532](https://github.com/thanos-io/thanos/pull/4532) Mixin: Fixed "all jobs" selector in thanos mixin dashboards.
- [#4607](https://github.com/thanos-io/thanos/pull/4607) Azure: Fix Azure MSI Rate Limit
- [#4476](https://github.com/thanos-io/thanos/pull/4476) UI: Fix incorrect html escape sequence used for '>' symbol.
- [#4532](https://github.com/thanos-io/thanos/pull/4532) Mixin: Fix "all jobs" selector in thanos mixin dashboards.
- [#4607](https://github.com/thanos-io/thanos/pull/4607) Azure: Fix Azure MSI Rate Limit.

### Changed

- [#4519](https://github.com/thanos-io/thanos/pull/4519) Query: switch to miekgdns DNS resolver as the default one.
- [#4519](https://github.com/thanos-io/thanos/pull/4519) Query: Switch to miekgdns DNS resolver as the default one.
- [#4586](https://github.com/thanos-io/thanos/pull/4586) Update Prometheus/Cortex dependencies and implement LabelNames() pushdown as a result; provides massive speed-up for the labels API in Thanos Query.
- [#4421](https://github.com/thanos-io/thanos/pull/4421) *breaking :warning:*: `--store` (in the future, to be renamed to `--endpoints`) now supports passing any APIs from Thanos gRPC APIs: StoreAPI, MetadataAPI, RulesAPI, TargetsAPI and ExemplarsAPI (in oppose in the past you have to put it in hidden `--targets`, `--rules` etc flags). `--store` will now automatically detect what APIs server exposes.
- [#4669](https://github.com/thanos-io/thanos/pull/4669) Moved Prometheus dependency to v2.30.

## [v0.22.0](https://github.com/thanos-io/thanos/tree/release-0.22) - 2021.07.22

Expand Down
2 changes: 1 addition & 1 deletion VERSION
@@ -1 +1 @@
0.24.0-dev
0.23.1
bwplotka marked this conversation as resolved.
Show resolved Hide resolved
5 changes: 3 additions & 2 deletions docs/release-process.md
Expand Up @@ -23,8 +23,9 @@ Release shepherd responsibilities:

| Release | Time of first RC | Shepherd (GitHub handle) |
|---------|----------------------|-----------------------------|
| v0.24.0 | (planned) 2021.09.28 | No one ATM |
| v0.23.0 | 2021.09.01 | `@bwplotka` |
| v0.25.0 | (planned) 2021.11.26 | No one ATM |
| v0.24.0 | (planned) 2021.10.14 | No one ATM |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW looking for volunteers (probably we should move this forward too)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐧i volunteer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

| v0.23.0 | 2021.09.02 | `@bwplotka` |
| v0.22.0 | 2021.07.06 | `@GiedriusS` |
| v0.21.0 | 2021.05.28 | `@metalmatze` and `@onprem` |
| v0.20.0 | 2021.04.23 | `@kakkoyun` |
Expand Down
39 changes: 29 additions & 10 deletions pkg/query/endpointset.go
Expand Up @@ -31,7 +31,8 @@ import (
)

const (
unhealthyEndpointMessage = "removing endpoint because it's unhealthy or does not exist"
unhealthyEndpointMessage = "removing endpoint because it's unhealthy or does not exist"
noMetadataEndpointMessage = "cannot obtain metadata: neither info nor store client found"

// Default minimum and maximum time values used by Prometheus when they are not passed as query parameter.
MinTime = -9223309901257974
Expand Down Expand Up @@ -76,17 +77,27 @@ func (es *grpcEndpointSpec) Addr() string {
// Metadata method for gRPC endpoint tries to call InfoAPI exposed by Thanos components until context timeout. If we are unable to get metadata after
// that time, we assume that the host is unhealthy and return error.
func (es *grpcEndpointSpec) Metadata(ctx context.Context, client *endpointClients) (*endpointMetadata, error) {
resp, err := client.info.Info(ctx, &infopb.InfoRequest{}, grpc.WaitForReady(true))
if err != nil {
// Call Info method of StoreAPI, this way querier will be able to discovery old components not exposing InfoAPI.
metadata, merr := es.getMetadataUsingStoreAPI(ctx, client.store)
if merr != nil {
return nil, errors.Wrapf(merr, "fallback fetching info from %s after err: %v", es.addr, err)
// TODO(@matej-g): Info client should not be used due to https://github.com/thanos-io/thanos/issues/4699
// Uncomment this after it is implemented in https://github.com/thanos-io/thanos/pull/4282.
// if client.info != nil {
// resp, err := client.info.Info(ctx, &infopb.InfoRequest{}, grpc.WaitForReady(true))
// if err != nil {
// return nil, errors.Wrapf(err, "fetching info from %s", es.addr)
// }

// return &endpointMetadata{resp}, nil
// }

// Call Info method of StoreAPI, this way querier will be able to discovery old components not exposing InfoAPI.
if client.store != nil {
metadata, err := es.getMetadataUsingStoreAPI(ctx, client.store)
if err != nil {
return nil, errors.Wrapf(err, "fallback fetching info from %s", es.addr)
}
return metadata, nil
}

return &endpointMetadata{resp}, nil
return nil, errors.New(noMetadataEndpointMessage)
}

func (es *grpcEndpointSpec) getMetadataUsingStoreAPI(ctx context.Context, client storepb.StoreClient) (*endpointMetadata, error) {
Expand Down Expand Up @@ -493,7 +504,9 @@ func (e *EndpointSet) getActiveEndpoints(ctx context.Context, endpoints map[stri
logger: e.logger,
StoreClient: storepb.NewStoreClient(conn),
clients: &endpointClients{
info: infopb.NewInfoClient(conn),
// TODO(@matej-g): Info client should not be used due to https://github.com/thanos-io/thanos/issues/4699
// Uncomment this after it is implemented in https://github.com/thanos-io/thanos/pull/4282.
// info: infopb.NewInfoClient(conn),
store: storepb.NewStoreClient(conn),
},
}
Expand Down Expand Up @@ -667,6 +680,10 @@ func (er *endpointRef) ComponentType() component.Component {
er.mtx.RLock()
defer er.mtx.RUnlock()

if er.metadata == nil {
return component.UnknownStoreAPI
}

return component.FromString(er.metadata.ComponentType)
}

Expand Down Expand Up @@ -785,13 +802,15 @@ func (er *endpointRef) apisPresent() []string {
return apisPresent
}

// TODO(@matej-g): Info client should not be used due to https://github.com/thanos-io/thanos/issues/4699
// Uncomment the nolint directive after https://github.com/thanos-io/thanos/pull/4282.
type endpointClients struct {
store storepb.StoreClient
rule rulespb.RulesClient
metricMetadata metadatapb.MetadataClient
exemplar exemplarspb.ExemplarsClient
target targetspb.TargetsClient
info infopb.InfoClient
info infopb.InfoClient //nolint:structcheck,unused
}

type endpointMetadata struct {
Expand Down
70 changes: 66 additions & 4 deletions pkg/query/endpointset_test.go
Expand Up @@ -19,6 +19,7 @@ import (
"github.com/thanos-io/thanos/pkg/info/infopb"
"github.com/thanos-io/thanos/pkg/store"
"github.com/thanos-io/thanos/pkg/store/labelpb"
"github.com/thanos-io/thanos/pkg/store/storepb"
"github.com/thanos-io/thanos/pkg/testutil"
)

Expand Down Expand Up @@ -58,7 +59,11 @@ var (
}
ruleInfo = &infopb.InfoResponse{
ComponentType: component.Rule.String(),
Rules: &infopb.RulesInfo{},
Store: &infopb.StoreInfo{
MinTime: math.MinInt64,
MaxTime: math.MaxInt64,
},
Rules: &infopb.RulesInfo{},
}
storeGWInfo = &infopb.InfoResponse{
ComponentType: component.Store.String(),
Expand Down Expand Up @@ -93,6 +98,28 @@ func (c *mockedEndpoint) Info(ctx context.Context, r *infopb.InfoRequest) (*info
return &c.info, nil
}

type mockedStoreSrv struct {
infoDelay time.Duration
info storepb.InfoResponse
}

func (s *mockedStoreSrv) Info(context.Context, *storepb.InfoRequest) (*storepb.InfoResponse, error) {
if s.infoDelay > 0 {
time.Sleep(s.infoDelay)
}

return &s.info, nil
}
func (s *mockedStoreSrv) Series(*storepb.SeriesRequest, storepb.Store_SeriesServer) error {
return nil
}
func (s *mockedStoreSrv) LabelNames(context.Context, *storepb.LabelNamesRequest) (*storepb.LabelNamesResponse, error) {
return nil, nil
}
func (s *mockedStoreSrv) LabelValues(context.Context, *storepb.LabelValuesRequest) (*storepb.LabelValuesResponse, error) {
return nil, nil
}

type APIs struct {
store bool
metricMetadata bool
Expand All @@ -113,6 +140,25 @@ type testEndpoints struct {
exposedAPIs map[string]*APIs
}

func componentTypeToStoreType(componentType string) storepb.StoreType {
switch componentType {
case component.Query.String():
return storepb.StoreType_QUERY
case component.Rule.String():
return storepb.StoreType_RULE
case component.Sidecar.String():
return storepb.StoreType_SIDECAR
case component.Store.String():
return storepb.StoreType_STORE
case component.Receive.String():
return storepb.StoreType_RECEIVE
case component.Debug.String():
return storepb.StoreType_DEBUG
default:
return storepb.StoreType_STORE
}
}

func startTestEndpoints(testEndpointMeta []testEndpointMeta) (*testEndpoints, error) {
e := &testEndpoints{
srvs: map[string]*grpc.Server{},
Expand All @@ -130,6 +176,19 @@ func startTestEndpoints(testEndpointMeta []testEndpointMeta) (*testEndpoints, er
srv := grpc.NewServer()
addr := listener.Addr().String()

storeSrv := &mockedStoreSrv{
info: storepb.InfoResponse{
LabelSets: meta.extlsetFn(listener.Addr().String()),
StoreType: componentTypeToStoreType(meta.ComponentType),
},
infoDelay: meta.infoDelay,
}

if meta.Store != nil {
storeSrv.info.MinTime = meta.Store.MinTime
storeSrv.info.MaxTime = meta.Store.MaxTime
}

endpointSrv := &mockedEndpoint{
info: infopb.InfoResponse{
LabelSets: meta.extlsetFn(listener.Addr().String()),
Expand All @@ -143,6 +202,7 @@ func startTestEndpoints(testEndpointMeta []testEndpointMeta) (*testEndpoints, er
infoDelay: meta.infoDelay,
}
infopb.RegisterInfoServer(srv, endpointSrv)
storepb.RegisterStoreServer(srv, storeSrv)
go func() {
_ = srv.Serve(listener)
}()
Expand Down Expand Up @@ -859,7 +919,7 @@ func TestEndpointSet_APIs_Discovery(t *testing.T) {
}
return endpointSpec
},
expectedStores: 4, // sidecar + querier + receiver + storeGW
expectedStores: 5, // sidecar + querier + receiver + storeGW + ruler
expectedRules: 3, // sidecar + querier + ruler
expectedTarget: 2, // sidecar + querier
expectedMetricMetadata: 2, // sidecar + querier
Expand Down Expand Up @@ -895,7 +955,7 @@ func TestEndpointSet_APIs_Discovery(t *testing.T) {
NewGRPCEndpointSpec(endpoints.orderAddrs[1], false),
}
},
expectedStores: 1, // sidecar
expectedStores: 2, // sidecar + ruler
expectedRules: 2, // sidecar + ruler
expectedTarget: 1, // sidecar
expectedMetricMetadata: 1, // sidecar
Expand All @@ -908,7 +968,8 @@ func TestEndpointSet_APIs_Discovery(t *testing.T) {
NewGRPCEndpointSpec(endpoints.orderAddrs[1], false),
}
},
expectedRules: 1, // ruler
expectedStores: 1, // ruler
expectedRules: 1, // ruler
},
},
},
Expand Down Expand Up @@ -1106,6 +1167,7 @@ func exposedAPIs(c string) *APIs {
}
case component.Rule.String():
return &APIs{
store: true,
rules: true,
}
case component.Store.String():
Expand Down
2 changes: 1 addition & 1 deletion tutorials/katacoda/thanos/1-globalview/courseBase.sh
@@ -1,4 +1,4 @@
#!/usr/bin/env bash

docker pull quay.io/prometheus/prometheus:v2.16.0
docker pull quay.io/thanos/thanos:v0.22.0
docker pull quay.io/thanos/thanos:v0.23.1
8 changes: 4 additions & 4 deletions tutorials/katacoda/thanos/1-globalview/step2.md
Expand Up @@ -10,7 +10,7 @@ component and can be invoked in a single command.
Let's take a look at all the Thanos commands:

```
docker run --rm quay.io/thanos/thanos:v0.22.0 --help
docker run --rm quay.io/thanos/thanos:v0.23.1 --help
```{{execute}}

You should see multiple commands that solves different purposes.
Expand Down Expand Up @@ -53,7 +53,7 @@ docker run -d --net=host --rm \
-v $(pwd)/prometheus0_eu1.yml:/etc/prometheus/prometheus.yml \
--name prometheus-0-sidecar-eu1 \
-u root \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
sidecar \
--http-address 0.0.0.0:19090 \
--grpc-address 0.0.0.0:19190 \
Expand All @@ -68,7 +68,7 @@ docker run -d --net=host --rm \
-v $(pwd)/prometheus0_us1.yml:/etc/prometheus/prometheus.yml \
--name prometheus-0-sidecar-us1 \
-u root \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
sidecar \
--http-address 0.0.0.0:19091 \
--grpc-address 0.0.0.0:19191 \
Expand All @@ -81,7 +81,7 @@ docker run -d --net=host --rm \
-v $(pwd)/prometheus1_us1.yml:/etc/prometheus/prometheus.yml \
--name prometheus-1-sidecar-us1 \
-u root \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
sidecar \
--http-address 0.0.0.0:19092 \
--grpc-address 0.0.0.0:19192 \
Expand Down
2 changes: 1 addition & 1 deletion tutorials/katacoda/thanos/1-globalview/step3.md
Expand Up @@ -28,7 +28,7 @@ Click below snippet to start the Querier.
```
docker run -d --net=host --rm \
--name querier \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
query \
--http-address 0.0.0.0:29090 \
--query.replica-label replica \
Expand Down
2 changes: 1 addition & 1 deletion tutorials/katacoda/thanos/2-lts/courseBase.sh
Expand Up @@ -2,7 +2,7 @@

docker pull minio/minio:RELEASE.2019-01-31T00-31-19Z
docker pull quay.io/prometheus/prometheus:v2.20.0
docker pull quay.io/thanos/thanos:v0.22.0
docker pull quay.io/thanos/thanos:v0.23.1
docker pull quay.io/thanos/thanosbench:v0.2.0-rc.1

mkdir /root/editor
4 changes: 2 additions & 2 deletions tutorials/katacoda/thanos/2-lts/step1.md
Expand Up @@ -117,7 +117,7 @@ Similar to previous course, let's setup global view querying with sidecar:
docker run -d --net=host --rm \
--name prometheus-0-eu1-sidecar \
-u root \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
sidecar \
--http-address 0.0.0.0:19090 \
--grpc-address 0.0.0.0:19190 \
Expand All @@ -130,7 +130,7 @@ so we will make sure we point the Querier to the gRPC endpoints of the sidecar:
```
docker run -d --net=host --rm \
--name querier \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
query \
--http-address 0.0.0.0:9091 \
--query.replica-label replica \
Expand Down
2 changes: 1 addition & 1 deletion tutorials/katacoda/thanos/2-lts/step2.md
Expand Up @@ -79,7 +79,7 @@ docker run -d --net=host --rm \
-v /root/prom-eu1:/prometheus \
--name prometheus-0-eu1-sidecar \
-u root \
quay.io/thanos/thanos:v0.22.0 \
quay.io/thanos/thanos:v0.23.1 \
sidecar \
--tsdb.path /prometheus \
--objstore.config-file /etc/thanos/minio-bucket.yaml \
Expand Down