Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft release notes for 2.11 #6702

Merged
merged 6 commits into from
Nov 28, 2023
Merged

Draft release notes for 2.11 #6702

merged 6 commits into from
Nov 28, 2023

Conversation

leizor
Copy link
Contributor

@leizor leizor commented Nov 22, 2023

What this PR does

This PR adds release notes for Mimir 2.11.0.

Which issue(s) this PR fixes or relates to

#6670

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@leizor leizor marked this pull request as ready for review November 22, 2023 00:23
@leizor leizor requested review from a team as code owners November 22, 2023 00:23
Comment on lines 77 to 78
- The CLI flag `querier.prefer-streaming-chunks-from-ingesters`.
- The CLI flag `querier.minimize-ingester-requests`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These flags are still considered experimental - the default value has changed to enable these features by default in this release though.

The plan is to remove the flags entirely in a future release, so these features are just always enabled by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I'll remove them from here.

- **Sampled logging of errors in the ingester.** A high-traffic Mimir cluster can occasionally become bogged down logging high volumes of repeated errors. You can now reduce the amount of errors outputted to logs by setting a sample rate via the `-ingester.error-sample-rate` CLI flag.
- **Add total request size instance limit for ingesters.** This limit protects the ingesters against requests that together may cause an OOM. Enable this feature by setting the `-ingester.instance-limits.max-inflight-push-requests-bytes` CLI flag.
- **Reduce the resolution of incoming native histograms samples** if the incoming sample has too many buckets compared to `-validation.max-native-histogram-buckets`. This is enabled by default but can be turned off by setting the `-validation.reduce-native-histogram-over-max-buckets` CLI flag to `false`.
- **Include a `Retry-After` header in recoverable error responses from the distributor.** This can protect your Mimir cluster from clients that default to retrying very quickly. Enable this feature by setting the `-distributor.retry-after-header.enabled` CLI flag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth mentioning some examples of clients that respect this header?

docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved
docs/sources/mimir/release-notes/v2.11.md Show resolved Hide resolved
docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved
docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved

In Grafana Mimir 2.11 the following behavior has changed:

- The distributor `Push()` endpoint will now return the following gRPC codes instead of HTTP status codes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user, I don't know what is the distributor's Push() endpoint, is that where I send my HTTP requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was unintentionally a little vague because I wasn't 100% sure, but this is the distributor's gRPC endpoint, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I think it's only misleading here. Distributor doesn't have a documented gRPC API as far as I can see. This change is only internal, as this endpoint is called from the http wrapper, and these errors are translated to same as previously.

I would just remove this section.

cc @duricanikolic for more context.

leizor and others added 2 commits November 22, 2023 10:10
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
## Features and enhancements

- **Sampled logging of errors in the ingester.** A high-traffic Mimir cluster can occasionally become bogged down logging high volumes of repeated errors. You can now reduce the amount of errors outputted to logs by setting a sample rate via the `-ingester.error-sample-rate` CLI flag.
- **Add total request size instance limit for ingesters.** This limit protects the ingesters against requests that together may cause an OOM. Enable this feature by setting the `-ingester.instance-limits.max-inflight-push-requests-bytes` CLI flag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this work efficiently (without reading request to memory first), this needs to be used with -ingester.limit-inflight-requests-using-grpc-method-limiter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks!

Comment on lines 57 to 60
- The distributor gRPC push endpoint will now return the following gRPC codes instead of HTTP status codes:
- 202 (accepted) code is replaced with 6 (`ALREADY_EXISTS`).
- 400 (bad request) code is replaced with 9 (`FAILED_PRECONDITION`).
- 429 (too many requests) and the non-standard 529 (service is overloaded) codes are replaced with 8 (`RESOURCE_EXHAUSTED`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distributor gRPC push endpoints are not part of public API. I don't think this needs to be mentioned in release notes.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for writing this. I see you've gone through a lot of PRs to get this done 😅

docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved
- **Reduce the resolution of incoming native histograms samples** if the incoming sample has too many buckets compared to `-validation.max-native-histogram-buckets`. This is enabled by default but can be turned off by setting the `-validation.reduce-native-histogram-over-max-buckets` CLI flag to `false`.
- **Include a `Retry-After` header in recoverable error responses from the distributor.** This can protect your Mimir cluster from clients including Prometheus that default to retrying very quickly. Enable this feature by setting the `-distributor.retry-after-header.enabled` CLI flag.
- **Improved query-scheduler performance under load.** This is particularly apparent for clusters with large numbers of queriers.
- **Ingester to querier chunks streaming** reduces the memory utilization of queriers and reduces the likelihood of OOMs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a new feature or was it just enabled by default now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's enabled by default now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can mention it if you want

docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved
docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved
docs/sources/mimir/release-notes/v2.11.md Show resolved Hide resolved
docs/sources/mimir/release-notes/v2.11.md Outdated Show resolved Hide resolved
leizor and others added 2 commits November 27, 2023 16:00
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for writing this!

@leizor
Copy link
Contributor Author

leizor commented Nov 28, 2023

Thanks for all the input, everybody!

@leizor leizor merged commit 0da3ce0 into main Nov 28, 2023
28 checks passed
@leizor leizor deleted the leizor/2.11-release-notes branch November 28, 2023 19:36
leizor added a commit that referenced this pull request Dec 1, 2023
* Draft release notes for 2.11

* Apply suggestions from code review

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>

* More code review responses

* More mode code review responses

* Update docs/sources/mimir/release-notes/v2.11.md

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* More more more code review responses

---------

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants