Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documenting recording rules per-tenant WAL #4566

Merged
merged 8 commits into from
Nov 3, 2021

Conversation

dannykopping
Copy link
Contributor

@dannykopping dannykopping commented Oct 27, 2021

What this PR does / why we need it:
Adds documentation for recording rules improvements (including per-tenant WAL).

Which issue(s) this PR fixes:
N/A

Special notes for your reviewer:
I will be adding alerts and dashboards to production/loki-mixin in a separate this PR.

Checklist

  • Documentation added
  • Tests updated

Danny Kopping added 2 commits October 25, 2021 18:02
WIP
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@dannykopping dannykopping marked this pull request as ready for review October 27, 2021 14:14
Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I just left a comment but other than that it LGTM.

docs/sources/operations/recording-rules.md Outdated Show resolved Hide resolved
Danny Kopping added 2 commits October 28, 2021 15:20
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all, looks good. I've got a few questions, but LGTM

wal:
# The directory in which to write tenant WAL files. Each tenant will have its own
# directory one level below this directory.
[dir: <string> | default = "wal"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I suspect we'll want to have the default ruler WAL dir not conflict with the default ingester WAL dir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol fuck, you suspect right 😆 how did I miss this? Thanks!

docs/sources/configuration/_index.md Outdated Show resolved Hide resolved
@@ -20,7 +20,7 @@ import (
// Default settings for the WAL cleaner.
const (
DefaultCleanupAge = 12 * time.Hour
DefaultCleanupPeriod = 30 * time.Minute
DefaultCleanupPeriod = 0 * time.Second // disabled by default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why we choose to disable this by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a feature we've tested extensively, and we inherited the feature from the code we pulled in from the agent.
The feature deletes WALs, so I want to be 100% sure it works as expected, plus it's an optimisation ~most people won't need.

Danny Kopping and others added 3 commits October 29, 2021 07:52
Co-authored-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
…opping/loki into dannykopping/recording-rules-docs
Copy link
Collaborator

@kavirajk kavirajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great doc! 👍 LGTM!

left few nits

so a `Persistent Volume` should be utilised.

## Remote-Write

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would be nice to small statement about what is remote-write here!. As a reader, I see Remote-write and suddenly changes to details like Per-tenant limits and Tuning etc.!

label, so be aware that cardinality could begin to be a concern if the number of tenants grows sufficiently large.

Some key metrics to note are:
- `loki_ruler_wal_appender_ready`: whether a WAL appender is ready to accept samples (1) or not (0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good set of metrics 👍


It can be determined by subtracting
`loki_ruler_wal_prometheus_remote_storage_queue_highest_sent_timestamp_seconds` from
`loki_ruler_wal_prometheus_remote_storage_highest_timestamp_in_seconds`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants