Skip to content

Upgrade Sloth to v0.16.0 with source_tenants support#1

Merged
evelynq3661 merged 331 commits intonb-mainfrom
nb-v0.16.0
Apr 22, 2026
Merged

Upgrade Sloth to v0.16.0 with source_tenants support#1
evelynq3661 merged 331 commits intonb-mainfrom
nb-v0.16.0

Conversation

@evelynq3661
Copy link
Copy Markdown

@evelynq3661 evelynq3661 commented Apr 15, 2026

Summary

  • Upgrade Sloth from v0.11.0 to v0.16.0 (upstream tag)
  • Re-apply --source-tenants CLI flag to the new codebase (3 files)
  • Add new --rule-group-interval CLI flag to explicitly set interval: on rule groups
  • Enables built-in sum_over_time optimization for 30d recording rules

Why

PR slok#463 in sre-prometheus-rules fixed 3 SLO files that caused Mimir 429 errors at midnight UTC. Audit found 20 more files with the same raw 30d query issue. Sloth v0.16.0 has native sum_over_time optimization (since v0.13.0), so upgrading eliminates the manual post-generation patching step that was easy to forget.

Changes

--source-tenants (ported from v0.11.0 fork)

Upstream Sloth has no source_tenants support. Re-applied to v0.16.0's refactored codebase:

  • cmd/sloth/commands/generate.go — add --source-tenants CLI flag, wire through to storage
  • internal/storage/io/std_prometheus.go — add SourceTenants to repo struct and YAML output struct
  • pkg/lib/gen.go — update WriteResultAsPrometheusStd signature

--rule-group-interval (new)

v0.16.0 no longer sets interval: on generated rule groups (v0.11.0 did). Without it, Mimir uses the global eval interval (~1m), causing unnecessary query load.

  • cmd/sloth/commands/generate.go — add --rule-group-interval CLI flag
  • internal/storage/io/std_prometheus.go — add intervalOverride field, resolveInterval() helper
  • pkg/lib/gen.go — pass through to storage layer

Usage

sloth generate --disable-alerts \
    -i slos/platform/redis-api.yaml \
    -o mimir/recording/prod/slo-redis-api.yaml \
    --source-tenants platform --source-tenants anonymous \
    --rule-group-interval 5m

Test plan

  • Built Docker image registry.n.newsbreak.com/sloth:v0.16.0
  • Generated redis-api.yaml with both flags
  • Verified: source_tenants present, interval: 5m present, 30d uses sum_over_time
  • Verified: mimirtool rules prepare -i adds namespace and by (cluster) as expected
  • CI: unit tests, integration tests, K8s tests all pass
  • Canary deploy 1 SLO file to prod, monitor 24h
  • Batch rollout remaining 20 unfixed files

🤖 Generated with Claude Code

slok and others added 30 commits April 10, 2025 09:20
Add support for SLO plugins on default (non-k8s) sloth prometheus service level spec
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Move SLO validation to an SLO plugin
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Add the original SLO group source information to the plugin request
solves issues with go downloading

https://stackoverflow.com/q/78519711

```
go: downloading go1.24 (linux/amd64)
go: download go1.24 for linux/amd64: toolchain not available
```
fix: update go version go.mod
…overs both and change flags to improve usage

Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Aggregate SLO and SLI plugin file loader into a single repo that discovers both and change flags to improve usage
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Bumps [sigs.k8s.io/structured-merge-diff/v4](https://github.com/kubernetes-sigs/structured-merge-diff) from 4.6.0 to 4.7.0.
- [Release notes](https://github.com/kubernetes-sigs/structured-merge-diff/releases)
- [Changelog](https://github.com/kubernetes-sigs/structured-merge-diff/blob/master/RELEASE.md)
- [Commits](kubernetes-sigs/structured-merge-diff@v4.6.0...v4.7.0)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/structured-merge-diff/v4
  dependency-version: 4.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
…io/structured-merge-diff/v4-4.7.0

build(deps): bump sigs.k8s.io/structured-merge-diff/v4 from 4.6.0 to 4.7.0
Add support for SLO plugins on Kubernetes API
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Fix naming on k8s plugin example
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Add overridePrevious as a way of resetting the declared previous chain
…pkg/client

Bumps [github.com/prometheus-operator/prometheus-operator/pkg/client](https://github.com/prometheus-operator/prometheus-operator) from 0.81.0 to 0.82.0.
- [Release notes](https://github.com/prometheus-operator/prometheus-operator/releases)
- [Changelog](https://github.com/prometheus-operator/prometheus-operator/blob/main/CHANGELOG.md)
- [Commits](prometheus-operator/prometheus-operator@v0.81.0...v0.82.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus-operator/prometheus-operator/pkg/client
  dependency-version: 0.82.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
…m/prometheus-operator/prometheus-operator/pkg/client-0.82.0

build(deps): bump github.com/prometheus-operator/prometheus-operator/pkg/client from 0.81.0 to 0.82.0
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Remove not optimized SLI rule generation flag
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Pass default plugins as a chain
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Support env vars on SLO plugins
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
slok and others added 28 commits December 22, 2025 17:09
…ns/upload-artifact-6

build(deps): bump actions/upload-artifact from 5 to 6
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Add basic auth to prometheus client
Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.67.4 to 0.67.5.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/CHANGELOG.md)
- [Commits](prometheus/common@v0.67.4...v0.67.5)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-version: 0.67.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
…m/prometheus/common-0.67.5

build(deps): bump github.com/prometheus/common from 0.67.4 to 0.67.5
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Add AI agent instructions for development
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v6...v7)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [azure/setup-helm](https://github.com/azure/setup-helm) from 4 to 5.
- [Release notes](https://github.com/azure/setup-helm/releases)
- [Changelog](https://github.com/Azure/setup-helm/blob/main/CHANGELOG.md)
- [Commits](Azure/setup-helm@v4...v5)

---
updated-dependencies:
- dependency-name: azure/setup-helm
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5 to 6.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v5...v6)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
…ns/upload-artifact-7

build(deps): bump actions/upload-artifact from 6 to 7
…ov/codecov-action-6

build(deps): bump codecov/codecov-action from 5 to 6
…/setup-helm-5

build(deps): bump azure/setup-helm from 4 to 5
Signed-off-by: Dmitriy S. Sinyavskiy <contact@r3code.ru>
Bumps [github.com/VictoriaMetrics/metricsql](https://github.com/VictoriaMetrics/metricsql) from 0.85.0 to 0.86.0.
- [Release notes](https://github.com/VictoriaMetrics/metricsql/releases)
- [Commits](VictoriaMetrics/metricsql@v0.85.0...v0.86.0)

---
updated-dependencies:
- dependency-name: github.com/VictoriaMetrics/metricsql
  dependency-version: 0.86.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
…m/VictoriaMetrics/metricsql-0.86.0

build(deps): bump github.com/VictoriaMetrics/metricsql from 0.85.0 to 0.86.0
feat: added slo duplicates check during validate
Signed-off-by: Xabier Larrakoetxea <me@slok.dev>
Prepare for sloth version v0.16.0
Re-apply the --source-tenants CLI flag from the v0.11.0 fork (commit 82bb326)
to the refactored v0.16.0 codebase. This writes source_tenants: into generated
YAML rule groups, required for Mimir multi-tenant rule evaluation.

Touch points:
- cmd/sloth/commands/generate.go: add CLI flag, pass through to writer
- internal/storage/io/std_prometheus.go: add field to repo + YAML struct
- pkg/lib/gen.go: update WriteResultAsPrometheusStd signature

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve merge conflicts:
- cmd/sloth/commands/generate.go: keep v0.16.0 version (source_tenants already applied)
- internal/prometheus/storage.go: removed (replaced by internal/storage/io/std_prometheus.go)
- docker/prod/Dockerfile: keep upstream distroless final image

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sloth v0.16.0 no longer sets interval: on generated rule groups.
Add a --rule-group-interval CLI flag (e.g. '5m') that overrides the
interval on all output rule groups. If not set, falls back to
whatever Sloth computed (which is 0/omitted in v0.16.0).

Usage:
  sloth generate --rule-group-interval 5m --source-tenants platform ...

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@evelynq3661 evelynq3661 merged commit 2382d9b into nb-main Apr 22, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants