Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Add scalable rule storage proposal #2661

Merged
merged 1 commit into from Jun 25, 2020

Conversation

brancz
Copy link
Member

@brancz brancz commented May 26, 2020

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Add proposal for scalable rule storage.

When I started writing this, I actually thought it would be much more intricate, but it still changes the architecture of the rule component pretty significantly in the long term, so I still think it's important to have as a proposal first, and if accepted, then for people to understand the changes that have happened over time in the future.

@thanos-io/thanos-maintainers

@bwplotka
Copy link
Member

fyi as for test failure: #2663

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the architectural perspective makes sense to me. Cortex does the same (ruler writes through the remote write path).

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! As discussed offline I love this idea.

Wonder what @gouthamve and @SuperQ would say (:

LGTM, just some comments.

docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved
docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved

A few large rules can create a significant amount of resulting time-series,
which should not be limited to what we can scale to with a single TSDB embedded
in the rule component.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in the rule component.
in the rule component.
Additionally, scaling out Ruler in terms of rule evaluations causes further defragmentation of TSDB blocks - as many of rules we have, that many of TSDB "streams" we have. While doable with vertical compaction, it might cause some operational complexity and unnecessary load on the system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some of this but rewrote it slightly, could you double check if that's what you were trying to say?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! thanks

docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved
docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved
@brancz brancz force-pushed the scalable-ruler branch 2 times, most recently from 61e828d to efc8ef5 Compare June 24, 2020 13:10
@brancz
Copy link
Member Author

brancz commented Jun 24, 2020

I think I addressed all comments :)

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just last nits, plus approved status (:

docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved
docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved
docs/proposals/202005_scalable-rule-storage.md Outdated Show resolved Hide resolved
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
@bwplotka bwplotka merged commit 7a35375 into thanos-io:master Jun 25, 2020
@brancz brancz deleted the scalable-ruler branch June 25, 2020 14:32
paulfantom added a commit to paulfantom/thanos that referenced this pull request Jul 9, 2020
* upstream/release-0.14: (46 commits)
  Cut release v0.14.0-rc.1 (thanos-io#2853)
  Query: correctly marshal errors to JSON and ignore if nil (thanos-io#2848)
  ci: Manually download promu in crossbuild stage (thanos-io#2828)
  Cut release v0.14.0-rc.0 (thanos-io#2826)
  Soft cut changelog on master to indicate v0.14.0 being in progress (thanos-io#2824)
  Update ThanosReceiveNoUpload to select sum == 0 (thanos-io#2819)
  receive: Added more observability, fixed leaktest, to actually check leaks ): (thanos-io#2817)
  Query: always return a string in the `lastError` field (thanos-io#2809)
  Added missing CHANGELOG entry for PR 2613 (thanos-io#2820)
  receive: Fixed small options race; Removed unused StartTime feature. (thanos-io#2816)
  go.mod: Bump Prometheus to current latest (thanos-io#2814)
  Implement CLI Flags page in React UI (thanos-io#2796)
  Improve ThanosReceiveNoUpload to only alert on current instances
  store: Preallocate output buffer when encoding postings. (thanos-io#2812)
  compact: introduce flag --block-viewer.global.sync-block-interval (thanos-io#2752)
  docs: compact: add blurb about how retention policy works (thanos-io#2808)
  Reduced memory allocations in readIndexRange() (thanos-io#2807)
  ui: Add Stores page to React UI (thanos-io#2754)
  Added Kemal to Maintainer Role; Kemal is volounteering to be next release shephard (thanos-io#2804)
  proposal: Add scalable rule storage proposal (thanos-io#2661)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants