Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Stateless Ruler Proposal #3761

Closed
bwplotka opened this issue Feb 1, 2021 · 13 comments
Closed

Implement Stateless Ruler Proposal #3761

bwplotka opened this issue Feb 1, 2021 · 13 comments

Comments

@bwplotka
Copy link
Member

bwplotka commented Feb 1, 2021

Tracking ticket for implementing https://github.com/thanos-io/thanos/blob/master/docs/proposals/202005_scalable-rule-storage.md#proposal

Open Question about this proposal: Is push required by ruler really remote-write? Remote write is for batches vs here we kind of write samples, no? More of pushgateway API etc 🤔 cc @brancz

@bwplotka
Copy link
Member Author

bwplotka commented Feb 1, 2021

Ongoing work: #3743

@metalmatze
Copy link
Member

The way this is implemented in my current PR is that the samples are added to a list of series and they are only sent in a batch once commit is called. So it's not that every series is sent on its own.
Not sure if that's what you're asking for or how we want to have this done in the end.

@brancz
Copy link
Member

brancz commented Feb 15, 2021

I don't see how this is more like the pushgateway API. I agree this should be remote-write, as @metalmatze reiterated. (not necessarily a good argument but Cortex does the same)

@jaybatra26
Copy link

Hi! Can I take this up as a part of LFX programme?

@bwplotka
Copy link
Member Author

bwplotka commented Mar 9, 2021

Anyone can help! (: By for LFX program we have @idoqo this spring. Still there is a lot of work so anyone can work on a piece.

@jaybatra26
Copy link

Thanks, @bwplotka I will then start with good first issues.

@idoqo
Copy link
Contributor

idoqo commented Apr 22, 2021

After 1:1 discussions, it seems like the implementation for this is sort of a choice between full and partial statelessness.

Full stateless is based on Frederic’s comment (#3743 (comment)) and will have us writing to the remote storage as soon as they are evaluated. We could then record the last successful remote write and back-fill the failed ones (say, by ruler asking “when was the last time I successfully wrote data?”). It also means no persistence (be it local storage or a WAL for buffering data). This makes sense as we don’t know exactly how much data rule evaluations will produce and that makes it hard(er) to scale the WAL/local storage.

One issue with full statelessness though, is that we might end up spamming the remote-write storage with requests since we are sending the request for each sample that ruler produces.

The alternative would be partial statelessness where we send the remote-write requests in batches. To batch though, we do need some kind of WAL that persists the data for a while and flushes after they have been successfully sent to the remote storage.

@brancz
Copy link
Member

brancz commented Apr 23, 2021

Whether it's batches or not, what I meant by "record" is in fact a sort of persistence, but just a marker of where we left off, as opposed to the data itself. As I mentioned in my last comment on the issue linked, I don't think there will be one remote-write request per sample, but rather per rule evaluation.

@stale
Copy link

stale bot commented Jun 22, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Jun 22, 2021
@stale
Copy link

stale bot commented Jul 8, 2021

Closing for now as promised, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Jul 8, 2021
@yeya24 yeya24 reopened this Jul 8, 2021
@stale stale bot removed the stale label Jul 8, 2021
@stale
Copy link

stale bot commented Sep 6, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Sep 6, 2021
@m-messiah
Copy link
Contributor

Let's continue with it? I think remote-write from thanos ruler could be a great thing

@stale stale bot removed the stale label Sep 16, 2021
@GiedriusS
Copy link
Member

Implemented in #4731 👍 thank you all for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants