Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Create new state history "fanout" backend that dispatches to multiple other backends at once #64774

Merged
merged 16 commits into from
Mar 17, 2023

Conversation

alexweav
Copy link
Contributor

What is this feature?

This PR adds a new state history backend called fanout, that composes multiple other backends.

It allows you to write to multiple state history backends at once, and select one of them for reads. In this PR, the backend selected for read traffic is called the primary, all others are called secondaries. Write traffic is dispatched to all configured backends.

Why do we need this feature?

This makes it much easier for operators to enable the new Loki-based state history on an active system. They can move to fanout, verify that Loki is operating correctly, then switch to loki.

Which issue(s) does this PR fix?:

Fixes #64363
Fixes #64364

Special notes for your reviewer:

@alexweav alexweav added this to the 9.5.0 milestone Mar 14, 2023
@alexweav alexweav requested a review from a team March 14, 2023 21:49
@alexweav alexweav requested a review from a team as a code owner March 14, 2023 21:49
@alexweav alexweav requested review from sakjur, papagian and suntala and removed request for a team March 14, 2023 21:49
@armandgrillet
Copy link
Contributor

Backport is product-approved.

pkg/services/ngalert/ngalert.go Outdated Show resolved Hide resolved
pkg/services/ngalert/state/historian/fanout.go Outdated Show resolved Hide resolved
pkg/setting/setting_unified_alerting.go Outdated Show resolved Hide resolved
@armandgrillet armandgrillet added the backport v9.4.x Mark PR for automatic backport to v9.4.x label Mar 14, 2023
@armandgrillet armandgrillet modified the milestones: 9.5.0, 9.4.6 Mar 14, 2023
@grafanabot
Copy link
Contributor

Hello @armandgrillet!
Backport pull requests need to be either:

  • Pull requests which address bugs,
  • Urgent fixes which need product approval, in order to get merged,
  • Docs changes.

Please, if the current pull request addresses a bug fix, label it with the type/bug label.
If it already has the product approval, please add the product-approved label. For docs changes, please add the type/docs label.
If none of the above applies, please consider removing the backport label and target the next major/minor release.
Thanks!

@alexweav alexweav added product-approved Pull requests that are approved by product/managers and are allowed to be backported and removed no-backport Skip backport of PR labels Mar 14, 2023
Copy link
Contributor

@yuri-tceretian yuri-tceretian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I wonder if we should consider a situation when on write one of the inputs could behave slower than the other (this is also applicable to the single input case), and probably cancel the operation by a timeout. Currently, historian uses the same context as the rule evaluation routine but operates asynchronously, so requests are will be piling up.

BackendTypeLoki BackendType = "loki"
BackendTypeNoop BackendType = "noop"
BackendTypeSQL BackendType = "sql"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you would put this enum to the settings package and will validate the mode when it parses settings which will make Grafana crash faster, during settings parsing. But I am ok with this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other consistency checks (are required keys for each backend provided? are urls valid? etc) also already happen in ngalert package. I added it there to keep all validation in one place.

I'm not opposed to moving the validation as a whole to config package, though! I'd just rather not split validation for now

@grafanabot grafanabot removed this from the 9.4.6 milestone Mar 16, 2023
@grafanabot
Copy link
Contributor

This pull request was removed from the 9.4.6 milestone because 9.4.6 is currently being released.

Copy link
Contributor

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, my comments are nits - I don't need to see this again.

pkg/services/ngalert/ngalert.go Outdated Show resolved Hide resolved
pkg/services/ngalert/state/historian/annotation.go Outdated Show resolved Hide resolved
pkg/services/ngalert/state/historian/fanout.go Outdated Show resolved Hide resolved
pkg/services/ngalert/state/historian/fanout.go Outdated Show resolved Hide resolved
pkg/services/ngalert/state/historian/fanout.go Outdated Show resolved Hide resolved
pkg/services/ngalert/state/historian/fanout.go Outdated Show resolved Hide resolved
@alexweav alexweav added this to the 9.4.8 milestone Mar 17, 2023
@alexweav alexweav merged commit a31672f into main Mar 17, 2023
@alexweav alexweav deleted the alexweav/fanout-backend branch March 17, 2023 17:41
grafanabot pushed a commit that referenced this pull request Mar 17, 2023
…o multiple other backends at once (#64774)

* Rename RecordStatesAsync to Record

* Rename QueryStates to Query

* Implement fanout writes

* Implement primary queries

* Simplify error joining

* Add test for query path

* Add tests for writes and error propagation

* Allow fanout backend to be configured

* Touch up log messages and config validation

* Consistent documentation for all backend structs

* Parse and normalize backend names more consistently against an enum

* Touch-ups to documentation

* Improve clarity around multi-record blocking

* Keep primary and secondaries more distinct

* Rename fanout backend to multiple backend

* Simplify config keys for multi backend mode

(cherry picked from commit a31672f)
alexweav added a commit that referenced this pull request Mar 17, 2023
…patches to multiple other backends at once (#64983)

Alerting: Create new state history "fanout" backend that dispatches to multiple other backends at once (#64774)

* Rename RecordStatesAsync to Record

* Rename QueryStates to Query

* Implement fanout writes

* Implement primary queries

* Simplify error joining

* Add test for query path

* Add tests for writes and error propagation

* Allow fanout backend to be configured

* Touch up log messages and config validation

* Consistent documentation for all backend structs

* Parse and normalize backend names more consistently against an enum

* Touch-ups to documentation

* Improve clarity around multi-record blocking

* Keep primary and secondaries more distinct

* Rename fanout backend to multiple backend

* Simplify config keys for multi backend mode

(cherry picked from commit a31672f)

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
@zerok zerok modified the milestones: 9.4.x, 9.4.8, 9.5.0 Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add to changelog area/alerting Grafana Alerting area/backend backport v9.4.x Mark PR for automatic backport to v9.4.x kata:state-history product-approved Pull requests that are approved by product/managers and are allowed to be backported
Projects
Archived in project
6 participants