Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Add endpoint to revert to a previous alertmanager configuration #65751

Merged
merged 12 commits into from
Apr 5, 2023

Conversation

JacobsonMT
Copy link
Member

@JacobsonMT JacobsonMT commented Mar 31, 2023

What is this feature?

This endpoint is meant to be used in conjunction with /api/alertmanager/grafana/config/history to revert to a previously applied alertmanager configuration. This is done by ID instead of raw config string in order to avoid secure field complications.

Why do we need this feature?

This endpoint will be used by the frontend to revert to a previous alertmanager configuration in case the current one is broken.

Which issue(s) does this PR fix?:

Partial #54505

Special notes for your reviewer:

image

Peek 2023-03-31 19-05

…tion

This endpoint is meant to be used in conjunction with /api/alertmanager/grafana/config/history to
revert to a previously applied alertmanager configuration. This is done by ID instead of raw config
string in order to avoid secure field complications.
@JacobsonMT JacobsonMT added this to the 9.5.0 milestone Mar 31, 2023
Copy link
Contributor

@yuri-tceretian yuri-tceretian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@santihernandezc santihernandezc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested the following flow:

  • Breaking the current AM configuration.
  • Trying to reset state from a broken historic configuration (it fails, expected).
  • Resetting state using a valid historic configuration (success).
    Everything works as expected!

I added a few comments about typos, tests, naming, and the API spec.

@@ -408,6 +414,16 @@ func (api *API) RegisterAlertmanagerApiEndpoints(srv AlertmanagerApi, m *metrics
m,
),
)
group.Post(
toMacaronPath("/api/alertmanager/grafana/config/history/{id}/_activate"),
api.authorize(http.MethodPost, "/api/alertmanager/grafana/config/history/{id}/_activate"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have other endpoints prefixed by underscores? I assume it's because "activate" refers to an action rather than a resource, but what's the motivation behind this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it's because "activate" refers to an action rather than a resource

Correct. @yuri-tceretian and I briefly discussed what format the endpoint should be in and came to this compromise. It has the benefits of mirroring the related endpoint /api/alertmanager/grafana/config/history and referencing the historical config by resource id. The drawback being the, somewhat strange, underscore action.

The other options we discussed were:

  • POST /api/alertmanager/grafana/config/api/v1/alerts with {id:<int>}. The benefits of this being no extra endpoint needed. The drawbacks are that it conflates this new use-case with an existing core endpoint.
  • POST /api/alertmanager/grafana/config/1 with body { active : true } or something similar. This doesn't have _activate but poses the question of what does this endpoint do if active=false.

I'm open to discussing alternatives if there's a compelling reason.

// Responses:
// 202: Ack
// 400: ValidationError
// 404: NotFound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing status codes 409 and 500.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

409 would be from ErrAlertmanagerNotReady which actually shouldn't ever be thrown by this endpoint. I know I added a specific condition in the API to catch this, but it's mostly defensive programming and to mirror the errors of RoutePostAlertingConfig (although I can't see where that endpoint can throw this error either)

500 is a catch-all "unknown" error, it doesn't feel right to document it in the api specs to me. The other endpoints (mostly) leave it out as well.

pkg/services/ngalert/store/alertmanager_test.go Outdated Show resolved Hide resolved
@grafanabot grafanabot removed this from the 9.5.0 milestone Apr 4, 2023
@grafanabot
Copy link
Contributor

This pull request was removed from the 9.5.0 milestone because 9.5.0 is currently being released.

pkg/services/ngalert/store/alertmanager.go Outdated Show resolved Hide resolved
@@ -166,6 +166,8 @@ func (api *API) authorize(method, path string) web.Handler {
case http.MethodPost + "/api/alertmanager/grafana/config/api/v1/alerts":
// additional authorization is done in the request handler
eval = ac.EvalAny(ac.EvalPermission(ac.ActionAlertingNotificationsWrite))
case http.MethodPost + "/api/alertmanager/grafana/config/history/{id}/_activate":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this is _activate instead of something like activate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question and here's the answer. I'm not very convinced by that underscore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context is here: #65751 (comment)

pkg/services/ngalert/api/api_alertmanager.go Outdated Show resolved Hide resolved
@grobinson-grafana
Copy link
Contributor

Tested and confirmed it works! 🚀

@JacobsonMT JacobsonMT merged commit 85f738c into main Apr 5, 2023
@JacobsonMT JacobsonMT deleted the jacobsonmt/endpoint_activate_previous_config branch April 5, 2023 18:10
@zerok zerok modified the milestones: 10.0.0, 10.0.0-preview May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants