Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Fix inconsistent AM raw config when applied via sync vs API #81655

Merged
merged 1 commit into from
Jan 31, 2024

Conversation

JacobsonMT
Copy link
Member

@JacobsonMT JacobsonMT commented Jan 31, 2024

What is this feature?

AM config applied via API would use the PostableUserConfig as the AM raw config and also the hash used to decide when the AM config has changed. However, when applied via the periodic sync the PostableApiAlertingConfig would be used instead.

This changes makes all means of applying the AM config use PostableApiAlertingConfig.

Why do we need this feature?

This issue causes:

  • Inconsistent hash comparisons when modifying the AM causing redundant applies.
  • GetStatus (api/alertmanager/grafana/api/v2/status) assumed the raw config was PostableUserConfig causing the endpoint to return correctly after a new config is applied via API and then nothing once the periodic sync runs.

Who is this feature for?

UA users.

Special notes for your reviewer:

Technically, the upstream GrafanaAlertamanger GetStatus shouldn't be returning PostableUserConfig or PostableApiAlertingConfig, but instead GettableStatus. However, this issue required changes elsewhere and is out of scope.

@JacobsonMT JacobsonMT added this to the 10.4.x milestone Jan 31, 2024
@JacobsonMT JacobsonMT requested review from a team, rwwiv, yuri-tceretian and grobinson-grafana and removed request for a team January 31, 2024 15:44
Copy link
Contributor

@yuri-tceretian yuri-tceretian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@grobinson-grafana grobinson-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me! 👍

@grobinson-grafana
Copy link
Contributor

GetStatus (api/alertmanager/grafana/api/v2/status) assumed the raw config was PostableUserConfig causing the endpoint to return correctly after a new config is applied via API and then nothing once the periodic sync runs.

So this failed until a configuration was applied via the API?

@JacobsonMT
Copy link
Member Author

GetStatus (api/alertmanager/grafana/api/v2/status) assumed the raw config was PostableUserConfig causing the endpoint to return correctly after a new config is applied via API and then nothing once the periodic sync runs.

So this failed until a configuration was applied via the API?

Yes, and then quickly started failing again once the periodic sync happened

@grobinson-grafana
Copy link
Contributor

GetStatus (api/alertmanager/grafana/api/v2/status) assumed the raw config was PostableUserConfig causing the endpoint to return correctly after a new config is applied via API and then nothing once the periodic sync runs.

So this failed until a configuration was applied via the API?

Yes, and then quickly started failing again once the periodic sync happened

More so out of curiosity, how would I replicate that on main branch?

@JacobsonMT
Copy link
Member Author

GetStatus (api/alertmanager/grafana/api/v2/status) assumed the raw config was PostableUserConfig causing the endpoint to return correctly after a new config is applied via API and then nothing once the periodic sync runs.

So this failed until a configuration was applied via the API?

Yes, and then quickly started failing again once the periodic sync happened

More so out of curiosity, how would I replicate that on main branch?

/api/alertmanager/grafana/api/v2/status should return the config right after saving a new AM config. Then wait a minute until the sync happens, the endpoint will stop returning the config.

After save:
Screenshot from 2024-01-31 13-33-54

After sync:
Screenshot from 2024-01-31 13-35-39

@JacobsonMT JacobsonMT enabled auto-merge (squash) January 31, 2024 18:42
AM config applied via API would use the PostableUserConfig as the AM raw
config and also the hash used to decide when the AM config has changed.
However, when applied via the periodic sync the PostableApiAlertingConfig would
be used instead.

This leads to two issues:
- Inconsistent hash comparisons when modifying the AM causing redundant applies.
- GetStatus assumed the raw config was PostableUserConfig causing the endpoint
to return correctly after a new config is applied via API and then nothing once
 the periodic sync runs.

Note: Technically, the upstream GrafanaAlertamanger GetStatus shouldn't be
returning PostableUserConfig or PostableApiAlertingConfig, but instead
GettableStatus. However, this issue required changes elsewhere and is out of
scope.
@JacobsonMT JacobsonMT force-pushed the jacobsonmt/fix-inconsistent-AM-raw-config branch from e9e05ab to 251c0e2 Compare January 31, 2024 18:48
@JacobsonMT JacobsonMT merged commit 0ce1ccd into main Jan 31, 2024
12 checks passed
@JacobsonMT JacobsonMT deleted the jacobsonmt/fix-inconsistent-AM-raw-config branch January 31, 2024 19:05
Ukochka pushed a commit that referenced this pull request Feb 14, 2024
…#81655)

AM config applied via API would use the PostableUserConfig as the AM raw
config and also the hash used to decide when the AM config has changed.
However, when applied via the periodic sync the PostableApiAlertingConfig would
be used instead.

This leads to two issues:
- Inconsistent hash comparisons when modifying the AM causing redundant applies.
- GetStatus assumed the raw config was PostableUserConfig causing the endpoint
to return correctly after a new config is applied via API and then nothing once
 the periodic sync runs.

Note: Technically, the upstream GrafanaAlertamanger GetStatus shouldn't be
returning PostableUserConfig or PostableApiAlertingConfig, but instead
GettableStatus. However, this issue required changes elsewhere and is out of
scope.
@aangelisc aangelisc modified the milestones: 10.4.x, 10.4.0 Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants