Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: In migration improve deduplication of title and group #78351

Merged
merged 6 commits into from
Nov 29, 2023

Conversation

JacobsonMT
Copy link
Member

@JacobsonMT JacobsonMT commented Nov 17, 2023

What is this feature?

This change improves alert titles generated in the legacy migration that occur when we need to deduplicate titles. Now when duplicate titles are detected we will first attempt to append a sequential index, falling back to a random uid if none are unique within 10 attempts. This should cause shorter and more easily readable deduplicated titles in most cases.

In addition, groups are no longer deduplicated. Instead we set them to a combination of truncated dashboard name and humanized alert frequency. This way, alerts from the same dashboard share a group if they have the same evaluation interval. In the event that truncation causes overlap, it won't be a big issue as all alerts will still be in a group with the correct evaluation interval.

Who is this feature for?

Users migrating from legacy alerting to Grafana Alerting (UA)

alertA in panel1 in dashboard1 in folder1 - 60s interval
alertA in panel2 in dashboard1 in folder1 - 60s interval
alertB in panel1 in dashboard2 in folder1 - 300s interval

These would migrate to as follows.

Before this change (example uid):
alertA in group dashboard1 - 1 in folder1
alertA_c976a621-4f7a-4963-82c3-0e58e2e5311f in group dashboard1 - 2 in folder1
alertB in group dashboard2 - 1 in folder1

After this change:
alertA in group dashboard1 - 1m in folder1
alertA #2 in group dashboard1 - 1m in folder1
alertB in group dashboard2 - 5m in folder1

Now when duplicate titles are detected we will first attempt to use the
dashboard and panel ids to deduplicate, falling back to a random uid.

Groups are no longer deduplicated, instead we set them to a combination of
truncated dashboard name and alert frequency. This way, alerts from the same
dashboard share a group if they have the same evaluation interval. In the event
that truncation causes overlap, it won't be a big issue as all alerts will still
 be in a group with the correct evaluation interval.
@JacobsonMT JacobsonMT requested a review from a team as a code owner November 17, 2023 17:38
@JacobsonMT JacobsonMT requested review from rwwiv, yuri-tceretian and grobinson-grafana and removed request for a team November 17, 2023 17:38
@grafana-delivery-bot grafana-delivery-bot bot added this to the 10.3.x milestone Nov 17, 2023
@JacobsonMT JacobsonMT added area/alerting Grafana Alerting add to changelog no-backport Skip backport of PR labels Nov 17, 2023
Copy link
Contributor

@yuri-tceretian yuri-tceretian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The overall change is really good. If my suggestions make sense, it would be great if you can address them in this PR.

@JacobsonMT JacobsonMT added the area/alerting/migration Issues relating to legacy alerting migration label Nov 24, 2023
Copy link
Contributor

@yuri-tceretian yuri-tceretian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Excellent job. My comments are optional.


// None of the simple suffixes worked, so we generate a new uid. We try a few times, just in case but this should
// almost always create a unique string on the first try.
for i := 0; i < 5; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can reuse constant MaxIncrementDeduplicationAttempts here by slightly renaming it. Not a blocker at all. Just for the sake of not using magic literals in code :)

}

func Test_shortUIDCaseInsensitiveConflicts(t *testing.T) {
s := Deduplicator{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it can be simplified to just two lines with predefined UIDs like:

  1. "abc"
  2. "Abc"

and then just count number of keys when caseInsensitive is true and false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, this test doesn't really make much sense anymore since we're now using uuid.NewRandom. It was testing issues with the algorithm when creating uids in quick succession. I'd rather leave it as is, or remove it entirely.

@@ -61,8 +61,6 @@ type Store interface {
SetOrgMigrationState(ctx context.Context, orgID int64, summary *migmodels.OrgMigrationState) error

RevertAllOrgs(ctx context.Context) error

CaseInsensitive() bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think this is a good thing.

silences: make([]*pb.MeshSilence, 0),
titleDeduplicatorForFolder: func(folderUID string) *migmodels.Deduplicator {
if _, ok := titlededuplicatorPerFolder[folderUID]; !ok {
titlededuplicatorPerFolder[folderUID] = migmodels.NewDeduplicator(ms.store.GetDialect().SupportEngine(), store.AlertDefinitionMaxTitleLength)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ms.migrationStore.CaseInsensitive() was better because it hid the store's implementation details

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it back, I don't feel strongly here. Removed it because this was the only place it's used and we have access to the base store.

@JacobsonMT JacobsonMT merged commit 2b51f0e into main Nov 29, 2023
17 checks passed
@JacobsonMT JacobsonMT deleted the jacobsonmt/migration-improve-dedup branch November 29, 2023 15:05
gabor pushed a commit that referenced this pull request Nov 29, 2023
* Alerting: In migration improve deduplication of title and group

This change improves alert titles generated in the legacy migration 
that occur when we need to deduplicate titles. Now when duplicate 
titles are detected we will first attempt to append a sequential index, 
falling back to a random uid if none are unique within 10 attempts. 
This should cause shorter and more easily readable deduplicated 
titles in most cases.

In addition, groups are no longer deduplicated. Instead we set them 
to a combination of truncated dashboard name and humanized alert 
frequency. This way, alerts from the same dashboard share a group 
if they have the same evaluation interval. In the event that truncation 
causes overlap, it won't be a big issue as all alerts will still be in a 
group with the correct evaluation interval.
@aangelisc aangelisc modified the milestones: 10.3.x, 10.2.3 Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add to changelog area/alerting/migration Issues relating to legacy alerting migration area/alerting Grafana Alerting area/backend no-backport Skip backport of PR
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants