-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alerting: Move legacy alert migration from sqlstore migration to service #72702
Conversation
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 2 weeks if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
d36812c
to
73a06ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some review notes
NamespaceUID string `xorm:"namespace_uid"` | ||
UID string `xorm:"uid"` | ||
NamespaceUID string `xorm:"namespace_uid"` | ||
DashboardUID *string `xorm:"dashboard_uid"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added because AddDashboardUIDPanelIDMigration
used to be conditional on migration (runs and reverts with migration). Since we cna't rely on it anymore the migration service needs to populate the fields. See 2fdf1d0
|
||
// FixEarlyMigration fixes UA configs created before 8.2 with org_id=0 and moves some files like __default__.tmpl. | ||
// The only use of this migration is when a user enabled ng-alerting before 8.2. | ||
func FixEarlyMigration(mg *migrator.Migrator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to be called RerunDashAlertMigration
which was wholly confusing as it didn't actually rerun the migration. Renamed to better represent what it was doing, and removed the condition on migration.
The code is not added to the migration service as it's no longer relevant. See 60d35fb
@@ -492,9 +492,6 @@ func addAlertImageMigrations(mg *migrator.Migrator) { | |||
} | |||
|
|||
func extractAlertmanagerConfigurationHistoryMigration(mg *migrator.Migrator) { | |||
if !mg.Cfg.UnifiedAlerting.IsEnabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This always ran before the migration, so it never did anything to new
legacy migrations anyways.
|
||
// UpdateRuleGroupIndexMigration updates a new field rule_group_index for alert rules that belong to a group with more than 1 alert. | ||
func UpdateRuleGroupIndexMigration(mg *migrator.Migrator) { | ||
mg.AddMigration("update group index for alert rules", &updateRulesOrderInGroup{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved as-is from ualert along with the vendored structs it needed from elsewhere in the migration model.
The only change is here is removing the condition to only add if UA is enabled.
} | ||
|
||
// CreateDefaultFoldersForAlertingMigration creates a folder dedicated for alerting if no folders exist | ||
func CreateDefaultFoldersForAlertingMigration(mg *migrator.Migrator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration was removed entirely as it's no longer necessary.
- The migration service will create the general alerting folder when ran.
- Creating a new first folder via the rule creation page is simpler than when this migration was created.
@@ -0,0 +1,102 @@ | |||
package fakes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved as-is from notifier.testing so it can be used for migration tests.
var migratedKey = "migrated" | ||
|
||
// MigrationServiceMigration moves the legacy alert migration status from the migration log to kvstore. | ||
func MigrationServiceMigration(mg *migrator.Migrator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary so that we retain the current migration state between when sqlstore was in charge of UA migration and now. See 79781ae
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review notes part 2
@@ -0,0 +1,300 @@ | |||
package migration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved nearly 100% as-is from sqstore, not sure why it's not detected as a move. The only changes are to use the session through the context instead of through folderHelper
. Browser diff: https://www.diffchecker.com/ldxQCY9E/
type secureJsonData map[string][]byte | ||
|
||
// getEncryptedJsonData returns map where all keys are encrypted. | ||
func getEncryptedJsonData(sjd map[string]string, log log.Logger) secureJsonData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover from securejsondata.go
Leaves custom ACL logic for followup as it needs to be completely rewritten for rbac.
This is a needed for upcoming work, so we do it separately for clarity.
Setting dashboard.created_by to -8 as a way to track which folders were created by the migration is no longer possible. Instead, we store the newly created folders uids in the kvstore and use it on revert.
If migrationLock feature flag was enabled and the max db connections was 2 or less, a deadlock would occur. This is because the GetMigrationLog() method spawns a new session and when migrationLock is used, the migration is executed in a nested transaction. Together, these deplete the connection pool.
117aa81
to
137e6bd
Compare
/deploy-to-hg |
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR moves the legacy alert migration out of the sqlstore db migrations and into ngalert.
There are multiple reasons behind this change, I'll do my best to explain them here:
Legacy alert migration doesn't fit the definition of a sqlstore migration
Initially, this was a natural place for the migration code to live. It does, in fact, migrate database schema and contents. However, it breaks two key requirements of sqlstore migrations:
These two differences are caused by the fact that the legacy migration is currently optional. Users may decide when to migrate, or even rollback the migration and try again later. This optional status also means we were able to improve the migration over time, thus causing the migration to change.
This has contributed to numerous bugs in the past, either directly and indirectly. To name a few recent ones:
Legacy provisioned alerts cannot currently be migrated in stateless grafana deployments
Since sqlstore migrations run before any provisioning, this means that the migration code does not see that legacy provisioned alerts exist.
By moving this into
ngalert.Run()
we allow a followup PR to run dashboard provisioning before we start the legacy migration.The migration code is saving vendored models to the database
This makes some sense if the migration were a true sqlstore migration where followup migrations would update the vendored models to more current ngalert versions. This is not currently what is happening. Instead, we periodically update the vendored models in the main migration in an attempt to keep the vendor in sync with ngalert models.
This is prone to drift and is not obvious in the context of sqlstore migrations.
With the migration code part of the ngalert package, a followup PR can ensure it uses the real ngalert models and drift should be caught alongside any other ngalert code changes we do.
This unlocks features that were not easily done before
If the migration code is in ngalert as a service, then we are free to use it in new/existing endpoints. This opens up interesting feature possibilities. For example (not saying we will do all of these and some are mutually exclusive):
legacy_migration
or something). As opposed to most alerts, theselegacy_migration
provenance alerts can be overwritten on a subsequent migration run. This could allow us to let users keep their as-code provisioned alerts in legacy format for a while if desired.Edit:
For the purposes of keeping the change self-contained in a single commit, this PR has been combined with a 2nd PR to replace the vendored models in the migration with their equivalent ngalert models. It also replaces the raw SQL selects and inserts with service calls:
alertRule
->ngmodels.AlertRule
PostableUserConfig
,PostableApiAlertingConfig
,PostableApiReceiver
,Route
,ObjectMatchers
,PostableGrafanaReceiver
to theirapimodels
equivalents of the same name.util.Encrypt
->secrets.Service.Encrypt
notificationChannel
->legacymodels.AlertNotification
dashboard
->dashboards.Dashboard
d023023dsUIDLookup
->datasources.CacheService
Some gaps in the testing suite are filled:
Replacing the checks for custom dashboard ACLs will be done in a separate targeted PR as it adds functionality instead of moving it around.