An one time job to init send-pulse trigger and migration down to clean up send-pulse triggers #42316

qnkhuat · 2024-05-07T08:44:25Z

A follow-up of #41772.

In that PR we have a method we called update-send-pulse-trigger-if-needed! as part of task/init! for SendPulses jobs.

It was meant to be a migration to create SendPulse triggers for existing triggers that don't have one.

It's ideally only run once, so this PR puts it into a job that is only triggered once. Inspired by the BackfillQueryField job from @piranha (see)

This also adds 2 migrations that run only on downgrade to remove the SendPulse job and InitSendPulseTriggers. We need so this to avoid the problem Cal described here.

By having these 2 migrations, if user downgrade, then upgrade, InitSendPulseTriggers will be triggered again and all sendpulse job will be correctly scheduled.

This PR also includes a typo fix for DeleteSendPulsesTask migration.

replay-io · 2024-05-07T09:00:16Z

Status	Complete ↗︎
Commit	`4284e2a`
Results	⚠️ 3 Flaky ✅ 2505 Passed

qnkhuat · 2024-05-08T06:07:14Z

src/metabase/db/custom_migrations.clj

@@ -1275,6 +1275,6 @@
  (set-jdbc-backend-properties!)
  (let [scheduler (qs/initialize)]
    (qs/start scheduler)
-    (qs/delete-trigger scheduler (triggers/key "metabase.task.send-pulses.job"))


urgh, it was swapped.

the fact that we have to do this every time we remove a job class is also bad. I'll explore a way so this is done automatically.

triggers get deleted with the job, as you said here. So we don't need line 1266

qnkhuat · 2024-05-08T06:07:36Z

resources/migrations/001_update_migrations.yaml

@@ -6603,7 +6603,7 @@ databaseChangeLog:
              DELETE FROM data_permissions where perm_type = 'perms/data-access' OR perm_type = 'perms/native-query-editing';

  - changeSet:
-      id: v50.2024-04-25T01:04:04
+      id: v50.2024-04-25T01:04:05


increase so stats will also run this

src/metabase/task/send_pulses.clj

calherries

Aside from the existing issue with pulse triggers and rollbacks I posted about here, this doesn't seem like it will work correctly for customers that rollback a major version upgrade.

Imagine this sequence of events:

start on 49, with many pulses
upgrade to 50
roll back to 49
add some more pulses
upgrade to 50 again
the InitSendPulseTriggers job won't execute again, because it's already run once in the past

That means that any pulses created between the rollback and the subsequent upgrade will fail to have any triggers created for them.

What if we paused this instead, and did it in 51, when we don't need to worry about rollbacks from 51 to 49?

(Btw, this is also an issue with #42279 @piranha, but maybe it doesn't matter as much?)

piranha · 2024-05-08T10:35:24Z

@calherries it indeed seems like a problem, I'd love to solve that, but not sure if quartz has any good solution for us here? We can of course just make "once a day" cron (provided the job is idempotent)... feels a bit dirty, but will do its job.

qnkhuat · 2024-05-08T10:36:56Z

how about making quartz auto-delete triggers that don't have a job class? this should make this type of problem error-proof going forward.

calherries · 2024-05-08T10:42:01Z

how about making quartz auto-delete triggers that don't have a job class

seems sensible, but doesn't help sanya's case now, since 49 is already released

@piranha I don't have any better ideas right now, but I would think a once-a-day solution would do the job! Even if it feels dirty. We can remove it after one release cycle too

qnkhuat · 2024-05-08T11:50:18Z

ok I have #42383 to do that, I'll backport it to 49 as well so we can still get this in.

calherries · 2024-05-08T16:30:03Z

Backporting that to 49 doesn't solve the issue for already released 49 versions. You'll have to wait a full release cycle

src/metabase/db/custom_migrations.clj

src/metabase/util/jvm.clj

calherries · 2024-05-09T09:58:41Z

test/metabase/db/custom_migrations_test.clj

+    (pulse-channel-test/with-send-pulse-setup!
+      (let [user-id  (:id (new-instance-with-default :core_user))
+            pulse-id (:id (new-instance-with-default :pulse {:creator_id user-id}))
+            pc       (new-instance-with-default :pulse_channel {:pulse_id pulse-id})]


Why use new-instance-with-default here, rather than inlining the data in the migration? I don't think the DRY principle should apply here. We don't want to couple migrations to each other in case the underlying schema or defaults for the tables change over time.

that's the idea of new-instance-with-default being defined in this namespace, otherwise we could use t2.with-temp/with-temp-defaults.

new-instance-with-default should only have fields that are less likely to be changed, like core_user.email, created_at ...

maybe schedule_type and schedule_hour are not the best candidates, as we plan to remove them. Nevertheless I think some amount of DRY is valuable here as these migrations are often very long and take some amount of setup to test.

some amount of DRY is valuable here

agreed, DRYing up things to keep code shorter is good, except for when it can cause issues like here

new-instance-with-default should only have fields that are less likely to be changed

true, it was maybe okay for users. but it becomes a problem when we use it for things that are more likely to change, so I think it should be discouraged. Not a strong opinion and I see the upsides too. Just thought I might point it out

calherries · 2024-05-09T10:02:06Z

test/metabase/task/send_pulses_test.clj

@@ -138,26 +137,6 @@
                 (pulse-channel-test/send-pulse-triggers pulse-1)
                 (pulse-channel-test/send-pulse-triggers pulse-2))))))))

-(deftest init-will-schedule-triggers-test


Why did you remove this test? As far as I can see we still need it for the new init-send-pulse-triggers! function

bc it's no longer triggered by calling (task/init! ::task.send-pulses/SendPulses) it just happens to run now, but it happens on a different thread.

Also this should be covered by the new migration test.

this should be covered by the new migration test.

They're not exactly equivalent, so I'm just checking we're okay weakening the test coverage. This test covers the property that the pulse trigger's info includes the schedule. The migration test is weaker in that it only tests that there are triggers are created for the pulse.

I updated instead of checking for count, we check for trigger info in the migration test: 0f75110

calherries · 2024-05-09T12:30:26Z

src/metabase/task/send_pulses.clj

+(defn- init-send-pulse-triggers!
+  []
+  (let [trigger-slot->pc-ids (as-> (t2/select :model/PulseChannel :enabled true) results
+                               (group-by #(select-keys % [:pulse_id :schedule_type :schedule_day :schedule_hour :schedule_frame]) results)


Can we group simply by pulse_id instead? I'm concerned that we'll do a migration on the :schedule_ columns in the future and leaving them here will be confusing

Suggested change

(group-by #(select-keys % [:pulse_id :schedule_type :schedule_day :schedule_hour :schedule_frame]) results)

(group-by :pulse_id results)

I don't this removes the confusion from the modeling perspective. once we migrate to corn string we can group by it by then.

calherries

We need to be really careful with test coverage with these pseudo-migrations that we schedule using quartz jobs. Not only do tests need to cover existing version upgrades, but the tests also need to cover the migration works for all future version upgrades, e.g. going straight from 49 to 51.

init-send-pulse-triggers! runs once after all migrations finish, and that's the case forever into the future. Currently I only see a test covering the case that it works when upgrading from 49 to 50. That's not enough, given in the future we'll have more major versions and the things the migration depends on (like the schema of pulse channels) can change over time.

qnkhuat · 2024-05-10T02:02:57Z

I've updated the migration test so it migrates up to the latest migration, so it should work for future migrations: 1dd523d

calherries

LGTM

qnkhuat · 2024-05-14T04:14:07Z

@calherries updated to make sure we don't schedule triggers for archived dashboards: 1ffe69f

github-actions · 2024-05-15T05:42:55Z

@qnkhuat Did you forget to add a milestone to the issue for this PR? When and where should I add a milestone?

Add a job to init send pulse trigger only once

5a3d729

qnkhuat requested a review from camsaul as a code owner May 7, 2024 08:44

qnkhuat requested a review from calherries May 7, 2024 08:44

metabase-bot bot assigned qnkhuat May 7, 2024

metabase-bot bot added the .Team/BackendComponents also known as BEC label May 7, 2024

delete a test

996d733

qnkhuat added 2 commits May 8, 2024 13:05

delete sendpulse task should have its name swapped

74ef6ea

increase migration id so stats will trigger it

97f5e25

qnkhuat commented May 8, 2024

View reviewed changes

qnkhuat requested a review from a team May 8, 2024 08:00

piranha reviewed May 8, 2024

View reviewed changes

src/metabase/task/send_pulses.clj Outdated Show resolved Hide resolved

not once on startup but once per instance

4e064a3

qnkhuat added the no-backport Do not backport this PR to any branch label May 8, 2024

calherries reviewed May 8, 2024

View reviewed changes

add migration to delete InitSendPulseTriggers and SendPulse on downgrade

be8fe2d

qnkhuat changed the title ~~Add a job to init send pulse trigger only once~~ An one time job to init send-pulse trigger and migration down to clean up send-pulse triggers May 9, 2024

qnkhuat added 7 commits May 9, 2024 12:42

add a test

a7b57ee

Merge branch 'master' into init-trigger-once

9ae3ca5

fix test failed to compile

dda37ed

test fixes

55ac355

Merge branch 'master' into init-trigger-once

49ff674

delete backfill-query-field-test test

29ab223

indent

96846cb

qnkhuat requested a review from calherries May 9, 2024 09:07

calherries reviewed May 9, 2024

View reviewed changes

src/metabase/db/custom_migrations.clj Show resolved Hide resolved

calherries reviewed May 9, 2024

View reviewed changes

src/metabase/util/jvm.clj Outdated Show resolved Hide resolved

calherries reviewed May 9, 2024

View reviewed changes

use u/poll

58cb3f8

qnkhuat requested a review from noahmoss as a code owner May 9, 2024 11:53

qnkhuat requested a review from calherries May 9, 2024 11:54

calherries reviewed May 9, 2024

View reviewed changes

qnkhuat added 3 commits May 10, 2024 08:56

instead of count, check the send pulse info

0f75110

migrate up to the latest migration

1dd523d

Merge branch 'master' into init-trigger-once

98eab26

qnkhuat requested review from calherries and a team May 10, 2024 11:09

calherries approved these changes May 10, 2024

View reviewed changes

qnkhuat added 2 commits May 10, 2024 19:12

Merge branch 'master' into init-trigger-once

057ca32

do not schedule trigger for pulses of archived dashboards

1ffe69f

qnkhuat added 2 commits May 15, 2024 11:58

Send pulse triggers should respect report timezone (#42502)

d1a25be

Merge branch 'master' into init-trigger-once

4284e2a

qnkhuat enabled auto-merge (squash) May 15, 2024 05:08

qnkhuat merged commit fe1f595 into master May 15, 2024
111 checks passed

qnkhuat deleted the init-trigger-once branch May 15, 2024 05:42

camsaul mentioned this pull request May 24, 2024

Kondo linter tests for :metabase/validate-logging #43104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An one time job to init send-pulse trigger and migration down to clean up send-pulse triggers #42316

An one time job to init send-pulse trigger and migration down to clean up send-pulse triggers #42316

qnkhuat commented May 7, 2024 •

edited

Loading

replay-io bot commented May 7, 2024 •

edited

Loading

qnkhuat May 8, 2024

calherries May 9, 2024 •

edited

Loading

qnkhuat May 8, 2024

calherries left a comment

piranha commented May 8, 2024

qnkhuat commented May 8, 2024

calherries commented May 8, 2024 •

edited

Loading

qnkhuat commented May 8, 2024

calherries commented May 8, 2024

calherries May 9, 2024 •

edited

Loading

qnkhuat May 9, 2024

calherries May 9, 2024

calherries May 9, 2024

qnkhuat May 9, 2024

calherries May 9, 2024 •

edited

Loading

qnkhuat May 10, 2024

calherries May 9, 2024

qnkhuat May 10, 2024

calherries left a comment •

edited

Loading

qnkhuat commented May 10, 2024

calherries left a comment

qnkhuat commented May 14, 2024

github-actions bot commented May 15, 2024

	(group-by #(select-keys % [:pulse_id :schedule_type :schedule_day :schedule_hour :schedule_frame]) results)
	(group-by :pulse_id results)

An one time job to init send-pulse trigger and migration down to clean up send-pulse triggers #42316

An one time job to init send-pulse trigger and migration down to clean up send-pulse triggers #42316

Conversation

qnkhuat commented May 7, 2024 • edited Loading

replay-io bot commented May 7, 2024 • edited Loading

Choose a reason for hiding this comment

calherries May 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

calherries left a comment

Choose a reason for hiding this comment

piranha commented May 8, 2024

qnkhuat commented May 8, 2024

calherries commented May 8, 2024 • edited Loading

qnkhuat commented May 8, 2024

calherries commented May 8, 2024

calherries May 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

calherries May 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

calherries left a comment • edited Loading

Choose a reason for hiding this comment

qnkhuat commented May 10, 2024

calherries left a comment

Choose a reason for hiding this comment

qnkhuat commented May 14, 2024

github-actions bot commented May 15, 2024

qnkhuat commented May 7, 2024 •

edited

Loading

replay-io bot commented May 7, 2024 •

edited

Loading

calherries May 9, 2024 •

edited

Loading

calherries commented May 8, 2024 •

edited

Loading

calherries May 9, 2024 •

edited

Loading

calherries May 9, 2024 •

edited

Loading

calherries left a comment •

edited

Loading