fix: migrate existing deadline rows in migration 0080 upgrade and downgrade#66016
Conversation
6bd9897 to
db962e1
Compare
a3771b5 to
435af24
Compare
…ngrade
Both the upgrade and downgrade paths of migration 0080
(808787349f22 - Modify deadline callback schema) added NOT NULL columns
to the deadline table without first populating them from the existing
data, causing:
* upgrade: NotNullViolation when adding callback JSON NOT NULL to a
non-empty table (existing rows have no value for the new
column).
* downgrade: NotNullViolation on PostgreSQL / silent NULL on MySQL when
adding callback VARCHAR(500) NOT NULL after dropping the
JSON column, crashing the subsequent 0094 upgrade with
json.loads(None).
Fix both paths with the same pattern used by migration 0094:
1. Read the existing data before any schema change.
2. Add the new column(s) as nullable so the DDL succeeds on all
supported databases.
3. Back-fill the column using a typed SA table clause (so each dialect
handles JSON serialisation correctly).
4. Enforce NOT NULL only after every row has a valid value.
Upgrade serialises the old (path, kwargs) pair into the
{"__data__": {"path": ..., "kwargs": ...}, "__classname__": ...,
"__version__": 0} format expected by migration 0094.
Downgrade extracts path/kwargs back from that same JSON envelope
before restoring the VARCHAR callback column.
Made-with: Cursor
Co-authored-by: Cursor <cursoragent@cursor.com>
435af24 to
36eb93a
Compare
ephraimbuddy
left a comment
There was a problem hiding this comment.
This still misses databases that already ran the old 0080. If MySQL created existing deadline.callback = NULL rows and Alembic is already stamped past 808787349f22, this edited 0080 upgrade will not run again, so 0094 can still hit json.loads(row.callback) with None.
I think we need either a repair path in 0094 or a follow-up migration that handles already-applied bad 0080 rows, plus a regression test that starts from a post-0080 schema with callback set to NULL.
Legacy MySQL deployments that ran the original (pre-#66016) 0080 migration silently wrote NULL into deadline.callback. Alembic is already stamped past 808787349f22 on those deployments, so the fixed 0080 won't re-run -- 0094 then crashes on json.loads(None). Defensive handling in 0094: - _upgrade_mysql_sqlite: detect raw_cb is None, log warning, default to empty envelope. - _upgrade_postgresql: COALESCE on cb_path / cb_kwargs so NULL jsonb doesn't propagate into callback.data. Regression test starts from a post-0080 schema with callback=NULL and verifies 0094 produces an empty envelope without crashing, plus that a mixed NULL+valid batch migrates both rows correctly. Addresses ephraimbuddy's review comment on #66016.
- 0094 PG path: wrap kwargs COALESCE with NULLIF so JSON-literal null
(e.g. from hand-edited DBs) is also normalized to {} -- matches the
MySQL/SQLite defensive normalization.
- 0094 MySQL/SQLite: aggregate per-row NULL-callback WARNING into a
single summary line after the loop to avoid log spam on deployments
with many legacy NULL rows.
- 0080 downgrade: log a WARNING when an existing row's kwargs is not a
dict (instead of silently resetting), mirroring the VARCHAR truncate
warning pattern.
- 0080 tests: add test_upgrade_exact_batch_boundary covering the case
where rows == batch_size to exercise the loop-continuation path.
d6620f3 to
058d7e9
Compare
… mismatch `_upgrade_mysql_sqlite` declares the deadline `id` column as `sa.Uuid()`. On SQLite the SQLAlchemy write path serializes UUID objects as a hex string without dashes. The previous test inserted ids via `str(uuid.uuid4())` (dashed form), so the SELECT correctly parsed them back to UUID objects but the subsequent UPDATE's WHERE clause produced the hex form -- no rows matched, the WHERE callback_id IS NULL filter never narrowed, and the loop spun until the 60s execution timeout. Insert ids via `uuid.uuid4().hex` so read and write round-trip cleanly.
Without this guard, an empty deadline table causes executemany with an empty parameter list, which SQLAlchemy rejects with 'A value is required for bind parameter'. Reverts the removal from e50f106.
Backport successfully created: v3-2-testNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
|
…rade and downgrade (apache#66016) * fix: migrate existing deadline rows in migration 0080 upgrade and downgrade Both the upgrade and downgrade paths of migration 0080 (808787349f22 - Modify deadline callback schema) added NOT NULL columns to the deadline table without first populating them from the existing data, causing: * upgrade: NotNullViolation when adding callback JSON NOT NULL to a non-empty table (existing rows have no value for the new column). * downgrade: NotNullViolation on PostgreSQL / silent NULL on MySQL when adding callback VARCHAR(500) NOT NULL after dropping the JSON column, crashing the subsequent 0094 upgrade with json.loads(None). Fix both paths with the same pattern used by migration 0094: 1. Read the existing data before any schema change. 2. Add the new column(s) as nullable so the DDL succeeds on all supported databases. 3. Back-fill the column using a typed SA table clause (so each dialect handles JSON serialisation correctly). 4. Enforce NOT NULL only after every row has a valid value. Upgrade serialises the old (path, kwargs) pair into the {"__data__": {"path": ..., "kwargs": ...}, "__classname__": ..., "__version__": 0} format expected by migration 0094. Downgrade extracts path/kwargs back from that same JSON envelope before restoring the VARCHAR callback column. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com> * fix(deadline): repair NULL callback rows in 0094 upgrade Legacy MySQL deployments that ran the original (pre-apache#66016) 0080 migration silently wrote NULL into deadline.callback. Alembic is already stamped past 808787349f22 on those deployments, so the fixed 0080 won't re-run -- 0094 then crashes on json.loads(None). Defensive handling in 0094: - _upgrade_mysql_sqlite: detect raw_cb is None, log warning, default to empty envelope. - _upgrade_postgresql: COALESCE on cb_path / cb_kwargs so NULL jsonb doesn't propagate into callback.data. Regression test starts from a post-0080 schema with callback=NULL and verifies 0094 produces an empty envelope without crashing, plus that a mixed NULL+valid batch migrates both rows correctly. Addresses ephraimbuddy's review comment on apache#66016. * fix(deadline): address review nits - 0094 PG path: wrap kwargs COALESCE with NULLIF so JSON-literal null (e.g. from hand-edited DBs) is also normalized to {} -- matches the MySQL/SQLite defensive normalization. - 0094 MySQL/SQLite: aggregate per-row NULL-callback WARNING into a single summary line after the loop to avoid log spam on deployments with many legacy NULL rows. - 0080 downgrade: log a WARNING when an existing row's kwargs is not a dict (instead of silently resetting), mirroring the VARCHAR truncate warning pattern. - 0080 tests: add test_upgrade_exact_batch_boundary covering the case where rows == batch_size to exercise the loop-continuation path. * test(deadline): insert deadline ids in hex form to avoid SA Uuid type mismatch `_upgrade_mysql_sqlite` declares the deadline `id` column as `sa.Uuid()`. On SQLite the SQLAlchemy write path serializes UUID objects as a hex string without dashes. The previous test inserted ids via `str(uuid.uuid4())` (dashed form), so the SELECT correctly parsed them back to UUID objects but the subsequent UPDATE's WHERE clause produced the hex form -- no rows matched, the WHERE callback_id IS NULL filter never narrowed, and the loop spun until the 60s execution timeout. Insert ids via `uuid.uuid4().hex` so read and write round-trip cleanly. * fix(deadline): drop redundant empty-rows break in 0080 batch loops * fix(deadline): restore empty-rows break in 0080 batch loops Without this guard, an empty deadline table causes executemany with an empty parameter list, which SQLAlchemy rejects with 'A value is required for bind parameter'. Reverts the removal from e50f106. --------- (cherry picked from commit c8a6c55) Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…rade and downgrade (apache#66016) * fix: migrate existing deadline rows in migration 0080 upgrade and downgrade Both the upgrade and downgrade paths of migration 0080 (808787349f22 - Modify deadline callback schema) added NOT NULL columns to the deadline table without first populating them from the existing data, causing: * upgrade: NotNullViolation when adding callback JSON NOT NULL to a non-empty table (existing rows have no value for the new column). * downgrade: NotNullViolation on PostgreSQL / silent NULL on MySQL when adding callback VARCHAR(500) NOT NULL after dropping the JSON column, crashing the subsequent 0094 upgrade with json.loads(None). Fix both paths with the same pattern used by migration 0094: 1. Read the existing data before any schema change. 2. Add the new column(s) as nullable so the DDL succeeds on all supported databases. 3. Back-fill the column using a typed SA table clause (so each dialect handles JSON serialisation correctly). 4. Enforce NOT NULL only after every row has a valid value. Upgrade serialises the old (path, kwargs) pair into the {"__data__": {"path": ..., "kwargs": ...}, "__classname__": ..., "__version__": 0} format expected by migration 0094. Downgrade extracts path/kwargs back from that same JSON envelope before restoring the VARCHAR callback column. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com> * fix(deadline): repair NULL callback rows in 0094 upgrade Legacy MySQL deployments that ran the original (pre-apache#66016) 0080 migration silently wrote NULL into deadline.callback. Alembic is already stamped past 808787349f22 on those deployments, so the fixed 0080 won't re-run -- 0094 then crashes on json.loads(None). Defensive handling in 0094: - _upgrade_mysql_sqlite: detect raw_cb is None, log warning, default to empty envelope. - _upgrade_postgresql: COALESCE on cb_path / cb_kwargs so NULL jsonb doesn't propagate into callback.data. Regression test starts from a post-0080 schema with callback=NULL and verifies 0094 produces an empty envelope without crashing, plus that a mixed NULL+valid batch migrates both rows correctly. Addresses ephraimbuddy's review comment on apache#66016. * fix(deadline): address review nits - 0094 PG path: wrap kwargs COALESCE with NULLIF so JSON-literal null (e.g. from hand-edited DBs) is also normalized to {} -- matches the MySQL/SQLite defensive normalization. - 0094 MySQL/SQLite: aggregate per-row NULL-callback WARNING into a single summary line after the loop to avoid log spam on deployments with many legacy NULL rows. - 0080 downgrade: log a WARNING when an existing row's kwargs is not a dict (instead of silently resetting), mirroring the VARCHAR truncate warning pattern. - 0080 tests: add test_upgrade_exact_batch_boundary covering the case where rows == batch_size to exercise the loop-continuation path. * test(deadline): insert deadline ids in hex form to avoid SA Uuid type mismatch `_upgrade_mysql_sqlite` declares the deadline `id` column as `sa.Uuid()`. On SQLite the SQLAlchemy write path serializes UUID objects as a hex string without dashes. The previous test inserted ids via `str(uuid.uuid4())` (dashed form), so the SELECT correctly parsed them back to UUID objects but the subsequent UPDATE's WHERE clause produced the hex form -- no rows matched, the WHERE callback_id IS NULL filter never narrowed, and the loop spun until the 60s execution timeout. Insert ids via `uuid.uuid4().hex` so read and write round-trip cleanly. * fix(deadline): drop redundant empty-rows break in 0080 batch loops * fix(deadline): restore empty-rows break in 0080 batch loops Without this guard, an empty deadline table causes executemany with an empty parameter list, which SQLAlchemy rejects with 'A value is required for bind parameter'. Reverts the removal from e50f106. --------- (cherry picked from commit c8a6c55) Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Fixes both the
upgradeanddowngradepaths of migration0080_3_1_0_modify_deadline_callback_schema.py(808787349f22), which addedNOT NULLcolumns to thedeadlinetable without migrating existing row data, causingIntegrityErroron non-empty databases.Root cause
Upgrade (
3bda03debd04 → 808787349f22): drops the oldcallback VARCHAR(500)/callback_kwargs JSONcolumns and then doesADD COLUMN callback JSON NOT NULL. PostgreSQL raisesNotNullViolationimmediately because existing rows have no value for the new column.Downgrade (
808787349f22 → 3bda03debd04): drops the new JSONcallbackcolumn and then doesADD COLUMN callback VARCHAR(500) NOT NULL. SameNotNullViolationon PostgreSQL; MySQL silently writesNULL, which then crashes migration0094's upgrade step withjson.loads(None).Fix
Both paths now follow the same safe-migration pattern already used by migration
0094:nullable=Trueso the DDL succeeds on all supported databases.table()clause (dialect-aware JSON serialisation).NOT NULLonly after all rows have valid values.The upgrade serialises the old
(callback VARCHAR, callback_kwargs JSON)pair into the{"__data__": {"path": ..., "kwargs": ...}, "__classname__": ..., "__version__": 0}envelope that migration0094's upgrade expects.The downgrade extracts
pathandkwargsfrom that same envelope before restoring the old VARCHAR column.How to reproduce
0080(i.e. >= 3.1.0) with at least onedeadlinerow in the database.airflow db downgradepast808787349f22(orairflow db upgradefrom a pre-3.1.0 snapshot) – observeIntegrityError: NotNullViolationon PostgreSQL.Was generative AI tooling used to co-author this PR?
Claude