Data migration 0078_populate_carrier_snapshots is not production-safe (per-row save loop over 5 tables) #1123
Unanswered
mgradalska
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What's happening
The data migration
manager.0078_populate_carrier_snapshotsis not safe to run against a production-scale database.Each of its five table loops runs
Model.objects.all()followed by a per-row.save(), with no batching, nobulk_update, no chunked iterator, and no progress logging. On databases of any realistic size the migration loads entire result sets into memory and then issues one SQLUPDATEper row across millions of rows. Operators running real workloads cannot apply this migration cleanly without invasive workarounds.The migration code
modules/manager/karrio/server/manager/migrations/0078_populate_carrier_snapshots.py#L48-L76.The Pickup block (shortest of the five, representative of all of them):
The same shape repeats for
Tracking,DocumentUploadRecord,Manifest, andShipment- five sequential O(N) loops, no shared checkpoint.Why this is a problem
.all()materializes the entire result set in memory before iteration begins. On large tables this is unbounded memory growth.UPDATEper row. Each statement carries its own commit overhead, so wall time is dominated by transaction bookkeeping rather than the actual updates. On a live database it also competes with concurrent traffic for the same locks..iterator(). The standard Django data-migration tooling is bypassed entirely.Suggested direction
Django provides the standard tooling for migrations of this shape:
.iterator(chunk_size=N)to avoid loading the whole table, andbulk_update(rows, ["carrier"], batch_size=N)to collapse N round-trips into a small number ofCASE-mapped UPDATEs. For migrations that touch tables likely to be large in production, those are worth applying here.Beyond this migration
The bigger concern is the pattern, not this one migration. Per-row
.save()insideRunPythonis the recognizable shape - if future migrations land with the same structure, they'll fail the same way. Worth treating as a class of issue rather than a one-off.Even when a heavy data migration is written well, operators benefit from knowing it's coming. A changelog note flagging migrations that touch large tables (with a rough sense of expected runtime or resource needs) would let operators plan downtime windows, scale resources up beforehand, and avoid being surprised mid-upgrade.
Beta Was this translation helpful? Give feedback.
All reactions