Allow custom RevisionMap subclass for pluggable revision ordering #1807
Replies: 11 comments
-
|
Hi, I'm not sure you use case can work by just changing the topological sort since there are other instances what depend on the revision.down_revision attribute, but we can look at the pull request. Note that your proposed strategy is not safe in general, since you generally cannot automatically order your migrations using their git commit data, since you may miss migrations if a revision A took one day to merge while another created before A took a week to merge. @zzzeek what do you think? |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the thoughtful response, @CaselIT. You raise a good point about Our current problem:
What we want to build (the full solution, not just this PR): The
On your concern about ordering safety: You're right that git commit date ordering is not universally safe. A migration that took a week to merge could end up ordered after one that was created later but merged sooner. However, in our case these are independent, non-conflicting migrations. If migration A adds a The ordering would always be fully dynamic. Developers would never set CI as a safety net: We also run the full migration suite in CI both in pre-merge pipelines and in the merge queue. So in the rare case where PR 1's migration actually breaks PR 2's migration when run in sequence, PR 2 would get kicked from the merge queue with a meaningful migration error, the same way any other real conflict surfaces. The difference from today is that PRs only get ejected when there's an actual problem, not just because two independent migrations happened to land concurrently. Does that clarify the use case? We fully understand the risks and accept that the ordering strategy is our responsibility. We just need the hook to plug it in. |
Beta Was this translation helpful? Give feedback.
-
|
I think you misunderstood what I was saying. The ordering is not deterministic using git commit date: State 0: B If you deployed to a db state 1, then you go to update to state 2 alembic would see no change is need, but migration Y wan never applied. That said, if you are fine with it that's ok ok my part. |
Beta Was this translation helpful? Give feedback.
-
|
I think I understand your concern now, and it's a valid one if the ordering were based on the original commit date from the developer's branch. Let me clarify what we'd actually use. We use merge commits on GitHub. When a PR merges (or enters the merge queue), the merge commit on In your example:
The merge commit timestamps on I should have been more precise in my earlier comment. When I said "the date each migration file was first committed" I meant the date it first appears in |
Beta Was this translation helpful? Give feedback.
-
|
if there are no force pushes in main I guess it should be ok |
Beta Was this translation helpful? Give feedback.
-
|
Yeah, we don't allow any force pushes, period. Everything is governed via GitHub rulesets and even "forces" must go through GitHub which still re-stamps via squash merge commit. |
Beta Was this translation helpful? Give feedback.
-
|
though of a drawback: you really just have the final ordering when you merge in main, if you ever need a feature branch then it doesn't work |
Beta Was this translation helpful? Give feedback.
-
|
100% agree but that's even true with typical alembic migrations, ie if a feature branch is out of date it has the latest migrations from when it was first branched plus the newest migrations. It's pretty similar with commit based ordering. There may be some nuance or details I'm missing but I agree that the edge case exists. Our goal is to use git commit history as the source of truth and that is per branch. For example if we were to create a long lived branch and deploy off of it that would have a distinct and separate order from any other branch (from the time of the branch that is). |
Beta Was this translation helpful? Give feedback.
-
|
i have a lot of thoughts on this and the first one is that it seems like you are proposing a system where your code would not be able to run without being present inside of a git checkout. isnt that a serious issue? beyond that this seems to be more appropriately addressed using merge hooks to automate the renumbering of alembic revision files to be based on the apparent git hashes in the final source tree. just because developers are doing this manually now doesnt mean it cant be automated and this automation doesnt seem any more complex than doing it at runtime based on what's present in .git, and in fact it seems vastly safer to ensure the source tree has a fixed ordering rather than one that can change dynamically if any kind of manipulations to the .git history occur. |
Beta Was this translation helpful? Give feedback.
-
|
ive made this general suggestion to Claude (opus) and it proposes three approaches based on github actions so far. I think an approach that creates a safe, static migration directory based on GH actions adapted to your workflow would be the way to go here. see claude output below Issue #1804 — Automated Migration Ordering in Merge QueuesThe ProblemIn a high-velocity monorepo with a GitHub merge queue, two PRs that each Why a Custom
|
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the detailed response @zzzeek. I want to address a few things. On the Totally fair point. Our plan was never to require On the GitHub Actions solutions: I appreciate the suggestions, but unfortunately I don't think any of them actually solve the problem in a merge queue context. I think the LLM you used was being too agreeable here and not pressure-testing these against the specific constraints of a high-velocity merge queue. Let me walk through why: Solution 1 Part A (fix working tree in merge queue CI): The merge queue creates a temporary merge commit, runs CI on it, and then merges the original PR branch into main. Modifications to the working tree during CI are thrown away. The merged code on main still has the wrong Solution 1 Part B (post-merge fixup commit): This pushes directly to main via Solution 2 (auto-merge heads post-merge): Same problems as Solution 1 Part B. Direct push to main, race windows, concurrent conflicts. Solution 3 (PR bot that pre-rebases): This doesn't help with the merge queue. Two PRs can both be updated to point at the same head on main, both enter the merge queue, and the second one still creates two heads. This is exactly the problem we already have today. The core issue: All three solutions try to maintain a static Dynamic ordering at the I understand if this isn't something you want to support in Alembic core. But I want to make sure the problem is clearly understood, because the suggested alternatives don't work for our use case. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
We run a high-velocity monorepo where multiple teams land migrations concurrently. In our merge queue, when two PRs each add a migration, the second one to merge discovers there are now two Alembic heads. This kicks the PR out of the merge queue and requires the author to manually rebase and update their migration's
down_revisionto point at the new head, re-push, and re-enter the queue. With many developers contributing migrations, this compounds — the more PRs in flight, the more frequently authors get ejected and have to rebase, creating a frustrating cycle of churn.We want to solve this by using git history to automatically determine migration ordering — when concurrent migrations create multiple heads, order them by when they were first introduced in git, so that the merge queue can resolve the ordering deterministically without human intervention. This requires being able to swap in a custom
RevisionMapsubclass that overrides_topological_sort().Today there is no way to customize the ordering behavior without monkey-patching internals. The
RevisionMapclass is instantiated directly insideScriptDirectory.__init__()with no way to substitute a subclass.Proposed Solution
Add a
revision_map_classconfiguration option toScriptDirectorythat accepts a dotted Python path to aRevisionMapsubclass. This would allow external packages (e.g. a hypotheticalalembic-git-ordering) to provide custom ordering logic by overriding_topological_sort()or other methods, without any changes to Alembic core.This is a minimal, non-breaking change: a single new keyword argument with a
Nonedefault, and a config option that is entirely opt-in.Use Cases
down_revisioneach time, which compounds with team size. A git-history-awareRevisionMapcould order unrelated migrations deterministically, eliminating this manual churn.Current Workarounds
The only current options are:
RevisionMaporScriptDirectoryinenv.py— fragile and breaks across Alembic upgradesdown_revisionevery time a PR gets ejected — the status quo we're trying to escapeWe have a PR ready at #1805 that implements this.
Beta Was this translation helpful? Give feedback.
All reactions