New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed #24529 - Allow double squashing of migrations #14380
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rtpg, thanks for this. I left some comments to move this forward. The main thing at this point is that we will need some tests that show the necessity of your changes in the loader. I would expect some changes to the loader to be necessary, but since the test passes without them, the test coverage looks lacking. You might look at the original failure case from ticket-23090 where the restriction against double-squashing was introduced.
Make sure to keep the ticket flags updated (uncheck "Needs ..." flags to become visible for re-review.) Happy to help out as you iterate--thanks again for the patch.
if migration.replaces: | ||
replaces.extend(migration.replaces) | ||
else: | ||
replaces.append((migration.app_label, migration.name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked your regression test; it fails on main
, as it should. However, when applying only your changes to the squashmigrations.py
command (here) and not your changes to the migration loader, your test case passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I looked into this a bit today.
So my main dilemma is that, at least running locally, the order of migration loading (for disk migrations) is non-deterministic in the old version of the loader. In attempting to make a failing test for this, I found that the existing test does fail sometimes with the old code without the migration loader. But only sometimes!
It's dependent on the ordering of the iteration of the migrations when the disk migrations are loaded in load_disk
. In particular the names are loaded into a set (that is where the non-determinism comes I think, since dictionary iteration should in theory be stable over runs).
So I think it's not really possible to make a test case that would fail consistently with the old code, while being a correct migration graph in the new model.
Would it be good to say that the added test_squashmigrations_squashes_already_squashed
test (which fails non-deterministically with the old logic) covers the main logic and the "happy path", but that I still should add one test for the cycle tracking in the replacement logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just getting up to speed again, but I do see that reverting the loader changes now leads to a consistent failure (good):
django.db.migrations.exceptions.NodeNotFoundError: Unable to find replacement node ('migrations', '3_squashed_5'). It was either never added to the migration graph, or has been removed.
@rtpg Do you have time to keep working on this? |
Hey @felixxm, thanks for the ping, this one fell by the wayside. Will find some time to work on this (unless someone else wants to pick it up and run with it, of course. I just want the feature in place) |
django/db/migrations/loader.py
Outdated
memo[arg] = [True, result] | ||
return result | ||
|
||
return wrapped_func |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this but now feel like it's a bit "engineered" (especially given that it's harder to write nice error messages this way), thinking now that I should just put in memoization and cycle tracking into the has_been_applied
helper function directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this sort of function probably belongs in utils (if it belongs at all)
django/db/migrations/loader.py
Outdated
|
||
def do_replacement(key): | ||
# Toggle here to test between old code path and new one | ||
NEW_WAY = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a helper boolean for me to toggle between the old and new ways when exploring how to write a regression test (turns out the existing test would fail sporadically in practice)
tests/migrations/test_commands.py
Outdated
|
||
with open(squashed_migration_file, "r", encoding="utf-8") as fp: | ||
content = fp.read() | ||
# HACK Really I would just like to import the migration and do a Real Python List Comparison |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking around and it looks like most of the migration tests are doing these sorts of string comparisons (along with the WITH_BLACK
toggles to make the formatting-dependent expected cases)
Alright I spent some time looking at this (mainly identifying that the existing test does indeed fail with the old code sometimes), would just like some confirmation about whether my game plan (to add one test for the cycle detection code, clean up the lint issues) sounds like a good game plan here. I also am a bit unsatisfied with various little lines of code here, though it's not so much correctness as it is legibility. Searching for strings in the generated migrations feels weird to be honest. Also a bit lost as to how to make that memoization code very clean... In short, a bit of guidance would be appreciated |
Added the missing test and tried to clean up the code to the best of my abilities. Will look at changelog/doc editing tomorrow, a cursory search didn't mead me to any obvious documentation changes, but I know there has to be something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. I haven't looked at the memoization code in detail, but did leave some comments to keep this moving.
Will look at changelog/doc editing tomorrow, a cursory search didn't mead me to any obvious documentation changes, but I know there has to be something.
The admonition in the docs needs re-writing, if it no longer applies (in full):
.. note::
Once you've squashed a migration, you should not then re-squash that squashed
migration until you have fully transitioned it to a normal migration.
Also, a new feature earns a small note in docs/releases/4.1.txt
.
django/db/migrations/loader.py
Outdated
if visited: | ||
# we visited this node but have not finished the replacement | ||
# this means we have a circular dependency | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be CommandError
?
tests/migrations/test_commands.py
Outdated
|
||
# we expect to hit a squash replacement cycle check error, but the actual | ||
# failure is dependent on the order in which the files are read on disk. | ||
self.assertTrue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: maybe assertIn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe assertRaisesRegex()
would help you specify the small ambiguity (squashed|auto
) and collapse all of this into one assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went with assertRaisesRegex
tests/migrations/test_commands.py
Outdated
# HACK Really I would just like to import | ||
# the migration and do a Real Python List Comparison | ||
# | ||
# Check the replaces list, while trying to normalize the text | ||
# independently of whether Black is in place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered using the MigrationLoader? See e.g. test_loading_squashed()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went with that, thanks for the pointer!
if migration.replaces: | ||
replaces.extend(migration.replaces) | ||
else: | ||
replaces.append((migration.app_label, migration.name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just getting up to speed again, but I do see that reverting the loader changes now leads to a consistent failure (good):
django.db.migrations.exceptions.NodeNotFoundError: Unable to find replacement node ('migrations', '3_squashed_5'). It was either never added to the migration graph, or has been removed.
5267297
to
9ca0edf
Compare
In order to support multi-level squashing, we need to be a bit smarter about how we traverse replacements. The solution here introduces some extra checks on squashed migrations (mainly a lookup for whether its replacements need to be squashed first), but the performance hit shouldn't be very large. Allow squashed migrations to also be squashed In order to support multi-level squashing, we need to be a bit smarter about how we traverse replacements. The solution here introduces some extra checks on squashed migrations (mainly a lookup for whether its replacements need to be squashed first), but the performance hit shouldn't be very large. Try out some changes for discussion Allow squashed migrations to also be squashed In order to support multi-level squashing, we need to be a bit smarter about how we traverse replacements. The solution here introduces some extra checks on squashed migrations (mainly a lookup for whether its replacements need to be squashed first), but the performance hit shouldn't be very large. Try out some changes for discussion Add a test confirming loop handling Fix line length Add documentation for double squashing of migrations Fix up isort
I believe to have answered all the outstanding questions on this branch, so it should be ready for a re-review! |
Co-authored-by: Jacob Walls <jacobtylerwalls@gmail.com>
@rtpg Thanks for this patch 👍 I have an issue in the following scenario:
Sample project: ticket_24529.zip. |
Closing due to inactivity. |
Related Trac ticket
In order to support multi-level squashing, we need to be a bit smarter about how we traverse replacements. The solution here introduces some extra checks on squashed migrations (mainly a lookup for whether its replacements need to be squashed first), but the performance hit shouldn't be very large.
This is missing documentation updates and changelog entries, but I would like to get a first look at the implementation strategy here, as well as field any extra testing requests before going further down this path.