Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #24529 - Allow double squashing of migrations #14380

Closed
wants to merge 2 commits into from
Closed

Conversation

rtpg
Copy link
Contributor

@rtpg rtpg commented May 11, 2021

Related Trac ticket

In order to support multi-level squashing, we need to be a bit smarter about how we traverse replacements. The solution here introduces some extra checks on squashed migrations (mainly a lookup for whether its replacements need to be squashed first), but the performance hit shouldn't be very large.

This is missing documentation updates and changelog entries, but I would like to get a first look at the implementation strategy here, as well as field any extra testing requests before going further down this path.

@rtpg rtpg changed the title #24529 - Allow double squashing of migrations Fixed #24529 - Allow double squashing of migrations May 17, 2021
Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rtpg, thanks for this. I left some comments to move this forward. The main thing at this point is that we will need some tests that show the necessity of your changes in the loader. I would expect some changes to the loader to be necessary, but since the test passes without them, the test coverage looks lacking. You might look at the original failure case from ticket-23090 where the restriction against double-squashing was introduced.

Make sure to keep the ticket flags updated (uncheck "Needs ..." flags to become visible for re-review.) Happy to help out as you iterate--thanks again for the patch.

if migration.replaces:
replaces.extend(migration.replaces)
else:
replaces.append((migration.app_label, migration.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked your regression test; it fails on main, as it should. However, when applying only your changes to the squashmigrations.py command (here) and not your changes to the migration loader, your test case passes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I looked into this a bit today.

So my main dilemma is that, at least running locally, the order of migration loading (for disk migrations) is non-deterministic in the old version of the loader. In attempting to make a failing test for this, I found that the existing test does fail sometimes with the old code without the migration loader. But only sometimes!

It's dependent on the ordering of the iteration of the migrations when the disk migrations are loaded in load_disk. In particular the names are loaded into a set (that is where the non-determinism comes I think, since dictionary iteration should in theory be stable over runs).

So I think it's not really possible to make a test case that would fail consistently with the old code, while being a correct migration graph in the new model.

Would it be good to say that the added test_squashmigrations_squashes_already_squashed test (which fails non-deterministically with the old logic) covers the main logic and the "happy path", but that I still should add one test for the cycle tracking in the replacement logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just getting up to speed again, but I do see that reverting the loader changes now leads to a consistent failure (good):

django.db.migrations.exceptions.NodeNotFoundError: Unable to find replacement node ('migrations', '3_squashed_5'). It was either never added to the migration graph, or has been removed.

tests/migrations/test_commands.py Show resolved Hide resolved
tests/migrations/test_commands.py Outdated Show resolved Hide resolved
django/db/migrations/loader.py Show resolved Hide resolved
@felixxm
Copy link
Member

felixxm commented Mar 3, 2022

@rtpg Do you have time to keep working on this?

@rtpg
Copy link
Contributor Author

rtpg commented Mar 3, 2022

Hey @felixxm, thanks for the ping, this one fell by the wayside. Will find some time to work on this (unless someone else wants to pick it up and run with it, of course. I just want the feature in place)

memo[arg] = [True, result]
return result

return wrapped_func
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this but now feel like it's a bit "engineered" (especially given that it's harder to write nice error messages this way), thinking now that I should just put in memoization and cycle tracking into the has_been_applied helper function directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this sort of function probably belongs in utils (if it belongs at all)


def do_replacement(key):
# Toggle here to test between old code path and new one
NEW_WAY = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a helper boolean for me to toggle between the old and new ways when exploring how to write a regression test (turns out the existing test would fail sporadically in practice)


with open(squashed_migration_file, "r", encoding="utf-8") as fp:
content = fp.read()
# HACK Really I would just like to import the migration and do a Real Python List Comparison
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking around and it looks like most of the migration tests are doing these sorts of string comparisons (along with the WITH_BLACK toggles to make the formatting-dependent expected cases)

@rtpg
Copy link
Contributor Author

rtpg commented Mar 14, 2022

Alright I spent some time looking at this (mainly identifying that the existing test does indeed fail with the old code sometimes), would just like some confirmation about whether my game plan (to add one test for the cycle detection code, clean up the lint issues) sounds like a good game plan here.

I also am a bit unsatisfied with various little lines of code here, though it's not so much correctness as it is legibility. Searching for strings in the generated migrations feels weird to be honest. Also a bit lost as to how to make that memoization code very clean...

In short, a bit of guidance would be appreciated

@rtpg
Copy link
Contributor Author

rtpg commented Mar 16, 2022

Added the missing test and tried to clean up the code to the best of my abilities. Will look at changelog/doc editing tomorrow, a cursory search didn't mead me to any obvious documentation changes, but I know there has to be something.

Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I haven't looked at the memoization code in detail, but did leave some comments to keep this moving.

Will look at changelog/doc editing tomorrow, a cursory search didn't mead me to any obvious documentation changes, but I know there has to be something.

The admonition in the docs needs re-writing, if it no longer applies (in full):

.. note::
    Once you've squashed a migration, you should not then re-squash that squashed
    migration until you have fully transitioned it to a normal migration.

Also, a new feature earns a small note in docs/releases/4.1.txt.

if visited:
# we visited this node but have not finished the replacement
# this means we have a circular dependency
raise ValueError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be CommandError?


# we expect to hit a squash replacement cycle check error, but the actual
# failure is dependent on the order in which the files are read on disk.
self.assertTrue(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: maybe assertIn?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe assertRaisesRegex() would help you specify the small ambiguity (squashed|auto) and collapse all of this into one assertion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went with assertRaisesRegex

Comment on lines 2615 to 2619
# HACK Really I would just like to import
# the migration and do a Real Python List Comparison
#
# Check the replaces list, while trying to normalize the text
# independently of whether Black is in place.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using the MigrationLoader? See e.g. test_loading_squashed().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went with that, thanks for the pointer!

if migration.replaces:
replaces.extend(migration.replaces)
else:
replaces.append((migration.app_label, migration.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just getting up to speed again, but I do see that reverting the loader changes now leads to a consistent failure (good):

django.db.migrations.exceptions.NodeNotFoundError: Unable to find replacement node ('migrations', '3_squashed_5'). It was either never added to the migration graph, or has been removed.

@rtpg rtpg force-pushed the main branch 3 times, most recently from 5267297 to 9ca0edf Compare April 6, 2022 02:00
 In order to support multi-level squashing, we need to be a bit smarter
 about how we traverse replacements. The solution here introduces some
 extra checks on squashed migrations (mainly a lookup for whether its
 replacements need to be squashed first), but the performance hit
 shouldn't be very large.

Allow squashed migrations to also be squashed

 In order to support multi-level squashing, we need to be a bit smarter
 about how we traverse replacements. The solution here introduces some
 extra checks on squashed migrations (mainly a lookup for whether its
 replacements need to be squashed first), but the performance hit
 shouldn't be very large.

Try out some changes for discussion

Allow squashed migrations to also be squashed

 In order to support multi-level squashing, we need to be a bit smarter
 about how we traverse replacements. The solution here introduces some
 extra checks on squashed migrations (mainly a lookup for whether its
 replacements need to be squashed first), but the performance hit
 shouldn't be very large.

Try out some changes for discussion

Add a test confirming loop handling

Fix line length

Add documentation for double squashing of migrations

Fix up isort
@rtpg
Copy link
Contributor Author

rtpg commented Apr 20, 2022

I believe to have answered all the outstanding questions on this branch, so it should be ready for a re-review!

Co-authored-by: Jacob Walls <jacobtylerwalls@gmail.com>
@felixxm
Copy link
Member

felixxm commented May 16, 2022

@rtpg Thanks for this patch 👍 I have an issue in the following scenario:

  • app with 3 migrations:
    • 0001_initial.py
    • 0002_mymodel1_field_1_mymodel2_field_2_and_more.py,
    • 0003_alter_mymodel2_unique_together.py,
  • steps:
    • apply migrations 0001 and 0002: python manage.py migrate test_one 0002
    • squash migrations 00010003: python manage.py squashmigrations test_one 0001 0003
    • make a change in the models definition and create a new migration file 0004_remove_mymodel1_field_1_mymodel1_field_3_and_more.py: python manage.py makemigrations,
    • squash migrations 00010004: python manage.py squashmigrations test_one 0001_initial_squashed 0004
      Traceback (most recent call last):
       File "manage.py", line 22, in <module>
         main()
       File "manage.py", line 18, in main
         execute_from_command_line(sys.argv)
       File "/django/django/core/management/__init__.py", line 446, in execute_from_command_line
         utility.execute()
       File "/django/django/core/management/__init__.py", line 440, in execute
         self.fetch_command(subcommand).run_from_argv(self.argv)
       File "/django/django/core/management/base.py", line 402, in run_from_argv
         self.execute(*args, **cmd_options)
       File "/django/django/core/management/base.py", line 448, in execute
         output = self.handle(*args, **options)
       File "/django/django/core/management/commands/squashmigrations.py", line 100, in handle
         start = loader.get_migration(
       File "/django/django/db/migrations/loader.py", line 144, in get_migration
         return self.graph.nodes[app_label, name_prefix]
      KeyError: ('test_one', '0001_initial_squashed_0003_alter_mymodel2_unique_together')
      

Sample project: ticket_24529.zip.

@felixxm
Copy link
Member

felixxm commented Mar 15, 2024

Closing due to inactivity.

@felixxm felixxm closed this Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants