Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StreamField migration support #4755

Closed
wants to merge 1 commit into from

Conversation

ErwinJunge
Copy link

I've made an attempt at adding support for StreamField block changes to django migrations. This should fix #2110

The idea is to provide automatically generated boilerplate for the developer to specify how to modify the existing data from the old structure to the new structure.

Comments very welcome :)

@gasman
Copy link
Collaborator

gasman commented Aug 29, 2018

Excellent, thanks @ErwinJunge! Great to see this - migrations for StreamField have been a bugbear for lots of people.

Looks like it might take a while to fully review this, but I'll try not to keep you hanging too long :-)

@chosak
Copy link
Member

chosak commented Sep 12, 2018

@ErwinJunge thanks very much for opening this. Migrating StreamFields has long been a pain point.

I worry that wrapping the Django makemigrations command with a custom one as you've done here would be hard to maintain and prone to conflict if/when the Django code changes in future. Wrapping it like this seems to require duplicating quite a bit of the Django code.

It's too bad that there aren't any hooks in Django for automatically generating custom Operation steps, but the fact that this doesn't exist makes me think it's probably not a good idea.

What if, instead, we provided a separate Wagtail management command explicitly for handling the data migration required by a StreamField change? Say, manage.py makesfmigration. This command could be invoked by developers independently from, and after, a call to makemigrations. The output of this command would ideally be a second migration containing one or more RunPython steps that made the necessary data changes.

A somewhat simple approach might be: new makesfmigration command that programmatically calls makemigrations --empty to generate an empty migration, then picks up and edits that migration file to add one or more RunPython steps depending on the logic.

As a possible first step I'd be curious how clever we can get with processing the field schema diff directly. While, ideally, it'd be great if the logic could detect what kind of change is being made: an addition, a deletion, a rename, etc, that might be hard to do. One possible benefit of using a separate command is that we could provide logic hints via command-line parameter if needed/useful. So one might invoke, say, manage.py makesfmigration myapp.mymodel.myfield --delete parent_block.sub_block.

Unfortunately, even the practical step of generating a migration programmatically seems to necessarily involve some hacking. There's no easy way right now to interact with makemigrations programmatically, for example to get back the migration file that was created (see Django ticket #29026 that discusses this).

@ErwinJunge
Copy link
Author

ErwinJunge commented Sep 18, 2018

Hi @chosak,

First, thank you for the feedback.

I agree that overriding makemigrations requires quite a bit of code duplication from core django. However, getting rid of the duplication would require modifications to core django. Getting changes to core django merged is a complicated process that is really helped by having a good real-world example of how it would help developers. If we merge this, a good next step would be to make the required changes in django, get those merged and then we can refactor this. In the meantime, makemigrations is quite a stable function so the expected maintenance burden is low.

You suggestion of using a seperate command is actually where I started building this. Sadly, that is a dead end. If we first create a regular schema migration and then an empty datamigration, the datamigration's from_state will already have the modified streamfield schema information. Any db calls to those will not return the old data that needs to be reshaped, but will return the data as interpreted by the new streamfield schema (with a lot of missing stuff).

Regarding building a streamfield schema diff: I've already put before/after/diff information in the generated code to help guide the developer in what to do. Writing code that pregenerates the required data modifications based on the detected changes (like django does on a field rename) could be a good feature for future development. Let's start with the basics.

Long story short: I know this method has shortcomings, but I think that at this time it's the only way that we can support this functionality. I propose to merge this and use it as a starting point for future improvements.

@chosak
Copy link
Member

chosak commented Sep 18, 2018

@ErwinJunge thanks for the reply. Regarding this:

Any db calls to those will not return the old data that needs to be reshaped, but will return the data as interpreted by the new streamfield schema (with a lot of missing stuff).

Is this something that #4727 would help with? This would allow you to read "lazy" StreamField data directly as stored in the DB using get_prep_value, without first going through the Python conversion that drops deprecated fields.

@ErwinJunge
Copy link
Author

@ErwinJunge thanks for the reply. Regarding this:

Any db calls to those will not return the old data that needs to be reshaped, but will return the data as interpreted by the new streamfield schema (with a lot of missing stuff).

Is this something that #4727 would help with? This would allow you to read "lazy" StreamField data directly as stored in the DB using get_prep_value, without first going through the Python conversion that drops deprecated fields.

It probably would, that's a nice improvement to wagtail :)

However, from a developer usability perspective I think extending makemigrations is preferable as it provides the way of least resistance. The developers are already used to typing makemigrations and if we have to remind them in the docs somewhere to also type makesfmigrations or something similar they're bound to forget.

I think we could potentially meet in the middle. During development of this feature I played with the idea of having makemigrations create 2 files with one call. First the regular django one and then an automatically generated datamigration with some prefilled commands. That would make for a much cleaner way of dealing with django makemigrations (it'd just be a super call and then some added stuff for the second file) while still keeping the same developer-facing interface as regular django.

On the other hand, I've come to quite like the end-result of just having a single migration file that deals with this. I'd really like the code to be a bit cleaner though. I'd love to get some modifications into django that would allow us to do the things I've done with a much cleaner interface. Personally, that's my preferred outcome of all of this.

@gasman care to weigh in on the discussion?

@gasman gasman modified the milestones: 2.3, 2.4 Oct 10, 2018
@gasman gasman modified the milestones: 2.4, 2.5 Dec 6, 2018
@gasman gasman modified the milestones: 2.5, 2.6 Apr 5, 2019
Copy link

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to be targetted for 2.6

@ErwinJunge
Copy link
Author

needed to be targetted for 2.6

I don't understand this comment. Is there something I need to do to get this merged?

@auvipy
Copy link

auvipy commented May 1, 2019

please rebase on top of master :)

@gasman
Copy link
Collaborator

gasman commented May 1, 2019

Hi @auvipy - just so you're aware, we don't usually expect pull request contributors to keep their PRs up to date with master. If there's been a delay in reviewing and merging the PR, as there has been here, then that's entirely the responsibility of the core team, and it's not fair to ask contributors to do extra work to make up for that failing :-)

@auvipy
Copy link

auvipy commented May 1, 2019

I understand :) I do agree with your point.

@chosak
Copy link
Member

chosak commented May 1, 2019

Before merging this I'd like to invite @gasman to weigh in on the discussion above regarding customizing makemigrations versus implementing as a separate command. My preference would still be the latter although I realize it's unfair to make that argument without having an alternate implementation to propose!

@ErwinJunge
Copy link
Author

Rebased on top of master :)

Copy link

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need doc before merge.

@gasman gasman added this to the 2.7 milestone Jul 9, 2019
@gasman gasman modified the milestones: 2.7, 2.8 Oct 28, 2019
@gasman gasman modified the milestones: 2.8, 2.9 Jan 20, 2020
@gasman gasman modified the milestones: 2.9, 2.10 Apr 20, 2020
@gasman gasman modified the milestones: 2.10, 2.11 Jul 15, 2020
@gasman gasman modified the milestones: 2.11, 2.12 Oct 21, 2020
@gasman gasman modified the milestones: 2.12, 2.13 Jan 15, 2021
Base automatically changed from master to main March 3, 2021 17:08
@gasman gasman modified the milestones: 2.13, 2.14 Apr 19, 2021
@gasman gasman modified the milestones: 2.14, 2.15 Jul 13, 2021
@gasman gasman modified the milestones: 2.15, 2.16 Oct 14, 2021
@gasman gasman modified the milestones: 2.16, 2.17 Jan 12, 2022
@auvipy
Copy link

auvipy commented Apr 12, 2022

what is needed to complete this? any hint?

@ErwinJunge
Copy link
Author

what is needed to complete this? any hint?

🤷

Summary: I don't know why this is still stuck 😅, but I'd be happy to push it over the finish line.

@gasman gasman modified the milestones: 3.0, 4.0 Apr 14, 2022
@gasman gasman modified the milestones: 4.0, 4.1 Aug 11, 2022
@jacobtoppm
Copy link
Member

jacobtoppm commented Oct 6, 2022

Hey @ErwinJunge, thanks for your work on this! This has been something we've built on and tackled further as part of Google Summer of Code this year, initially as a third party package: see @sandilsranasinghe's update #8156 (comment). This includes a separate management command to autogenerate basic streamfield migrations to avoid the necessity to override large parts of Django's. Currently it only tackles rename and delete operations automatically, but it includes manual operations for many more changes.

We're hoping to move it into Wagtail core over the next few releases, starting with the operations, and including autodetection as it becomes more fully featured. We'd really appreciate any testing and feedback of the third party package in the meantime though - it's now released on PyPI, so please give it a try. I'm going to close this one for now as I think the package's implementation has built on this, and it will eventually be moved into core that way, but thanks again for all your work here - it was definitely an inspiration for us tackling this.

@jacobtoppm jacobtoppm closed this Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Roadmap
Planned for 2.4
Development

Successfully merging this pull request may close these issues.

Migrating a Block within a StreamField
6 participants