Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New rules for migrations #341

Closed
ewjoachim opened this issue Oct 19, 2020 · 13 comments · Fixed by #342
Closed

New rules for migrations #341

ewjoachim opened this issue Oct 19, 2020 · 13 comments · Fixed by #342
Labels
Issue appropriate for: People up for a challenge 🤨 This issue probably will be challenging to tackle Issue contains: Exploration & Design decisions 🤯 We don't know how this will be implemented yet Issue contains: Some SQL 🐘 This features require changing the SQL model Issue type: Process ⚙️ Things that have to do with how procrastinate is maintained

Comments

@ewjoachim
Copy link
Member

ewjoachim commented Oct 19, 2020

See #100 for our first iteration.

Why

We need to provide a way to run migrations while procrastinate runs, which means we need each each version of the schema to be compatible with 2 at least versions of the code (and likewise, each version of the code to be compatible with at least 2 versions of the schema).

How

  • By setting clear rules on what version of the schema should work with what version of the code
  • By testing that it's the case
  • By documenting the upgrade strategy

We can be quite limiting at start (e.g. only upgrading versions 1 by 1 is supported)

Let's flesh out proposals below.

@ewjoachim
Copy link
Member Author

ewjoachim commented Oct 19, 2020

Proposal 1: "one by one" & calendar based releases

We guarantee that each DB version is compatible with procrastinate's previous codebase. This means that in a single version, we can always run the migrations for the next version.

High availability upgrade strategy for X to Z:

  • Upgrade DB, then code from X to Y
  • Upgrade DB, then code from Y to Z

This means that if you need to upgrade 10 versions at once, you'll need 20 upgrade steps. In order to limit the steps, we should be avoiding making new versions that modify SQL too often. Thus we suggest adopting a calendar-based release with at most monthly releases.

What it means in terms of code:

  • When we add a feature with a migration, we need to make sure that running this migration would not break the code on the currently released version
  • If it does, we split into 2 parts: adding the messy-but-compatible migration on the current working version X and the cleanup on X+1
  • When we release, we need to make sure to advertise the "cleanup" migrations

@ewjoachim
Copy link
Member Author

ewjoachim commented Oct 19, 2020

Proposal 2: "one by one on major versions" with checkpoints

(by @elemoine)

We guarantee that each release within a major version X is compatible with the database at version X+1.0 (the "checkpoint" version).

High availability upgrade strategy for X.5 to Z.5:

  • Upgrade DB, then code from X.5 to Y.0
  • Upgrade DB, then code from Y.0 to Z.0
  • Upgrade DB, then code from Z.0 to Z.5

We can imagine that major versions are less frequent than "any release", thus this would limit the steps needed compared to proposal 1.

What it means in terms of code:

  • When we add a feature with a migration, we need to make sure that running this migration would not break the code on the first version of our current major, X.0
  • If it does, we split into 2 parts: adding the messy-but-compatible migration on the current working version X and the cleanup on X+1.1
  • The X+1.0 DB needs to be still compatible with everything from version X, and we need X+1.1 to remove the mess

@ewjoachim
Copy link
Member Author

ewjoachim commented Oct 19, 2020

Proposal 3 : Proposal 2 but checkpoint is X.99

We guarantee that each release within a major version X is compatible with the database at version X.99 (the "checkpoint" version).
We guarantee that each X.99 release is compatible with the database at version X+1.99

High availability upgrade strategy for X.5 to Z.5:

  • Upgrade DB, then code from X.5 to X.99
  • Upgrade DB, then code from X.99 to Y.99
  • Upgrade DB, then code from Y.99 to Z.5

What it means in terms of code:

  • When we add a feature with a migration, we need to make sure that running this migration would not break the code on X-1.99
  • If it does, we split into 2 parts: adding the messy-but-compatible migration on the current working version X and the cleanup on X+1.0
  • The X.99 version doesn't need to introduce code, it can be identical to the previous one
  • The X+1.0 version also doesn't need to introduce code, it just removes DB compatibility with X.
  • We can also remove all migrations linked to X except X.99 from the codebase on any X+1.0.

@ewjoachim
Copy link
Member Author

ewjoachim commented Oct 19, 2020

Discussing we @elemoine : we're going for proposition 3 but you have until 2020-10-23 08:00:00Z to express concern before we start taking action :)

Ping @k4nar @mgu @marco44

@ewjoachim ewjoachim added Issue appropriate for: People up for a challenge 🤨 This issue probably will be challenging to tackle Issue contains: Some SQL 🐘 This features require changing the SQL model Issue contains: Exploration & Design decisions 🤯 We don't know how this will be implemented yet Issue type: Process ⚙️ Things that have to do with how procrastinate is maintained labels Oct 19, 2020
@marco44
Copy link

marco44 commented Oct 21, 2020

2 or 3 for me, I'm fine with both. So I presume it's a vote for 3 too

@k4nar
Copy link
Contributor

k4nar commented Oct 21, 2020

I think I'm ok with proposition 3, but I find the ".99" a bit confusing. Would formulating it this way work?

We guarantee that each release within a major version X is compatible with the database at any release within the next major version (X+1).

@ewjoachim
Copy link
Member Author

ewjoachim commented Oct 21, 2020

I'm not sure this would be true.
Imagine we remove a field at version 1.5, the code stops using it (but the field stays in the db). At 1.99, there's still the field, but we know that it's not used anymore. At 2.0 we remove the field. This means the code at 1.99 is compatible with the db at 2.99, but the code at 1.4 is not.

We're talking about a real .99 release that we'll be creating for real, each time we make a major release, in order to have a version you can use on a given major that you can be sure has the most updated code. This means upgrading from this 1.99 release to any 2.X release will be safe.

@k4nar
Copy link
Contributor

k4nar commented Oct 22, 2020

Do you have examples of projects doing this? I find it very unusual. Plus when we'll make the .99, what tells us that we won't have another minor version to make?

I feel like there is a simpler option, but I can't get my head around it. I'm not sure it's Option 2 either.

@ewjoachim
Copy link
Member Author

Plus when we'll make the .99, what tells us that we won't have another minor version to make?

That would imply we would support older versions. I think the plan for now is just to support the latest one.

Do you have examples of projects doing this? I find it very unusual.

Do you have examples for non-django projects integrated as libraries into users project and that require SQL migrations ?

@ewjoachim
Copy link
Member Author

Another proposal could be to postpone cleaning after X+2.0, which would lead to your suggestion above

Note: proposal 2 is like proposal 3 without the .99, so that would work too.

@elemoine
Copy link
Contributor

elemoine commented Nov 6, 2020

The advantage of proposal 3 (the "0.99" proposal) is that we don't need to wait for X+1.1 to remove the SQL code for backward compatibility. Instead we can do the clean-up directly in X+1.0.

So with proposal 3 we'd release X.99 just before releasing X+1.0. And we'd tell our users that they need to upgrade to the X.99 pivot version before upgrading to X+1.

With proposal 2 we'd release X+1.0, and X+1.1 right after for the SQL clean-up. And we'd tell our users that they need to upgrade to X+1.0 before upgrading to any other X+1 version. We can also recommend that they upgrade from X+1,0 to X+1.1 as soon as X+1.1 comes out, although that's not required.

With that I mind I think I prefer proposal 2, because it'd be more natural and usual to our users. With proposal 3, when they are ready to upgrade to the next major version (from X.n to X+1.0), we would force them to do 2 upgrades, from X.n to X.99, and them from X.99 to X+1.0. With proposal 2, they can upgrade from X.n to X+1.0, and then, optionally, from X+1.0 to X+1.1.

@ewjoachim, what do you think? I can go ahead and document the migration process when we agree on the proposal.

@ewjoachim
Copy link
Member Author

Frankly, I'm ok with anything as long as it's documented.

@elemoine
Copy link
Contributor

elemoine commented Nov 6, 2020

Here is my first attempt at documenting this: #342.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue appropriate for: People up for a challenge 🤨 This issue probably will be challenging to tackle Issue contains: Exploration & Design decisions 🤯 We don't know how this will be implemented yet Issue contains: Some SQL 🐘 This features require changing the SQL model Issue type: Process ⚙️ Things that have to do with how procrastinate is maintained
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants