Skip to content
This repository has been archived by the owner on Apr 29, 2024. It is now read-only.

Designing the "upgrade compatibility matrix" #18

Open
thoughtpolice opened this issue Aug 14, 2023 · 0 comments
Open

Designing the "upgrade compatibility matrix" #18

thoughtpolice opened this issue Aug 14, 2023 · 0 comments

Comments

@thoughtpolice
Copy link
Collaborator

In one of our talks, I discussed a potential design about how to:

  • Keep track of deployed versions,
  • Keep track of which versions can be upgraded to others, and
  • How this might be done in a way that allows automation.

This is what we sort of called the "upgrade compatibility matrix" approach.

Keeping track of built PostgreSQL versions

The fundamental idea is simple:

  1. The full state of this repository can be solely identified by a Git commit
  2. The output of the build is a unique path to /nix/store
  3. Therefore, simply create some mapping, in some database, from Git commit hashes to /nix/store paths. This can be a one-to-many relationship.
    3.1. It could also be inverted: map store paths to Git hashes.
    3.2. You could also use git tags, too.

This is one of the powers of Nix: it unifies your version control and package management system to make things reproducible, so you can get the same results back later and rebuild things. Once you have a Git version, you can get a build, and you can associate these however you want and use whatever database of record you like; the source of truth it actually the source code itself, though.

A matrix for version compatibility

Let's say we have the following basic strategy. Every week, on Friday @ 00:00 UTC, we build the latest versions of PostgreSQL from this repository, and then we git tag that version with a date format: v{YYYYMMDD}. Something like:

austin@GANON:~/work/nix-postgres$ git show HEAD | head -1
commit d214aaa6d40f798f2de7ba2895294bb6bf100fb0

austin@GANON:~/work/nix-postgres$ readlink result*
/nix/store/jsbd0s1q5kyw96a78gfcbvrpirnxzand-postgresql-and-plugins-14.8
/nix/store/82p89d65l81446nfbxwaszg0pn9d2lss-postgresql-and-plugins-15.3

austin@GANON:~/work/nix-postgres$ ... run git tag ...

Now we have tagged versions we can deploy to the fleet, integrate and test with the migration-test tool, or what have you.

Now, we can use these git tags as a basis for compatibility: what git tags are compatible with what other Git tags? And how do migrations get tested between them?

We could imagine a single JSON file that has something like the following schema:

{
    "versions": {
        "v20230807-1": {
          "psql_14": {
            "version": "14.8",
            "path": "/nix/store/jsbd0s1q5kyw96a78gfcbvrpirnxzand-postgresql-and-plugins-14.8"
          },
          "psql_15": {
            "version": "15.3",
            "path": "/nix/store/82p89d65l81446nfbxwaszg0pn9d2lss-postgresql-and-plugins-15.3"
          }
        },

        "v20230814-1": {
          "psql_14": {
            "version": "14.8",
            "path": "/nix/store/...-postgresql-and-plugins-14.8"
          },
          "psql_15": {
            "version": "15.3",
            "path": "/nix/store/...-postgresql-and-plugins-15.3"
          },
        },
    },

    "compatibility": {
      "v20230807-1": {
        "can_upgrade_to": [
          "v20230814-1"
        ],
        "incompatible_with": [
          "v20230801-1"
        ],
      },

      "v20230814-1": {
        "can_upgrade_to": [ ... ],
        "incompatible_with": [ ... ],
      },
    },
}

This data model is very incomplete, but the idea is there I think:

  • The data model represents versions, their stored paths that exist in the binary cache, and
  • What the compatibility guarantee is: primarily, which tags can be migrated to which other tags successfully e.g. with pg_upgrade.

There are a lot of specifics left out here, especially how to do major -> major upgrades. Ultimately though, I think the key point is that this data model should reflect the upgrade and migration policy.

Automating this system

Here's the fun part: you don't need a database for this. You could literally just have a GitHub Action that uses the cron scheduling feature to do this for you, and you can just record the .json file above in git itself as well. Thus, you can imagine an update robot that:

  • Snapshots and tags the repo once a week, if the build succeeds
  • Starts running migration tests against other versions, e.g. the last week's versions.
  • Automatically updates the compatibility matrix.
  • Failures can be posted e.g. as GitHub issues.
  • Individuals can also change the compatibility matrix with their own commits (e.g. you test a specific upgrade path for a customer, and mark is as compatible.)

I think this is a promising approach because keeping track of breakages and compatibility is something that requires a lot of attention to detail, and so automating it to the highest degree possible is, I think, probably going to be very valuable.

That said, the biggest part of this is actually figuring out the proper data model for the compatibility matrix/compatibility schema. Assuming that can be done and represented correctly, I think the automation is mostly a matter of "how fancy do you want to make it."

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant