Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to trunk-based development #3140

Closed
9 tasks done
Tracked by #2756
jdangerx opened this issue Dec 8, 2023 · 11 comments · Fixed by #3212 or #3216
Closed
9 tasks done
Tracked by #2756

Move to trunk-based development #3140

jdangerx opened this issue Dec 8, 2023 · 11 comments · Fixed by #3212 or #3216
Assignees
Labels
nightly-builds Anything having to do with nightly builds or continuous deployment.

Comments

@jdangerx
Copy link
Member

jdangerx commented Dec 8, 2023

Earlier context in this discussion

We spend a bunch of time juggling the dev and main branches which complicates some of our automation and development flows. It also means that main and dev are divergent. Which one is the authoritative history of the project? Who really knows?

The desired end state is to get rid of dev and branch everything off of main.

We want to keep building nightly builds, and have tagged release versions.

The RTD updates appear to be mostly about configuring the readthedocs project.

Success criteria

  1. 0 of 5
    release
  2. 0 of 5
    nightly-builds
    zaneselvans
  3. 16 of 16
    cloud docs nightly-builds release
    zaneselvans
@zaneselvans
Copy link
Member

On the archive to Zenodo on tag, I think you're talking about the data releases, but there will be 2 kinds of archives that get sent to Zenodo: software and data. The software archive-on-tag is working now, and is only triggered on v20* tags (because GitHub releases are only triggered on v20* tags, and that's what the GitHub to Zenodo integration watches for).

By unifying the scheduled and tagged builds, do you mean that we only ever build on a schedule, and every tagged release would just be adding an additional tag to the successful nightly build that we've decided to use as a new versioned release? So we could just package up that commit and its build outputs without needing to re-run anything? That sounds nice and clean to me! One issue that might come up there is that typically there's a post-build edit to the release notes that happens to close off the old section (which is changes that go into the release) and open up a new section. But I imagine lots of other software also has this problem so there must be some way around it.

@zaneselvans zaneselvans added the nightly-builds Anything having to do with nightly builds or continuous deployment. label Dec 15, 2023
@jdangerx
Copy link
Member Author

jdangerx commented Dec 20, 2023

docs

@zaneselvans with RTD, we're currently set up to include these versions on our website:

image
  • latest corresponds to main;
  • 0stable corresponds to our new stable branch;
  • dev corresponds to dev;
  • tag builds

I think in the main-only future, we would have:

  • latest corresponding to main
  • stable corresponding to stable
  • tag builds

Which also happens to be the default RTD behavior, I think!

I think the nightly version in the docs wouldn't actually be helpful and instead would sow confusion about the difference between nightly and `latest.

reasons to drop `nightly` in docs build Thinking about use cases: - someone who is developing PUDL using the website as a reference: likely to be on the latest commit of `main` regardless of nightly build passing - someone who is debugging a specific nightly build failure: likely to be on some `nightly-XXXX-XX-XX` tag - someone who's using a pinned numeric version of PUDL: likely to be using that specific version - someone who's using an installed version of PUDL pinned to some git branch like `main` - likely to be on some random commit that is some amount behind the head of `main`

The first case is the most common one, and best served by latest pointing at the latest commit on main. The second case is somewhat likely, but also not that well-served by a nightly docs build, because it's hard to tell if nightly points at nightly-XXXX-XX-XX or nightly-YYYY-YY-YY. The third case is well served by the vXXXX versions, and the fourth case is completely hopeless / best served by latest pointing at main.

git branches

I think the nightly branch could also be dropped in favor of just using the nightly tags, but I can see it being useful still.

Pros of keeping the nightly git branch:

  • to find the latest code that ran a successful nightly build, you just have to check out nightly.
  • easy to link to most recent successful output: s3://pudl.catalyst.coop/nightly/ has the latest passing build

Cons of keeping the nightly git branch:

  • if the update process fails between uploading the output and pushing the git ref, or if it's just taking a long time, or if people just forget to pull the latest nightly from the remote, the correspondence between output <-> code will be subtly wrong
  • maintaining nightly branch from within Docker seems like a pain in the butt

Pros of dropping the nightly git branch:

  • avoid most subtle wrong states of above - nightly-XXXX-XX-XX output will correspond to the nightly-XXXX-XX-XX code, unless we failed in the middle of our S3 upload.
  • easy to check out the code associated with a specific nightly build for debugging purposes
  • don't have to deal with maintaining the nightly branch from within the docker container

Cons of dropping nightly git branch:

  • harder to link to most recent successful output: s3://pudl.catalyst.coop/nightly/XXXX-XX-XX might have the latest passing build, but users would have to look at a list of what's in nightly/ and assess what the latest date there is.

WDYT about dropping nightly in docs/git/both @zaneselvans ? I'm personally somewhat in favor of dropping nightly from the docs, but pretty on-the-fence for git. Maybe slightly pro dropping from git too, if I had to pick.

@zaneselvans
Copy link
Member

main: the bleeding edge. May be broken. Has no outputs associated with it. main is only really interesting in a development context -- it's what we're all branching off of to make new code. If we were going to hide something from the docs, I think I would be most inclined for it to be this. The nightly builds are kind of our final CI test, and we don't really want the public working with / referring to that until we know the builds pass. Generally this will only be a day or two ahead of nightly.

nightly: the most recent known-to-work code. Has ephemeral outputs associated with it. nightly is useful for folks like RMI or other who prioritize freshness over long-term guarantees of accessibility. If we merge in a PR addressing e.g. RMI's needs and it takes a month or more for those changes to show up in stable, I think they will often prefer to access the nightly outputs, at least for the purpose of their own testing and development in the interim. And I think if the data is going to be out there in use by people, there should probably be publicly visible docs that go along with it. I would be inclined to connect nightly to latest as in "latest thing that you should probably be looking at, if you're not doing development." Dunno. We could make nightly/latest the default landing page for RTD, but also have main there if someone wants to see it for development.

stable: the most recent known-to-work code with persistent outputs available (via S3, GCS, Zenodo, etc.). If we get to where we're cutting monthly releases, I think that this is probably what we want to point most users at. Though I suspect that there will still be breaking changes between versions due to schema drift. So maybe we actually want most users to rely on a specific version if they need stuff to not go down.

v202Y.MM.DD: a particular past tagged stable release, so folks can pin and know nothing will change. We should definitely persist these docs builds to annotate the persistent outputs they're associated with.

@jdangerx
Copy link
Member Author

jdangerx commented Dec 20, 2023

Ah! We could point latest at nightly instead of main, in RTD settings.

Then we have:

RTD latest -> git nightly
RTD stable -> git stable, duh
RTD main -> git main in case we really need it for some reason
RTD vXXXX -> git vXXXX

RTD default landing page is stable.

That all seems pretty straightforward to do!

And implies that we keep nightly around as a git branch, which helps make that decision 😄

We can set up the RTD stuff once nightly branch gets successfully pushed tonight 🤞

@jdangerx
Copy link
Member Author

I think that would also imply that we want the AWS outputs to have s3://pudl.catalyst.coop/latest/pudl.sqlite instead of /nightly/pudl.sqlite - and maybe we want the nightly branch to be named latest instead? Just to make everything consistent?

@zaneselvans
Copy link
Member

Hmm. I really like nightly because it communicates the frequency with which it's being potentially updated whereas latest could be three months ago without additional context (and nightly updates are very rare among the people that we're trying to serve). Can we not change the RTD naming? Or if we can't get ride of their latest have an identical alias that's nightly?

@zaneselvans
Copy link
Member

zaneselvans commented Dec 20, 2023

It looks like we can set the default version to nightly under Advanced Settings, and let latest point at the default branch main which is the normal RTD behavior.

image

Which I think would replicate the default RTD behavior, except that there'd be an additional nightly -> nightly layer:

  • latest -> main (available, but not the docs landing page)
  • nightly -> nightly (this would be the landing page)
  • stable -> stable (available, but not the docs landing page)
  • tagged versions (for historical reference)

@jdangerx
Copy link
Member Author

jdangerx commented Dec 20, 2023 via email

@jdangerx
Copy link
Member Author

OK, I went and fiddled with RTD settings. I think that if the nightly build works and pushes nightly branch, RTD should make a nightly version that will appear on the website. Once that docs build exists we can switch the default to nightly, move latest to main, and hide dev from view.

@zaneselvans
Copy link
Member

I'm gonna give the build and all the new machinery like a 10% chance of succeeding. 🤞🏼

@zaneselvans
Copy link
Member

I went ahead and updated the RTD configuration, so it is now building and serving the main (latest), nightly, and stable branches, with nightly as the default if you don't specify a version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nightly-builds Anything having to do with nightly builds or continuous deployment.
Projects
Archived in project
2 participants