Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BEP 6: Branching policy #10177

Closed
mattpap opened this issue Jun 16, 2020 · 22 comments
Closed

BEP 6: Branching policy #10177

mattpap opened this issue Jun 16, 2020 · 22 comments

Comments

@mattpap
Copy link
Contributor

mattpap commented Jun 16, 2020

Proposal:

  • Milestones:

    • future (or x.0) -- the next major, breaking release
    • next -- x.y or x.y.z release, whichever comes first
    • x.y -- major minor changes, non-user breaking
    • x.y.z -- minor patch changes, regression fixes from x.{y-1} or x.y.{z-1} release
  • Branches:

    • future -- aligns with future milestone
    • master -- the next x.y milestone
    • release_x.y -- branch off master after feature freeze (master becomes x.{y+1})
    • release_x.y.z -- the next x.y.z release

The main assumption behind this proposal is that the majority work is done on master for the next x.y milestone. I assume that most changes will require a "break-in" period that can't be (and shouldn't be guaranteed) by minor patch releases. I additionally assume that it will be easier for all contributors to target master branch, instead of targeting specific branches, which I leave for project/release maintainers. Thus I assume most external contributions will be made available in x.y releases, unless we (the maintainers) backport those changes to x.{y-1}.z releases.

Minor Patch releases (x.y.z) should not contain any new features and/or major changes of other kind, and thus shouldn't have more that a dozen (roughly speaking) PRs merged. This will ensure that they can be released quickly and without major breakage, and won't require extensive testing, which should be left for x.y releases.

Details will need to be figured out, e.g. when exactly release_x.y.z is branched of master, etc.

The alternative is to have only release_x, release_x.y and release_x.y.z branches, but I'm certain that this would cause chaos for external contributors, and make master branch an archive of old releases, instead of what it typically means in git, which is the current development branch.

@mattpap
Copy link
Contributor Author

mattpap commented Jun 16, 2020

future branch should be rebased onto master on regular basis (preferably after every PR or daily), until it is merged into master. Then future branch becomes bokeh {x+1}.0. Other branches must not be rebased.

@bryevdv
Copy link
Member

bryevdv commented Jun 16, 2020

This all seems roughly fine in very high level terms. A few notes:

  • The final result should be codified in a new BEP and linked and described in the dev guide as well
  • we probably need to start milestoning PRs to make it less easy to merge a PR when it is not appropriate or intended
  • "Other branches must not be rebased" surely only refers to the special branches described and not e.g. feature branches, that should just be made more clear
  • Tooling/script support would help ensure things happen correctly and with proper sanity checks
  • GitHub is going to drop master as a default main branch name and we should follow suit and switch to main

That said, one question I have is that I don't really see how point releases will really ever get squeezed in. If e.g. after a freeze and branch for current release, master is opened up for changes, and ends up with x.(y+1)level changes very soon, then the possibility of making an x.y.z release has already been foreclosed. This just seems like it will be common given the current pace and character of development, unless we either:

  • make more radical changes to let release branches start off somewhere besides master (e.g off the last x.y branch while there are already x.(y+1) changes on master
  • Make a point to plan for a point release cycle and intentionally hold back merging any x.(y+1) for a time after the latest freeze/release branch
  • have a release_x.(y+1) branch early that is longer running alongside master (so any x.y.z small changes can collect on master)

I'm mostly OK with any go these, just unclear what the suggestion is. Or maybe there is something that I've misunderstood. I would also say that:

  • To aid explanation and understanding the fine write-up should present and walk through concrete example scenarios with real version numbers, not "x.y.z"

cc @bokeh/dev

@mattpap
Copy link
Contributor Author

mattpap commented Jun 17, 2020

Let's start from the fact that I messed up the names of releases. I updated the initial comment, but here's what's important:

Releases:

  • x.0 -- a major release
  • x.y.0 -- a minor release
  • x.y.z -- a patch release (to a x.y.{z-1} release)

My assumption is that patch releases are not continued development (as in new features or any other non-trivial changes), but maintenance of a x.y branch. This is in opposition to how we currently perceive patch releases. I view patch releases literally, i.e. patch what was broken in x.y and leave any new developments for x.{y+1}. My intention is that master branch reflects continuous development towards the next minor release x.y (and rarely the next major release). At feature freeze a release_x.y branch is branched off master. master becomes future x.{y+1}. release_x.y branch is the basis for x.y.0 release and for future patch releases to x.y line. In this case there would be no release_x.y.z branches, as there would be no need for such, assuming we don't have a sudden need for x.y.z.t releases, which I think would be an overkill for this project.

I realize this is a bit (or perhaps significantly) different from what I initially proposed, but after formulating my initial ideas out loud, I had some more realizations about the release process, especially in the context of software releases (e.g. TypeScript's) that I closely follow.

It may seem at first that this approach would make it longer for simpler new additions to reach our users, and that would be the case in general, because minor releases are further apart than patch releases. However, that's actually good, because any new additions require the "break-in" period and proper testing, so that we don't push out half figured out ideas in patch releases and then wouldn't be able to amend them until a major release, due to backwards compatibility concerns. Additionally, this would give us a framework for actually having a release cadence, because patch releases wouldn't be blocked by insufficient testing of new additions, and minor releases would have a more stable cadence as well, because new additions would be merged/adopted early and there would be plenty of time to test them properly and refine if needed. Furthermore, given our limited resources, to actually properly test and refine contributions, we need to focus on a single "main-line" branch, i.e. master.

In practical terms, I was motivated to start this issue due to the current state of the master branch and 2.1.1 and 2.2 milestones. The situation has been that 2.1 release was delayed and there were quite a few contributions that were laying around unmerged for a week or two, due to code freeze. Now those were merged into master and they are too big changes to be considered for 2.1.1 release. 2.1.1 has to happen soon due to several important regressions that happened in 2.1.

How do we proceed with this now? Do we revert changes to master, fix regressions, push out 2.1.1 and restore the changes? That would mean that those changes would lay around for at least a month, slowing down my progress to a crawl. Perhaps they could have been merged into an independent branch, but then no one except myself would look at them for that month, which means I could just work on a huge single PR, which is definitively not what we want.

However, if the described process was adopted before 2.1 release, then the scenario would pan out as follows. At feature freeze for 2.1 release a branch would be created (release_2.1) and development towards 2.1 final release would continue there. master would become future 2.2 release and development would continue without delays and PRs would be merged continuously. Regressions in 2.1 release are revealed and fixes are applied to release_2.1 branch. Localized testing is performed and a quick patch release 2.1.1 follows. Relevant changes are merged back to master (alternatively changes from master are backported to fix those regressions). Perhaps new regressions are found out and another quick release is scheduled. No extensive testing is required and no delays happen. In the meantime, major development happens on master branch without affecting patch releases.

As bokeh grows, there may be demand for patching up older releases, e.g. due to security concerns. With the described process we should be able to push out such changes easily, well, with some caveats like would we be still able to CI test such releases. Right now the answer is no, e.g. we need to be able to pin down all software involved in testing.

I would expect a cadence of 5 weeks between minor releases. The cadence for patch and major releases would be defined by need or demand. No regressions means no need for patch releases.

@bryevdv
Copy link
Member

bryevdv commented Jun 17, 2020

@mattpap in general I like all of this. My only comments:

  • 5 weeks seems about right to target but I think we should be ok with anything in the 4-6 week range (and state that explicitly in the BEP)

  • prefer release-x.y to release_x.y

Lastly, would it be so bad to just branch to the next immediately after a release and use that for dev instead of master? I think we should consider it, and the reason is that there is already a flurry of prescribed activity around releases. Branching for the next x.y cycle then would be just one more small task to happen semi-automatically at that already ongoing time of activity. In contrast if we wait to branch until feature freeze:

  • there is now work and judgment and discussion needed to agree when exactly to branch
  • the mechanical work to branch and "switch gears" to make contributions on that branch now happens mid-cycle, which is (IMO) more complicated overall.

Under this process, master then collects the latest stable release, which has its own benefits such as people looking at examples not seeing future-versions that rely on yet-to-be released features

@bryevdv
Copy link
Member

bryevdv commented Jun 19, 2020

@mattpap I'd like ot start drafting a BEP, do you have any thoughts on the comment above (i.e. "branch immediately to new release-x.y after release") If we do that I might also say prefer branch-x.y as a branch name.

@mattpap
Copy link
Contributor Author

mattpap commented Jun 19, 2020

I will comment soon, though I agree with most of the suggestions.

@bryevdv
Copy link
Member

bryevdv commented Jun 19, 2020

Another thing to consider for the BEP and process is a few new GH labels to specifically identify PRs that are backports, forward ports, etc.

@bryevdv
Copy link
Member

bryevdv commented Jun 21, 2020

Another comment: if we are going to continue to automate issue/PR validation and changelog generation, I think we will have to add milestones to PRs. To be honest, there are many times I have wanted to search PRs by milestone so this seems like a welcome change anyway.

@bryevdv
Copy link
Member

bryevdv commented Jun 22, 2020

@mattpap FYI I have tagged this as BEP and linked it as the discussion issue in this draft document:

https://github.com/bokeh/bokeh/wiki/BEP-6:-Branching-and-Milestones

@bryevdv bryevdv changed the title Milestones and (release) branches BEP 6: Milestones and (release) branches Jun 22, 2020
@bryevdv
Copy link
Member

bryevdv commented Jun 22, 2020

@mattpap just a heads up I changed the branch name (and associated branch protections) to branch-2.1:

  • the existing automation remnants already use release-x.y want to try and keep as much the same there as possible
  • "branch" is shorter/fewer syllables than "release"
  • release is a specific event and set of actions, a branch for x.y may have multiple releases so branch-x.y is more descriptive

@bryevdv
Copy link
Member

bryevdv commented Jun 22, 2020

As an aside, don't we need to prohibit force pushes to the branch-x.y branches? (I noticed they are not currently) If the idea is that changes that go there are merged back in to the current line of development it seems that history rewriting on those branches could lead to bad situations. Or do you imagine doing that manually by cherry picking? If so we will really need some script/tool support to help with that.

@bryevdv
Copy link
Member

bryevdv commented Jun 22, 2020

FYI I added branch-* to GH action branch selection here: 30bbe4a

@bryevdv
Copy link
Member

bryevdv commented Jun 24, 2020

@mattpap I pushed a branch-2.2 branch and set it as the default. Can you check that updating one of your existing PRs is simple/straightforward? I will cherry pick the 2.1.1 on to a PR for merging. It seems to me having a PR to pull in patch release changes will leave a better paper trail than just cherry picking directly on the new minor release branch

@bryevdv
Copy link
Member

bryevdv commented Jun 30, 2020

Just some terminology I tentatively started to adopt in the first release automation PR, comments welcome:

  • release level - version up to minor release, e.g 2.2
  • base branch - for a release, the branch for the current release level, e.g. branch-2.2
  • release branch - an ephemeral branch created for release update, with the full version, e.g. release-2.2.1

@mattpap as another aside I would not be opposed to matching the PEP for versions for dev and rc packages, e.g with the extra dot: 2.2.1.dev1. It's a bit of a risk but having the different version values is a big nuisance and the dust has settled elsewhere and now our format is the odd one out. Thoughts? If you agree it's worth looking in to I can create an issue.

@bryevdv
Copy link
Member

bryevdv commented Jul 8, 2020

@mattpap I have updated the BEP with more content: https://github.com/bokeh/bokeh/wiki/BEP-6:-Branching-and-Milestones

Please review when you have a change. Do you have thoughts about the question immediately above?

Also note I did turn of force pushes to base branches.

@mattpap
Copy link
Contributor Author

mattpap commented Jul 11, 2020

To simplify local workflow, one might issue the following:

git symbolic-ref refs/heads/default refs/heads/branch-2.2

and redo this every time the base branch is changes. This way one can always refer to the current base branch as default symbolic reference, similarly to previous usage of master branch.

@p-himik
Copy link
Contributor

p-himik commented Jul 11, 2020

To simplify local workflow, one might issue the following

I definitely don't like the need of always having to monitor the current branch and constantly describing why this is needed to new contributors.

The discussion here and the text in the linked BEP seem to revolve around something that has already been invented.
With that being said, what are your thoughts on Gitflow?

@bryevdv
Copy link
Member

bryevdv commented Jul 11, 2020

If we want to have all active mainline (minor release) development on default dev branch, and then only create the release branch e.g. branch-2.2 automatically at the exact time of a minor release, then I think that could be OK. That would seem to address both mine and @mattpap concerns, namely:

  • @mattpap wants a defined, predictable, segregated place for non-mainline, patch changes to land in the future
  • I want to avoid "when to branch" being a people decision

Edit: just to summarize the options:

  1. create current default branch-2.2 right after 2.1 release, all work for all 2.2.x releases happens on this branch (this is the current proposal) .The branch has a record of all the development that went into the 2.2 release as well as any future 2.2.x release.
  2. work on permanent default dev but create branch-2.2 mid-stream in to 2.2. cycle at some arbitrary feature freeze point. Note that there is no way to avoid some portion of work on this branch after freeze because things like SRI hashes can only be created at package time. These would also have to be merged back to dev. This is complicated and requires people decisions ("when to branch") so I am -1 on this
  3. work on permanent default dev and only create branch-2.2 exactly at release time (automated). The branch starts with a record of the moment of the 2.2 release, and only future 2.2.x patch work goes on this branch.

I'm fine with either 1 or 3

@bryevdv
Copy link
Member

bryevdv commented Jul 11, 2020

As an aside, after a little trial with it, I am not very happy about milestoning regular PRs (i.e. non "issue PRs") however if we want to avoid that then we either,

  • completely abandon automated hygiene checking for PR labels, or
  • jsut check every single PR, every time (because there is no way to know which PRs are just for a given release)

As a reminder the reason things worked before is that we had a single line of development. "All PRs for this release" was the same as asking "All PRs since the date of the next highest version tag". That is no longer the case if we admit the possiblity of patch releases off old branches. (e.g. a 2.2.1 for a "critical fix" after 2.3. and 2.4 have already come out)

@bryevdv
Copy link
Member

bryevdv commented Jul 14, 2020

I guess after a little more thought I have a slight preference for a permanent default dev branch and an auto branch-x.y at release time.

@bryevdv bryevdv changed the title BEP 6: Milestones and (release) branches BEP 6: Branching policy Sep 28, 2020
@bryevdv
Copy link
Member

bryevdv commented Sep 28, 2020

@mattpap I have updated the Wiki document I think that restricted to only branching, things are simple enough to not need examples, so I have deleted that empty section. I also made a few tweaks re: prefer to land on current base and cherry-pick backports. Can you look and see if there are any other changes you would like to make before voting to adopt this BEP? I'd like to mark is as implemented by next week at the lastes.

Regarding "dev" branch, I think things have been going well just having branch-x.y be the current branch and switching it right at release. It's very clear and explicit where changes are landing and it's also the way some other (fairly larger) projects handle things. So I would propose to keep the new status quo.

@bryevdv
Copy link
Member

bryevdv commented Oct 8, 2020

Recording that this passed:

Screen Shot 2020-10-08 at 9 00 42 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants