Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0029] Backports team #29

Closed
wants to merge 7 commits into from
Closed

Conversation

@samueldr
Copy link
Member

samueldr commented Jul 28, 2018

Rendered


TODO

  • Finish the draft Detailed design section from the rough notes.
  • Find a co-author.
  • Rebase into one commit when finished. (Keeping history meanwhile for those playing at home)

Quick note, I may seem critical into the RFC text, but I am mainly terse. I know and understand that it's a lot of work. That is, especially true when considering how it involves the release managers.


Additional motivation

The tipping point making me write up this RFC is how stale current-stable seems to become. I was particularly incensed by spotting #43675, opened on July 17th 2018, while the issue was fixed on April 25th 2018 on master, but the fix not backported. It seems that the expectation that authors will backport their own fixes is not universally true. I hope to help by creating processes, tooling and a community around backports.

Copy link
Member

LnL7 left a comment

This is a pretty bad argument but, increasing will have a negative impact on the stability of the darwin channels. The fact that most packages don't change avoids the issue that most changes are not tested on darwin before ending up in master.

Don't backport if the patch is just for Darwin, they use nixpkgs-unstable not a
stable branch.

> `FIXME` Is this true?? I see nixpkgs-18.03-darwin in the channels list.

This comment has been minimized.

Copy link
@LnL7

LnL7 Jul 29, 2018

Member

No, darwin related changes should follow the same rules now that we have stable channels. Including darwin only fixes if applicable.

This comment has been minimized.

Copy link
@samueldr

samueldr Jul 29, 2018

Author Member

Exactly! I have taken this from the linked general guidelines and was a bit worried that this wasn't true. Removing.

@FRidh

This comment has been minimized.

Copy link
Member

FRidh commented Jul 29, 2018

### Examples

An example of broken software needing a major update to be backported is
Spotify, which regularly breaks with old versions.

This comment has been minimized.

Copy link
@timokau

timokau Jul 29, 2018

Member

I'm hoping to fix this with NixOS/nixpkgs#44123.

This comment has been minimized.

Copy link
@samueldr

samueldr Jul 29, 2018

Author Member

Awesome, said example was gleaned from the general guidelines linked earlier. I'm sure as a group we can either abstract away the exact software which break as any one of those, or think up a small list of "glassware" (often-breaking software).

This comment has been minimized.

Copy link
@Baughn

Baughn Nov 2, 2018

Anything that can't be hosted on nixos.org is a candidate for this. grep can find more than a couple. (It also finds things like tor-browser-bundle, which we probably could rehost but haven't, and which is a prime candidate for backporting.)

@vcunat

This comment has been minimized.

Copy link
Member

vcunat commented Jul 29, 2018

This isn't in any manual (yet), but I suggested a different approach (since 18.03) than forming a separate team, described as ## Maintenance workflow in https://groups.google.com/forum/#!topic/nix-devel/3KxPNwxDV9E.

In particular I would stress:

  • the efficiency point - knowledge of what to do isn't much about being a "backport person" but about knowing that particular package well (e.g. understanding how it's used, what exactly changed in a version);
  • motivation point - I suspect it might be hard to find many people who need bugfixes of everything on 18.03.

... maybe I'm wrong and a team catching everything will appear. In general I think we need more contributors that are more focused on some packages; years ago it was a necessity that contributors touched very many packages, but the community has been growing significantly, and hopefully we could afford to move in this direction.


Then there's the usual maintenance problem, regardless of process (and not NixPkgs-specific): most packages don't really maintain older versions. Very simple patches (bug/security) tend not to be too difficult to backport with some confidence, but not even all security fixes fall into that category and then you may:

  • keep the old buggy version; or
  • update to new version on stable as well (bringing some incompatibilities); or
  • somehow backport without understanding it well (including some other distribution doing it and us piggy-backing).

Fortunately our release cycle is short, which helps this.

@Zimmi48

This comment has been minimized.

Copy link
Member

Zimmi48 commented Jul 30, 2018

The tooling part of this RFC is very important. Without it, the rest of the RFC will fall down.

I'll share my perspective on backporting since I'm the person in charge of backporting in Coq (https://github.com/coq/coq) since we switched (one year ago) from pushing fixes to stable branches and merging stable branches into master to basing all PRs (including fixes) on master and backporting what needs to be backported. Something to note is that it was quite important not to miss backportable fixes since most of our users are using stable versions (whereas a lot of Nixers are using nixpkgs-unstable). Of course, another important difference is that we are ten times smaller than nixpkgs (only 100 PR merged last month, compared to the almost 1000 in nixpkgs).

What makes it easier not to miss anything is that all changes go through PRs (pushing to master directly is not permitted). Moreover, mergers are using a script that will check that a milestone has been set on the PR: if not, it will remind them to do so before merging. We generally have to choose between the next patch-level milestone and the next major release milestone. We have a bot that will automatically add merged PRs in the next patch-level milestone to the first column of this backporting GitHub project: https://github.com/coq/coq/projects/19. Additionally, the bot started recently automatically backporting what it can (cherry-picking goes through without conflicts) and pushing to a staging branch so that CI can run. While writing this comment, I checked what was in this staging branch. I was fine with everything and CI had passed so I merged into the stable branch and pushed. The bot then moved the four corresponding PRs from the "Request inclusion" column to the "Shipped" column of the GitHub project. There remains 2 PRs in this column which have conflicts and I will have to backport manually.

@7c6f434c

This comment has been minimized.

Copy link
Member

7c6f434c commented Aug 1, 2018

@vcunat I am by now completely out of ideas what is the proper stable-backport policy (mostly because I am not using stable and I am not going to). And I thing people who merge the PRs are very often not too knowledgeable in the package specifics. Maybe it is a good idea to add a question about the need for backport (and about a WIP PR with backpoty) in the PR template? A link to a written stable-backport policy would fit well there, too.

@vcunat

This comment has been minimized.

Copy link
Member

vcunat commented Aug 1, 2018

In nixpkgs almost everything will just pass CI when cherry-picked blindly (as it passes on unstable as well), but the SW may contain some kind of incompatible changes, including new bugs. EDIT: I think here the main burden is the decision what is "safe" to backport (+the "maintenance problem" above) and not the cherry-pick action itself.

Ah, yes, so what clearly seems missing:

  • add "what to backport" to manual
  • add link to that from the PR template (and yes/no/bordeline checkboxes), possibly even if we decide to choose the "backport team" way instead

And some more feedback/discussion about the backporting workflow.

@samueldr

This comment has been minimized.

Copy link
Member Author

samueldr commented Aug 1, 2018

In nixpkgs almost everything will just pass CI when cherry-picked blindly (as it passes on unstable as well).

I find this oddly entertaining how release-18.03 had broken eval after a backport merge where the failure was ignored.

the SW may contain some kind of incompatible changes, including new bugs.

This is also relevant on nixpkgs-unstable, and is (in my opinion) the one big issue with the automated upgrades some packages received from a friendly robot.

When software contains incompatible changes (let's ignore bugs now), unstable users may receive them at any moment, uncoordinated or not. This is not inherently bad, but should probably be added as a note somewhere about the channels "software will update through major revisions without notification, this could introduce new behaviours, features and issues".

Additionally, bugs are an issue, when introduced with updates. An update to software that also introduces bugs could go from mildly annoying to infuriating depending on the scope of the software and the bugs.

What those last paragraphs are about, is that I don't know how this situation should be handled. Software changes, it is expected. There is a dire lack of changelog updates with our contributions. Not just for big updates. It may be of interest (possibly out of scope, but touching this RFC) to also figure out a more in-depth update policy (and explore those of other distributions). Such a policy could include figuring out a way to include upstream CHANGELOG somehow, and force somehow all packages to include some update log. Not that the nixos manual should mirror upstream software's own changelogs, but that it is necessary to be a bit more comprehensive and try to be proactive into noting changes that are of interest to the end-users.

Then, there may be a need for package-level testing integration, even if not integrated with the complete nixos test suites. Packages opting-in into those could be more reliably automatically updated, while those not opting into those tests will need some eyballing. This is true even for unstable.

@vcunat

This comment has been minimized.

Copy link
Member

vcunat commented Aug 1, 2018

We might make some Borg checks more fatal/visible/mandatory.

  • We can explicitly state somewhere that nixos-unstable is... unstable? (Well, document what it means – the policies for both branches.) But in my experience people tend to be rather surprised that so many people are running on unstable, as presumably they're reminded of Debian or others.

Later we can think of improving the branch policies, but so far they seemed to have worked reasonably well, perhaps partially thanks to the ease of rolling back (and using other versions in general).

IIRC there's some WIP around package testing; there was even a presentation on the last NixCon... but in any case the main work there is to write those tests :-) except for the easy cases like just a missing doCheck = true;.

@Ekleog Ekleog mentioned this pull request Aug 2, 2018
3 of 9 tasks complete
@Ekleog

This comment has been minimized.

Copy link
Member

Ekleog commented Aug 2, 2018

I'd have a question about the “what to backport” instructions included here.

In NixOS/nixpkgs#43889 (comment) (see the linked post for full details), we wonder whether to backport an update to software that have been declared as unstable by upstream. Basically, upstream offers no support in debugging what could potentially go wrong with any version that is not the latest one, and doesn't care about backporting fixes.

Given the fact that upstream is officially unstable (as embodied by a v0.* version), I wonder whether exactly the same rules for backporting fixes should be applied, or whether we should be more lenient in our backporting of these, as users know they're using unstable software anyway and it'd take too much resources handling backport of all fixes.

For a comparison:

  • Debian has 0.33.0 in testing (but has no matrix-synapse in any stable version).
  • Fedora 28 has 0.31.2, last updated 2 months ago. I think the notable fact though is that they have upgraded from 0.27.3 to 0.31.2 (sorry for the web proxy link, pkgs.fedoraproject.org appears to use an invalid certificate with HSTS from here) during the lifecycle of Fedora 28.

So, I would think:

  • If we aim for the stability level of Debian, maybe we should just not accept declared-unstable-by-upstream packages into our stable version (though I'm not sure whether that was a conscious choice by Debian or whether they just had no stable version rollout since their having a package for synapse)
  • If we aim for the stability level of Fedora, maybe we should accept potentially-breaking update backports of declared-unstable-by-upstream packages

It may be important to make a distinction between unstable software used online (like synapse) and unstable software entirely offline, though, as unstable software used online can be broken by updates to other software on other machines (as has been my experience with synapse, see the linked post).

Anyway, I would think the guidelines could be made a bit more explicit around the “what to backport” decision, for on- and off-line declared-unstable-by-upstream software. :)

What do you think about it?

@Zimmi48

This comment has been minimized.

Copy link
Member

Zimmi48 commented Aug 2, 2018

If we aim for the stability level of Debian, maybe we should just not accept declared-unstable-by-upstream packages into our stable version

This is incompatible with the way branching the stable branch from the unstable one works in nixpkgs.

@Ekleog

This comment has been minimized.

Copy link
Member

Ekleog commented Aug 2, 2018

With the current way of branching stable from unstable, yes, indeed :) and I agree excluding v0.* packages at branch-off time would be painful to do.

@samueldr

This comment has been minimized.

Copy link
Member Author

samueldr commented Aug 2, 2018

I am open for every crazy, impossible and different comments about the branching model and backports. Even if they are not possible to implement, knowing about them is useful, and could ideas or concept that weren't thought of.

Thanks for all your comments.

@7c6f434c

This comment has been minimized.

Copy link
Member

7c6f434c commented Aug 3, 2018

@vcunat for interesting package tests we need a solution to «testing with X11 needs partial duplication of NixOS test functionality, and debugging NixOS tests with all their overhead is way too annoying».

@Ekleog well, software used offline but mainly for handling some maliciously developed data formats (like .doc) could be effectively broken if the format updates quickly gain use.


It could also be possible to only implement parts of the RFC. Either the team
or the tooling. Both are of equal value and generally independent. They would,
though, work best if working together.

This comment has been minimized.

Copy link
@Baughn

Baughn Nov 2, 2018

It would be possible to add "should be backported" as a check-box on PRs, automating the discovery part.

@globin

This comment has been minimized.

Copy link
Member

globin commented Feb 25, 2019

Closing for now due to no activity, after communicating with @samueldr.

@globin globin closed this Feb 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants
You can’t perform that action at this time.