Strengthen Cherry Pick Guidance #7634

jeremyrickard · 2023-12-01T03:42:06Z

Describe the issue

The current cherry pick guidance has fairly clear criteria on what kind of PRs are good for cherry picks, but it seems like we could maybe enhance our guidance, for release managers reviewing cherry picks, for contributors opening cherry picks, and for SIG Leads reviewing cherry picks.

At the wg-lts meeting on November 21st, @liggitt presented a pretty thorough analysis on regressions introduced into patch releases through cherry picks, which are available here:

Kubernetes patch release regression/bugfix rate

Analysis of Kubernetes regression rates, patterns, examples

There was a pretty important take away: Every single minor version has had a backport cause a regression.

In the wg-lts meeting, we discussed a few concrete things:

We should limit back ports to regression / security / data-loss fixes
Anything beyond that needs justification / careful risk analysis (size, entanglement with other changes, test coverage)
The level of scrutiny applied to changes to master get between code freeze and a .0 release is the level we should be applying to cherry picks as well

The first bullet point is basically already expressed in our existing guidelines, so perhaps we should investigate a new PR template for cherry picks that includes a self-attestation from someone opening a cherry pick? This could also include a section to better indicate when a bug was introduced to better help understand if a bug fix should actually be cherry picked back to a release if the bug already existed prior to the .0 of that minor release.

/sig release
/kind documentation

sdodson · 2023-12-11T18:31:12Z

I spent some time looking to see if there were industry standard defect classes like data-loss, security, performance hoping maybe we could put together a bit of taxonomy and decide which classes are valid for cherry picks. However, it seems like most alignment is around severity like critical, major, minor, etc where the highest severity issues generally include security, data-loss, and often performance regressions. I think the recommendations from the first bullet point are a great place to start.

Assuming we want labels for these, area/security kind/regression are available today but data-loss doesn't seem to have a relevant label, perhaps kind/data-loss?

YuikoTakada · 2023-12-13T12:46:06Z

Assuming we want labels for these, area/security kind/regression are available today but data-loss doesn't seem to have a relevant label, perhaps kind/data-loss?

does it mean that unless labels( like area/security ) added, the PR is not allowed to cherry-pick?

sdodson · 2023-12-14T15:30:40Z

does it mean that unless labels( like area/security ) added, the PR is not allowed to cherry-pick?

I'm not sure it becomes a requirement, initially, but I believe having the labels available to track the criteria that are part of the consideration in cherry-pick approval would be helpful. As written the description just says that the author would self affirm it's one of the approved fix classes or provide other reasoning why the change needs to be backported so that the release team can make more informed decisions.

neolit123 · 2023-12-14T17:19:41Z

note that sometimes a fix for a user blocking bug could be eligible for a backport, but today that is just a kind/bug. at the EOD everything that is backported is a bug fix.

instead of working with the current set of labels we could create a new family of labels foo/bar, where bar must always be the resson and classification provided by the backport author and foo will be the family specific to backports. these labels could be explaned by bots, and blocking if not present (ALA needs-foo).

at the same time, ETOOMANYLABELS.

liggitt · 2023-12-15T21:51:48Z

one other category of changes that seem acceptable for backports came up - 1.29.0 accidentally enabled an alpha capability by not applying the feature gate correctly... fix to apply the gate correctly is in kubernetes/kubernetes#122343

That's sort of the opposite of a regression... it's an accidental progression of functionality intended to be gated to roll out in a controlled way.

k8s-triage-robot · 2024-03-14T22:07:53Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-04-13T22:15:12Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jeremyrickard · 2024-04-23T14:07:04Z

/remove-lifecycle rotten

k8s-ci-robot added sig/release Categorizes an issue or PR as relevant to SIG Release. kind/documentation Categorizes issue or PR as related to documentation. labels Dec 1, 2023

saschagrunert mentioned this issue Dec 4, 2023

Improvements to the cherry-pick-approved prow plugin kubernetes/sig-release#2385

Open

liggitt mentioned this issue Dec 5, 2023

Enabling InPlacePodVerticalScaling feature gate causes restart of containers within the cluster kubernetes/kubernetes#122028

Closed

liggitt mentioned this issue Dec 15, 2023

Automated cherry pick of #122329: Wire in field dropping for CRDs kubernetes/kubernetes#122343

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 14, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 13, 2024

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strengthen Cherry Pick Guidance #7634

Strengthen Cherry Pick Guidance #7634

jeremyrickard commented Dec 1, 2023

sdodson commented Dec 11, 2023

YuikoTakada commented Dec 13, 2023

sdodson commented Dec 14, 2023

neolit123 commented Dec 14, 2023

liggitt commented Dec 15, 2023

k8s-triage-robot commented Mar 14, 2024

k8s-triage-robot commented Apr 13, 2024

jeremyrickard commented Apr 23, 2024

Strengthen Cherry Pick Guidance #7634

Strengthen Cherry Pick Guidance #7634

Comments

jeremyrickard commented Dec 1, 2023

Describe the issue

sdodson commented Dec 11, 2023

YuikoTakada commented Dec 13, 2023

sdodson commented Dec 14, 2023

neolit123 commented Dec 14, 2023

liggitt commented Dec 15, 2023

k8s-triage-robot commented Mar 14, 2024

k8s-triage-robot commented Apr 13, 2024

jeremyrickard commented Apr 23, 2024