Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strengthen Cherry Pick Guidance #7634

Open
jeremyrickard opened this issue Dec 1, 2023 · 8 comments
Open

Strengthen Cherry Pick Guidance #7634

jeremyrickard opened this issue Dec 1, 2023 · 8 comments
Labels
kind/documentation Categorizes issue or PR as related to documentation. sig/release Categorizes an issue or PR as relevant to SIG Release.

Comments

@jeremyrickard
Copy link
Contributor

Describe the issue

The current cherry pick guidance has fairly clear criteria on what kind of PRs are good for cherry picks, but it seems like we could maybe enhance our guidance, for release managers reviewing cherry picks, for contributors opening cherry picks, and for SIG Leads reviewing cherry picks.

At the wg-lts meeting on November 21st, @liggitt presented a pretty thorough analysis on regressions introduced into patch releases through cherry picks, which are available here:

Kubernetes patch release regression/bugfix rate

Analysis of Kubernetes regression rates, patterns, examples

There was a pretty important take away: Every single minor version has had a backport cause a regression.

In the wg-lts meeting, we discussed a few concrete things:

  • We should limit back ports to regression / security / data-loss fixes
  • Anything beyond that needs justification / careful risk analysis (size, entanglement with other changes, test coverage)
  • The level of scrutiny applied to changes to master get between code freeze and a .0 release is the level we should be applying to cherry picks as well

The first bullet point is basically already expressed in our existing guidelines, so perhaps we should investigate a new PR template for cherry picks that includes a self-attestation from someone opening a cherry pick? This could also include a section to better indicate when a bug was introduced to better help understand if a bug fix should actually be cherry picked back to a release if the bug already existed prior to the .0 of that minor release.

/sig release
/kind documentation

@sdodson
Copy link

sdodson commented Dec 11, 2023

I spent some time looking to see if there were industry standard defect classes like data-loss, security, performance hoping maybe we could put together a bit of taxonomy and decide which classes are valid for cherry picks. However, it seems like most alignment is around severity like critical, major, minor, etc where the highest severity issues generally include security, data-loss, and often performance regressions. I think the recommendations from the first bullet point are a great place to start.

Assuming we want labels for these, area/security kind/regression are available today but data-loss doesn't seem to have a relevant label, perhaps kind/data-loss?

@YuikoTakada
Copy link
Contributor

Assuming we want labels for these, area/security kind/regression are available today but data-loss doesn't seem to have a relevant label, perhaps kind/data-loss?

does it mean that unless labels( like area/security ) added, the PR is not allowed to cherry-pick?

@sdodson
Copy link

sdodson commented Dec 14, 2023

does it mean that unless labels( like area/security ) added, the PR is not allowed to cherry-pick?

I'm not sure it becomes a requirement, initially, but I believe having the labels available to track the criteria that are part of the consideration in cherry-pick approval would be helpful. As written the description just says that the author would self affirm it's one of the approved fix classes or provide other reasoning why the change needs to be backported so that the release team can make more informed decisions.

@neolit123
Copy link
Member

note that sometimes a fix for a user blocking bug could be eligible for a backport, but today that is just a kind/bug. at the EOD everything that is backported is a bug fix.

instead of working with the current set of labels we could create a new family of labels foo/bar, where bar must always be the resson and classification provided by the backport author and foo will be the family specific to backports. these labels could be explaned by bots, and blocking if not present (ALA needs-foo).

at the same time, ETOOMANYLABELS.

@liggitt
Copy link
Member

liggitt commented Dec 15, 2023

one other category of changes that seem acceptable for backports came up - 1.29.0 accidentally enabled an alpha capability by not applying the feature gate correctly... fix to apply the gate correctly is in kubernetes/kubernetes#122343

That's sort of the opposite of a regression... it's an accidental progression of functionality intended to be gated to roll out in a controlled way.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 14, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 13, 2024
@jeremyrickard
Copy link
Contributor Author

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
None yet
Development

No branches or pull requests

7 participants