Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pending UserNamespace beta]promote LocalStorageCapacityIsolationFSQuotaMonitoring to beta #112626

Closed
wants to merge 1 commit into from

Conversation

pacoxu
Copy link
Member

@pacoxu pacoxu commented Sep 21, 2022

This reverts commit 32a90f5.
This feature gate was promoted to beta only in v1.25.0 and was reverted in v1.25.1 for a regression issue.

See the history in #107329 and #112076

The KEP can be found in kubernetes/enhancements#2697

Action Items:

/kind feature
Xrefs kubernetes/enhancements#1029

promote LocalStorageCapacityIsolationFSQuotaMonitoring to beta

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 21, 2022
@k8s-ci-robot k8s-ci-robot added area/test sig/node Categorizes an issue or PR as relevant to SIG Node. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/testing Categorizes an issue or PR as relevant to SIG Testing. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 21, 2022
@pacoxu
Copy link
Member Author

pacoxu commented Sep 21, 2022

/test pull-kubernetes-node-kubelet-serial-crio-cgroupv1
/test pull-kubernetes-node-kubelet-serial-crio-cgroupv2
/test pull-kubernetes-node-crio-cgrpv2-e2e
/test pull-kubernetes-node-crio-cgrpv2-e2e-kubetest2
/test pull-kubernetes-node-crio-e2e
/test pull-kubernetes-node-crio-e2e-kubetest2

@@ -939,7 +940,7 @@ var defaultKubernetesFeatureGates = map[featuregate.Feature]featuregate.FeatureS

LocalStorageCapacityIsolation: {Default: true, PreRelease: featuregate.GA, LockToDefault: true}, // remove in 1.27

LocalStorageCapacityIsolationFSQuotaMonitoring: {Default: false, PreRelease: featuregate.Alpha},
LocalStorageCapacityIsolationFSQuotaMonitoring: {Default: true, PreRelease: featuregate.Beta},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the isue fix in this pr: #112625? Is this PR just WIP than?

Copy link
Member Author

@pacoxu pacoxu Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be pending until #112625 is merged to fix the regression.

I still need more verifying for the fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the fsquota related bug is fixed, I think we can promote it again. I will update the KEP as soon as possible in kubernetes/enhancements#3821

@SergeyKanzhelev SergeyKanzhelev moved this from Triage to PRs Waiting on Author in SIG Node CI/Test Board Sep 21, 2022
@SergeyKanzhelev SergeyKanzhelev moved this from Triage to Waiting on Author in SIG Node PR Triage Sep 21, 2022
@SergeyKanzhelev SergeyKanzhelev moved this from PRs Waiting on Author to Archive-it in SIG Node CI/Test Board Dec 7, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2022
@pacoxu
Copy link
Member Author

pacoxu commented Dec 26, 2022

/remove-lifecycle stale
pending on #112624 and maybe #114506

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2022
@SergeyKanzhelev
Copy link
Member

/assign @rphillips

Let's evaluate whether we can really promote it to beta as there seems to be bugs still that needs to be looked at.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 4, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Oct 15, 2023

/unhold
/triage accepted
/priority important-soon
/assign @dchen1107 @mrunalp

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 15, 2023
@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 15, 2023
@pacoxu pacoxu moved this from Waiting on Author to Needs Approver in SIG Node PR Triage Oct 15, 2023
@pacoxu pacoxu moved this from Needs Approver to Needs Reviewer in SIG Node PR Triage Oct 15, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Oct 20, 2023

Any further comments on this? @SergeyKanzhelev @rphillips as kubernetes/enhancements#3821 was merged for v1.29.

@rphillips @SergeyKanzhelev @derekwaynecarr Do you have any comments on this feature promotion?

@pacoxu
Copy link
Member Author

pacoxu commented Oct 20, 2023

/assign @derekwaynecarr

@pacoxu
Copy link
Member Author

pacoxu commented Oct 31, 2023

Any further comments on this? @SergeyKanzhelev @rphillips as kubernetes/enhancements#3821 was merged for v1.29.

@rphillips @SergeyKanzhelev @derekwaynecarr Do you have any comments on this feature promotion?

Ping again before the code freeze.

@pacoxu
Copy link
Member Author

pacoxu commented Dec 15, 2023

Kernel issue. Other bugs reported
#112626
Mrunal and Peter to check on kernel issue

@mrunalp @haircommander do you have the kernel issue link?

@haircommander
Copy link
Contributor

It's not so much an issue as expected behavior, and actually I may have been wrong. Any process in the root user namespace can change the project ID of a path, effectively nullifying the quota for a malicious user. In some of the documentation, users need CAP_SYS_RESOURCE to change the hard limit, and I'm not sure if that also applies to changing the project ID. there seems to have been a discussion on this but it's not clear where this went.

I am trying to investigate further. If a user does need CAP_SYS_RESOURCE to change a project ID, then we're good to move forward with this beta IMO. Otherwise, we will need to wait until user namespaces are more prevalent to move forward with this feature

@haircommander
Copy link
Contributor

Okay I confirmed with kernel engineer. It's not actually documented other than in the code, but a process just needs to own the file or have CAP_FOWNER. If it's in a user namespace it can't change the projid, so we would need to require user namespace support. Theoretically we could pursue beta for this in 1.30, since we're pushing for user namespace beta then, but we may want to wait for the feature to be on by default (maybe 1.31?)

@pacoxu
Copy link
Member Author

pacoxu commented Dec 21, 2023

Okay I confirmed with kernel engineer. It's not actually documented other than in the code, but a process just needs to own the file or have CAP_FOWNER. If it's in a user namespace it can't change the projid, so we would need to require user namespace support. Theoretically we could pursue beta for this in 1.30, since we're pushing for user namespace beta then, but we may want to wait for the feature to be on by default (maybe 1.31?)

I may update the KEP based on this.

  • If user namespace FG is off and LocalStorageCapacityIsolationFSQuotaMonitoring is on, should we give a warning to user about this?

As a risk, we'd better to wait for User Namespace beta before promoting this to beta in my opinion.

@pacoxu pacoxu changed the title promote LocalStorageCapacityIsolationFSQuotaMonitoring to beta [pending UserNamespace beta]promote LocalStorageCapacityIsolationFSQuotaMonitoring to beta Dec 21, 2023
@haircommander
Copy link
Contributor

If user namespace FG is off and LocalStorageCapacityIsolationFSQuotaMonitoring is on, should we give a warning to user about this?

yeah probably

As a risk, we'd better to wait for kubernetes/enhancements#127 beta before promoting this to beta in my opinion.

works for me!

@pacoxu pacoxu marked this pull request as draft February 18, 2024 06:34
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 18, 2024
@pacoxu
Copy link
Member Author

pacoxu commented Feb 28, 2024

/remove-priority important-soon

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Feb 28, 2024
@bart0sh bart0sh moved this from Needs Reviewer to WIP in SIG Node PR Triage Apr 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 28, 2024
@pacoxu pacoxu closed this May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

10 participants