-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3746 - Initial KEP for specifying root-fs volume size for Windows containers #3951
KEP-3746 - Initial KEP for specifying root-fs volume size for Windows containers #3951
Conversation
marosset
commented
Apr 10, 2023
•
edited
Loading
edited
- One-line PR description:
- Issue link: Specify root-fs volume size for Windows containers #3746
- Other comments:
…tainers Signed-off-by: Mark Rossetti <marosset@microsoft.com>
0c2db81
to
0c65a05
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: marosset The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
A user wants to run a workload that requires more than 20Gb of disk space on the root-fs volume. | ||
|
||
#### Story 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about running less than 20gb?
--> | ||
|
||
IO performance when writing to `emptyDir` volumes is worse then writing to container root fs. | ||
https://github.com/microsoft/Windows-Containers/issues/345 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see you mentioned it down here. Maybe could link to the various issues we've gotten related to this?
- name: container1 | ||
resources: | ||
limits: | ||
storage: 30Gi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree storage
is ambiguous in how it interacts with mounted volumes... would a more explicit rootfs-storage
be better?
also, is there a reason this would not apply to both linux and windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was the resolution of the name storage
being ambiguous?
CRI or CNI may require updating that component before the kubelet. | ||
--> | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this would be added to a lot of the places we include ephemeral storage, in terms of validation.
See use of EphemeralStorage in https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/helper/helpers.go, which fans out to helper methods like IsStandardContainerResourceName, IsStandardResourceName, IsStandardQuotaResourceName, etc.
Any validation of resources and allowed values for resources would need to consider skew so we don't persist data an n-1 API server or controller would choke on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated - PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not seeing the update, still looks like N/A
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment at #3951 (comment) with a sketch of how to relax validation safely and be sure you are considering the impact on all callers
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
/api-review |
- name: container1 | ||
resources: | ||
limits: | ||
storage: 30Gi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was the resolution of the name storage
being ambiguous?
### API server updates | ||
|
||
Update API server validation to allow `resources.limits.storage` to be specified on containers. While the feature is in development this validation will be gated on by a feature flag. | ||
The relevant code is in [ValidateContainerResourceName](https://github.com/kubernetes/kubernetes/blob/ad18954259eae3db51bac2274ed4ca7304b923c4/pkg/apis/core/v1/validation/validation.go#L73-L86) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relaxing validation has to be done over two releases. Otherwise, we allow data to be persisted that will be considered invalid by the previous release of the API server (which could still be running in a multi-server cluster during rolling upgrade).
I would suggest doing the following:
- in
pkg/apis/core/validation/validation.go
, add anAllowContainerStorageResource bool
field toPodValidationOptions
- make sure that options struct is passed through all validation functions down to ValidateContainerResourceName
- this will reveal lots of call paths that lead to this function, all of which have to consider whether to allow the relaxed data or not
- in ValidateContainerResourceName, only allow the relaxed validation if the
AllowContainerStorageResource
field is true
- in
GetValidationOptionsFromPodSpecAndMeta
, compute the validation options based on the new and old object- in this release
- on create (new object is non-nil, old object is nil), set the
AllowContainerStorageResource
field totrue
only if the feature gate is enabled - on update, set the
AllowContainerStorageResource
field totrue
only if the old object already has a container using the storage resource or the feature is enabled
- on create (new object is non-nil, old object is nil), set the
- in a future release, enable the feature gate
- in an even more future release, graduate the feature gate and remove the option check
- in this release
CRI or CNI may require updating that component before the kubelet. | ||
--> | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not seeing the update, still looks like N/A
--> | ||
|
||
Yes | ||
If the feature is disabled, then new pods admitted into the cluster and new containers started will always be created with a root-fs volume size of 20Gb (default for Windows containers) instead of the size specified in the Pod spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, any existing pods that have been admitted with a root-fs size other than 20Gb will continue to run at the configured size even if the feature is turned off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would expect a pod that has been admitted with a root-fs size other than 20Ggb would continue to run at the configures size after the feature was turned off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you add this detail to the KEP? Thanks!
@marosset: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update with me as approver, and make the addition Joe asked for and I can approve PRR.
# of http://git.k8s.io/enhancements/OWNERS_ALIASES | ||
kep-number: 3503 | ||
alpha: | ||
approver: TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please but me here
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |