Skip to content

Commit 3717372

Browse files
committed
Update kep-4876 Mutable CSINode Allocatable for Beta
1 parent ee25de8 commit 3717372

File tree

3 files changed

+90
-24
lines changed

3 files changed

+90
-24
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 4876
22
alpha:
33
approver: "@deads2k"
4+
beta:
5+
approver: "@deads2k"

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

Lines changed: 85 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,12 @@
1818
- [API Changes](#api-changes)
1919
- [CSINode](#csinode)
2020
- [CSIDriver](#csidriver)
21+
- [VolumeError](#volumeerror)
2122
- [Validation Changes](#validation-changes)
22-
- [Volume Plugin Manager](#volume-plugin-manager)
23+
- [CSI Node Updater](#csi-node-updater)
24+
- [Implementation details](#implementation-details)
25+
- [Update behavior](#update-behavior)
26+
- [Error handling](#error-handling)
2327
- [NodeInfoManager Interface Extension](#nodeinfomanager-interface-extension)
2428
- [CSINode Update Behavior](#csinode-update-behavior)
2529
- [Pod Construction Changes](#pod-construction-changes)
@@ -186,21 +190,45 @@ type VolumeNodeResources struct {
186190

187191
#### CSIDriver
188192

189-
A new field, `NodeAllocatableUpdatePeriodSeconds`, will be added to the `CSIDriverSpec` struct. This field allows a CSI driver to specify the interval at which the Kubelet should periodically query a driver's `NodeGetInfo` RPC endpoint to update the `CSINode` object. If this field is not set, updates will only occur in response to volume attachment failures as a result of no capacity.
193+
A new field, `NodeAllocatableUpdatePeriodSeconds`, will be added to the `CSIDriverSpec` struct. This field allows a CSI driver to specify the interval at which the Kubelet should periodically query a driver's `NodeGetInfo` RPC endpoint to update the `CSINode` object. If this field is not set, no updates occur (neither periodic nor upon detecting capacity-related failures), and the allocatable count remains static.
190194

191195
```golang
192196
// CSIDriverSpec is the specification of a CSIDriver.
193197
type CSIDriverSpec struct {
194198
...
195-
// NodeAllocatableUpdatePeriodSeconds specifies the interval between periodic updates of
196-
// the CSINode allocatable capacity for this driver. If not set, periodic updates
197-
// are disabled, and updates occur only upon detecting capacity-related failures.
198-
// The minimum allowed value for this field is 10 seconds.
199-
// +optional
199+
// nodeAllocatableUpdatePeriodSeconds specifies the interval between periodic updates of
200+
// the CSINode allocatable capacity for this driver. When set, both periodic updates and
201+
// updates triggered by capacity-related failures are enabled. If not set, no updates
202+
// occur (neither periodic nor upon detecting capacity-related failures), and the
203+
// allocatable.count remains static. The minimum allowed value for this field is 10 seconds.
204+
//
205+
//
206+
// This field is mutable.
207+
//
208+
// +featureGate=MutableCSINodeAllocatableCount
209+
// +optional
200210
NodeAllocatableUpdatePeriodSeconds *int64
201211
}
202212
```
203213

214+
#### VolumeError
215+
216+
A new field, `ErrorCode`, will be added to the `VolumeError` struct to facilitate detection of capacity-related errors:
217+
218+
```golang
219+
// Captures an error encountered during a volume operation.
220+
type VolumeError struct {
221+
...
222+
// errorCode is a numeric gRPC code representing the error encountered during Attach or Detach operations.
223+
//
224+
// This is an optional field that requires the MutableCSINodeAllocatableCount feature gate being enabled to be set.
225+
//
226+
// +featureGate=MutableCSINodeAllocatableCount
227+
// +optional
228+
ErrorCode *int32
229+
}
230+
```
231+
204232
#### Validation Changes
205233

206234
The [ValidateCSINodeUpdate](https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/storage/validation/validation.go#L304) function in the API validation code path will be modified to allow updates to the `Allocatable.Count`
@@ -226,20 +254,53 @@ func ValidateCSINodeUpdate(new, old *storage.CSINode) field.ErrorList {
226254

227255
This updated logic allows the `Allocatable.Count` field to be modified when the feature gate is enabled, while ensuring all other fields remain immutable. When the feature gate is disabled, it falls back to the existing validation logic for backward compatibility.
228256

229-
#### Volume Plugin Manager
257+
#### CSI Node Updater
258+
259+
A new plugin-level updated will be implemented in `kubernetes/pkg/volume/csi/csi_node_updater.go` to manage periodic updates of CSINode allocatable counts. This updater watches for changes to CSIDriver objects and manages per-driver update goroutines based on the `NodeAllocatableUpdatePeriodSeconds` setting.
260+
261+
##### Implementation details
262+
263+
```golang
264+
// csiNodeUpdater watches for changes to CSIDriver objects and manages the lifecycle
265+
// of per-driver goroutines that periodically update CSINodeDriver.Allocatable information
266+
type csiNodeUpdater struct {
267+
// Informer for CSIDriver objects
268+
driverInformer cache.SharedIndexInformer
269+
270+
// Map of driver names to stop channels for update goroutines
271+
driverUpdaters sync.Map
272+
273+
// Ensures the updater is only started once
274+
once sync.Once
275+
}
276+
```
277+
#### Update behavior
278+
279+
When a `CSIDriver` object is added or updated with `NodeAllocatableUpdatePeriodSeconds` set, the updater checks if the driver is installed on the node before running periodic updates.
230280

231-
A new goroutine will be started in VolumePluginMgr’s [Run()](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/plugins.go#L953) func if the `NodeAllocatableUpdatePeriodSeconds` is set to a nonzero value. This goroutine will periodically trigger updates to the `CSINode` object based on the specified interval:
281+
When `NodeAllocatableUpdatePeriodSeconds` is modified, the updater automatically adjusts by stopping the old goroutine and starting a new one. Setting the period to 0 or nil stops updates entirely. Driver uninstallation or `CSIDriver` object deletion also stops the update goroutine for that specific driver.
232282

233283
```golang
234-
func (pm *VolumePluginMgr) Run(stopCh <-chan struct{}) {
235-
if pm.csiNodeUpdateInterval > 0 {
236-
go wait.Until(pm.updateCSINodeInfo, pm.csiNodeUpdateInterval, stopCh)
284+
func (u *csiNodeUpdater) runPeriodicUpdate(driverName string, period time.Duration, stopCh <-chan struct{}) {
285+
ticker := time.NewTicker(period)
286+
defer ticker.Stop()
287+
288+
for {
289+
select {
290+
case <-ticker.C:
291+
if err := updateCSIDriver(driverName); err != nil {
292+
klog.ErrorS(err, "Failed to update CSIDriver", "driver", driverName)
293+
}
294+
case <-stopCh:
295+
return
296+
}
237297
}
238298
}
239299
```
240300

241-
In case of a failure during the `updateCSINodeInfo` call, the `Allocatable.Count` will retain its current value and `updateCSINodeInfo` will be retried.
301+
#### Error handling
242302

303+
If `updateCSIDriver()` fails, the error is logged but the allocatable count retains its current value. Updates continue at the configured interval regardless of individual failures.
243304

244305
#### NodeInfoManager Interface Extension
245306

@@ -262,7 +323,7 @@ This table explains how updates to the `CSINode.Spec.Drivers[*].Allocatable.Coun
262323
| **Feature Flag Status** | **`NodeAllocatableUpdatePeriodSeconds`** | **Behavior** |
263324
|------------------------------------------|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------|
264325
| Enabled | Set | Periodic updates occur at the defined interval + when invalid state is detected (volume attachment failures due to `ResourceExhausted`)|
265-
| Enabled | Not set | Updates occur only in response to volume attachment failures (`ResourceExhausted` errors) |
326+
| Enabled | Not set | No updates occur; `Allocatable.Count` remains static |
266327
| Disabled | Set | `NodeAllocatableUpdatePeriodSeconds` is ignored; `Allocatable.Count` remains static and immutable |
267328
| Disabled | Not set | No updates occur; `Allocatable.Count` remains static and immutable |
268329

@@ -271,7 +332,7 @@ This table explains how updates to the `CSINode.Spec.Drivers[*].Allocatable.Coun
271332

272333
To address race conditions where the scheduler assigns stateful pods to nodes with insufficient capacity, Kubelet's pod construction process during [WaitForAttachAndMount](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/volumemanager/volume_manager.go#L393) will now handle `ResourceExhausted` errors returned by CSI drivers during the `ControllerPublishVolume` RPC.
273334

274-
The `ResourceExhausted` error is directly reported on the `VolumeAttachment` object associated with the relevant attachment. To facilitate easier detection of `ResourceExhausted` errors from `VolumeAttachment` statuses, we propose adding a `StatusCode` field to the [VolumeError](https://github.com/kubernetes/api/blob/master/storage/v1/types.go#L219) struct.
335+
The `ResourceExhausted` error is directly reported on the `VolumeAttachment` object associated with the relevant attachment. To facilitate easier detection of `ResourceExhausted` errors from `VolumeAttachment` statuses, we propose adding a `ErrorCode` field to the [VolumeError](https://github.com/kubernetes/api/blob/master/storage/v1/types.go#L219) struct.
275336

276337
```golang
277338
if err := kl.volumeManager.WaitForAttachAndMount(pod); err != nil {
@@ -395,7 +456,6 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
395456

396457
#### Beta
397458

398-
- Allowing time for feedback (at least 2 releases between beta and GA).
399459
- All unit tests/integration/e2e tests completed and enabled.
400460
- Validate kubelet behavior when API server rejects `CSINode` updates (older API server version).
401461
- Validate CSI driver behavior with and without the `NodeAllocatableUpdatePeriodSeconds` field set.
@@ -405,8 +465,7 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
405465

406466
#### GA
407467

408-
- All beta criteria have been satisfied.
409-
- Feature is stable.
468+
- Feature stability: at least 2 releases between Beta and GA.
410469
- No bug reports / feedback / improvements to address.
411470

412471
### Upgrade / Downgrade Strategy
@@ -490,7 +549,7 @@ well as the [existing list] of feature gates.
490549

491550
- [X] Feature gate (also fill in values in `kep.yaml`)
492551
- Feature gate name: `MutableCSINodeAllocatableCount`
493-
- Components depending on the feature gate: `kube-apiserver`, `kube-controller-manager`, `kubelet`.
552+
- Components depending on the feature gate: `kube-apiserver`, `kubelet`.
494553

495554
###### Does enabling the feature change any default behavior?
496555

@@ -705,7 +764,7 @@ Yes, there will be new API calls to update the `CSINode` object:
705764
```
706765
API call type: PATCH
707766
Estimated throughput: Depends on the `NodeAllocatableUpdatePeriodSeconds` setting and the frequency of volume attachment failures.
708-
Originating component: Kubelet, KCM
767+
Originating component: Kubelet
709768
```
710769

711770
###### Will enabling / using this feature result in introducing new API types?
@@ -800,6 +859,8 @@ details). For now, we leave it here.
800859

801860
###### How does this feature react if the API server and/or etcd is unavailable?
802861

862+
When the API server is unavailable, `CSINode` update attempts fail and are logged, however, the periodic update goroutines will continue running and retry at their configured intervals. Additionally, `ResourceExhausted` errors cannot trigger immediate updates since `VolumeAttachment` statuses cannot be read. Existing allocatable values remain unchanged and stateful workloads continue running normally.
863+
803864
###### What are other known failure modes?
804865

805866
<!--
@@ -815,8 +876,12 @@ For each of them, fill in the following information by copying the below templat
815876
- Testing: Are there any tests for failure mode? If not, describe why.
816877
-->
817878

879+
No other known failure modes.
880+
818881
###### What steps should be taken if SLOs are not being met to determine the problem?
819882

883+
N/A
884+
820885
## Implementation History
821886

822887
<!--

keps/sig-storage/4876-mutable-csinode-allocatable/kep.yaml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,12 @@ approvers:
2222
- "@msau42"
2323

2424
# The target maturity stage in the current dev cycle for this KEP.
25-
stage: alpha
25+
stage: beta
2626

2727
# The most recent milestone for which work toward delivery of this KEP has been
2828
# done. This can be the current (upcoming) milestone, if it is being actively
2929
# worked on.
30-
latest-milestone: "v1.33"
30+
latest-milestone: "v1.34"
3131

3232
# The milestone at which this feature was, or is targeted to be, at each stage.
3333
milestone:
@@ -38,10 +38,9 @@ milestone:
3838
# The following PRR answers are required at alpha release
3939
# List the feature gate name and the components for which it must be enabled
4040
feature-gates:
41-
- name: MutableCSINode
41+
- name: MutableCSINodeAllocatableCount
4242
components:
4343
- kube-apiserver
44-
- kube-controller-manager
4544
- kubelet
4645
disable-supported: true
4746

0 commit comments

Comments
 (0)