You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Pod Construction Changes](#pod-construction-changes)
@@ -186,21 +190,45 @@ type VolumeNodeResources struct {
186
190
187
191
#### CSIDriver
188
192
189
-
A new field, `NodeAllocatableUpdatePeriodSeconds`, will be added to the `CSIDriverSpec` struct. This field allows a CSI driver to specify the interval at which the Kubelet should periodically query a driver's `NodeGetInfo` RPC endpoint to update the `CSINode` object. If this field is not set, updates will only occur in response to volume attachment failures as a result of no capacity.
193
+
A new field, `NodeAllocatableUpdatePeriodSeconds`, will be added to the `CSIDriverSpec` struct. This field allows a CSI driver to specify the interval at which the Kubelet should periodically query a driver's `NodeGetInfo` RPC endpoint to update the `CSINode` object. If this field is not set, no updates occur (neither periodic nor upon detecting capacity-related failures), and the allocatable count remains static.
190
194
191
195
```golang
192
196
// CSIDriverSpec is the specification of a CSIDriver.
193
197
typeCSIDriverSpecstruct {
194
198
...
195
-
// NodeAllocatableUpdatePeriodSeconds specifies the interval between periodic updates of
196
-
// the CSINode allocatable capacity for this driver. If not set, periodic updates
197
-
// are disabled, and updates occur only upon detecting capacity-related failures.
198
-
// The minimum allowed value for this field is 10 seconds.
199
-
// +optional
199
+
// nodeAllocatableUpdatePeriodSeconds specifies the interval between periodic updates of
200
+
// the CSINode allocatable capacity for this driver. When set, both periodic updates and
201
+
// updates triggered by capacity-related failures are enabled. If not set, no updates
202
+
// occur (neither periodic nor upon detecting capacity-related failures), and the
203
+
// allocatable.count remains static. The minimum allowed value for this field is 10 seconds.
204
+
//
205
+
//
206
+
// This field is mutable.
207
+
//
208
+
// +featureGate=MutableCSINodeAllocatableCount
209
+
// +optional
200
210
NodeAllocatableUpdatePeriodSeconds *int64
201
211
}
202
212
```
203
213
214
+
#### VolumeError
215
+
216
+
A new field, `ErrorCode`, will be added to the `VolumeError` struct to facilitate detection of capacity-related errors:
217
+
218
+
```golang
219
+
// Captures an error encountered during a volume operation.
220
+
typeVolumeErrorstruct {
221
+
...
222
+
// errorCode is a numeric gRPC code representing the error encountered during Attach or Detach operations.
223
+
//
224
+
// This is an optional field that requires the MutableCSINodeAllocatableCount feature gate being enabled to be set.
225
+
//
226
+
// +featureGate=MutableCSINodeAllocatableCount
227
+
// +optional
228
+
ErrorCode *int32
229
+
}
230
+
```
231
+
204
232
#### Validation Changes
205
233
206
234
The [ValidateCSINodeUpdate](https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/storage/validation/validation.go#L304) function in the API validation code path will be modified to allow updates to the `Allocatable.Count`
@@ -226,20 +254,53 @@ func ValidateCSINodeUpdate(new, old *storage.CSINode) field.ErrorList {
226
254
227
255
This updated logic allows the `Allocatable.Count` field to be modified when the feature gate is enabled, while ensuring all other fields remain immutable. When the feature gate is disabled, it falls back to the existing validation logic for backward compatibility.
228
256
229
-
#### Volume Plugin Manager
257
+
#### CSI Node Updater
258
+
259
+
A new plugin-level updated will be implemented in `kubernetes/pkg/volume/csi/csi_node_updater.go` to manage periodic updates of CSINode allocatable counts. This updater watches for changes to CSIDriver objects and manages per-driver update goroutines based on the `NodeAllocatableUpdatePeriodSeconds` setting.
260
+
261
+
##### Implementation details
262
+
263
+
```golang
264
+
// csiNodeUpdater watches for changes to CSIDriver objects and manages the lifecycle
265
+
// of per-driver goroutines that periodically update CSINodeDriver.Allocatable information
266
+
type csiNodeUpdater struct {
267
+
// Informer for CSIDriver objects
268
+
driverInformer cache.SharedIndexInformer
269
+
270
+
// Map of driver names to stop channels for update goroutines
271
+
driverUpdaters sync.Map
272
+
273
+
// Ensures the updater is only started once
274
+
once sync.Once
275
+
}
276
+
```
277
+
#### Update behavior
278
+
279
+
When a `CSIDriver` object is added or updated with `NodeAllocatableUpdatePeriodSeconds` set, the updater checks if the driver is installed on the node before running periodic updates.
230
280
231
-
A new goroutine will be started in VolumePluginMgr’s [Run()](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/plugins.go#L953) func if the `NodeAllocatableUpdatePeriodSeconds` is set to a nonzero value. This goroutine will periodically trigger updates to the `CSINode` object based on the specified interval:
281
+
When `NodeAllocatableUpdatePeriodSeconds` is modified, the updater automatically adjusts by stopping the old goroutine and starting a new one. Setting the period to 0 or nil stops updates entirely. Driver uninstallation or `CSIDriver` object deletion also stops the update goroutine for that specific driver.
go wait.Until(pm.updateCSINodeInfo, pm.csiNodeUpdateInterval, stopCh)
284
+
func (u *csiNodeUpdater) runPeriodicUpdate(driverName string, period time.Duration, stopCh <-chan struct{}) {
285
+
ticker:= time.NewTicker(period)
286
+
defer ticker.Stop()
287
+
288
+
for {
289
+
select {
290
+
case<-ticker.C:
291
+
iferr:=updateCSIDriver(driverName); err != nil {
292
+
klog.ErrorS(err, "Failed to update CSIDriver", "driver", driverName)
293
+
}
294
+
case<-stopCh:
295
+
return
296
+
}
237
297
}
238
298
}
239
299
```
240
300
241
-
In case of a failure during the `updateCSINodeInfo` call, the `Allocatable.Count` will retain its current value and `updateCSINodeInfo` will be retried.
301
+
#### Error handling
242
302
303
+
If `updateCSIDriver()` fails, the error is logged but the allocatable count retains its current value. Updates continue at the configured interval regardless of individual failures.
243
304
244
305
#### NodeInfoManager Interface Extension
245
306
@@ -262,7 +323,7 @@ This table explains how updates to the `CSINode.Spec.Drivers[*].Allocatable.Coun
262
323
|**Feature Flag Status**|**`NodeAllocatableUpdatePeriodSeconds`**|**Behavior**|
| Enabled | Set | Periodic updates occur at the defined interval + when invalid state is detected (volume attachment failures due to `ResourceExhausted`)|
265
-
| Enabled | Not set |Updates occur only in response to volume attachment failures (`ResourceExhausted` errors)|
326
+
| Enabled | Not set |No updates occur; `Allocatable.Count` remains static|
266
327
| Disabled | Set |`NodeAllocatableUpdatePeriodSeconds` is ignored; `Allocatable.Count` remains static and immutable |
267
328
| Disabled | Not set | No updates occur; `Allocatable.Count` remains static and immutable |
268
329
@@ -271,7 +332,7 @@ This table explains how updates to the `CSINode.Spec.Drivers[*].Allocatable.Coun
271
332
272
333
To address race conditions where the scheduler assigns stateful pods to nodes with insufficient capacity, Kubelet's pod construction process during [WaitForAttachAndMount](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/volumemanager/volume_manager.go#L393) will now handle `ResourceExhausted` errors returned by CSI drivers during the `ControllerPublishVolume` RPC.
273
334
274
-
The `ResourceExhausted` error is directly reported on the `VolumeAttachment` object associated with the relevant attachment. To facilitate easier detection of `ResourceExhausted` errors from `VolumeAttachment` statuses, we propose adding a `StatusCode` field to the [VolumeError](https://github.com/kubernetes/api/blob/master/storage/v1/types.go#L219) struct.
335
+
The `ResourceExhausted` error is directly reported on the `VolumeAttachment` object associated with the relevant attachment. To facilitate easier detection of `ResourceExhausted` errors from `VolumeAttachment` statuses, we propose adding a `ErrorCode` field to the [VolumeError](https://github.com/kubernetes/api/blob/master/storage/v1/types.go#L219) struct.
- Components depending on the feature gate: `kube-apiserver`, `kube-controller-manager`, `kubelet`.
552
+
- Components depending on the feature gate: `kube-apiserver`, `kubelet`.
494
553
495
554
###### Does enabling the feature change any default behavior?
496
555
@@ -705,7 +764,7 @@ Yes, there will be new API calls to update the `CSINode` object:
705
764
```
706
765
API call type: PATCH
707
766
Estimated throughput: Depends on the `NodeAllocatableUpdatePeriodSeconds` setting and the frequency of volume attachment failures.
708
-
Originating component: Kubelet, KCM
767
+
Originating component: Kubelet
709
768
```
710
769
711
770
###### Will enabling / using this feature result in introducing new API types?
@@ -800,6 +859,8 @@ details). For now, we leave it here.
800
859
801
860
###### How does this feature react if the API server and/or etcd is unavailable?
802
861
862
+
When the API server is unavailable, `CSINode` update attempts fail and are logged, however, the periodic update goroutines will continue running and retry at their configured intervals. Additionally, `ResourceExhausted` errors cannot trigger immediate updates since `VolumeAttachment` statuses cannot be read. Existing allocatable values remain unchanged and stateful workloads continue running normally.
863
+
803
864
###### What are other known failure modes?
804
865
805
866
<!--
@@ -815,8 +876,12 @@ For each of them, fill in the following information by copying the below templat
815
876
- Testing: Are there any tests for failure mode? If not, describe why.
816
877
-->
817
878
879
+
No other known failure modes.
880
+
818
881
###### What steps should be taken if SLOs are not being met to determine the problem?
0 commit comments