-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CSI Migration Kep to include PRRQ #2432
Conversation
/assign @msau42 |
/cc @andyzhangx |
|
||
* **Can the feature be disabled once it has been enabled (i.e. can we roll back | ||
the enablement)?** | ||
Yes. The `volume.beta.kubernetes.io/migrated-to` annotation will be used by `v1.6.0+` of the CSI External provisioner to determine whether a PV or PVC should be operated on by the provisioner. The annotation will be set (and removed on rollback) for Kubernetes `v1.17.2+`, we will carefully document the fact that rollback with migration on may not be successful to a version before `v1.17.2`. The benefit being that PV Controller start-up annealing of this annotation will allow the PV Controller to stand down and the CSI External Provisioner to pick up a PV that was dynamically provisioned before migration was enabled. These newer external provisioners will still be compatible with older versions of Kubernetes with migration on even if they do not set the migrated-to annotation. However, without the migrated-to annotation a newer provisioner with a Kubernetes cluster `<1.17.2` will not be able to delete volumes provisioned before migration until the Kubernetes cluster is upgraded to `v1.17.2+`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- On which object the annotation is set?
- Can you clarify in more details what does "rollback with migration on may not be successful" mean?
nit: it would be much easier to comment, if you could split the lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I was copy-paste from the design doc and it seems not related to rollback with migration off actually. The issue only exists if we rollback to a version before v1.17.2 and the feature flag is still on. For more reading, https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/csi-migration.md#persistent-volume-controller
Hi @Jiawei0227, 1.21 Enhancements Lead here Please also update the Thank you! |
Done, thank you for the instruction! |
Hey, @wojtek-t Could you help to review this PRRQ? It's my first time doing this so might be forget or lost something. Feel free to comment and let me know. Much appreciate it! |
d727b12
to
b8f5540
Compare
Notifications aren't useful for me (I wouldn't notice if not running kepctl) - next time please assign directly to me. |
|
||
* **Can the feature be disabled once it has been enabled (i.e. can we roll back | ||
the enablement)?** | ||
Yes. Once there is an error or something unexpected, users can disable the CSI migration feature flags to roll it back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: change change to:
"Yes - can be disabled by disabling feature flags"
That said - questions:
- any concerns about state stored in etcd? Anything that CSI drivers will report differently than in-tree drivers that we may have problems to reconcile after rollback? @msau42 fpr thoughts too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we have anything special that will be stored in etcd after the feature is turned on. It is mainly a few annotations of PV, and for those we will remove them when the migration is turned from on to off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PV annotations have been tested already in rollback scenarios.
Can you link to the section about the steps to enable/disable the feature. There are some strict requirements in ordering and steps (such as draining the node first), so that we don't end up with stale VolumeAttachment objects.
In general, there is some risk about state related to migration in PV and VolumeAttachment objects, so we need to make sure rollback and disablement is tested thoroughly (and has already been done back in 1.17 timeframe). @Jiawei0227 can you make sure the subtle issues are documented in the rollback section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will revisit these section
- All unit tests in the csi-translation-lib `staging/src/k8s.io/csi-translation-lib/translate_test.go` | ||
- Controller test with Migration on CSI sidecars: external-provisioner, external-resizer | ||
- provisioner: pkg/controller/controller_test.go#TestProvisionWithMigration | ||
- resizer: pkg/resizer/csi_resizer_test.go#TestResizeMigratedPV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of those are checking "what if FG enabled" or "what if FG disabled".
Any transition tests? Would they bring additional value (depending on the question about storing state above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing you mean the upgrade/downgrade/version skews test? These tests are a requirement for the GA criteria. Updated the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we have some upgrade tests for storage here: https://github.com/kubernetes/kubernetes/tree/master/test/e2e/upgrades/storage
These will test a PVC created before migration enabled continues to function after upgrade. We could enhance this to add more feature coverage if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
|
||
* **How can a rollout fail? Can it impact already running workloads?** - The rollout can fail if the ordering of CSIMigration{cloud-provider} flag was wrongly enabled on kubelet and kube-controller-manager. Specifically, if on the node side kubelet enables the flag and control-plane side the flag is not enabled, then the volume will not be able to be mounted successfully. | ||
- For workloads that running on nodes have not enable CSI migration, those pods will not be impacted. | ||
- For any pod that is being rescheduled to nodes that turned on migration, the volume mount will fail and pod will not come up correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no pod rescheduling in Kubernetes - once bound to the node the pods lives there until EOL.
Can you clarify what you meant here?
[nit - it would be much easier to comment if the lines would be splitted]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a restriction on enabling the feature gate. It requires node that being upgrade be drained before restarting kubelet. So when the node is being drained, there may or may not be pod movement and that could fail potentially with the wrong ordering of feature gate enablement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm saying is "there is NOT anything like 'pod movement' in Kubernetes".
Once pod is bounded to the node - it will never be moved to a different node.
What we do is delte a pod and create a new one. But it's not pod movement or rescheduling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
* **Will enabling / using this feature result in increasing size or count of | ||
the existing API objects?** | ||
CSI migration will require CSI driver to be installed in the cluster so it can add CSI related API objects including CSIDriver, CSINode, VolumeAttachment. The size of PV will include up to two more annotations. The existing Node object will include new labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly related to this KEP, but what is the lifecycle of VolumeAttachment object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we switch to CSI model, the VolumeAttachment object will be used to reconcile the attachment of the PV.
So the lifecycle would be:
- User create PVC => csi-provisioner notice the PVC => csi-provisioner call csi driver to create volume => csi-provisioner Create PV
- User create a pod to use PVC => kube-controller-manager create a VolumeAttachment object => csi-attacher notice the object and call CSI driver for volume attaching => After attach finished, csi-attacher update the VolumeAttachment object
- User delete the pod => kube-controller-manager delete VolumeAttachment => csi-attacher notice the volumeattachment has been marked for deletion and call CSI driver for volume detaching => remove the finalizer on VolumeAttachment => VolumeAttachment object is removed from api server
I hope this is somehow clearer, happy to provide more info on this. Here is the design doc that has more context: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/container-storage-interface.md#attaching-and-detaching
|
||
* **Will enabling / using this feature result in increasing time taken by any | ||
operations covered by [existing SLIs/SLOs]?** | ||
Depending on the design and implementation of the CSI Driver, the operation time taken could vary. In general, it might increase the total time spend because for the CSI sidecar to detect the object in the APIServer and do corresponding change through socket might add additional traffic compared to the in-tree plugin model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what "socket"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the in-tree switch to CSI, the volume operation for example, a mount operation, used to work straight on kubelet alone. But now, it has to go through a unix domain socket to have a grpc CSI function call.
Here is the CSI design doc for more info on how the CSI works.
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/container-storage-interface.md#design-overview
# The most recent milestone for which work toward delivery of this KEP has been | ||
# done. This can be the current (upcoming) milestone, if it is being actively | ||
# worked on. | ||
latest-milestone: "v1.17" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.17 does seem correct here (nor below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, sorry what do you mean here? I want to clarify that CSI migration feature has been Beta since v1.17. So it seems v1.17 is the right version to put here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's confusing is we have one overall kep for csi migration, but each cloud provider has their own feature gates and timelines. Some drivers went beta in 1.17, but for 1.21, we are targeting AzureFile beta. So maybe we have "latest-milestone" as 1.21, but keep beta at "1.17". And in the implementation history section, call out the timelines of each cloud provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I add a new section in implementation history.
d9ee931
to
782721a
Compare
- All unit tests in the csi-translation-lib `staging/src/k8s.io/csi-translation-lib/translate_test.go` | ||
- Controller test with Migration on CSI sidecars: external-provisioner, external-resizer | ||
- provisioner: pkg/controller/controller_test.go#TestProvisionWithMigration | ||
- resizer: pkg/resizer/csi_resizer_test.go#TestResizeMigratedPV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we have some upgrade tests for storage here: https://github.com/kubernetes/kubernetes/tree/master/test/e2e/upgrades/storage
These will test a PVC created before migration enabled continues to function after upgrade. We could enhance this to add more feature coverage if needed.
2 nits, otherwise lgtm! /lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few more comments, but this now looks pretty good - thanks a lot for huge improvements.
- All unit tests in the csi-translation-lib `staging/src/k8s.io/csi-translation-lib/translate_test.go` | ||
- Controller test with Migration on CSI sidecars: external-provisioner, external-resizer | ||
- provisioner: pkg/controller/controller_test.go#TestProvisionWithMigration | ||
- resizer: pkg/resizer/csi_resizer_test.go#TestResizeMigratedPV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
fields of API types, flags, etc.?** | ||
There will not be API removal in CSI migration itself. But eventually when CSI migration is all finished. We will plan to remove all in-tree plugins. | ||
So we will have in-tree plugin deprecated when CSIMigration{cloud-provider} goes to beta. And code removal will be required eventually. | ||
In addition, some CSI drivers are not able to maintain 100% backwards compatibility, so those drivers need to deprecate certain behaviors. For example, azurefile and vSphere has some corner cases that will not work anymore. (TODO: add more details.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really useful to describe them here (or link to a place that is describing those).
Also - are these azufilre and vSphere the only cases you're ware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For vsphere, you can link to kubernetes/kubernetes#98546
@gnufied can you open an issue for to track the azurefile deprecation?
The remaining plugins I'm not aware of any incompatibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
- [Optional] Aggregation method: | ||
- Components exposing the metric: CSI sidecars, kubelet, kube-controller-manager | ||
- [x] Other (treat as last resort) | ||
- Details: Pod using in-tree plugin has failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can you say if the pod is using in-tree plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the pod is using a PVC that uses intree plugin. It should be able to tell from the PVC spec
I'm cancelling hold, because PRR approval is needed anyway. |
The metric uses different plugin name for intree vs csi
…On Tue, Feb 9, 2021, 08:35 Jiawei Wang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In keps/sig-storage/625-csi-migration/README.md
<#2432 (comment)>
:
> +In addition, some CSI drivers are not able to maintain 100% backwards compatibility, so those drivers need to deprecate certain behaviors. For example, azurefile and vSphere has some corner cases that will not work anymore. (TODO: add more details.)
+
+
+### Monitoring Requirements
+
+* **How can an operator determine if the feature is in use by workloads?**
+ We will have metrics `csi_sidecar_duration_seconds` on the CSI sidecars and `storage_operation_duration_seconds` on the kube-controller-manager and kubelet side to indicate whether this operation is a migrated operation or not. These metrics will have a `migrated` field to indicate if this is a migrated operation.
+
+* **What are the SLIs (Service Level Indicators) an operator can use to determine
+the health of the service?**
+ - [x] Metrics
+ - Metric name: csi_sidecar_duration_seconds && storage_operation_duration_seconds, these metrics will have a `migrated` field
+ - [Optional] Aggregation method:
+ - Components exposing the metric: CSI sidecars, kubelet, kube-controller-manager
+ - [x] Other (treat as last resort)
+ - Details: Pod using in-tree plugin has failure.
If the pod is using a PVC that uses intree plugin. It should be able to
tell from the PVC spec
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2432 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF2QYPNUC35Y4SY5VNEH2E3S6FP6VANCNFSM4XD7KQGA>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two last minor comments.
* **How can a rollout fail? Can it impact already running workloads?** | ||
- The rollout can fail if the ordering of CSIMigration{cloud-provider} flag was wrongly enabled on kubelet and kube-controller-manager. Specifically, if on the node side kubelet enables the flag and control-plane side the flag is not enabled, then the volume will not be able to be mounted successfully. | ||
- For workloads that running on nodes have not enable CSI migration, those pods will not be impacted. | ||
- For any pod that is being rescheduled to nodes that turned on migration during node upgrading, the volume mount will fail and pod will not come up correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still - this comments isn't addressed (pretty much a wording thing).
fields of API types, flags, etc.?** | ||
There will not be API removal in CSI migration itself. But eventually when CSI migration is all finished. We will plan to remove all in-tree plugins. | ||
So we will have in-tree plugin deprecated when CSIMigration{cloud-provider} goes to beta. And code removal will be required eventually. | ||
In addition, some CSI drivers are not able to maintain 100% backwards compatibility, so those drivers need to deprecate certain behaviors. For example, vSphere [kubernetes#98546](https://github.com/kubernetes/kubernetes/pull/98546). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you mentioned that for azure link TBD and for other providers no deprecations are known?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/lgtm /hold cancel |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jiawei0227, msau42, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR updates the CSI Migration KEP to include the Production Readiness Review Questionnaire