-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSI Cluster Registry and Node Info CRDs #67803
Conversation
b1667ef
to
db1c227
Compare
/cc |
2cb115d
to
e024408
Compare
/test pull-kubernetes-integration |
e024408
to
ec34338
Compare
pkg/features/kube_features.go
Outdated
@@ -394,6 +400,7 @@ var defaultKubernetesFeatureGates = map[utilfeature.Feature]utilfeature.FeatureS | |||
MountContainers: {Default: false, PreRelease: utilfeature.Alpha}, | |||
VolumeScheduling: {Default: true, PreRelease: utilfeature.Beta}, | |||
CSIPersistentVolume: {Default: true, PreRelease: utilfeature.Beta}, | |||
CSICrdAutoInstall: {Default: true, PreRelease: utilfeature.Beta}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing only. Will flip to false and alpha before merge.
541a23a
to
d32491f
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: saad-ali, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@saad-ali: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are some initial thoughts. Thank you for putting it as separate commits, that made it much easier to review.
|
||
// +k8s:deepcopy-gen=package,register | ||
// +groupName=csi.storage.k8s.io | ||
// +k8s:openapi-gen=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line to prevent this from being a godoc package comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
// (like podName, podUID, etc.) during mount operations. | ||
// If this is set to true, Kubelet will pass pod information as | ||
// VolumeAttributes in the CSI NodePublishVolume() calls. | ||
// If value is not specified, default is false -- meaning pod information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like this should be a version counter that controls the format of the data passed, not a true/false field?
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object | ||
|
||
// CSINodeInfo holds information about all CSI drivers installed on a node. | ||
type CSINodeInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info is very vague. Is this spec (declaring which drivers should be running), status (reporting which drivers are running), both, or neither?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who writes these objects? Who reads them? (answer in the comment)
|
||
// CSIDriverInfo contains information about one CSI driver installed on a node. | ||
type CSIDriverInfo struct { | ||
// Driver is the name of the CSI driver that this object refers to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driver
(use json capitalization)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
// that driver. | ||
Driver string `json:"driver"` | ||
|
||
// ID of the node from the driver point of view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain this more. Why is it different from e.g. the node's UID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will clarify.
if err != nil && !apierrors.IsAlreadyExists(err) { | ||
glog.Errorf("failed to create CSIDrivers CRD: %#v, err: %#v", | ||
res, err) | ||
} else if apierrors.IsAlreadyExists(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it safe to pass a nil error to this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Cleaned up the block to prevent that.
res) | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How well will the rest of the system cope if only one of these is installed (say the other one has a transient error)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different parts of the volume plugin provisioning/attach/mount process look at CRs for one of these CRDs. If one of the CRDs fails to create, it would just change where in the provision/attach/mount process subsequent failure will happen.
Since it is better to fail fast, I changed the errors here to return error instead of silently log.
@@ -649,6 +667,65 @@ func (adc *attachDetachController) processVolumesInUse( | |||
} | |||
} | |||
|
|||
// installCRDs creates the specified CustomResourceDefinition for the CSIDrivers object. | |||
func (adc *attachDetachController) installCRDs() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little concerned about the implications of controller-manager doing schema changes when it starts up. Not claiming it's definitely problematic, just still thinking about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, we're doing this for alpha because it's the path of least resistance. For beta, I was talking to @timstclair and we were thinking about having a new/dedicated controller that is responsible for installing CRDs (and reinstalling them if they are uninstalled, e.g. like add on manager). Can revisit that for beta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just use addons for alpha? I drafted those changes here: tallclair@a58008e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're doing this for alpha because it's the path of least resistance
That attitude is concerning. Constructing APIs via CRD should not lower our standards for design or make us more willing to ship things we haven't considered the implications of.
@@ -649,6 +667,65 @@ func (adc *attachDetachController) processVolumesInUse( | |||
} | |||
} | |||
|
|||
// installCRDs creates the specified CustomResourceDefinition for the CSIDrivers object. | |||
func (adc *attachDetachController) installCRDs() error { | |||
crd := &apiextensionsv1beta1.CustomResourceDefinition{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the definition files should be stored in yaml format in a file. I don't think it's a good idea to construct them at runtime, it's too easy to make a change to the code and not realize you're introducing a compatibility problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted. When we move this to a more permanent solution in beta (see #67803 (comment)), will consider this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the definition files should be stored in yaml format in a file. I don't think it's a good idea to construct them at runtime, it's too easy to make a change to the code and not realize you're introducing a compatibility problem.
Why was this comment not considered necessary to address? I am somewhat dismayed by the attitude (echoed in the follow up PR) that the issues raised here don't matter for alpha.
Plural: csiapiv1alpha1.CsiDriverResourcePlural, | ||
Kind: reflect.TypeOf(csiapiv1alpha1.CSIDriver{}).Name(), | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the validation
property? I think you should. I think there is a way to construct it from the types.go file you have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice, I didn't realize validation was already supported. Looks like auto generation of OpenAPIV3Schema from types.go is still in the works kubernetes/kube-openapi#37. Before we move to beta, hopefully that is ready and we can use it. If not, we'll just construct the Validation schema manually then.
The fact that CSI plugins must fail if they get info they don't understand
seems weird. Can we explore that?
…On Fri, Aug 31, 2018 at 2:38 PM Saad Ali ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In staging/src/k8s.io/csi-api/pkg/apis/csi/v1alpha1/types.go
<#67803 (comment)>
:
> + Driver string `json:"driver"`
+
+ // Indicates this CSI volume driver requires an attach operation (because it
+ // implements the CSI ControllerPublishVolume() method), and that Kubernetes
+ // should call attach and wait for any attach operation to complete before
+ // proceeding to mounting.
+ // If value is not specified, default is true -- meaning attach will be
+ // called.
+ // +optional
+ AttachRequired *bool `json:"attachRequired"`
+
+ // Indicates this CSI volume driver requires additional pod information
+ // (like podName, podUID, etc.) during mount operations.
+ // If this is set to true, Kubelet will pass pod information as
+ // VolumeAttributes in the CSI NodePublishVolume() calls.
+ // If value is not specified, default is false -- meaning pod information
blech, can we fix CSI to be less brittle?
We looked in to adding workload info into CSI. The challange is that CSI
is agnostic to the CO (has to work with Kubernetes, Mesos, and others). So
Kubernetes specific pod information can't be encoded in to the spec. But
we'd still like to be able to have CSI drivers that want this Kube specific
info to be able to receive it. And this is a relatively clean way to do so
(leaves CSI spec agnostic, and allows individual drivers to choose not to
be).
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#67803 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVBDyVEdh_FQJ1TkochxiQRUrPjjHks5uWazGgaJpZM4WK9To>
.
|
To be clear, the CSI spec does not say that a volume plugin must (or should) fail if it sees volume attributes it does not recognize. It leaves validation of volume attributes up to the volume plugin to determine. For backwards compat, a well behaved plugin should validate the parameters it cares about, and ignore the ones it doesn't understand. But, like storageClass parameters, ultimately it is up to the driver to determine what type of validation it will do. |
Would you say that a driver which explodes when given a parameter it
doesn't know is well-behaved or mis-behaved?
…On Fri, Aug 31, 2018 at 4:05 PM Saad Ali ***@***.***> wrote:
Currently all plugins are supposed to validate every volume attribute and error on attributes they don't recognize. This could break plugins when we upgrade Kubernetes independently of the plugin
The fact that CSI plugins must fail if they get info they don't understand seems weird. Can we explore that?
To be clear, the CSI spec does not say that a volume plugin must (or should) fail if it sees volume attributes it does not recognize. It leaves validation of volume attributes up to the volume plugin to determine. For backwards compat, a well behaved plugin should validate the parameters it cares about, and ignore the ones it doesn't understand. But, like storageClass parameters, ultimately it is up to the driver to determine what type of validation it will do.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Should a driver that doesn't understand a parameter just blindly ignore it? |
That's the general behavior of most things, isn't it? A C function that
doesn't know a global variable ignores it. A shell program that doesn't
know an env var ignores it. Do you have a "problem vector" in mind?
…On Fri, Aug 31, 2018 at 4:18 PM Michelle Au ***@***.***> wrote:
Should a driver that doesn't understand a parameter just blindly ignore it?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#67803 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVGEAW8XKAXWGtBQtIK8VPKyWkab6ks5uWcRFgaJpZM4WK9To>
.
|
I'm mostly thinking about how StorageClass parameters are validated today for in-tree plugins. They fail if an unrecognized parameter is passed in to prevent an incorrect volume to be provisioned. For example, a user could mispell the parameter for "encryptionMode". The plugin could ignore it and provision an unencrypted volume when the user expected an encrypted one. |
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. |
Good point. This is a good reason for CSI to leave it up to the driver to determine what validation it wants to do (it can optimize for backwards compat or correctness) . |
Automatic merge from submit-queue (batch tested with PRs 66840, 68159). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI Cluster Registry and Node Info CRDs Improvements **What this PR does / why we need it**: #67803 merged before I could address @lavalamp's feedback. This PR addresses his feedback **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Follow up on PR #67803 **Special notes for your reviewer**: **Release note**: ```release-note ``` /assign @lavalamp /assign @thockin CC @jsafrane @vladimirvivien @verult @gnufied @childsb
Automatic merge from submit-queue (batch tested with PRs 66840, 68159). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI Cluster Registry and Node Info CRDs Improvements **What this PR does / why we need it**: kubernetes/kubernetes#67803 merged before I could address @lavalamp's feedback. This PR addresses his feedback **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Follow up on PR kubernetes/kubernetes#67803 **Special notes for your reviewer**: **Release note**: ```release-note ``` /assign @lavalamp /assign @thockin CC @jsafrane @vladimirvivien @verult @gnufied @childsb Kubernetes-commit: 416f63c050d01fe831d532d89e29d01093d9c127
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI: skip attach for non-attachable drivers **What this PR does / why we need it**: This is implementation of kubernetes/community#2523. CSI volumes that don't need attach/detach now don't need external attacher running. WIP: * contains #67803 to get CSIDriver API. Ignore the first commit. * ~~missing e2e test~~ /sig storage cc: @saad-ali @vladimirvivien @verult @msau42 @gnufied @davidz627 **Release note**: ```release-note CSI volume plugin does not need external attacher for non-attachable CSI volumes. ```
Automatic merge from submit-queue (batch tested with PRs 68171, 67945, 68233). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI pod information in NodePublish **What this PR does / why we need it**: This is implementation of kubernetes/community#2439. It brings CSI closer to Flex volume features, CSI driver can (optionally) get name/namespace/UID/ServiceAccount of the pod that requires the CSI volume mounted. This allows CSI drivers to either do AAA before the volume is mounted or tailor content of volume to the pod. Work in progress: * contains #67803 to get `CSIDriver` API. Ignore the first commit. Related to #64984 #66362 (fixes only part of these issues) /sig storage cc: @saad-ali @vladimirvivien @verult @msau42 @gnufied **Release note**: ```release-note CSI NodePublish call can optionally contain information about the pod that requested the CSI volume. ```
Automatic merge from submit-queue (batch tested with PRs 66840, 68159). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI Cluster Registry and Node Info CRDs Improvements **What this PR does / why we need it**: kubernetes/kubernetes#67803 merged before I could address @lavalamp's feedback. This PR addresses his feedback **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Follow up on PR kubernetes/kubernetes#67803 **Special notes for your reviewer**: **Release note**: ```release-note ``` /assign @lavalamp /assign @thockin CC @jsafrane @vladimirvivien @verult @gnufied @childsb Kubernetes-commit: 416f63c050d01fe831d532d89e29d01093d9c127
Automatic merge from submit-queue (batch tested with PRs 66840, 68159). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI Cluster Registry and Node Info CRDs Improvements **What this PR does / why we need it**: kubernetes/kubernetes#67803 merged before I could address @lavalamp's feedback. This PR addresses his feedback **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Follow up on PR kubernetes/kubernetes#67803 **Special notes for your reviewer**: **Release note**: ```release-note ``` /assign @lavalamp /assign @thockin CC @jsafrane @vladimirvivien @verult @gnufied @childsb Kubernetes-commit: 416f63c050d01fe831d532d89e29d01093d9c127
@@ -192,9 +193,14 @@ func startAttachDetachController(ctx ControllerContext) (http.Handler, bool, err | |||
if ctx.ComponentConfig.AttachDetachController.ReconcilerSyncLoopPeriod.Duration < time.Second { | |||
return nil, true, fmt.Errorf("Duration time must be greater than one second as set via command line option reconcile-sync-loop-period.") | |||
} | |||
csiClientConfig := ctx.ClientBuilder.ConfigOrDie("attachdetach-controller") | |||
// csiClient works with CRDs that support json only | |||
csiClientConfig.ContentType = "application/json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't have been necessary. Clients should have priority order content accept types which should result in json being served. What happened when you didn't set this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt wouldn't have expected this to show up in code using CRD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like
kubeconfig.ContentConfig.ContentType = s.Generic.ClientConnection.ContentType |
defaulted at
kubernetes/staging/src/k8s.io/apimachinery/pkg/apis/config/v1alpha1/defaults.go
Lines 29 to 31 in 95c99eb
if len(obj.ContentType) == 0 { | |
obj.ContentType = "application/vnd.kubernetes.protobuf" | |
} |
I'm not seeing a fallback specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened when you didn't set this?
Watch()
did not work. It received protobuf and failed to parse it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this related to #62175?
What this PR does / why we need it:
Introduces the new
CSIDriver
andCSINodeInfo
API Object as proposed in kubernetes/community#2514 and kubernetes/community#2034Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes kubernetes/enhancements#594
Special notes for your reviewer:
Per the discussion in https://groups.google.com/d/msg/kubernetes-sig-storage-wg-csi/x5CchIP9qiI/D_TyOrn2CwAJ the API is being added to the staging directory of the
kubernetes/kubernetes
repo because the consumers will be attach/detach controller and possibly kubelet, but it will be installed as a CRD (because we want to move in the direction where the API server is Kubernetes agnostic, and all Kubernetes specific types are installed).Release note:
CC @jsafrane