New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework to library-go #44
Conversation
0c9b7b1
to
fea78fd
Compare
Makefile
Outdated
# $3 - Dockerfile path | ||
# $4 - context directory for image build | ||
# It will generate target "image-$(1)" for building the image and binding it as a prerequisite to target "images". | ||
$(call build-image,csi-driver-manila-operator,$(IMAGE_REGISTRY)/ocp/4.5:csi-driver-manila-operator,./build/Dockerfile.openshift,.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4.6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Makefile
Outdated
@@ -0,0 +1,54 @@ | |||
SHELL :=/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used (I need to remove from AWS EBS too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
assets/controller.yaml
Outdated
- key: CriticalAddonsOnly | ||
operator: Exists | ||
containers: | ||
# Warning: the operator expects the first container to be the CSI driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we can remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
assets/controller.yaml
Outdated
allowPrivilegeEscalation: true | ||
image: ${DRIVER_IMAGE} | ||
args: | ||
- --v=5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
${LOG_LEVEL} in all containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
assets/controller.yaml
Outdated
securityContext: | ||
privileged: true | ||
capabilities: | ||
add: ["SYS_ADMIN"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure: provisioner and snapshotter need to be privileged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No! Removed privileged from controller containers
test/e2e/storageclass.yaml
Outdated
metadata: | ||
name: ebs.csi.aws.com | ||
provisioner: ebs.csi.aws.com | ||
volumeBindingMode: WaitForFirstConsumer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might want to remove these files, or the whole test dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed whole test directory
pkg/operator/starter.go
Outdated
|
||
klog.Info("Starting controllers") | ||
go manilaController.Run(ctx, 1) | ||
go secretSyncController.Run(ctx, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This controller will run regardless of manila; can we move it to the manila.Runnable
slice above and have it run only when manila is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
"csi.storage.k8s.io/node-publish-secret-namespace": util.OperatorNamespace, | ||
}, | ||
} | ||
return sc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storageclass-base.yaml
isn't being used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
return factory.New().WithSync(c.sync).WithSyncDegradedOnError(operatorClient).WithInformers( | ||
operatorClient.Informer(), | ||
secretInformer.Core().V1().Secrets().Informer(), | ||
).ToController("SecretSync", eventRecorder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no resync interval set with ResyncEvery()
so this controller won't resync periodically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 20 minutes resync
Data: data, | ||
} | ||
|
||
return &secret, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: why does the CSI driver and the cluster-credentials-operator produce distinct secrets? Are both formats "officially" supported/valid in OpenStack? If so, I'd like to see either cloud-credentials-operator or the CSI driver supporting an additional format. But that's not a discussion for this PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to @Fedosin and there are at least three different confing/secret formats in OpenStack, each component using its own. I agree it should be in cloud-credentials-operator
3f025c9
to
57addc4
Compare
k8s.io/client-go v0.19.0-rc.2 | ||
k8s.io/component-base v0.19.0-rc.2 | ||
k8s.io/klog/v2 v2.3.0 | ||
sigs.k8s.io/yaml v1.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: why are you pinning these deps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go mod tidy
does that.
pkg/controllers/manila/manila.go
Outdated
// under (short?) maintenance / reconfiguration. | ||
// Similarly, StorageClasses are not deleted when a share type disappears | ||
// from Manila. | ||
type Controller struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: in library-go (for debugging reasons) it's recommended to name these structs with specific names, like ManilaCSIDriverController
. Same for the files. We're not in library-go, but just it might be good to follow th same convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
err = nil | ||
} | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't want to return if it's not found here, we want to fall through to ApplyStorageClass instead, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed + added comment about fallthrough
pkg/controllers/manila/manila.go
Outdated
return err | ||
} | ||
|
||
func (c *Controller) setDisabled(msg string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setDisableCondition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
pkg/operator/starter.go
Outdated
kubeInformersForNamespaces, | ||
openstackClient, | ||
[]manila.Runnable{ | ||
manilaControllerSet, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: manila controller running the manila controller set sounds odd; perhaps rename:
manilaControllerSet -> csiDriverControllerSet
nfsController -> nfsCSIDriverController
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
// Manila needs to replace two of them: Manila driver and NFS driver image. | ||
// Let the Manila image be replaced by CSIDriverController and NFS in this | ||
// custom asset loading func. | ||
func assetWithNFSDriver(file string) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice trick
|
||
const ( | ||
operandName = "manila-csi-driver" | ||
operatorName = "manila-csi-driver-operator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: the operator namespace is in util
, not sure if want to have them in the same place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's name of the operator, not name of the namespace. It's just coincidence they're the same :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to say: have the operator name and namespace consts in the same place. But that's not really important
operatorClient, | ||
assetWithNFSDriver, | ||
kubeClient, | ||
controllerConfig.EventRecorder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of these args is incorrect:
func NewCSIDriverController(
name string,
csiDriverName string,
csiDriverNamespace string,
instanceName string,
operatorClient v1helpers.OperatorClient,
assetFunc func(string) []byte,
kubeClient kubernetes.Interface,
eventRecorder events.Recorder,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, fixed
pkg/controllers/manila/manila.go
Outdated
// Minimal interval between controller resyncs. The controller will detect | ||
// new share types in Manila and create StorageClasses for them at least | ||
// once per this interval. | ||
resyncInterval = 1 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to resync every minute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When someone creates a new share type in OpenStack, this operator should create StorageClass for it. Question is what's the right frequency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the optimal resync interval for this type of storage backend (manila, EFS) will vary from cluster to cluster. For instance, in some clusters the number of share types will seldom change, while in others new share types will be added and removed quite often.
Having said that, whatever we hardcode here might not be the ideal for some use cases.
IMO the best way to approach this is to define a standard value, say 20 minutes, and allow the cluster admin to customize it through the CR. If we choose this approach, we could tackle the customization part in another PR.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm updating it to 20 min, however, I am not sure how important it is to have it customizable.
README.md
Outdated
|
||
Operator to create, configure and manage CSI driver for OpenStack Manila in OpenShift. | ||
An operator to deploy the [Manil CSI driver](https://github.com/openshift/cloud-provider-openstack/tree/master/pkg/csi/manila) in OpenShift. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manila
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
README.md
Outdated
|
||
### Installing the operator | ||
The operator is based on [openshift/library-go](https://github.com/openshift/library-go). It manages `ManilaDriver` instance named `cluster` and runs several controllers in parallel: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind and CR changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
pkg/operator/starter.go
Outdated
).WithCSIDriverController( | ||
"ManilaDriverController", | ||
operandName, | ||
string(operatorapi.ManilaCSIDriver), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't the instance/config name and driver/operand name swapped here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I think
"NFSDriverController", | ||
string(operatorapi.ManilaCSIDriver), | ||
util.OperatorNamespace, | ||
"nfs-csi-driver", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swap line 83 with line 85?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I think
README.md
Outdated
* It starts `nfsController`: Runs `csidriverset.Controller` that installs NFS CSI driver itself. | ||
* It creates `StorageClass` for each share type reported by Manila and periodically syncs them at least once per minute, in case a new share type appears in Manila. | ||
* It never removes StorageClass if Manila service or share type disappears. It may be temporary OpenStack ir Manila re-configuration hiccup. | ||
* If there is no Manila service, it marks the `ManilaDriver` instance with `ManilaControllerDisabled: True` condition. It does not stop any CSI drivers started when Manila service was present! This allows pod to at least unmount their volumes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace ManilaDriver with ClusterCSIDriver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
pkg/controllers/manila/manila.go
Outdated
type ManilaController struct { | ||
operatorClient v1helpers.OperatorClient | ||
kubeClient kubernetes.Interface | ||
secretLister corelisters.SecretLister |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're not using this anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
pkg/controllers/manila/openstack.go
Outdated
|
||
cert, err := getCloudProviderCert() | ||
if err != nil && !os.IsNotExist(err) { | ||
return nil, fmt.Errorf("Failed to get cloud provider CA certificate: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: my editor says errors should not start with capital letters
pkg/controllers/manila/openstack.go
Outdated
if len(cert) > 0 { | ||
certPool, err := x509.SystemCertPool() | ||
if err != nil { | ||
return nil, fmt.Errorf("Create system cert pool failed: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
/lgtm Adding hold in case @Fedosin wants to take a look |
README.md
Outdated
```sh | ||
oc delete -f deploy/crds/csi.openshift.io_maniladrivers_crd.yaml -f deploy/role.yaml -f deploy/role_binding.yaml -f deploy/service_account.yaml -f deploy/namespace.yaml | ||
``` | ||
* OpenStack cloud credentials are in Secret named "cloud-credentials" in the same namespace where the operator runs. The operator uses the credentials to check if Manila is present in the cluster and, since OpenStack does not allow any fine-grained access controll, it lets the CSI driver to use the same credentials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
access control
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
README.md
Outdated
oc delete -f deploy/crds/csi.openshift.io_maniladrivers_crd.yaml -f deploy/role.yaml -f deploy/role_binding.yaml -f deploy/service_account.yaml -f deploy/namespace.yaml | ||
``` | ||
* OpenStack cloud credentials are in Secret named "cloud-credentials" in the same namespace where the operator runs. The operator uses the credentials to check if Manila is present in the cluster and, since OpenStack does not allow any fine-grained access controll, it lets the CSI driver to use the same credentials. | ||
* If underlying OpenStack uses self-signed certificate, the operator expects the ceritificate is present in a ConfigMap named "cloud-provider-config" with key "ca-bundle.pem" in the namespace where it runs. Generally, it should be a copy of "openshift-config/cloud-provider-config" ConfigMap. It then uses the certificate to talk to OpenStack API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
certificate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
pkg/version/version.go
Outdated
func init() { | ||
buildInfo := prometheus.NewGaugeVec( | ||
prometheus.GaugeOpts{ | ||
Name: "openshift_aws_ebs_csi_driver_operator", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift_manila_csi_driver_operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, fixed
pkg/version/version.go
Outdated
buildInfo := prometheus.NewGaugeVec( | ||
prometheus.GaugeOpts{ | ||
Name: "openshift_aws_ebs_csi_driver_operator", | ||
Help: "A metric with a constant '1' value labeled by major, minor, git commit & git version from which OpenShift AWS EBS CSI Driver Operator was built.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Openshift Manila CSI Driver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Use OpenStack's ceritificates when OpenStack API uses self-signed ones. Assume that CSO creates a ConfigMap with it when it starts the operator. Since CSO is going to create the ConfigMap and operator Deployment in parallel, make sure the operator recovers when the ConfigMap is created later. In addition, the certificate may be added / updated later, when the operator already runs for some time.
The operator must already have valid cloud credentials when it starts to check if underlying OpenStack has Manila. Since OpenStack does not provide fine-grained permissions, the credentials used by the operator and the driver can be the same. Therefore let the driver reuse the operator credentials.
Delete and re-create a StorageClass when there is a change in StorageClass.Paramertes. Kubernetes API does not allow Parameters change after creation. This is useful when moving a StorageClass between namespaces - secrets namespace changes.
As required by OCP. 50 MB / 1% of a CPU core for most containers, 20MB / 0.5% for registrar.
It takes ~20 seconds to take some final error
1 minute is too often. We do not expect that new share types are added regularly.
library-go does full copare and it does not work well with defaulted fields
/hold cancel |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Fedosin, jsafrane The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Rework to NewCSIControllerSet provided by library-go. In addition, following controllers are running:
The operator assumes that all connection details that the operator needs when talking to OpenStack is prepared by CSO, namely:
@openshift/storage