Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUILD-284: integrate shared resources operator #198

Merged
merged 2 commits into from
Oct 20, 2021
Merged

BUILD-284: integrate shared resources operator #198

merged 2 commits into from
Oct 20, 2021

Conversation

coreydaley
Copy link
Member

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 3, 2021
@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 4, 2021
@@ -0,0 +1,269 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the CSI driver need access to some CR for the driver? I don't see its RBAC here. In addition, who is creating the CRD? It could be CVO (then put the CRD into /manifests/) or it could be your operator (then the operator needs permissions to create/update CRDs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah at the moment our various CI flows are passing because https://github.com/openshift/csi-driver-shared-resource/blob/master/Makefile#L54-L57 is creating the CRDs.

Along the "CVO path", @coreydaley has openshift/openshift-apiserver#244 up, which I think would result in the CVO handling it via how all the other CVO operator based CRDs get applied to the cluster.

If I am correct, then once that merged we could remove that make crd Makefile target in the driver repo.

If I am incorrect, then yes, we would need to add permissions here to create the new CRDs, given the CSO is not creating any CRDs at the moment, and go from there.

@coreydaley am I missing something based on perhaps you recent conversations with the apiserver team?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

granted I just realized the timestamp of @jsafrane comment here is over 2 months old, so perhaps everyone has since been sorted out on this :-)

@jsafrane
Copy link
Contributor

jsafrane commented Aug 5, 2021

Out of curiosity, do you aim at OCP 4.9? We're past feature freeze there and this looks like a feature.

@adambkaplan
Copy link
Contributor

@jsafrane this now being staged for when 4.10 is available. We missed ART's cutoff for adding things to the payload.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 3, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 14, 2021
@coreydaley
Copy link
Member Author

/retest

3 similar comments
@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

/assign @jsafrane @adambkaplan @gabemontero
ptal

@coreydaley
Copy link
Member Author

/retest

@gabemontero
Copy link
Contributor

/assign @jsafrane @adambkaplan @gabemontero ptal

other than the discussion point from back in August about how the CRDs get installed on the cluster, nothing jumped out at me; that said, I'm more comfortable with @jsafrane applying the labels for merge, as I suspect the boiler plate type aspects of this PR are very CSO specific

Just to confirm, I did see the elements about applying our new drivers on all the clouds / is "cloud neutral" and what @adambkaplan introduced here with https://github.com/openshift/cluster-storage-operator/pull/197/files

@jsafrane
Copy link
Contributor

From e2e-vsphere-csi job that sets TechPreviewNoUpgrade FeatureSet:

(CSIDriverStarter_SyncError::SHARESCSIDriverOperatorCR_SharedResourcesDriverStaticResourcesController_SyncError):
CSIDriverStarterDegraded: detected CSI driver csi.sharedresource.openshift.io that is not provided by
OpenShift - please remove it before enabling the OpenShift one

Sorry, I forgot to mention we use special annotation csi.openshift.io/managed in CSIDriver to detect which CSI driver is installed - OCP or community one. Please add the annotation to you CSIDriver instance in the operator. See AWS EBS for an example.
(We should document it somewhere...)

SHARESCSIDriverOperatorCRDegraded: SharedResourcesDriverStaticResourcesControllerDegraded: "rbac/node_role.yaml" (string):
clusterroles.rbac.authorization.k8s.io "shared-resource-secret-configmap-share-watch-sar-create" is forbidden:
user "system:serviceaccount:openshift-cluster-csi-drivers:shared-resource-csi-driver-operator"
(groups=["system:serviceaccounts" "system:serviceaccounts:openshift-cluster-csi-drivers" "system:authenticated"])
is attempting to grant RBAC permissions not currently held:

See https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-storage-operator/198/pull-ci-openshift-cluster-storage-operator-master-e2e-vsphere-csi/1450116529319317504 and storage conditions, it lists ~5 issues with RBAC.

The go code itself looks good.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 19, 2021
@coreydaley coreydaley changed the title [WIP] Jira build 284 integrate shared resources operator BUILD-284: integrate shared resources operator Oct 19, 2021
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 19, 2021
@coreydaley
Copy link
Member Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 19, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 19, 2021
@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

/retest

 Oct 19 17:48:17.759: FAIL: Some cluster operators are not ready: storage (Upgradeable=False VSphereProblemDetectorController_VSphereOlderVersionDetected: VSphereProblemDetectorControllerUpgradeable: Marking cluster un-upgradeable because one or more VMs are on hardware version vmx-13) 

@coreydaley
Copy link
Member Author

@jsafrane ptal, the e2e-vsphere-csi job is passing now,

@coreydaley
Copy link
Member Author

/retest

2 similar comments
@coreydaley
Copy link
Member Author

/retest

@coreydaley
Copy link
Member Author

/retest

@jsafrane
Copy link
Contributor

jsafrane commented Oct 20, 2021

This PR itself looks good, a new CSI driver is installed when TechPreviewNoUpgrade is present.
/lgtm
/approve

One blocking question/confirmation: do you have ART build of csi-driver-shared-resource-operator and csi-driver-shared-resource images? We cannot merge this PR until the images are built, otherwise it breaks nightly ART builds and blocks whole QA.
/hold

Non-blocking issues that need to be solved in subsequent PRs / another repos:

  • The CSI driver runs only on three nodes out of six. I guess you miss a toleration in the DaemonSet.
  • The driver containers complain about Failed to watch *v1alpha1.SharedConfigMap: failed to list *v1alpha1.SharedConfigMap: the server could not find the requested resource (get sharedconfigmaps.sharedresource.openshift.io).
    Something needs to install the CRD.
    • Probably the CSI driver operator should do that, so it's created only when TechPreviewNoUpgrade is set.
    • Or it could be CVO, then the CRD just needs to be present in manifests/ directory in CSO image. We have a trick how to copy CRD yamls from openshift/api to CSO image, see our Dockerfile. This way, the CRD will be installed also without TechPreviewNoUpgrade, which is bad.

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Oct 20, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: coreydaley, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 20, 2021
@jsafrane
Copy link
Contributor

I was assured ART builds are present.
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 20, 2021
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2021

@coreydaley: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere c283251 link false /test e2e-vsphere

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@coreydaley
Copy link
Member Author

/retest-required

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants