Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STOR-1049: Add topology aware sc #121

Closed
wants to merge 2 commits into from

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Dec 6, 2022

@gnufied
Copy link
Member Author

gnufied commented Dec 6, 2022

cc @rvanderp3 @jcpowermac

return nil, fmt.Errorf("failed to access datacenter %s: %s", dcName, err)
}

finder = find.NewFinder(vmClient.Client, false)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a new finder here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 6, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 6, 2022
@gnufied
Copy link
Member Author

gnufied commented Dec 8, 2022

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 8, 2022

@gnufied: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-csi af44173 link true /test e2e-vsphere-csi
ci/prow/e2e-vsphere-csi-migration af44173 link false /test e2e-vsphere-csi-migration

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

pkg/operator/storageclasscontroller/vmware.go Show resolved Hide resolved
err = v.createOrUpdateTag(ctx, ds)
if err != nil {
return v.policyName, fmt.Errorf("error creating or updating tag %s: %v", v.tagName, err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of stopping the execution, should it try the next one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to hard error out, rather than continuing with tagging whatever datastores we could access, because that could result in a storagepolicy which has unpredictable behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was when you have say 5 failureDomains/datastores, then you successfully tag 3 of them and the 4th fail. Sounds like it's better to either:

  1. Try tagging all of them and return an aggregated error
  2. Return early in case of error, but don't leave behind tagged datastores

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay I have aggregated the errors and returned them. I think that is a good idea in case we have multiple datastores and we don't have access to them, so as rather than returning one error at a time - we will return error about all inaccessible datastores at the same time.

I have pushed that change to - #125 however, which also includes #121

Copy link
Member

@bertinatto bertinatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment related to error aggregation, otherwise LGTM (not tagging to give @jsafrane a chance to review).

Comment on lines +155 to +159
vSphereInfraConfig := v.infra.Spec.PlatformSpec.VSphere
if vSphereInfraConfig != nil && len(vSphereInfraConfig.FailureDomains) > 1 {
return v.createZonalStoragePolicy(ctx)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createZonalStoragePolicy and rest of createStoragePolicy below are almost the same, createZonalStoragePolicy only loops / tags over more datastores. Would it be possible to join them together somehow? Like getZonalDatastores() + getDefaultDatastore() and then a common tagging loop + createStorageProfile over the result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay I have unified that code. But I have pushed my new changes to https://github.com/openshift/vmware-vsphere-csi-driver-operator/pull/125/files#diff-cd720ec01eefb8566ee378bc53aef398da5550c725aeacd65aba7efb2b39e311R146

That branch already includes commit from this branch and has more tests and ensures tags are recreated if deleted etc.

if vSpherePlatformConfig != nil {
failureDomains := vSpherePlatformConfig.FailureDomains
if len(failureDomains) > 0 {
return []string{defaultOpenshiftZoneCategory, defaultOpenshiftRegionCategory}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a warning when both infra and clusterCSIDriver specify topology. Maybe with a metric + alert, but that probably does not belong to this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. I can file this as a story attached to the same epic in jira.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied
Copy link
Member Author

gnufied commented Dec 13, 2022

I am going to close this favour of #125

/close

@openshift-ci openshift-ci bot closed this Dec 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 13, 2022

@gnufied: Closed this PR.

In response to this:

I am going to close this favour of #125

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants