Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lb uuid to docluster spec #291

Merged

Conversation

varshavaradarajan
Copy link
Contributor

What this PR does / why we need it:

DO LB uuid is currently stored in status. Upon restore, we lose this information and capdo creates a new LB instead of re-using the existing LB. This PR adds the lb uuid in the spec. And updates the docluster controller to prefer the spec uuid over status if present.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #290

Special notes for your reviewer:

  • I only made this change for v1beta1. And ran make generate, got a few warnings like:
E1228 13:28:36.045121   30566 conversion.go:755] Warning: could not find nor generate a final Conversion function for sigs.k8s.io/cluster-api-provider-digitalocean/api/v1beta1.DOLoadBalancer -> github.com/kubernetes-sigs/cluster-api-provider-digitalocean/api/v1alpha4.DOLoadBalancer
  • I am yet to test this. The setup we use internally in DO still uses v1alpha4, which makes it difficult for me to test this.

Documentation:

Release note:

Allow specifying an existing DO LB in docluster spec

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 3, 2022
@k8s-ci-robot
Copy link

Welcome @varshavaradarajan!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-digitalocean 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-digitalocean has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link

Hi @varshavaradarajan. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 3, 2022
@MorrisLaw
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 4, 2022
@MorrisLaw
Copy link
Member

/retest

Copy link
Member

@MorrisLaw MorrisLaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the issue and subsequent PR 🎉

@@ -126,6 +126,9 @@ type DOLoadBalancer struct {
// An object specifying health check settings for the Load Balancer. If omitted, default values will be provided.
// +optional
HealthCheck DOLoadBalancerHealthCheck `json:"healthCheck,omitempty"`
// The DO load balancer uuid. If omitted, a new load balancer will be created.
// +optional
ResourceId string `json:"resourceId,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need to add this field to the struct in v1alpha4/types.go too. It's likely what's causing the conversion warning you're seeing when running make generate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly v1alpha3/types.go too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our policy on updating APIs as old as v1alpha3? My gut feeling is that v1alpha4 should be the oldest we add features to, but that's mostly based on what feels right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our policy on updating APIs as old as v1alpha3?

I'm not 100% sure. Maybe you're right that worrying about v1alpha3 is unnecessary. @cpanato or @prksu might know for sure? Either way it'd be good to have an explicit rule regarding that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpanato need your 👀 ^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for olders api, i think we don't need to do that, otherwise might break users, but in the conversion we should add that

@fabriziopandini it is the correct approach when adding a new field in the newer API and not in the older ones? and just write some conversions? or what is the best practice for that? thanks in advance

api/v1beta1/types.go Outdated Show resolved Hide resolved
Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do something extra to ensure that the new spec field is only ever populated once?

@@ -126,6 +126,9 @@ type DOLoadBalancer struct {
// An object specifying health check settings for the Load Balancer. If omitted, default values will be provided.
// +optional
HealthCheck DOLoadBalancerHealthCheck `json:"healthCheck,omitempty"`
// The DO load balancer uuid. If omitted, a new load balancer will be created.
// +optional
ResourceId string `json:"resourceId,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our policy on updating APIs as old as v1alpha3? My gut feeling is that v1alpha4 should be the oldest we add features to, but that's mostly based on what feels right.

api/v1beta1/types.go Outdated Show resolved Hide resolved
controllers/docluster_controller.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 2, 2022
@cpanato
Copy link
Member

cpanato commented Feb 7, 2022

@MorrisLaw @MorrisLaw @varshavaradarajan this PR will be good to have for the 1.1.0 release

@cpanato
Copy link
Member

cpanato commented Feb 7, 2022

@varshavaradarajan please run make generate and push the updated files

@cpanato
Copy link
Member

cpanato commented Feb 7, 2022

also will be good to run some manual tests to cover this case, @varshavaradarajan can you write down the steps needed to reproduce this use case?

@varshavaradarajan
Copy link
Contributor Author

@cpanato - I have been running make generate but doesn't change any files. :(

@cpanato
Copy link
Member

cpanato commented Feb 7, 2022

@cpanato - I have been running make generate but doesn't change any files. :(

ok, I will check this tomorrow my morning :)

@varshavaradarajan
Copy link
Contributor Author

@cpanato - to reproduce:

  1. Setup a management cluster x
  2. Setup a workload cluster
  3. Do clusterctl backup with clusterctl -v 10 backup -n <workload_cluster_namespace> --directory /tmp/test-backup with the kubeconfig pointing to the management cluster x.
  4. Setup another management cluster y
  5. Make management cluster x inaccessible by deleting the apiserver, power off management cluster x's control plane node etc. Do not delete x. The workload cluster should still be intact.
  6. Restore using clusterctl restore: clusterctl -v 10 restore --directory /tmp/test-backup . kubeconfig should point to management cluster y.

Expected Result:
Should re-use the existing control plane LB, not create a new one.

clusterctl restore had a bug which was fixed in v1.0.1: https://github.com/kubernetes-sigs/cluster-api/releases/tag/v1.0.1. Please build a binary with the fix.

@varshavaradarajan
Copy link
Contributor Author

@cpanato - generated the conversion files through make generate finally! Waiting for your approval.

@cpanato
Copy link
Member

cpanato commented Feb 11, 2022

@cpanato - generated the conversion files through make generate finally! Waiting for your approval.

thanks for your patience, I would like to run some upgrade tests and this case as well, so bear with me :)

@cpanato
Copy link
Member

cpanato commented Feb 11, 2022

/test pull-cluster-api-provider-digitalocean-capi-e2e
/test pull-cluster-api-provider-digitalocean-conformance

@cpanato
Copy link
Member

cpanato commented Feb 11, 2022

/hold for some manual tests and check why conformance and capi did not work well

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 11, 2022
@cpanato
Copy link
Member

cpanato commented Feb 14, 2022

@varshavaradarajan can you please rebase this PR with the latest main branch? we made some fixes: #299
after that I will finish the manual tests and upgrade tests

thanks for your patience

@cpanato
Copy link
Member

cpanato commented Feb 15, 2022

6. clusterctl -v 10 restore --directory /tmp/test-backup

trying the backup but getting

$ clusterctl backup --directory /tmp/test-backup -n default -v 10
Using configuration File="/Users/cpanato/.cluster-api/clusterctl.yaml"
Performing backup...
Discovering Cluster API objects
Secret Count=8
ConfigMap Count=1
Total objects Count=12
Excluding secret from move (not linked with any Cluster) name="default-token-ph9ck"
Object won't be moved because it's not included in GVK considered for move kind="KubeadmConfig" name="test-control-plane-r76tb"
Object won't be moved because it's not included in GVK considered for move kind="KubeadmConfig" name="test-md-0-6txhk"
Object won't be moved because it's not included in GVK considered for move kind="KubeadmControlPlane" name="test-control-plane"
Starting backup of Cluster API objects Clusters=0
Moving Cluster API objects ClusterClasses=0
Pausing the source cluster
Pausing the source cluster classes
Saving files to /tmp/test-backup
Resuming the target cluter classes
Resuming the source cluster
Using configuration File="/Users/cpanato/.cluster-api/clusterctl.yaml"

@cpanato
Copy link
Member

cpanato commented Feb 15, 2022

/test pull-cluster-api-provider-digitalocean-capi-e2e
/test pull-cluster-api-provider-digitalocean-conformance

@cpanato
Copy link
Member

cpanato commented Feb 15, 2022

the upgrade from v1alpha4 > v1beta1 works!

Copy link
Member

@cpanato cpanato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this!!
/approve

@timoreimann and @MorrisLaw you had comments, can you please take another look and see if that looks good for you?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 16, 2022
@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cpanato, varshavaradarajan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 16, 2022
@timoreimann
Copy link
Contributor

For posterity, our related Slack discussion is here.

This CAPI Slack discussion indicates that the backup/restore failure is due to clusterctl, and not our implementation. Approving since all our tests pass.

/lgtm

@timoreimann
Copy link
Contributor

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 21, 2022
@k8s-ci-robot k8s-ci-robot merged commit d7490d7 into kubernetes-sigs:main Feb 21, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.1.0 milestone Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Store LB uuid in DOCluster spec
5 participants