Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gce: Reuse unsuccessfully provisioned volumes. #38702

Merged

Conversation

jsafrane
Copy link
Member

@jsafrane jsafrane commented Dec 13, 2016

GCE PD names generated by Kubernetes are guaranteed to be unique - they
contain name of the cluster and UID of the PVC that is behind it.
Presence of a GCE PD that has the same name as we want to provision
indicates that previous provisioning did not go well and most probably
the controller manager process was restarted in the meantime.

Kubernetes should reuse this volume and not provision a new one.

Fixes #38681

@jsafrane jsafrane added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Dec 13, 2016
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 13, 2016
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note-label-needed labels Dec 13, 2016
@jsafrane jsafrane added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Dec 13, 2016
@jsafrane jsafrane force-pushed the gce-provisioning-existing branch 2 times, most recently from bec64d2 to 6582f38 Compare December 13, 2016 15:49
@k8s-ci-robot
Copy link
Contributor

Jenkins CRI GCE Node e2e failed for commit bec64d23c63d5f852c7600bfb27d0844d8e67cba. Full PR test history.

The magic incantation to run this job again is @k8s-bot cri node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@mikedanese mikedanese assigned saad-ali and unassigned mikedanese Dec 20, 2016
return gce.waitForZoneOp(createOp, zone)
err = gce.waitForZoneOp(createOp, zone)
if isGCEError(err, "alreadyExists") {
glog.V(2).Infof("GCE PD %q already exists, reuising", name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Warningf? Also, reuising -> reusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed both the level and typo

GCE PD names generated by Kubernetes are guaranteed to be unique - they
contain name of the cluster and UID of the PVC that is behind it.
Presence of a GCE PD that has the same name as we want to provision
indicates that previous provisioning did not go well and most probably
the controller manager process was restarted in the meantime.

Kubernetes should reuse this volume and not provision a new one.
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @mikedanese
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@jingxu97
Copy link
Contributor

/approval

@jingxu97 jingxu97 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 10, 2017
@jsafrane
Copy link
Member Author

@jingxu97

/approval

that did not work, please /approve

@jingxu97
Copy link
Contributor

@approve

@jsafrane
Copy link
Member Author

I approve it by myself :-)
/approve

@jsafrane jsafrane added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 15, 2017
@childsb
Copy link
Contributor

childsb commented Feb 20, 2017

removing saads review since changes were made and its not showing the comment.

@childsb childsb dismissed saad-ali’s stale review February 20, 2017 19:43

updates were made to address.

@jsafrane
Copy link
Member Author

@k8s-bot kops aws e2e test this

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Feb 23, 2017

@jsafrane: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins GCI GKE smoke e2e ba6ab90 link @k8s-bot gci gke e2e test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 38702, 41810, 41778, 41858, 41872)

@k8s-github-robot k8s-github-robot merged commit 616d929 into kubernetes:master Feb 23, 2017
k8s-github-robot pushed a commit that referenced this pull request May 24, 2017
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)

Fix provisioned GCE PD not being reused if already exists

@jsafrane PTAL 

This is another attempt at #38702 . We have observed that `gce.service.Disks.Insert(gce.projectID, zone, diskToCreate).Do()` instantly gets an error response of alreadyExists, so we must check for it.

I am not sure if we still need to check for the error after `waitForZoneOp`; I think that if there is an alreadyExists error, the `Do()` above will always respond with it instantly. But because I'm not sure, and to be safe, I will leave it.
calebamiles pushed a commit to kubernetes/cloud-provider-gcp that referenced this pull request Mar 21, 2018
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)

Fix provisioned GCE PD not being reused if already exists

@jsafrane PTAL 

This is another attempt at kubernetes/kubernetes#38702 . We have observed that `gce.service.Disks.Insert(gce.projectID, zone, diskToCreate).Do()` instantly gets an error response of alreadyExists, so we must check for it.

I am not sure if we still need to check for the error after `waitForZoneOp`; I think that if there is an alreadyExists error, the `Do()` above will always respond with it instantly. But because I'm not sure, and to be safe, I will leave it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants