Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait until backend storage PVC is ready before starting VMIs #11373

Merged
merged 2 commits into from Apr 23, 2024

Conversation

jean-edouard
Copy link
Contributor

What this PR does

Before this PR:
VMIs can encounter a (non-fatal) error while starting if the PVC is not ready.

After this PR:
VMIs will wait until the PVC is ready and not run into any error.

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note

NONE

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Feb 27, 2024
Copy link
Contributor

@fossedihelm fossedihelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jean-edouard! Some comments below! Thanks

pkg/virt-controller/watch/vmi.go Outdated Show resolved Hide resolved
Comment on lines 1213 to 1215
var backendStoragePending bool
backendStoragePending, err = backendstorage.IsPVCPending(vmi, c.clientset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do it inline?

Suggested change
var backendStoragePending bool
backendStoragePending, err = backendstorage.IsPVCPending(vmi, c.clientset)
backendStoragePending, err := backendstorage.IsPVCPending(vmi, c.clientset)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if PVC goes to Lost state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do it inline?

'err' redeclared in this block

What happens if PVC goes to Lost state?

Good point, changed to an error in that case

Comment on lines 122 to 162
func IsPVCPending(vmi *corev1.VirtualMachineInstance, client kubecli.KubevirtClient) (bool, error) {
if !IsBackendStorageNeededForVMI(&vmi.Spec) {
return false, nil
}

pvc, err := client.CoreV1().PersistentVolumeClaims(vmi.Namespace).Get(context.Background(), PVCForVMI(vmi), metav1.GetOptions{})
if err != nil {
return false, err
}

if pvc.Status.Phase != v1.ClaimBound {
return true, nil
}

return false, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used from the vmi controller which has a pvcInformer. How about using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if the PVC just got created, the informer might not have been updated yet, triggering an error.
In general, it seems like a better idea to get the latest status instead of risking failing the whole sync() on an outdated phase.

@jean-edouard jean-edouard marked this pull request as draft February 27, 2024 22:24
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 27, 2024
@jean-edouard
Copy link
Contributor Author

Moving to draft since storage tests are failing.

@akalenyu
Copy link
Contributor

akalenyu commented Feb 28, 2024

Moving to draft since storage tests are failing.

Are you holding off VMI start when pvc.phase == pending? In that case, this is breaking WaitForFirstConsumer based storage
which will only have the PVC bind once a workload exists that consumes it

@jean-edouard jean-edouard marked this pull request as ready for review February 28, 2024 14:57
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 28, 2024
@jean-edouard
Copy link
Contributor Author

Moving to draft since storage tests are failing.

Should be fixed now, we had to error out to ensure the VMI gets re-enqueued.

@jean-edouard
Copy link
Contributor Author

Are you holding off VMI start when pvc.phase == pending?

Yes, but this is limited to the backend storage PVC, which is not a volume of the VMI

In that case, this is breaking WaitForFirstConsumer based storage which will only have the PVC bind once a workload exists that consumes it

I believe this is not a concern for RWX PVCs, is that correct?

@akalenyu
Copy link
Contributor

I believe this is not a concern for RWX PVCs, is that correct?

More often than not, RWX storage backends won't be topology-constrained (accessible from all nodes in the cluster),
but we've witnessed large ceph WFFC setups as well to contradict this.

Yes, but this is limited to the backend storage PVC, which is not a volume of the VMI

Hmm, interesting. How does the backend storage PVC bind today with WFFC storage? Who is mounting it?

/cc @mhenriks

@jean-edouard
Copy link
Contributor Author

I believe this is not a concern for RWX PVCs, is that correct?

More often than not, RWX storage backends won't be topology-constrained (accessible from all nodes in the cluster), but we've witnessed large ceph WFFC setups as well to contradict this.

Yes, but this is limited to the backend storage PVC, which is not a volume of the VMI

Hmm, interesting. How does the backend storage PVC bind today with WFFC storage? Who is mounting it?

/cc @mhenriks

It's a RWX FS PVC created like that: https://github.com/kubevirt/kubevirt/blob/main/pkg/storage/backend-storage/backend-storage.go#L99-L114
It is mounted inside virt-launcher as volume mounts, see https://github.com/kubevirt/kubevirt/blob/main/pkg/virt-controller/services/rendervolumes.go#L401-L422

By the way, everything works fine today, but on first VM start we sometimes get a failure logged because the PVC wasn't ready (the VM then successfully starts).
What I'm trying to do here is to get rid of that failure.

@mhenriks
Copy link
Member

mhenriks commented Mar 1, 2024

I believe this is not a concern for RWX PVCs, is that correct?

RWX PVCs may be WFFC

@jean-edouard
Copy link
Contributor Author

I believe this is not a concern for RWX PVCs, is that correct?

RWX PVCs may be WFFC

Good to know, thanks! So I guess we need to check for that and start the "doppleganger" pod if needed.
/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 1, 2024
@mhenriks
Copy link
Member

mhenriks commented Mar 1, 2024

I believe this is not a concern for RWX PVCs, is that correct?

RWX PVCs may be WFFC

Good to know, thanks! So I guess we need to check for that and start the "doppleganger" pod if needed. /hold

You shouldn't have to do the "doppleganger" for TPM/EFI PVCs. I would simply suggest checking if the PVC is WFFC. If it is WFFC, you can create the pod right away otherwise you can wait for it to be bound

EDIT: maybe you will still get the error you are trying to avoid in the WFFC case but it should be rare and as you said it is non fatal.

doppleganger is probably the only way to avoid the error entirely

@jean-edouard
Copy link
Contributor Author

I believe this is not a concern for RWX PVCs, is that correct?

RWX PVCs may be WFFC

Good to know, thanks! So I guess we need to check for that and start the "doppleganger" pod if needed. /hold

You shouldn't have to do the "doppleganger" for TPM/EFI PVCs. I would simply suggest checking if the PVC is WFFC. If it is WFFC, you can create the pod right away otherwise you can wait for it to be bound

Ah, so if bound || WFFC then start pod else re-enqueue?

@mhenriks
Copy link
Member

mhenriks commented Mar 1, 2024

I believe this is not a concern for RWX PVCs, is that correct?

RWX PVCs may be WFFC

Good to know, thanks! So I guess we need to check for that and start the "doppleganger" pod if needed. /hold

You shouldn't have to do the "doppleganger" for TPM/EFI PVCs. I would simply suggest checking if the PVC is WFFC. If it is WFFC, you can create the pod right away otherwise you can wait for it to be bound

Ah, so if bound || WFFC then start pod else re-enqueue?

Yes that what I was thinking

@mhenriks
Copy link
Member

mhenriks commented Mar 1, 2024

You can test RWX WFFC in kubevirtci with rook-ceph-block-wffc storageclass

@kubevirt-bot kubevirt-bot added the sig/buildsystem Denotes an issue or PR that relates to changes in the build system. label Apr 4, 2024
@jean-edouard
Copy link
Contributor Author

/cc @acardace

@jean-edouard jean-edouard force-pushed the bindbackendstorage branch 2 times, most recently from 7662eff to 1feb44c Compare April 18, 2024 18:11
@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2024
Signed-off-by: Jed Lejosne <jed@redhat.com>
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2024
@jean-edouard
Copy link
Contributor Author

/retest

Copy link
Contributor

@fossedihelm fossedihelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @jean-edouard! Everything looks good from my side.
Just a very small note :)
/lgtm

Entry("using explicit Immediate mode", pointer.P(storagev1.VolumeBindingImmediate)),
Entry("using nil mode", nil))

It("should create a corresponding Pod on VMI creation with a storage class in WaitForFirstCustomer mode", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/WaitForFirstCustomer/WaitForFirstConsumer/

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 19, 2024
// IsPVCReady returns true if either:
// - No PVC is needed for the VMI since it doesn't use backend storage
// - The backend storage PVC is bound
// - The backend storage PVC is pending uses a WaitForFirstCustomer storage class
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/WaitForFirstCustomer/WaitForFirstConsumer/

Signed-off-by: Jed Lejosne <jed@redhat.com>
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 22, 2024
@fossedihelm
Copy link
Contributor

Thank You! Great work!
/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 22, 2024
@mhenriks
Copy link
Member

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 22, 2024
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-1.27-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.29-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-storage
/test pull-kubevirt-e2e-k8s-1.27-sig-compute
/test pull-kubevirt-e2e-k8s-1.27-sig-operator
/test pull-kubevirt-e2e-k8s-1.28-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-storage
/test pull-kubevirt-e2e-k8s-1.28-sig-compute
/test pull-kubevirt-e2e-k8s-1.28-sig-operator

@kubevirt-bot
Copy link
Contributor

@jean-edouard: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-unit-test-arm64 bba9c9e link false /test pull-kubevirt-unit-test-arm64
pull-kubevirt-conformance-arm64 bba9c9e link false /test pull-kubevirt-conformance-arm64

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-bot kubevirt-bot merged commit 1f4e36c into kubevirt:main Apr 23, 2024
36 of 38 checks passed
@acardace
Copy link
Member

/cherrypick release-1.2

@kubevirt-bot
Copy link
Contributor

@acardace: #11373 failed to apply on top of branch "release-1.2":

Applying: Wait until backend storage PVC is ready before starting VMIs
Using index info to reconstruct a base tree...
M	pkg/virt-controller/watch/BUILD.bazel
M	pkg/virt-controller/watch/vmi.go
M	pkg/virt-controller/watch/vmi_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/virt-controller/watch/vmi_test.go
CONFLICT (content): Merge conflict in pkg/virt-controller/watch/vmi_test.go
Auto-merging pkg/virt-controller/watch/vmi.go
Auto-merging pkg/virt-controller/watch/BUILD.bazel
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Wait until backend storage PVC is ready before starting VMIs
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-1.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@acardace
Copy link
Member

@jean-edouard can you do a manual backport for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/buildsystem Denotes an issue or PR that relates to changes in the build system. sig/storage size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants