Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not backup volumes of pending pods #2270

Closed
wants to merge 2 commits into from

Conversation

ctrox
Copy link
Contributor

@ctrox ctrox commented Feb 13, 2020

Fixes #2269

My preferred solution to this would be to actually check if a PVC exists and not just completely ignore PVCs of Pending pods. But as far as I can see that is not easy to achieve since we don't have a kubernetes client in PodAction.

Fixes vmware-tanzu#2269

Signed-off-by: Cyrill Troxler <cyrill@nine.ch>
@ashish-amarnath
Copy link
Member

@ctrox Thanks for your PR.
On the outset, your approach seems fine. Have you been able test this out? Specifically, that the PVC (if any) should also included in the backup.

@ashish-amarnath
Copy link
Member

Can you also please add a change log to your PR?
See https://velero.io/docs/v1.3.0-beta.2/code-standards/ for instructions.

@ctrox
Copy link
Contributor Author

ctrox commented Feb 27, 2020

Have you been able test this out? Specifically, that the PVC (if any) should also included in the backup.

Yeah I have tested this locally, plus I added a test case with a pending pod. Correct me if I'm wrong but as long as a PVC is bound it should be backed up by the backup_pv_action.

Can you also please add a change log to your PR?

done

Signed-off-by: Cyrill Troxler <cyrill@nine.ch>
@ashish-amarnath
Copy link
Member

This PR LGTM.
Requesting reviews from more folks.

@skriss
Copy link
Contributor

skriss commented Feb 27, 2020

The code LGTM if we want to go ahead with this.

I do want to call out that we've previously had a bunch of back-and-forth on whether Velero should ignore cases like this, or whether they should be reported as errors/partial failures like they are today (see #1878). In the past, we've decided to continue reporting these as errors, so the user is at least alerted that something is wrong with their config.

What do others think? Should we revisit this decision?

@nrb
Copy link
Contributor

nrb commented Mar 2, 2020

@ctrox So if I'm reading this correctly, the Pod will be backed up, but no PVCs for it. In your correct scenario, does the Pod restore as Pending? Or is it able to restore successfully because new PVCs are provisioned?

@ctrox
Copy link
Contributor Author

ctrox commented Mar 3, 2020

In my scenario, the pod will be restored as Pending as the PVC reference was never valid anyways.
However, without knowing too much about the architecture of Velero, would it not make sense to completely separate PVC backups from Pods? If for some reason I have a PVC which is not currently bound to a pod, maybe for example some job that uses a volume and is not currently running, I would still expect that PVC+PV to be backed up.

@skriss
Copy link
Contributor

skriss commented Mar 10, 2020

@ctrox PVCs will still be backed up independent of pods, assuming they match all the filters defined in the backup spec - so, in your scenario, if you were backing up an entire namespace that included a pending pod and a PVC, the PVC would still be backed up. It would just happen at a different point in time, outside the "scope" of the pod backup. The reason for linking pod + PVC + PV backups is that sometimes, users want to run hooks in their pod just before backing up the pod + any PVC/PV it uses, in order to quiesce their application and get consistent volume snapshots.

@skriss
Copy link
Contributor

skriss commented Mar 10, 2020

I suppose one could argue that it makes sense not to try to back up related PVCs/PVs of Pending pods, because any pre/post hooks won't be able to be executed in the pod anyway, so there's no need to try to back up the PVC/PV within the scope of the pod, even if they exist. If they match the backup's filters, they'll still get backed up, just at a later point in time. And, if we do that, then you sidestep the issue of erroring when the pod is Pending because the PVC doesn't actually exist.

@carlisia
Copy link
Contributor

I wonder if we should have the concept of policy. A set of rules that Velero would follow to resolve issues like this one, and leave it up to the user to decide. These would be differ from configuration in a way that would dictate how Velero would behave for either all backups or on a per backup basis. I think I remember there being at least one other behavior decision that we were having trouble deciding one way or another (not #1878) but I can't recall what.

@skriss skriss added the On Hold label Apr 29, 2020
@cblecker
Copy link
Contributor

This PR would fix the specific use case I described here: #1878 (comment)

@skriss
Copy link
Contributor

skriss commented Jun 1, 2020

I've put up PR #2595 which fixes this in a more general way by not erroring when additional items returned from plugins can't be found. I've confirmed in testing that a pod that's pending because it references a nonexistent PVC no longer results in a Velero error, just a warning. Closing this PR out, but appreciate the input & contribution nonetheless!

@skriss skriss closed this Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unscheduled pod with invalid PVC reference results in partial failure
6 participants