Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task has failed: more than one PersistentVolumeClaim is bound #3480

Closed
koceg opened this issue Nov 2, 2020 · 4 comments
Closed

task has failed: more than one PersistentVolumeClaim is bound #3480

koceg opened this issue Nov 2, 2020 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@koceg
Copy link

koceg commented Nov 2, 2020

Expected Behavior

the task to execute from pipelineRun as it executes from taskRun

Actual Behavior

task backup-prometheus-snapshot has failed: more than one PersistentVolumeClaim is bound
pod for taskrun prometheus-snapshot-run-8cdnx-backup-prometheus-snapshot-fpcgx not available yet
Tasks Completed: 2 (Failed: 1, Cancelled 0), Skipped: 0

Steps to Reproduce the Problem

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  creationTimestamp: null
  generateName: prometheus-snapshot-run-
spec:
  serviceAccountNames:
    - taskName: new-prometheus-snapshot
      serviceAccountName: prometheus-backup
  podTemplate:
    nodeSelector:
      kubernetes.io/hostname:
  params:
  - name: oc-image
    value: quay.io/openshift/origin-cli:4.5.0
  - name: oc-prometheus
    value: prometheus-k8s-1
  pipelineRef:
    name: prometheus-snapshot
  workspaces:
  - name: prometheus-pvc
    persistentVolumeClaim:
      claimName: "prometheus-k8s-db-$(params.oc-prometheus)"
  - name: backup-pvc
    persistentVolumeClaim:
      claimName: bkp
status: {}
---
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  creationTimestamp: null
  generateName: oc-snapshot-bkp-run-
spec:
  podTemplate:
    nodeSelector:
      kubernetes.io/hostname:
  params:
  - name: oc-image
    value: quay.io/openshift/origin-cli:4.5.0
  resources: {}
  serviceAccountName: ""
  taskRef:
    name: oc-snapshot-bkp
  workspaces:
  - name: prometheus
    persistentVolumeClaim:
      claimName: prometheus-k8s-db-prometheus-k8s-1
    subPath: prometheus-db/snapshots
  - name: backup
    persistentVolumeClaim:
      claimName: bkp
status:
  podName: ""

Additional Info

  • Kubernetes version: v1.18.3

  • Tekton Pipeline version:

    Client version: 0.12.1
    Pipeline version: v0.17.2
    Triggers version: unknown

@koceg koceg added the kind/bug Categorizes issue or PR as related to a bug. label Nov 2, 2020
@jlpettersson
Copy link
Member

  workspaces:
  - name: prometheus
    persistentVolumeClaim:
      claimName: prometheus-k8s-db-prometheus-k8s-1
    subPath: prometheus-db/snapshots
  - name: backup
    persistentVolumeClaim:
      claimName: bkp

Could these two workspaces use the same PersistentVolumeClaim? I see one of them is using a subPath, could the other use another subPath of the same PVC? It does not work well to use two PVCs in the same Task, if you only use a single datacenter, it should work if you disable the affinit assistant but if you use a cloud provider this most likely will not help since the two PVCs may live within different Availability Zones in the cluster - this may also be the case in spanned clusters on-prem. In general, try to only use a single PVC for each task.

@koceg
Copy link
Author

koceg commented Nov 2, 2020

The requirement for the Task itself is to save the snapshot data under prometheus-db/snapshots to some other remote location, and the subPath is used to simplify my logic in the Task that is execute. And yes currently the setup is with different Availability Zones.
I've tested the same logic on
Kubernetes Version: v1.18.3+2fbd7c7
Tekton Pipeline version: v0.11.3
and initially it was failing almost the same.
With exception of log translating TaskSpec to Pod: serviceaccounts "pipeline" not found.
To try and counteract that I've added spec.serviceAccountName: default inside PipelineRun yaml file.
After I did that the failing task run successfully and the pipeline executed as well.
don't know if this was the original error but different log output.
As explained this might go against best practices but I don't see other option currently regarding PVC usage.

@jlpettersson
Copy link
Member

And yes currently the setup is with different Availability Zones.

I don't see other option currently regarding PVC usage.

I think your use case is interesting, but also challenging. In this case, you need to disable the affinity assistant, and you also need to make sure that your PVCs are in the same Availability Zone, how this can be done depends on your storage system. Is there a way to set Zones for your PVs ? Or at least, enforce so that all Pods that use those PVCs are always scheduled to the same zone?

The VolumeBindingMode for your StorageClass may also affect if this is possible or how to do this. Especially the field waitForFirstConsumer.

For some storage systems and volume binding mode, the volume is first scheduled to a Zone, then the Pod follow to that zone - as how I have understood it - at least with WaitForFirstConsumer: Immediate. This is a difficult field that I had to learn the hard way :) In your case, you need to find a configuration so that the two volumes are in the same AZ - otherwise it can not be mounted by a Pod at the same time. Or if you have any storageClass that is available in all your AZs?

@koceg
Copy link
Author

koceg commented Nov 3, 2020

@jlpettersson I want to thank you for pointing out affinity assistant after disabling it on the initial cluster where I started testing everything worked. The backup storage that I use is zone independent and does not pose any problem as it allows RWX access mode. I'm closing this as it's not really a bug but lack of understanding on my side.

@koceg koceg closed this as completed Nov 3, 2020
jimmykarily pushed a commit to epinio/epinio that referenced this issue Aug 17, 2021
because trying to mount 2 PVCs results in an error:

tektoncd/pipeline#3480

Now we have another problem because the subpath for the source is not
automatically cleaned up. The pipeline fails with:

"remote origin already exists"
jimmykarily pushed a commit to epinio/epinio that referenced this issue Aug 18, 2021
because trying to mount 2 PVCs results in an error:

tektoncd/pipeline#3480

Since the PVC is re-used, it means we now have to remove the old code
otherwise the git clone will fail. We do that by adding a new "cleanup"
task.
jimmykarily pushed a commit to epinio/epinio that referenced this issue Aug 19, 2021
because trying to mount 2 PVCs results in an error:

tektoncd/pipeline#3480

Since the PVC is re-used, it means we now have to remove the old code
otherwise the git clone will fail. We do that by adding a new "cleanup"
task.
jimmykarily pushed a commit to epinio/epinio that referenced this issue Aug 19, 2021
because trying to mount 2 PVCs results in an error:

tektoncd/pipeline#3480

Since the PVC is re-used, it means we now have to remove the old code
otherwise the git clone will fail. We do that by adding a new "cleanup"
task.
jimmykarily pushed a commit to epinio/epinio that referenced this issue Aug 19, 2021
because trying to mount 2 PVCs results in an error:

tektoncd/pipeline#3480

Since the PVC is re-used, it means we now have to remove the old code
otherwise the git clone will fail. We do that by adding a new "cleanup"
task.
cmoulliard added a commit to ch007m/tekton-pac that referenced this issue Jun 1, 2023
…kton reports such an issue: tektoncd/pipeline#3480

Signed-off-by: Charles Moulliard <cmoulliard@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants