Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output artifact gets deleted even when it resides on a volume mount #4676

Closed
antoniomo opened this issue Dec 9, 2020 · 6 comments
Closed
Assignees
Labels

Comments

@antoniomo
Copy link
Contributor

Summary

If we have an output artifact from a mounted PVC or some other Volume mount, it shouldn't be deleted after upload.

The code: https://github.com/argoproj/argo/blob/master/workflow/executor/executor.go#L343-L348
Assumes that the artifact is on the container ethereal storage, but that might not be the case.

We use a volume to share data fast between workflow steps. However, some of the outputs produced on intermediate steps, we want to upload as output artifacts right away to also use elsewhere. We see that Argo removes the files of the output artifacts, so we can't use them on the next steps... which goes against the recommended advice here: https://github.com/argoproj/argo/blob/master/docs/cost-optimisation.md#consider-trying-volume-claim-templates-or-volumes-instead-of-artifacts

Diagnostics

What Kubernetes provider are you using? 1.18

What version of Argo Workflows are you running? 2.9.3


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@alexec
Copy link
Contributor

alexec commented Dec 9, 2020

Interesting. Could be pretty nasty. We'll discuss.

@Ark-kun
Copy link
Member

Ark-kun commented Dec 9, 2020

Perhaps there is no need to delete the artifact files after uploading them.
It's supposed to be an optimization, but I'm not sure it provides benefit.
I also remember there was a bug I've fixed long time ago in cases where the same file was being output as both artifact and parameter.

If we have an output artifact from a mounted PVC or some other Volume mount, it shouldn't be deleted after upload.

This scenario seems pretty strange. If the output data is generated by the component, then it's usually located on the local disk, not the mounted volume.

However, some of the outputs produced on intermediate steps, we want to upload as output artifacts right away to also use elsewhere.

Perhaps it might be better to make uploading explicit and add an upload template. This can be beneficial as your repositories for intermediate data and persistent data might be different.

@antoniomo
Copy link
Contributor Author

This scenario seems pretty strange. If the output data is generated by the component, then it's usually located on the local disk, not the mounted volume.

It's actually quite common and the recommended way to pass data between steps, without artifact upload and then download. We are talking about "heavy" artifacts, multiple gigabytes.

Perhaps it might be better to make uploading explicit and add an upload template. This can be beneficial as your repositories for intermediate data and persistent data might be different.

As an optimization we typically do that, so that the upload can happen concurrently with other workflow steps (otherwise, the step that creates the output data, doesn't end until the artifact upload is complete, preventing the dependent steps from starting). However this upload template is just a dummy busybox ls or the like, to do an output artifact. We could use a container with AWS CLI and aws s3 cp... but that seems a bit wrong when S3 artifact support is a core Argo feature (not to mention our teams of Argo users will have a hard time understanding why that's necessary).

@alexec alexec self-assigned this Dec 9, 2020
alexec added a commit to alexec/argo-workflows that referenced this issue Dec 10, 2020
…proj#4676

Signed-off-by: Alex Collins <alex_collins@intuit.com>
@alexec
Copy link
Contributor

alexec commented Dec 10, 2020

@antoniomo. I've created a dev build for you to test. Can you please test argoproj/argoexec:fix-4676?

@antoniomo
Copy link
Contributor Author

@antoniomo. I've created a dev build for you to test. Can you please test argoproj/argoexec:fix-4676?

Thanks, I can give it a go over the weekend!

@antoniomo
Copy link
Contributor Author

Hi!

I used this modified example workflow for testing:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: volumes-existing-
spec:
  entrypoint: volumes-existing-example
  volumes:
  # Pass my-existing-volume as an argument to the volumes-existing-example template
  # Same syntax as k8s Pod spec
  - name: workdir
    persistentVolumeClaim:
      claimName: my-existing-volume

  templates:
  - name: volumes-existing-example
    steps:
    - - name: generate
        template: whalesay
    - - name: print
        template: print-message

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol
    outputs:
      artifacts:
        - name: hello
          path: /mnt/vol/hello_world.txt

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol

It completes just fine, including the expected output on the second step, reading from the volume after the output artifact upload (to the default artifact storage here).

The logs of the wait container on the first step correctly show:

time="2020-12-12T16:01:18.798Z" level=info msg="Saving output artifacts"
time="2020-12-12T16:01:18.798Z" level=info msg="Staging artifact: hello"
time="2020-12-12T16:01:18.798Z" level=info msg="Staging /mnt/vol/hello_world.txt from mirrored volume mount /mainctrfs/mnt/vol/hello_world.txt"
time="2020-12-12T16:01:18.798Z" level=info msg="Taring /mainctrfs/mnt/vol/hello_world.txt"
time="2020-12-12T16:01:18.799Z" level=info msg="Successfully staged /mnt/vol/hello_world.txt from mirrored volume mount /mainctrfs/mnt/vol/hello_world.txt"
time="2020-12-12T16:01:18.799Z" level=info msg="S3 Save path: /tmp/argo/outputs/artifacts/hello.tgz, key: volumes-existing-kt9jx/volumes-existing-kt9jx-3077291320/hello.tgz"
time="2020-12-12T16:01:18.799Z" level=info msg="Creating minio client minio:9000 using static credentials"
time="2020-12-12T16:01:18.799Z" level=info msg="Saving from /tmp/argo/outputs/artifacts/hello.tgz to s3 (endpoint: minio:9000, bucket: my-bucket, key: volumes-existing-kt9jx/volumes-existing-kt9jx-3077291320/hello.tgz)"
time="2020-12-12T16:01:18.804Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/artifacts/hello.tgz
time="2020-12-12T16:01:18.804Z" level=info msg="Successfully saved file: /tmp/argo/outputs/artifacts/hello.tgz"

So I think it works as expected and is good to go :) Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants