Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: argo cp command to download artifacts #695

Closed
jessesuen opened this issue Jan 19, 2018 · 3 comments · Fixed by #8582
Closed

Proposal: argo cp command to download artifacts #695

jessesuen opened this issue Jan 19, 2018 · 3 comments · Fixed by #8582
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/cli The `argo` CLI good first issue Good for newcomers size/M 2-4 days solution/workaround There's a workaround, might not be great, but exists type/feature Feature request
Projects

Comments

@jessesuen
Copy link
Member

This proposal is for an argo CLI command to download artifacts from a workflow step. Given a workflow result like:

$ argo get artifact-passing-jg2k8 -o wide
Name:             artifact-passing-jg2k8
Namespace:        default
ServiceAccount:   default
Status:           Succeeded
Created:          Thu Jan 18 15:02:42 -0800 (9 hours ago)
Started:          Thu Jan 18 15:02:42 -0800 (9 hours ago)
Finished:         Thu Jan 18 15:03:31 -0800 (9 hours ago)
Duration:         49 seconds

STEP                       PODNAME                            DURATION  ARTIFACTS  MESSAGE
 ✔ artifact-passing-jg2k8
 ├---✔ generate-artifact   artifact-passing-jg2k8-2089725168  25s       hello-art
 └---✔ consume-artifact    artifact-passing-jg2k8-2707584000  23s

One could do something like:

$ argo cp artifact-passing-jg2k8-2089725168:hello-art /tmp/

The workflow YAML already has most of the information required to download the artifact (e.g. bucket name and key) as seen in the node status:

    artifact-passing-jg2k8-2089725168:
      finishedAt: 2018-01-18T23:03:07Z
      id: artifact-passing-jg2k8-2089725168
      name: artifact-passing-jg2k8[0].generate-artifact
      outputs:
        artifacts:
        - name: hello-art
          path: /tmp/hello_world.txt
          s3:
            accessKeySecret:
              key: accesskey
              name: argo-artifacts-minio-user
            bucket: my-bucket
            endpoint: argo-artifacts-minio-svc.default:9000
            insecure: true
            key: artifact-passing-jg2k8/artifact-passing-jg2k8-2089725168/hello-art.tgz
            secretKeySecret:
              key: secretkey
              name: argo-artifacts-minio-user

The issues here are that:

  1. the artifact repository may not directly accessible from the client (e.g. internal minio server)
  2. the secrets may not be accessible

However, we could at least support the use case for GCS and AWS, where the repository will be directly accessible. Regarding bucket secrets, we can either lookup the secrets using the k8s go-client, or rely on the .aws config/credentials in the user's home directory for the download.

@edlee2121 edlee2121 added this to the V2.3 milestone Aug 29, 2018
@alexmt alexmt modified the milestones: v2.3, v2.4 Jan 25, 2019
@jessesuen jessesuen removed this from the v2.4 milestone Apr 19, 2019
@jessesuen
Copy link
Member Author

Would be possible to do with a proper API server.

@alexec
Copy link
Contributor

alexec commented Sep 29, 2020

You can now curl artifacts.

@alexec alexec added the solution/workaround There's a workaround, might not be great, but exists label Sep 29, 2020
icecoffee531 pushed a commit to icecoffee531/argo-workflows that referenced this issue Jan 5, 2022
@alexec alexec added area/artifacts S3/GCP/OSS/Git/HDFS etc area/cli The `argo` CLI and removed help wanted labels Feb 7, 2022
@alexec alexec added the good first issue Good for newcomers label Apr 11, 2022
@alexec
Copy link
Contributor

alexec commented Apr 20, 2022

Users:

  • Want to download all artifacts (probably quite common).
  • Want to download artifact from one step, but don't know the node ID, instead they know the template name.
argo cp outputDir --namepace=... --workflowName=... --nodeId=... --templateName=... --artifactName=... --artifactType=ouput

So what does outputDir end up containing?

There should never be a clash between two files from two different nodes, so the structure of that directory should probably match the identifier, i.e. {outputDir}/{namespace}/{workflowName}/{nodeID}/outputs/{artifactName} etc

This is probably inconvenient for many users, e.g. the workflow only had a a few uniquely named output. We could add a --flatten=true option that does not prefix the directory name.

@alexec alexec added this to To do in Artifacts Apr 25, 2022
mihirpandya-greenops added a commit to mihirpandya-greenops/argo-workflows that referenced this issue May 2, 2022
mihirpandya-greenops added a commit to mihirpandya-greenops/argo-workflows that referenced this issue May 2, 2022
Signed-off-by: mihirpandya-greenops <mihir@greenops.io>
mihirpandya-greenops added a commit to mihirpandya-greenops/argo-workflows that referenced this issue May 3, 2022
Signed-off-by: mihirpandya-greenops <mihir@greenops.io>
@alexec alexec added the size/M 2-4 days label May 5, 2022
Artifacts automation moved this from To do to Done May 6, 2022
alexec pushed a commit that referenced this issue May 6, 2022
Signed-off-by: mihirpandya-greenops <mihir@greenops.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/cli The `argo` CLI good first issue Good for newcomers size/M 2-4 days solution/workaround There's a workaround, might not be great, but exists type/feature Feature request
Projects
Development

Successfully merging a pull request may close this issue.

4 participants