Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent versioning of artifacts #35

Closed
jjaraalm opened this issue Dec 7, 2019 · 2 comments
Closed

Prevent versioning of artifacts #35

jjaraalm opened this issue Dec 7, 2019 · 2 comments
Labels
help wanted Extra attention is needed

Comments

@jjaraalm
Copy link

jjaraalm commented Dec 7, 2019

I really like the idea and structure of metaflow. For my use case, it looks like it could simultaneously solve a lot of different problems. That said, is there any way to disable versioning and archiving of specific artifacts? If I can guarantee that my upstream data source is versioned and archived appropriately, then I don't necessarily want duplication of all artifacts (because of the storage overhead).

I could just remove certain artifacts after a time, but this would require the cleaning tool to know what should and shouldn't be archived. It would be nicer if there was some syntax to declare an artifact as transient, or at the very least a call we can make at the end of a flow to dispose of artifacts that shouldn't be versioned.

@savingoyal
Copy link
Collaborator

savingoyal commented Dec 7, 2019

One way you can achieve this (if your data is stored in s3) is by using the metaflow.s3 client to access your data within steps and passing pointers to the s3 location using the self assignments across steps.

  @step
  def step_a(self):
      with S3(s3root=self.path_to_data_in_s3) as s3:
          res = s3.get('object')
          ...
          url = s3.put('example_object', message)
          self.path_to_example_url_in_s3 = url
      ...

Currenly, we store the artifacts so that you can replicate the state of your workflow at anytime in the future. Making artifacts transient will modify that behaviour.

@savingoyal savingoyal added the help wanted Extra attention is needed label Dec 7, 2019
@savingoyal
Copy link
Collaborator

Closing this issue for now. Please comment and re-open for any follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants