Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(artifacts): Allow adding s3 bucket as reference artifact #6346

Merged
merged 15 commits into from
Sep 22, 2023

Conversation

ibindlish
Copy link
Contributor

@ibindlish ibindlish commented Sep 21, 2023

Fixes

Description

Allows

artifact.add_reference("s3://<bucket_name>")

If the bucket is empty then an empty artifact sequence (i.e., no versions) is created.

馃 Generated by Copilot at 2ba269f

This pull request enhances the S3Handler class to support empty keys as bucket artifacts and optimize the bucket operations. It modifies the load_path and store_path functions in s3_handler.py to implement these changes.

Testing

Upload

ENTITY = "ibindlish"
PROJECT = "test-s3"
ARTIFACT = "ref-artifact"

with wandb.init(entity=ENTITY, project=PROJECT) as run:
    artifact = wandb.Artifact(ARTIFACT, type="data")
    artifact.add_reference("s3://dev-test-ibindlish")
    run.log_artifact(artifact)

https://wandb.ai/ibindlish/test-s3/artifacts/data/ref-artifact/v2/overview

Download

with wandb.init() as run:
    artifact = run.use_artifact(f"{ENTITY}/{PROJECT}/{ARTIFACT}:v2", type="data")
    artifact_dir = artifact.download()
    print(artifact_dir)

Checklist

  • Include reference to internal ticket "Fixes WB-NNNNN" and/or GitHub issue "Fixes #NNNN" (if applicable)
  • Ensure PR title compliance with the conventional commits standards

馃 Generated by Copilot at 2ba269f

The S3Handler class had a flaw
When dealing with buckets so raw
It could not handle keys
That were empty with ease
But this pull request fixes that draw

@github-actions github-actions bot added cc-feat and removed cc-feat labels Sep 21, 2023
@ibindlish ibindlish changed the title feat(artifacts): Allow adding s3 bucket as reference artifact fix(artifacts): Allow adding s3 bucket as reference artifact Sep 21, 2023
@github-actions github-actions bot added cc-fix and removed cc-feat labels Sep 21, 2023
@codecov
Copy link

codecov bot commented Sep 21, 2023

Codecov Report

Merging #6346 (342c0cd) into main (d2ba674) will decrease coverage by 0.09%.
The diff coverage is 93.33%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6346      +/-   ##
==========================================
- Coverage   77.24%   77.16%   -0.09%     
==========================================
  Files         387      387              
  Lines       44436    44441       +5     
==========================================
- Hits        34323    34291      -32     
- Misses      10060    10097      +37     
  Partials       53       53              
Flag Coverage 螖
unittest 80.92% <93.33%> (-0.09%) 猬囷笍

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage 螖
wandb/sdk/artifacts/storage_handlers/s3_handler.py 83.67% <93.33%> (+0.57%) 猬嗭笍

... and 13 files with indirect coverage changes

@github-actions github-actions bot added cc-fix and removed cc-fix labels Sep 21, 2023
@ibindlish ibindlish marked this pull request as ready for review September 21, 2023 22:50
@github-actions github-actions bot added cc-fix and removed cc-fix labels Sep 21, 2023
@ibindlish ibindlish marked this pull request as draft September 22, 2023 00:39
@github-actions github-actions bot added cc-fix and removed cc-fix labels Sep 22, 2023
@ibindlish ibindlish marked this pull request as ready for review September 22, 2023 00:46
@github-actions github-actions bot added cc-fix and removed cc-fix labels Sep 22, 2023
Copy link
Contributor

@moredatarequired moredatarequired left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're in a hurry this can merge as-is, but if we have some more hours I'd like to take this opportunity to clean up the function (which was a little ugly before you got to it)

@szymon-piechowicz-wandb
Copy link
Contributor

If we're in a hurry this can merge as-is, but if we have some more hours I'd like to take this opportunity to clean up the function (which was a little ugly before you got to it)

Let's make this a follow up, I think we are under time pressure here.

@ibindlish ibindlish merged commit 1208a55 into main Sep 22, 2023
79 checks passed
@ibindlish ibindlish deleted the ibindlish/artifacts/s3_handle_bucket_ref branch September 22, 2023 22:56
newline=False,
)
objs = self._s3.Bucket(bucket).objects.filter(Prefix=key).limit(max_objects)
if key != "":
objs = objs = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, extra objs =

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants