-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sdk): add exist_ok=False
to file.download()
#4564
Conversation
…file exists but you don't want to redownload also add type annotations
exist_ok=False
to file.download()
exist_ok=False
to file.download()
wandb/apis/public.py
Outdated
@@ -2782,7 +2783,7 @@ def download(self, root=".", replace=False): | |||
`ValueError` if file already exists and replace=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you update the doc string here.
wandb/apis/public.py
Outdated
@@ -2782,7 +2783,7 @@ def download(self, root=".", replace=False): | |||
`ValueError` if file already exists and replace=False | |||
""" | |||
path = os.path.join(root, self.name) | |||
if os.path.exists(path) and not replace: | |||
if os.path.exists(path) and not replace and not exist_ok: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you are handling the case where the file exists and exist_ok==True
, with your current change you will still re-download it (based on your description, you said you want to avoid downloading if the file exists)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kptkin Good catch. Sorry, was a bit hasty with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries :) That's why we have reviews
exist_ok=False
to file.download()
exist_ok=False
to file.download()
wandb/apis/public.py
Outdated
if os.path.exists(path): | ||
if not replace and not exist_ok: | ||
raise ValueError("File already exists, pass replace=True to overwrite") | ||
elif not replace and exist_ok: | ||
return open(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total nit: (up to you)
if os.path.exists(path): | |
if not replace and not exist_ok: | |
raise ValueError("File already exists, pass replace=True to overwrite") | |
elif not replace and exist_ok: | |
return open(path) | |
if os.path.exists(path) and not replace: | |
if not exist_ok: | |
raise ValueError("File already exists, pass replace=True to overwrite") | |
else: | |
return open(path) |
wandb/apis/public.py
Outdated
raise ValueError("File already exists, pass replace=True to overwrite") | ||
if os.path.exists(path): | ||
if not replace and not exist_ok: | ||
raise ValueError("File already exists, pass replace=True to overwrite") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can update the error message to reflect the option that the user can pass exist_ok
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!
commit d244d07 Author: Katia Patkin <87335417+kptkin@users.noreply.github.com> Date: Tue Dec 27 13:31:20 2022 -0800 style(public-api): format public file with proper formating (#4697) fix format commit 3dc8584 Author: Janosh Riebesell <janosh.riebesell@gmail.com> Date: Tue Dec 27 10:21:41 2022 -0800 feat(sdk): add `exist_ok=False` to `file.download()` (#4564) * add exist_ok=False to file.download() to avoid raising ValueError if file exists but you don't want to redownload also add type annotations * handle case file exists and exist_ok==True in File.download() * point out exist_ok=True option in file already exists error * tweak if/else paths Co-authored-by: Katia Patkin <87335417+kptkin@users.noreply.github.com> commit 4a4651e Author: Vish Rajiv <8609620+vwrj@users.noreply.github.com> Date: Fri Dec 23 14:27:16 2022 -0800 fix(artifacts): artifact.version should be the version index from the associated collection (#4486) fix(artifacts): artifact.version should be the version index from the associated collection Co-authored-by: Hugh Wimberly <hugh.wimberly@wandb.com> Co-authored-by: Dmitry Duev <dmitryduev@users.noreply.github.com> commit a7de372 Author: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri Dec 23 14:19:14 2022 -0800 docs(sdk): Removed less than, greater than characters from dosctrings… (#4687) docs(sdk): Removed less than, greater than characters from dosctrings. Re: It breaks the new doc engine Docusuarus. commit a45629d Author: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri Dec 23 14:18:00 2022 -0800 docs(sdk): Fixed typo in docstring for data_types.Objects3D (#4543) Fixed typo in docstring for data_types.Objects3D commit 6df4b88 Author: Dmitry Duev <dmitryduev@users.noreply.github.com> Date: Fri Dec 23 13:16:16 2022 -0800 test(integrations): fix import tests (#4690) test(integrations): fix import tests commit 8da62bb Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Fri Dec 23 11:46:00 2022 -0800 refactor(artifacts): consolidate hash utilities into lib.hashutil (#4525) * Move all MD5 and base64 encode utilities to lib.hashutil * Remove duplicate function definitions * Refactor common code out into small functions * Use hypothesis to for testing * Remove unused code * Update type annotations for hash types commit eb70114 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Thu Dec 22 09:44:23 2022 -0800 test(artifacts): improve storage handler test coverage (#4674) * Add test case for adding a file to a finalized artifact * Add test case for caching local file references * Add GCS and WBArtifact storage handler tests commit 7106d17 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Wed Dec 21 19:36:31 2022 -0800 fix(artifacts): get digest directly instead of from the manifests' manifest (#4681) Reference digest directly instead of though the manifests' manifest commit 0a21fd1 Author: KyleGoyette <kdgoyette@gmail.com> Date: Wed Dec 21 14:37:03 2022 -0800 feat(launch): Default to using model-registry project for agent and launch_add (#4613) commit 84be24c Author: Griffin Tarpenning <griffin.tarpenning@wandb.com> Date: Wed Dec 21 13:20:02 2022 -0800 chore(launch): remove fallback resource when not specified for a queue (#4637) commit cf7df1d Author: speezepearson <speezepearson@users.noreply.github.com> Date: Tue Dec 20 17:10:13 2022 -0800 test(sdk): add tests for Api.upload_file_retry (#4639) commit c1ad0c0 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Tue Dec 20 16:08:37 2022 -0800 refactor(artifacts): use ArtifactEntry directly instead of subclassing (#4649) * Subsume ArtifactEntry into ArtifactManifestEntry commit 491fc59 Author: speezepearson <speezepearson@users.noreply.github.com> Date: Tue Dec 20 15:26:54 2022 -0800 test(sdk): add unit tests for filesync.StepUpload (#4652) * add unit tests for filesync.StepUpload commit 4e667d0 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Tue Dec 20 15:02:09 2022 -0800 fix(artifacts): correctly handle url-encoded local file references. (#4665) * Add test case for path names needing correct urldecoding * Use local_fil_uri_to_path instead of urlparse * Fix existing tests that used relative path uris instead of absolute paths
commit d244d07 Author: Katia Patkin <87335417+kptkin@users.noreply.github.com> Date: Tue Dec 27 13:31:20 2022 -0800 style(public-api): format public file with proper formating (#4697) fix format commit 3dc8584 Author: Janosh Riebesell <janosh.riebesell@gmail.com> Date: Tue Dec 27 10:21:41 2022 -0800 feat(sdk): add `exist_ok=False` to `file.download()` (#4564) * add exist_ok=False to file.download() to avoid raising ValueError if file exists but you don't want to redownload also add type annotations * handle case file exists and exist_ok==True in File.download() * point out exist_ok=True option in file already exists error * tweak if/else paths Co-authored-by: Katia Patkin <87335417+kptkin@users.noreply.github.com> commit 4a4651e Author: Vish Rajiv <8609620+vwrj@users.noreply.github.com> Date: Fri Dec 23 14:27:16 2022 -0800 fix(artifacts): artifact.version should be the version index from the associated collection (#4486) fix(artifacts): artifact.version should be the version index from the associated collection Co-authored-by: Hugh Wimberly <hugh.wimberly@wandb.com> Co-authored-by: Dmitry Duev <dmitryduev@users.noreply.github.com> commit a7de372 Author: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri Dec 23 14:19:14 2022 -0800 docs(sdk): Removed less than, greater than characters from dosctrings… (#4687) docs(sdk): Removed less than, greater than characters from dosctrings. Re: It breaks the new doc engine Docusuarus. commit a45629d Author: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri Dec 23 14:18:00 2022 -0800 docs(sdk): Fixed typo in docstring for data_types.Objects3D (#4543) Fixed typo in docstring for data_types.Objects3D commit 6df4b88 Author: Dmitry Duev <dmitryduev@users.noreply.github.com> Date: Fri Dec 23 13:16:16 2022 -0800 test(integrations): fix import tests (#4690) test(integrations): fix import tests commit 8da62bb Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Fri Dec 23 11:46:00 2022 -0800 refactor(artifacts): consolidate hash utilities into lib.hashutil (#4525) * Move all MD5 and base64 encode utilities to lib.hashutil * Remove duplicate function definitions * Refactor common code out into small functions * Use hypothesis to for testing * Remove unused code * Update type annotations for hash types commit eb70114 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Thu Dec 22 09:44:23 2022 -0800 test(artifacts): improve storage handler test coverage (#4674) * Add test case for adding a file to a finalized artifact * Add test case for caching local file references * Add GCS and WBArtifact storage handler tests commit 7106d17 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Wed Dec 21 19:36:31 2022 -0800 fix(artifacts): get digest directly instead of from the manifests' manifest (#4681) Reference digest directly instead of though the manifests' manifest commit 0a21fd1 Author: KyleGoyette <kdgoyette@gmail.com> Date: Wed Dec 21 14:37:03 2022 -0800 feat(launch): Default to using model-registry project for agent and launch_add (#4613) commit 84be24c Author: Griffin Tarpenning <griffin.tarpenning@wandb.com> Date: Wed Dec 21 13:20:02 2022 -0800 chore(launch): remove fallback resource when not specified for a queue (#4637) commit cf7df1d Author: speezepearson <speezepearson@users.noreply.github.com> Date: Tue Dec 20 17:10:13 2022 -0800 test(sdk): add tests for Api.upload_file_retry (#4639) commit c1ad0c0 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Tue Dec 20 16:08:37 2022 -0800 refactor(artifacts): use ArtifactEntry directly instead of subclassing (#4649) * Subsume ArtifactEntry into ArtifactManifestEntry commit 491fc59 Author: speezepearson <speezepearson@users.noreply.github.com> Date: Tue Dec 20 15:26:54 2022 -0800 test(sdk): add unit tests for filesync.StepUpload (#4652) * add unit tests for filesync.StepUpload commit 4e667d0 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Tue Dec 20 15:02:09 2022 -0800 fix(artifacts): correctly handle url-encoded local file references. (#4665) * Add test case for path names needing correct urldecoding * Use local_fil_uri_to_path instead of urlparse * Fix existing tests that used relative path uris instead of absolute paths
commit d244d07 Author: Katia Patkin <87335417+kptkin@users.noreply.github.com> Date: Tue Dec 27 13:31:20 2022 -0800 style(public-api): format public file with proper formating (#4697) fix format commit 3dc8584 Author: Janosh Riebesell <janosh.riebesell@gmail.com> Date: Tue Dec 27 10:21:41 2022 -0800 feat(sdk): add `exist_ok=False` to `file.download()` (#4564) * add exist_ok=False to file.download() to avoid raising ValueError if file exists but you don't want to redownload also add type annotations * handle case file exists and exist_ok==True in File.download() * point out exist_ok=True option in file already exists error * tweak if/else paths Co-authored-by: Katia Patkin <87335417+kptkin@users.noreply.github.com> commit 4a4651e Author: Vish Rajiv <8609620+vwrj@users.noreply.github.com> Date: Fri Dec 23 14:27:16 2022 -0800 fix(artifacts): artifact.version should be the version index from the associated collection (#4486) fix(artifacts): artifact.version should be the version index from the associated collection Co-authored-by: Hugh Wimberly <hugh.wimberly@wandb.com> Co-authored-by: Dmitry Duev <dmitryduev@users.noreply.github.com> commit a7de372 Author: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri Dec 23 14:19:14 2022 -0800 docs(sdk): Removed less than, greater than characters from dosctrings… (#4687) docs(sdk): Removed less than, greater than characters from dosctrings. Re: It breaks the new doc engine Docusuarus. commit a45629d Author: Noah Luna <15202580+ngrayluna@users.noreply.github.com> Date: Fri Dec 23 14:18:00 2022 -0800 docs(sdk): Fixed typo in docstring for data_types.Objects3D (#4543) Fixed typo in docstring for data_types.Objects3D commit 6df4b88 Author: Dmitry Duev <dmitryduev@users.noreply.github.com> Date: Fri Dec 23 13:16:16 2022 -0800 test(integrations): fix import tests (#4690) test(integrations): fix import tests commit 8da62bb Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Fri Dec 23 11:46:00 2022 -0800 refactor(artifacts): consolidate hash utilities into lib.hashutil (#4525) * Move all MD5 and base64 encode utilities to lib.hashutil * Remove duplicate function definitions * Refactor common code out into small functions * Use hypothesis to for testing * Remove unused code * Update type annotations for hash types commit eb70114 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Thu Dec 22 09:44:23 2022 -0800 test(artifacts): improve storage handler test coverage (#4674) * Add test case for adding a file to a finalized artifact * Add test case for caching local file references * Add GCS and WBArtifact storage handler tests commit 7106d17 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Wed Dec 21 19:36:31 2022 -0800 fix(artifacts): get digest directly instead of from the manifests' manifest (#4681) Reference digest directly instead of though the manifests' manifest commit 0a21fd1 Author: KyleGoyette <kdgoyette@gmail.com> Date: Wed Dec 21 14:37:03 2022 -0800 feat(launch): Default to using model-registry project for agent and launch_add (#4613) commit 84be24c Author: Griffin Tarpenning <griffin.tarpenning@wandb.com> Date: Wed Dec 21 13:20:02 2022 -0800 chore(launch): remove fallback resource when not specified for a queue (#4637) commit cf7df1d Author: speezepearson <speezepearson@users.noreply.github.com> Date: Tue Dec 20 17:10:13 2022 -0800 test(sdk): add tests for Api.upload_file_retry (#4639) commit c1ad0c0 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Tue Dec 20 16:08:37 2022 -0800 refactor(artifacts): use ArtifactEntry directly instead of subclassing (#4649) * Subsume ArtifactEntry into ArtifactManifestEntry commit 491fc59 Author: speezepearson <speezepearson@users.noreply.github.com> Date: Tue Dec 20 15:26:54 2022 -0800 test(sdk): add unit tests for filesync.StepUpload (#4652) * add unit tests for filesync.StepUpload commit 4e667d0 Author: Hugh Wimberly <hugh.wimberly@wandb.com> Date: Tue Dec 20 15:02:09 2022 -0800 fix(artifacts): correctly handle url-encoded local file references. (#4665) * Add test case for path names needing correct urldecoding * Use local_fil_uri_to_path instead of urlparse * Fix existing tests that used relative path uris instead of absolute paths
To avoid raising
ValueError
if file exists but you don't want to re-download.Also adds type annotations.