Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage Cleaner] Add more basic functionality to storage adapters #380

Merged
merged 12 commits into from
Dec 7, 2023

Conversation

2015aroras
Copy link
Collaborator

@2015aroras 2015aroras commented Nov 22, 2023

This adds or updates the following functionality for storage adapters:

  • is_dir: Create a separate inner method to avoid calls between public-facing methods when the unsharder is implemented.
  • download_folder: Download a folder from storage to a local destination. Downloading a file can be done using cached path and so I exclude it.
  • upload: Upload a file or folder from local FS to storage.

PR Train:

  1. [Storage Cleaner] Add script for removing bad runs from local & cloud storage #364
  2. [Storage Cleaner] Migrate to cached_path #378
  3. [Storage Cleaner] Add option to get full paths when listing entries #379
  4. This PR [Storage Cleaner] Add more basic functionality to storage adapters #380
  5. [Storage Cleaner] Add unsharding to storage cleaner #381
  6. [Storage Cleaner] Handle some legacy checkpoints in unsharding #382

@2015aroras 2015aroras changed the title [storage cleaner] Add more basic functionality to storage adapters [Storage Cleaner] Add more basic functionality to storage adapters Nov 22, 2023
@2015aroras 2015aroras force-pushed the shanea/storage-cleaner-archive-path-fix branch from 1a4e300 to 9c6cd51 Compare November 22, 2023 17:07
@2015aroras 2015aroras marked this pull request as ready for review November 22, 2023 17:15
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

One nit: A lot of the methods take some sort of path argument as a str, and then convert to a Path, while others take PathOrStr. I think it would be cleaner to allow everything to take PathOrStr.

@2015aroras
Copy link
Collaborator Author

LGTM.

One nit: A lot of the methods take some sort of path argument as a str, and then convert to a Path, while others take PathOrStr. I think it would be cleaner to allow everything to take PathOrStr.

The reason I use str for most things is that Path() breaks cloud URLs (it changes // to /). Using PathOrStr for most of the storage adapter methods could encourage the wrong behavior of passing in a Path for cloud paths. Thus I avoid using Path or PathOrStr for paths that could be from the cloud.

Most of the methods that take some a str path argument and convert to Path are in the local FS adapter, where the path is a local path and so doesn't get broken by Path().

@2015aroras 2015aroras changed the base branch from shanea/storage-cleaner-archive-path-fix to shanea/storage-cleaner-archive-fix-2 December 6, 2023 22:16
@2015aroras
Copy link
Collaborator Author

@epwalsh Sorry, didn't mean to re-request review for this one since it is more or less unchanged (just a different base).

Base automatically changed from shanea/storage-cleaner-archive-fix-2 to main December 7, 2023 18:38
@2015aroras 2015aroras merged commit a120ab2 into main Dec 7, 2023
10 checks passed
@2015aroras 2015aroras deleted the shanea/storage-cleaner-download-upload branch December 7, 2023 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants