Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rmdir and rmtree #52

Merged
merged 3 commits into from Aug 30, 2020
Merged

rmdir and rmtree #52

merged 3 commits into from Aug 30, 2020

Conversation

jayqi
Copy link
Member

@jayqi jayqi commented Aug 29, 2020

  • CloudPath.rmdir now errors if directory is not empty
  • new method CloudPath.rmtree that recursively deletes files, named after shutil.rmtree
  • Update tests to behave correctly with these new methods. Note that the current state of the mocked S3 backend assumes only files exist in S3 and not folders. Discussion about folders in Handle mkdir for cloud providers that support creating directories #51.

@jayqi jayqi requested a review from pjbull August 29, 2020 04:23
@codecov
Copy link

codecov bot commented Aug 29, 2020

Codecov Report

Merging #52 into master will decrease coverage by 0.28%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #52      +/-   ##
==========================================
- Coverage   81.51%   81.23%   -0.29%     
==========================================
  Files           9        9              
  Lines         622      634      +12     
==========================================
+ Hits          507      515       +8     
- Misses        115      119       +4     
Impacted Files Coverage Δ
cloudpathlib/cloudpath.py 90.61% <83.33%> (-0.64%) ⬇️
cloudpathlib/backends/s3/s3backend.py 97.67% <0.00%> (-1.17%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8f1c711...3080673. Read the comment docs.

Copy link
Member

@pjbull pjbull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One q, but this looks great! Glad to have rmtree

@@ -508,6 +519,7 @@ def download_to(self, destination: Union[str, os.PathLike]):
if self.is_file():
self.backend._download_file(self, destination)
else:
destination.mkdir(exist_ok=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we need parents=True as well? I think if a user specifies a path we can assume they want the whole thing to be created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's a design decision to make on whether we want to change the behavior of this function. The existing behavior of download_to if the source is a file also will not create parents, which is consistent with typical behavior for Path file methods (though it is inconvenient sometimes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is doubly true. First, that pathlib never creates dirs for you, and second that it is inconvenient annoying. In a bunch of our projects we've got a repeated path.parent.mkdir(exist_ok=True, parents=True) before each path write.

I don't feel strongly either way. I didn't track down a reason for pathlib not creating dirs in some cursory searching of the Python bug tracker / mailing list, and I would be curious to see discussion why file creation methods (like write_text for example) wouldn't create the full path. I guess it's not explicitly saying to create a directory, but Path("asdf/asdf/asdf.txt").write_text("asdf") does seem pretty explicit...

In the absence of a reason not to, I'm inclined to create directories since that feels more useful and more like the API I would want to use in my development experience, but happy to punt the discussion to an issue as well.


I guess that's a design decision to make on whether we want to change the behavior of this function. The existing behavior of download_to if the source is a file also will not create parents

This is a good point that highlights that throughout the project we have contracts that are under-specified. It may be the case that the backend _download_file actually does create non-existing parent directories since we don't explicitly stop it from doing so. We haven't defined if Backend._download_file should or should not create parent directories, and we don't enforce it across backends. This is the kind of place where we're going to want to get more explicit in documenting these contracts and what is expected so that we mitigate differences across backend implementations.

I'll file an issue to track creating a backend implementation guide that specifies a handful of these things.

(Also, I will say that the new functions in cloudpathlib are the make-it-work implementations for a happy path. They definitely have not considered every use case and corner of API consistency. We should feel free to change them to make them better while we're developing.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my inclination is to punt for now. Given that the precedent is to not create directories for you automatically, not just for pathlib, but also for shells like bash, I think it makes sense to standardize at first on that. If we decide otherwise, I think we should make a concerted effort to do that consistently everywhere in cloudpathlib, and also test for it everywhere.

@@ -179,3 +181,15 @@ def paginate(self, Bucket=None, Prefix="", Delimiter=None):
dirs = [{"Prefix": str(_.relative_to(self.root))} for _ in page if _.is_dir()]
files = [{"Key": str(_.relative_to(self.root))} for _ in page if _.is_file()]
yield {"CommonPrefixes": dirs, "Contents": files}


def delete_empty_parents_up_to_root(path, root):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we act like a true blob store!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants