rmdir and rmtree #52

jayqi · 2020-08-29T04:23:10Z

CloudPath.rmdir now errors if directory is not empty
new method CloudPath.rmtree that recursively deletes files, named after shutil.rmtree
Update tests to behave correctly with these new methods. Note that the current state of the mocked S3 backend assumes only files exist in S3 and not folders. Discussion about folders in Handle mkdir for cloud providers that support creating directories #51.

codecov · 2020-08-29T04:24:11Z

Codecov Report

Merging #52 into master will decrease coverage by 0.28%.
The diff coverage is 83.33%.

@@            Coverage Diff             @@
##           master      #52      +/-   ##
==========================================
- Coverage   81.51%   81.23%   -0.29%     
==========================================
  Files           9        9              
  Lines         622      634      +12     
==========================================
+ Hits          507      515       +8     
- Misses        115      119       +4

Impacted Files	Coverage Δ
cloudpathlib/cloudpath.py	`90.61% <83.33%> (-0.64%)`	⬇️
cloudpathlib/backends/s3/s3backend.py	`97.67% <0.00%> (-1.17%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8f1c711...3080673. Read the comment docs.

pjbull

One q, but this looks great! Glad to have rmtree

pjbull · 2020-08-30T05:39:45Z

cloudpathlib/cloudpath.py

@@ -508,6 +519,7 @@ def download_to(self, destination: Union[str, os.PathLike]):
        if self.is_file():
            self.backend._download_file(self, destination)
        else:
+            destination.mkdir(exist_ok=True)


Will we need parents=True as well? I think if a user specifies a path we can assume they want the whole thing to be created.

I guess that's a design decision to make on whether we want to change the behavior of this function. The existing behavior of download_to if the source is a file also will not create parents, which is consistent with typical behavior for Path file methods (though it is inconvenient sometimes).

That is doubly true. First, that pathlib never creates dirs for you, and second that it is ~~inconvenient~~ annoying. In a bunch of our projects we've got a repeated path.parent.mkdir(exist_ok=True, parents=True) before each path write.

I don't feel strongly either way. I didn't track down a reason for pathlib not creating dirs in some cursory searching of the Python bug tracker / mailing list, and I would be curious to see discussion why file creation methods (like write_text for example) wouldn't create the full path. I guess it's not explicitly saying to create a directory, but Path("asdf/asdf/asdf.txt").write_text("asdf") does seem pretty explicit...

In the absence of a reason not to, I'm inclined to create directories since that feels more useful and more like the API I would want to use in my development experience, but happy to punt the discussion to an issue as well.

I guess that's a design decision to make on whether we want to change the behavior of this function. The existing behavior of download_to if the source is a file also will not create parents

This is a good point that highlights that throughout the project we have contracts that are under-specified. It may be the case that the backend _download_file actually does create non-existing parent directories since we don't explicitly stop it from doing so. We haven't defined if Backend._download_file should or should not create parent directories, and we don't enforce it across backends. This is the kind of place where we're going to want to get more explicit in documenting these contracts and what is expected so that we mitigate differences across backend implementations.

I'll file an issue to track creating a backend implementation guide that specifies a handful of these things.

(Also, I will say that the new functions in cloudpathlib are the make-it-work implementations for a happy path. They definitely have not considered every use case and corner of API consistency. We should feel free to change them to make them better while we're developing.)

I guess my inclination is to punt for now. Given that the precedent is to not create directories for you automatically, not just for pathlib, but also for shells like bash, I think it makes sense to standardize at first on that. If we decide otherwise, I think we should make a concerted effort to do that consistently everywhere in cloudpathlib, and also test for it everywhere.

pjbull · 2020-08-30T05:42:52Z

tests/mock_backends/mock_s3.py

@@ -179,3 +181,15 @@ def paginate(self, Bucket=None, Prefix="", Delimiter=None):
            dirs = [{"Prefix": str(_.relative_to(self.root))} for _ in page if _.is_dir()]
            files = [{"Key": str(_.relative_to(self.root))} for _ in page if _.is_file()]
            yield {"CommonPrefixes": dirs, "Contents": files}
+
+
+def delete_empty_parents_up_to_root(path, root):


Now we act like a true blob store!

jayqi added 2 commits August 28, 2020 20:13

rmdir error on nonempty; add rmtree

16dfce9

Update tests

42a99ec

jayqi requested a review from pjbull August 29, 2020 04:23

Fix download_to handling of subdirs (#53); fix s3 mock bugs

3080673

pjbull reviewed Aug 30, 2020

View reviewed changes

This was referenced Aug 30, 2020

Create API specifications for Backends/Clients #56

Open

download_to does not create non-existent parent directories #57

Open

pjbull merged commit feb8925 into master Aug 30, 2020

pjbull deleted the rmdir-rmtree branch August 30, 2020 18:41

jayqi mentioned this pull request Sep 4, 2020

Downloading a directory doesn't work if it contains subdirectories + incorrect mocked test #53

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rmdir and rmtree #52

rmdir and rmtree #52

jayqi commented Aug 29, 2020

codecov bot commented Aug 29, 2020 •

edited

pjbull left a comment

pjbull Aug 30, 2020

jayqi Aug 30, 2020

pjbull Aug 30, 2020

jayqi Aug 30, 2020

pjbull Aug 30, 2020

rmdir and rmtree #52

rmdir and rmtree #52

Conversation

jayqi commented Aug 29, 2020

codecov bot commented Aug 29, 2020 • edited

Codecov Report

pjbull left a comment

Choose a reason for hiding this comment

pjbull Aug 30, 2020

Choose a reason for hiding this comment

jayqi Aug 30, 2020

Choose a reason for hiding this comment

pjbull Aug 30, 2020

Choose a reason for hiding this comment

jayqi Aug 30, 2020

Choose a reason for hiding this comment

pjbull Aug 30, 2020

Choose a reason for hiding this comment

codecov bot commented Aug 29, 2020 •

edited