New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle mkdir
for cloud providers that support creating directories
#51
Comments
This already leads to some tricky unintuitive behavior in our existing implementation. In filesystems, you can't have a directory and a file share a name:
S3 has no problem doing this. Not only that, but there is path dependence for how cloudpathlib handles it. Making
( Making
|
Overall, I think this is just going to be a problem with all flat object stores. I can also create Doing some testing with AWS CLI:
In that case, I think we will need to come up with a specification of what happens, and document how it works. For example, we may decide:
This seems like it would best match what the AWS CLI will do. |
When we add #17 we can make tests that handle these scenarios when we define what the behavior should be. |
This is also broken for blob storage using Data Lake 2 Storage which DOES have a concept of hierarchical directory namespaces. |
Thanks @analog-cbarber. Can you provide a code snippet that repros the problem and some more details on the configuration? |
If you create a data lake 2 storage container and create a directory in the container using az_test_client = AzureBlobClient(AZ_TEST_ACCOUNT, AZ_SAS)
az_test_path = AzureBlobPath(f'az://{AZ_TEST_CONTAINER}', client=az_test_client)
subdir = az_test_path.joinpath('subdir')
subdir.mkdir()
assert subdir.is_dir() # fails |
If the trick to create an S3 folder is to touch (make sure its path ends with "/" to not get mixed with a regular file), why don't we just do the following ? class S3Path(CloudPath):
...
# simplified version
def mkdir(self, parents=False, exist_ok=False):
path = path if str(self).endswith("/") else S3Path(f"{str(self)}/")
path.touch(exist_ok=exist_ok) |
mkdir
for cloud providers that support creating directories
Additional useful discussion in #295, which has been consolidated to this issue. |
Handling treating fake directories on S3 as directories should work correctly now with #302. Changing this issue to just be about implementing the |
S3 has an interesting situation with folders.
Like other object stores like Azure, it has a flat structure, and when you upload a file to
a/b/c.txt
for example, it creates an object literally nameda/b/c.txt
. The directoriesa
andb
aren't real and don't exist. The web console has special behavior to fake those as folders in the UI. When you deletec.txt
,a
andb
will automatically be gone.However, S3 does have another mechanism that lets you have folders. There is a "Create Folders" button in the web console, which lets you make a folders that exist even while empty. These turn out to actually be dummy files with a trailing slash. So if you create
a/
, there is actually an object in your bucketa/
which is not actually a folder, but the S3 console will treat it like a folder for the UI. You can equivalently upload a file toa/
and it will do the same thing.We need to think through the implications of this and what cloudpathlib should support.
PurePosixPath
strips trailing/
on instantiation, so this is something that doesn't map to a representation cleanly throughPurePosixPath
As a result, it's not currently possible to create anEDIT: This was incorrect. See discussion in Handle and test for s3 fake directories #190. This is possible because the string representation of the input URI is the main basis for a CloudPath object, not aS3Path
object that points to an S3 folder.PurePosixPath
.S3Path.mkdir
method is not implemented.The text was updated successfully, but these errors were encountered: