Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't access file in subdir ("No such file or directory" when accessing) #866

Closed
XavierGeerinck opened this issue Aug 17, 2022 · 27 comments
Closed

Comments

@XavierGeerinck
Copy link

Which version of the blobfuse was used?

blobfuse2 latest

Which OS (please include version) are you using?

Linux Ubuntu 20.04

What problem was encountered?

I have a directory structure (e.g. A/B/C/file.txt) but when trying to access the file it does not work. It works after performing an ls on each subdir (e.g. ls A, ls A/B, ls A/B/C and then cat A/B/C/file.txt)

Have you found a mitigation/solution?

It works after performing an ls on each subdir (e.g. ls A, ls A/B, ls A/B/C and then cat A/B/C/file.txt)

By default, blobfuse logs errors to syslog. If this is relevant, is there anything in the syslog that might be helpful?

/

If relevant, please share your mount command.

Note: tried with both streaming and tmp

Mount Command: blobfuse2 mount /root/azurestorage --read-only --config-file=/docker/azure-blobfuse-config.yaml

Config File:

allow-other: true

logging:
  type: syslog
  level: log_debug

components:
  - libfuse
  - stream
  - attr_cache
  - azstorage

libfuse:
  attribute-expiration-sec: 120
  entry-expiration-sec: 120
  negative-entry-expiration-sec: 240

stream:
  block-size-mb: 8
  blocks-per-file: 3
  cache-size-mb: 1024

attr_cache:
  timeout-sec: 7200

azstorage:
  type: block
  endpoint: MASKED.blob.core.windows.net
  account-name: MASKED
  account-key: MASKED
  mode: key
  container: MASKED
@gapra-msft
Copy link
Member

Hi @XavierGeerinck, Thank you for reporting this issue. Could you please share if you created these directories using blobfuse? Blob storage has flat namespace so it only has the concept of "virtual" directories. Blobfuse2 gets around this by creating marker directories and setting special metadata on the directories. If these marker directories don't exist, Blobfuse can't find the intermediate folders (which cat expects to exist).

This case will work seamlessly with hierarchical namespace accounts which do have the concept of directories, so all those intermediate directories will be added when uploading the file on portal

@gapra-msft gapra-msft self-assigned this Aug 17, 2022
@XavierGeerinck
Copy link
Author

XavierGeerinck commented Aug 17, 2022 via email

@gapra-msft gapra-msft added the V2 label Aug 17, 2022
@gapra-msft
Copy link
Member

You could manually make the directories using blobfuse or manually upload an empty blob with the metadata key value pair 'hdi_isfolder': 'true'

@XavierGeerinck
Copy link
Author

XavierGeerinck commented Aug 17, 2022 via email

@gapra-msft
Copy link
Member

So I just create a file without name and add that as content? I will try that asap Get Outlook for iOShttps://aka.ms/o0ukef

________________________________ From: Gauri Prasad @.> Sent: Wednesday, August 17, 2022 6:47:15 PM To: Azure/azure-storage-fuse @.> Cc: Xavier Geerinck @.>; Mention @.> Subject: Re: [Azure/azure-storage-fuse] Can't access file in subdir ("No such file or directory" when accessing) (Issue #866) You could manually make the directories using blobfuse or manually upload an empty blob with the metadata key value pair 'hdi_isfolder': 'true' — Reply to this email directly, view it on GitHub<#866 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAEQLMRSNO55B3NF5LVBU2TVZUJRHANCNFSM562FTTMQ. You are receiving this because you were mentioned.Message ID: @.***>

Sorry, the file name would be whatever virtual directory you are trying to add (for example, a, a/b, a/b/c in your example are the virtual directory names), it's content is empty and the metadata is 'hdi_isfolder': 'true'

@XavierGeerinck
Copy link
Author

So like this then?

image

@gapra-msft
Copy link
Member

yep that looks correct

@XavierGeerinck
Copy link
Author

Awesome! Just verified it as well and it seems to work! Thanks a lot :) let's maybe update the documentation somehow to reflect this requirement? Or will there be some code change to automatically create it if it doesn't exist?

@gapra-msft
Copy link
Member

We can definitely update the documentation to reflect this requirement! I will also check with my team and see if this automatic creation is feasible.

Thanks again for trying out blobfuse2! We are happy to help as you hit other questions/issues.

@XavierGeerinck
Copy link
Author

Thanks a lot!

Interesting note, when I use it, it works on mount, then after a while the created files disappear and it's not working anymore again

@gapra-msft
Copy link
Member

When you say created files do you mean the ones manually created by you or the ones that show that they are listed on ls?

@XavierGeerinck
Copy link
Author

When you say created files do you mean the ones manually created by you or the ones that show that they are listed on ls?

The ones created by me, now it seems they only disappeared in the Storage Explorer, when performing ls -la they still show up. So I am unsure if they exist in the storage account or are indeed gone entirely.

In any case, the discovery doesn't seem to work anymore and the original issue returns where I have to manually navigate to each directory to get it working

@gapra-msft
Copy link
Member

When you say created files do you mean the ones manually created by you or the ones that show that they are listed on ls?

The ones created by me, now it seems they only disappeared in the Storage Explorer, when performing ls -la they still show up. So I am unsure if they exist in the storage account or are indeed gone entirely.

In any case, the discovery doesn't seem to work anymore and the original issue returns where I have to manually navigate to each directory to get it working

It looks like Storage Explorer and Azure Portal have a slightly different UI. Storage Explorer simply shows the folder but Azure Portal will show you the folder and the marker blob, so you could check Azure Portal to double check the marker is indeed gone.

Blobfuse doesn't delete any folders unless an explicit call comes from the file system to delete the folder. Would you be able to share any debug logs for this behavior? Or even a set of steps I can try to follow?

@XavierGeerinck
Copy link
Author

I will debug it a bit more as it is currently training a model now through BlobFuse. It indeed seems that the Azure Portal does show the markers now, so it should work. I will keep you updated on the process! Thanks so much for helping with this so quickly

@vibhansa-msft vibhansa-msft added this to the 2.0.0-preview.3 milestone Aug 19, 2022
@XavierGeerinck
Copy link
Author

Checked more into this and with Hierarchical Namespaces it works better so I am using that one now until a more permanent fix is available :)

@vibhansa-msft
Copy link
Member

@XavierGeerinck : Just to give you some background on this, block blobs are flat name space file-systems and they do not have any concept of a directory, while blobfuse being a file-system driver has to support directory operations. This leads us to create those special marker files. HNS has native support of directory hence it works there well. Any block blob account where directory markers does not exist, blobfuse will have this limitation as discussed in this thread.
We will document this scenario in our readme. Let us know if you need any further help on this. If the workaround is well suited for your workflow, kindly close this issue.

@XavierGeerinck
Copy link
Author

Thanks a lot for the clarification @vibhansa-msft !

Currently I am trying with HNS support, but am encountering access issues as well :/ it seems that blobfuse sometimes fails to load a file and the entire pipeline crashes (even though the file exist)

This is random though, sometimes I have it after 3000 epochs, sometimes after 20000.. so not sure how I can assist in debugging this?

@vibhansa-msft
Copy link
Member

Can you collect the debug logs for blobfuse, that can help us root-cause the issue.

@vibhansa-msft
Copy link
Member

We have updated our README file mentioning this directory marker file issue. I will close this issue here as this is by design. If you are facing any other issue, feel free to create a new bug on blobfuse.

@dashesy
Copy link

dashesy commented Oct 12, 2022

A lot of times these directories are not made by us, so we cannot change them. We still need a way to read them. Currently blobfuse2 stops working randomly. No such issue with blobfuse, so I'm going back to blobfuse for now

@gapra-msft
Copy link
Member

Hi @dashesy, we actually just added a fix for this behavior in the latest code on the main branch. A user needs to set the virtual-directory config parameter to true and virtual directories without markers should work. Could you please try it out? I'd also be happy to share a link to a private build if necessary

@dashesy
Copy link

dashesy commented Oct 13, 2022

Yes. I will try it out. Please let me know the branch.

@gapra-msft
Copy link
Member

gapra-msft commented Oct 13, 2022

@dashesy It's already merged onto the main branch https://github.com/Azure/azure-storage-fuse/tree/main

Please be sure to add the azstorage.virtual-directory config file parameter and set it to true

@ddl-giuliocapolino
Copy link

@vibhansa-msft I am currently trying to mount ADLS with HNS but am wondering if there is a way to, upon mounting, respect directory-level scope/access using blobfuse? Say my ADLS has a container with 3 directories but my user only has ACL access to 2, is it possible to use blobfuse to mount that container in a way it recognizes my directory-level access and display only what I have access to?

@vibhansa-msft
Copy link
Member

There are two ways to achieve this.

  1. Mount blobfuse using root and do allow_other so that its accessible to other users on the system. Have your ACLs set in such a way that "others" have only restricted access like read or write.
  2. Mount only a sub-directory where you have access instead of mounting entire container.

@ddl-giuliocapolino
Copy link

thanks @vibhansa-msft - for 1. how do you go about setting ACL on “others” given that acl (setacl specifically) is not supported by fuse?

@vibhansa-msft
Copy link
Member

chmod is the only option you have to change the ACL for any given file. If your account is not HNS enable or you have mounted using blob endpoint (type: adls is not given in config file) chmod will not be supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants