Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File system notification support / improved copy speed #32

Closed
hasse69 opened this issue Mar 13, 2015 · 59 comments
Closed

File system notification support / improved copy speed #32

hasse69 opened this issue Mar 13, 2015 · 59 comments

Comments

@hasse69
Copy link
Owner

hasse69 commented Mar 13, 2015

My main purpose for trying out rar2fs is to be enjoy media without the need of unpacking it. I would mount a whole directory full of archives and media into rar2fs and see it as a DLNA server's root folder.

This works perfectly, however any new media put into the folder is not picked up by the DLNA sever. The folder is watched with inotify. I need to rebuild the media database manually every time there is a change.

I have looked into the issue as to the reason and fuse does not and will not support forwarding notifications on the src to the mounted target directory. This is understandable as the mount does not need to be on a local filesystem.

So we are left with setting a watch on the mount and directly moving files there instead of the src. This also works... however moving files there is painstakingly slow as we do a copy instead of a mv I presume. Is there any way of speeding things up if a file move is from the same localdisk to the same localdisk? Or is this out of our hands and a fuse thing?

Thanks

Original issue reported on code.google.com by fafarago on 2014-03-07

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

I am bit confused here. I understand the issue as such, but what I do not understand
is when you say
"we are left with setting a watch on the mount and directly moving files there instead
of the src"
What watch are you setting on the mount point? Moving files there? Why?
Can you elaborate a bit on what you are doing here? You should not need to move/copy
anything.

Original issue reported on code.google.com by hasse69 on 2014-03-07 06:56:46

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

My apologies for not explaining correctly.
* I have a media directory on /md0/media including archived- and non-archived files.
* to seamlessly access both data I mount it with rar2fs on /md0/media_all
* I point the DLNA server's root directory to /md0/media_all and I can access all the
files
* the DLNA server starts monitoring - with inotify - its root directory (/md0/media_all)
for changes so updates are reflected instantly on client application
* now I have downloaded a new media item called music.mp3 in /md0/downloads and want
it to be picked up by the DLNA server
* where do I put this file?

1. if I move it to /md0/media - which is instantaneous (mv /md0/downloads/music.mp3
/md0/media) because both directories are on the same physical disk and partition -
then the file will not be picked up by the DLNA server as it is watching the /md0/media_all
directory.
2. if I move it to /md0/media_all the file is picked up by the DLNA server. However,
calling "mv /md0/downloads/music.mp3 /md0/media_all" physically moves the file to media_all
which if the file is big and the NAS is not very powerful quite a long time.

I know #1 cannot be solved as it is a FUSE limitation (see (1, 2)), but can #2 be solved?
Is that something rar2fs has under control? E.g. placing files on the same disk into
the mount directory is the same as putting it in the mount-source directory?

[1] http://fuse.996288.n3.nabble.com/Current-state-of-the-art-with-inotify-like-functionality-td9262.html
[2] https://sourceware.org/ml/libc-help/2011-08/msg00009.html

Original issue reported on code.google.com by fafarago on 2014-03-07 08:13:35

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

Ok. Now I understand a lot better what you are trying to achieve.

And your are absolutely right. By moving a file from (in this case) download to b,
and assuming b is your mount point of a, it will physically write the file to a, b
is completely virtual. When you move a file from download to a using the normal file
system operations the move does not require moving data around, it is all about inode
entries. So at most there are some bookkeeping information that need to be changed
but other from that nothing is really done to your media. But when you do this operation
towards b, you enter the fuse mount point. rar2fs receives a request to create/open
the new file and after that comes a lot of write() requests. So from rar2fs/fuse perspective
this is like creating a completely new file. I think the same would happen if you tried
a simple bind mount? But I am not sure. But moving files across file system borders
are usually requiring a complete write procedure. Your a nd b are not considered part
of the same file system, even though physically the backend and mount point in fact
(in this case) are part of the same. It is possible, but I need to check, that rar2fs
could detect this case and actually move the file "externally" and ignore the fuse
operations. But I have never tried something like that before. Might even throw errors
from fuse. I will ask in the fuse developer forum and take it from there.

Going back to your 1). Again, you are right. This is not supported by fuse. But can
you not simply remount the rar2fs mount point after moving the file? That would surely
trig inotify and update the state in your DLNA server, or?
If that does not work, does not the DLNA server have some forced refresh/update option?
I know this is how many PLEX user do it. Should not take that much effort to request
an update of the DLNA media server database? A lot faster than having to "copy" the
file at least.


Original issue reported on code.google.com by hasse69 on 2014-03-07 11:31:23

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

Original issue reported on code.google.com by hasse69 on 2014-03-07 11:33:15

  • Labels added: Type-Enhancement
  • Labels removed: Type-Defect

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

Yes yoresponse ggestion will work. However it is a manual process or would require some
scripting support. 

I will go the periodic refresh-way for the DLNA server.  Unfortunately there is no
support for that in minidlna so I'll have to implement it myself. Thanks for the quick
respn

Original issue reported on code.google.com by fafarago on 2014-03-08 04:21:45

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

Giving this a second thought, I do not think it is that easy to move the file in the
background. The FUSE file system does not get information about the source file. It
only sees the request to create a new file. 
Do you know if inotify supports monitoring soft links? In that case, could you not
create a soft link from a folder in your 'media_all' to your 'media' folder? Then any
changes to 'media' should be picked up by inotify. rar2fs supports soft links.

Original issue reported on code.google.com by hasse69 on 2014-03-09 10:26:25

@hasse69
Copy link
Owner Author

hasse69 commented May 2, 2015

Have you tried the small trick of moving the file x to /md0/media and after that do
'touch /md0/media_all/x'? That might trigger inotify the the way you want?

Original issue reported on code.google.com by hasse69 on 2014-03-29 13:22:09

@hasse69
Copy link
Owner Author

hasse69 commented Jan 16, 2016

There is a patch for FUSE currently being reviewed that aims for performing read/write transparent to the user-space implementation. When/if that patch is applied (together with a new version of libfuse) and read/write can be handled separately and inotify watches on the FUSE mount point will still work as expected it could be used by rar2fs to resolve this issue. Lets see.

@gardar
Copy link

gardar commented Aug 31, 2016

Any news on this? Do you have a link where I can follow the status of the fuse patch?

@hasse69
Copy link
Owner Author

hasse69 commented Sep 1, 2016

The kernel patch you can check here
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1065661.html

I am not sure when and where to get information about the FUSE library API that also needs to be updated for this to work. You might try the FUSE developer mailing archive.

https://sourceforge.net/p/fuse/mailman/fuse-devel/thread/56B24F7C.4060703%40codeaurora.org/#msg34819304

@doobnet
Copy link

doobnet commented Feb 7, 2017

I've run into the same issue myself and came up with a workaround:

I created a dynamic library which implements the inotify_add_watch system call. I start the application, in this case Plex, with the LD_PRELOAD environment variable, with the value pointing to the path of the dynamic library. When Plex calls inotify_add_watch, the implementation in the dynamic library receives the call. The implementation will change the path to the corresponding path in the source directory, in this case it will change the path from /md0/media_all to /md0/media, and then forward the call to the original implementation in the operating system. This works beautifully 😃.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017

Interesting!
But I have one question. What did this solve really? Inotify watches on PLEX works just fine if you make sure to write your files to the mount point rather than the source directory. Or is your solution allowing PLEX (or any other observer) to detect changes to the mount point even if you write files to the source folder which would then become a lot faster than writing it through FUSE? You are very welcome to write a user guide that explains this in more detail and then I can add it to the Wiki.

@doobnet
Copy link

doobnet commented Feb 7, 2017

Or is your solution allowing PLEX (or any other observer) to detect changes to the mount point even if you write files to the source folder

Yes.

which would then become a lot faster than writing it through FUSE

This second part of the question makes me wonder if I misunderstand what's possible to do with rar2fs. I had the impression that rar2fs mounted the file system as read only. Or at least that new rar files needed to be added/written to the source directory.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017

Nope, rar2fs is a pass-through file-system that can easily be mounted read/write.
The only limitation is that you cannot write/change files inside RAR archives. But as regular files (e.g. your RARs) you can write them directly to the mount point just as any other file. In fact, this is the recommended way since then rar2fs will detect the changes to the folder and can invalidate the folder cache.
If you write behind the back of rar2fs the cache would be kept and you will not see your new updates until you either force a change to the folder through rar2fs or manually invalidate the cache.

In fact, with the new folder cache I am not sure your solution would work? What you can do is to touch a file in the corresponding mount point folder after all files are written to the source folder.
Or alternatively send rar2fs a USR1 signal, which however will invalidate your entire cache. Btw. I think I should change USR1 to HUP instead. That would make more sense.

@doobnet
Copy link

doobnet commented Feb 7, 2017

Hmm, I see. That's a couple of days work wasted for nothing. But it was a fun exercise. Perhaps it could be useful for some other scenario.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017

Yes, you are on to something good here! The actual problem this issue is all about is the fact that writing to the rar2fs mount point forces all writes to enter the user space application.
On a native file system a move of files from one place to another would only change a few pointers in the meta data and job done. But not if you write to the mount point. This is what the fuse patch is trying to address. That every write request does not need to be sent to the user space but instead could be done in the kernel the same way as other files systems would handle it. That still leaves us with a problem! How would rar2fs then know if the files system was altered? This is also what fuse tries to solve using some sort of internal notification.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017

What you have solved is half of the problem, letting the observer watching a mount point see changes made to the souce folder. Now all that is needed is to be able to tell rar2fs this happend some how, preferably carrying information exactly which folder was changed.

@doobnet
Copy link

doobnet commented Feb 7, 2017

How would rar2fs then know if the files system was altered? This is also what fuse tries to add using some sort of internal notification.

Can't rar2fs use inotify to watch the source directory?

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017

Yes, it could but there is a reason (actually several) it does not.
Using inotify on every single folder (last time I checked there is no way to simply monitor a top folder and detect any changes below it) would need a lot of resources!
Also, inotify is not very portable.

But I have an idea, give it a few minutes of thought.
If you made two libraries, a server and a client part that could communicate, e.g using a socket or something, then you could LD_PRELOAD the client to rar2fs and the server to e.g. PLEX. If I added an external API for notification, your client library could call that when the server tells the client a change was made. You need some simple protocol between the client and the server to handle multiple clients since there might be several instances of rar2fs mounting different folders.

@doobnet
Copy link

doobnet commented Feb 7, 2017

Yes, it could but here is a reason it does not.
Using inotify on every single folder (last time I checked there is no way to simply monitor a top folder and detect any changes below it) would need a lot of resources!

Ok, I see.

But I have an idea, give it a few minutes of thought.
If you made two libraries, a server and a client part that could communicate, e.g using a socket or something, then you could LD_PRELOAD the client to rar2fs and the server to e.g. PLEX. If I added an external API for notification, your client library could call that when the server tells the client a change was made. You need some simple protocol between the client and the server to handle multiple clients since there might be several instances of rar2fs mounting different folders.

Yeah, I was thinking the same thing. But wouldn't it be easier to modify rar2fs to directly receive notifications from the library? The only reason I used the LD_PRELOAD technique is because Plex is not open source.

This would also make the implementation a bit messy. The read system call would need to be implemented and a list of all file descriptors that are currently being watch need to be stored. This would be necessary since read can be used for other things as well. The read function would need to check if the given file descriptor is one of the stored once.

@doobnet
Copy link

doobnet commented Feb 7, 2017

To move a RAR file out of the directory, it needs to be moved out of the source directory, correct? In that case, that's another use case which the LD_PRELOAD technique can be used for.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017 via email

@doobnet
Copy link

doobnet commented Feb 7, 2017

I tried adding a RAR file to the mounted directory. For some reason Plex doesn't pick it up. I tried debugging with my library, only logging enabled. For some reason -1 is returned in the call to original inotify_add_watch. errno is set to Permission denied. I'll continue debugging tomorrow.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 7, 2017 via email

@doobnet
Copy link

doobnet commented Feb 8, 2017

If we should add notification support it must be portable and handled in a platform independent way. That is why I would rather not add notification support internally in rar2fs but instead export a simple function for independent implementations to call.

I see.

Btw, you would not use read to watch file descriptors. You use a dedicated thread and select()/epoll() that can monitor multiple descriptors.

So the library would use a traditional inotify setup with inotify_add_watch and epoll but with the addition that it would get the paths to listen for filesystem events from inotify_add_watch that the application calls?

@doobnet
Copy link

doobnet commented Feb 9, 2017

I am not really sure what you are saying here. But the server side of the library should do what it does today, but it needs to talk to the client side in order to report upates. The client side will then forward the notification to a dedicated function in rar2fs. How to make the latter interface generic enough for others to write libraries for other platforms, e.g. OSX and FreeBSD, remains to think a bit about.

What the library is doing today is only changing the path passed to inotify_add_watch. It doesn't watch any directories on its own, i.e. calling inotify_add_watch and using epoll. It doesn't handle any notifications. I created a repository with the code: https://github.com/doobnet/renotify. The readme has some documentation at the bottom how the library is implemented/works.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 9, 2017

Neither master nor v1.23.1 works.

Are we sure this is a rar2fs issue? I do not seem to see what you see :(

@doobnet
Copy link

doobnet commented Feb 10, 2017

Are we sure this is a rar2fs issue? I do not seem to see what you see :(

If I unmount rar2fs and add the same directory, with the regular file, to the same directory as I use for the mount point it works.

If I monitor the same directory with inotifywait it doesn't have the same problem. But if I run inotifywait as the same user as Plex is running (by default Plex is run by the "plex" user) it shows the same problem as Plex. But I start rar2fs with: rar2fs -o allow_other, so I don't understand why I get the this problem.

If I mount rar2fs as the plex user, both inotifywait and Plex works as expected. Could there be some issue with the allow_other option?

@hasse69
Copy link
Owner Author

hasse69 commented Feb 10, 2017 via email

@doobnet
Copy link

doobnet commented Feb 11, 2017

I also connect over samba with allow_other. No problems there. But I mount as root, maybe that is the difference? Try to use the uid mount option to set it to the PLEX user.

I'll give that a try.

Regarding the library, yes I assume you do not set any notifications yourself. The whole point is that PLEX or whatever application is used should do that. But in the same way as you modify the path you should also be able to "steal" the callback and then notify both the application and rar2fs.

That's what I initially was thinking, but then you mentioned threads and epoll, which got me uncertain. At least something inside Plex is calling read with the same file descriptor as the initial call to inotify_add_watch. Not sure if epoll is calling read.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 11, 2017 via email

@doobnet
Copy link

doobnet commented Feb 11, 2017

Try to use the uid mount option to set it to the PLEX user

Yes, setting the quid to the Plex user works. But that means it changes the owner of the mount point, which means I cannot add content with my regular user (unless using sudo). I don't really understand what the issue is since Plex should only need read access (inotifywait behaves the same). And there's no problem when rar2fs is not involved.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 11, 2017

Maybe it is the group/gid? rar2fs has nothing to do with permissions really. If this is a problem on exporting the file system, then it is rather something with FUSE.

What you can try is to patch rar2fs.c and remove the automatic use of option default_permissions (just search the code for one instance of it). See if that makes a difference or not.

Change

        fuse_opt_add_arg(&args, "-osync_read,fsname=rar2fs,subtype=rar2fs,default_permissions");

to

        fuse_opt_add_arg(&args, "-osync_read,fsname=rar2fs,subtype=rar2fs");

@doobnet
Copy link

doobnet commented Feb 11, 2017

Yeah, that works.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 11, 2017

Ok, need to figure out if that is a good solution or not. May have other implications.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 11, 2017

Can you try to apply this patch to latest master

patch.txt

$ patch < patch.txt

Try both with and without default permissions.

@doobnet
Copy link

doobnet commented Feb 11, 2017

With that patch, It works if I remove default_permissions, otherwise it doesn't work.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 11, 2017

Ok, then the patch has no purpose. That is news of sorts.
Will look into the possibility to disable default_permissions. It has been there almost since day one so I am a bit surprised no one has reported this before.

@hasse69
Copy link
Owner Author

hasse69 commented Feb 15, 2017

Ok, default permissions option is gone, see 6212575

@doobnet
Copy link

doobnet commented Feb 16, 2017

👍

@zimme
Copy link
Collaborator

zimme commented Nov 19, 2017

Any new thoughts on a possible solution for this?

edit: Some related links
https://github.com/libfuse/libfuse/wiki/Fsnotify-and-FUSE
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/fuse.h#n38

Seems like there's support for some kind of notification since fuse protocol 7.18

@hasse69
Copy link
Owner Author

hasse69 commented Nov 20, 2017 via email

@hasse69
Copy link
Owner Author

hasse69 commented Nov 20, 2017

Please correct me if I am wrong here, but this has never been about file system notification, or?
Tools like inotify work just fine if you make sure to copy files to the mount point rather than the back-end file system. What is needed is something that can by-pass the user-space file system but still keep tools like inotify work as intended. That is what the pass-through file system patches were addressing.

@zimme
Copy link
Collaborator

zimme commented Nov 20, 2017

My interest in the issue was coming from a place of me needing file system notifications to be triggered on a rar2fs mount whenever a file was added to the source.

I mount /data in a docker container and within the container I mount /data-unrar using rar2fs.

Whenever new media is downloaded it's moved into the source folder, this all happens in another container, I wan't plex to pick up the changes.

@hasse69
Copy link
Owner Author

hasse69 commented Nov 20, 2017

@zimme ok, and copying stuff to the mount point (even though a bit slow) is not an option?

@zimme
Copy link
Collaborator

zimme commented Nov 20, 2017

Not really as the mount point only lives inside the docker container that runs plex. This was made possible when docker added support for CAP_SYS_ADMIN.

docker run -d --cap-add SYS_ADMIN --device /dev/fuse ... zimme/plex-rar2fs

So it's only my plex container which has access to /data-unrar all the other containers for downloading etc. get access to specific data sets.

However this isn't a "BIG" problem for me as I've just set plex to refresh a few times every hour.

@hasse69
Copy link
Owner Author

hasse69 commented Nov 20, 2017

I think this thread ran off-track a long time ago ;)
It all started with a question how to improve performance of writes to the mount point (which lets inotify do its thing). What you have here is another scenario. Since tools like PLEX uses inotify to watch the mount point (which is a rather natural approach) it cannot pick up changes on something it is not aware of. On the other hand, rar2fs cannot tell inotify that something happened either. So that leaves you with two options; a) make sure to copy files to the mount point or b) fool inotify to think something in the mount point was changed. If I get your use-case correct it is b) that you are looking for. Did you ever try the trick by doobnet? https://github.com/doobnet/renotify

@zimme
Copy link
Collaborator

zimme commented Nov 20, 2017

I'll have to look into renotify 👍

I'll have to read up a bit more on fuse and fsnotify, becuase I got the sense that it's trying to solve the problem I'm looking to get solved.

Anyways, thanks for all your work on this awesome project, it saves me a ton of space 😄

@doobnet
Copy link

doobnet commented Nov 21, 2017

@zimme if I understand your problem correctly, I think the simplest solution would be to mount an empty directory using Docker and then mount rar2fs on that directory inside the container. That should give you access to both the source and the mount point outside of the container.

@zimme
Copy link
Collaborator

zimme commented Nov 21, 2017

@doobnet Thanks, I'll look into that 👍

@hasse69 hasse69 added this to To do in Backlog Oct 10, 2019
@hasse69
Copy link
Owner Author

hasse69 commented Nov 5, 2019

I am going to close this one.
There are some improvements made in the detection of directory updates in source folder(s).
See issue #116 and commit f622557
More things can still be done in this area though. When time allows I might start looking into improvements in both file modification detection and archive validation.

@hasse69 hasse69 closed this as completed Nov 5, 2019
@hasse69 hasse69 moved this from To do to Done in Backlog Nov 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Backlog
  
Done
Development

No branches or pull requests

4 participants