-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing scan times on systems without native recursive watching #87
Comments
Thanks for the recording, that's super helpful! It looks like Mutagen's taking a few seconds to perform rescans, which is a bit long. As a rule of thumb, they should take about as long as a Can you tell me which filesystems are being used on each endpoint? I'd assume APFS on macOS? Is the Docker container using some type of virtual or network filesystem? Also, can you give me an idea of file counts (e.g. with One good place to start with debugging would be to run the One quick idea would be to ignore |
I'm indeed using AFPS. CONTAINER_PREFIX=app
CONTAINER_PATH=/var/www/app services:
sync:
image: alpine:latest
container_name: ${CONTAINER_PREFIX}.sync
command: tail -f /dev/null
working_dir: ${CONTAINER_PATH}
volumes:
- ${CONTAINER_PREFIX}.sync:${CONTAINER_PATH}:nocopy
volumes:
app.sync:
This was already displayed in my screencast : find . | wc -l
117022 du -sh .
2,3G . In fact, I've 3 big folders :
|
chiming in here. also seeing 3-5 second delays between file sync propagation with docker. assuming this is due to large filesystems (using magento 2 here). looking for a way to set and forget. i'd like to ignore
|
yep -- my suspicions were correct. if i ignore vcs and vendor folders, reloads are instantaneous:
|
@markshust (and @qkdreyer) There are some changes (037e713, 858f908, db2494f) coming in v0.9.0 that will add parallel Even with these changes, ignoring content (especially platform-specific content like |
I wrote a python application mutagen-helper, just released today, that wraps mutagen binary and can help to manage your sessions. Put configuration in a yaml file in a directory to sync, and run This could help people having performance issues by splitting a sync session into multiple sessions and multiple configuration files, and starting/stopping them when required with a simple command. |
I'd like to give a quick update on performance efforts here, as well as a summary of the current Mutagen bottlenecks and the plans to alleviate them (CC @saulfautley). First, Mutagen v0.9.0-beta2 is now available and has a number of optimizations, fixes, and features that I hope will alleviate some of the performance woes that you've experienced. The most relevant change is the option to perform accelerated scans. This feature is still experimental, so it's not enabled by default, but it's easy to turn on for a session with More generally, the performance issues with multi-GB and high-file-multiplicity synchronization roots in Mutagen generally come down to two things:
There are other areas where Mutagen's performance could be improved, but these are the big, O(n), user-perceptible performance issues. There's not much that Mutagen can do about the first issue. It already transfers changes as efficiently as possible using the rsync algorithm, but an initial sync of GBs of files is going to take as long as it takes given bandwidth constraints. About the only optimization that Mutagen could potentially do here would be to switch to a raw TCP-based transport (cutting out the overhead of The second issue is the real performance pain point with Mutagen, and it's where most of the optimization focus has been. The accelerated scans mentioned above lay the foundation for fixing this problem, but they're only truly helpful on systems with native recursive watching. Systems without native recursive watching (Linux et al.) simply don't have the facilities to ensure total, race-free watching of a synchronization root, and thus a full rescan (i.e. recursive usage of Programs like Watchman simulate recursive watching on these systems by starting and stopping non-recursive watches based on other watches, but the process is an approximation and prone to missing events. Even after years of work, it's not perfect. Moreover, systems without native recursive watching generally have low default limits on the number of watches that they can establish. On Linux, each inotify watch requires a watch descriptor, and the OS generally limits the number of active watch descriptors to a few thousand by default. This limit is also per-user, not per-process. On BSD systems, which use kqueue for watching, each watch requires a file descriptor, meaning that the maximum number of open files is what limits the watch count (and if reached, causes problems with the rest of the program). It is possible to increase these limits, but it would require manual intervention (and in some cases superuser permissions) to do. And then there's the problem of scalability... if you're talking about individual watches on tens or hundreds of thousands of individual files, which is what we're talking about here, then you're looking at significantly straining system resources, probably beyond what Mutagen's rescans do. My feeling is that approximating recursive watching on Linux (and other platforms) is a non-starter due to the fact that it would be prone to missing events, wouldn't scale to the number of files that we're talking about here, and would require manual superuser intervention to scale at all. I think that the best option is for Mutagen to attempt to use Linux's fanotify API to perform recursive watching. The reason that Mutagen doesn't do this now is that fanotify is (or at least was) extremely limited in terms of the events that it can detect. It also requires superuser permissions. However, Linux 5.1 significantly expanded the fanotify API, adding more granular events. Additionally, since much of Mutagen's target use case is running inside containers, where If we can make fanotify work, and that's definitely my next avenue of research, then I think that these scan performance problems will largely evaporate. It won't help things on BSD systems and other platforms, but I think it will cover the 99.9% case. It will probably be Linux 5.1+ only (falling back to existing mechanisms on earlier systems), but then it's just a matter of waiting for that to roll out. It may not help RHEL/CentOS/Debian systems with ancient kernels, but it will become rapidly available for container environments with their more modern kernels. Beyond that, there are a few other avenues of approach that I've been considering:
So that's the state of things. At the end of the day though, there will still be limits. Synchronizing 100,000 files that add up to GBs is going to take time. But if Mutagen strives for an implementation where those cases are network-latency-limited, then everything else is going to be instant as well. |
Just one final update on this issue: As far as optimizations go, I think the scanning is about as optimized as it can be on systems without recursive watching mechanisms like FSEvents, Also, as of Mutagen v0.14, there is also support for using |
Following up #81, I've noticed too that I need to wait for ~5s before changes propagate to the my docker sync container, but my alpha URL is a local path on macOS, and my beta URL is a docker path on the same machine. (Using docker desktop community 2.0.3.0 on edge channel)
I'd be delighted to help you find out what is causing this delay.
We could start investigation using this sample screen recording : http://recordit.co/Bk3JlWaIwN
The text was updated successfully, but these errors were encountered: