-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lsyncd causes load while monitoring large filesystem. #202
Comments
This is not the case. default behavior is waiting for the defined delay If you set delay to zero, Lsyncd has no possibility to aggregate changes,
I'm afraight right now there isn't much away around that limit. Its a
|
I know level 4 has a delay which aggregates filelist before spawning rsync. But we deal with around 40 small files being changed every second (based on the log that I created there were 145000 files affected in an hour). Setting delay to a large value didn't work either, seems like there is a limit to it. Level 4 couldn't keep up in one of the test i did. Memory is not an issue we have 24 GB RAM on the server. And lsyncd uses around 500 mb ram. |
The aggregation is limited to 1000 events in the queue, since checking each event for an event that is already in the queue has n^2 runtime performance. This could be changed to n * log n by clever use of lookup tables but isn't been done so far. Runnig the Lua Profiler with a default configuration would be helpful to see where the CPU is eaten, if its in Lua itself and not the kernel after all. |
Can still confirm this. Tried it with ~1,000,000 directories on a 3TB file system and apart from the initial rsync taking a long time the inotify_add_watch for every folder took up an enormous amount of time and (single-core) cpu load, which seemingly lead to lsync repeatedly doing an init or at least part of it. I didn't see the rsync again after it finished, but checking on it after two days with strace I realized it was adding watches for folders I had already seen beeing added. |
If Lsyncd encounters an inotify queue overflow event, it fully restarts. Otherwise if a path is moved or deleted and recreated it will of course add new watches for that path. |
Is that preventable somehow? Also, would it be possible to multithread the add_watch process or increase it's performance otherwise? |
You can increase the inotify queue length as kernel parameter (via sys dev) /proc/sys/filesystem/inotify/max_queued_events This is an issue only if events pile up faster than Lsyncd can empty them out. The basic possiblity of this faucet to be faster than Lsyncd can drain it is not avoidable however. It can only be made less likely. For this you'd have to go GlusterFS or DBRD or so, where the sync is controlled on device level and thus these can throttle incoming events. |
PS: You should be able to see in the log if there was an Overflow, or why a watch for a folder has been readed (like the folder was moved) |
Hi,
I am using lsyncd to monitor an entire filesystem (~500 gb with 1250000 folder watches). Using lsyncd level 2 config, I call a shell script to echo the event.pathname to a timestamped file. This is done for backup purpose using a custom rsync script to sync this timestamped file containing list of path of files that have changed. Using the inbuilt level 4 lsyncd config doesn't work for me coz ,level 4 spawns a single rsync for each file changed. It doesn't keeps up with the bulk of file changed on my server.
Initially there weren't any issues besides high cpu utilization by lsyncd. But now it causes load average on the system to be increased too much. Shutting lsyncd down brings down the load average so its obvious that this issue is being caused by lsyncd.
I even tried replacing the shell scripts with lua code in the lsyncd-config, although the cpu utilization got reduced but the load average problem still persisted.
here is a link to the lsyncd-config i wrote
https://gist.github.com/perfectayush/5502216
any idea why this happens and what can be done to tackle this.
The text was updated successfully, but these errors were encountered: