Lsyncd causes load while monitoring large filesystem. #202

perfectayush · 2013-05-02T13:38:00Z

Hi,

I am using lsyncd to monitor an entire filesystem (~500 gb with 1250000 folder watches). Using lsyncd level 2 config, I call a shell script to echo the event.pathname to a timestamped file. This is done for backup purpose using a custom rsync script to sync this timestamped file containing list of path of files that have changed. Using the inbuilt level 4 lsyncd config doesn't work for me coz ,level 4 spawns a single rsync for each file changed. It doesn't keeps up with the bulk of file changed on my server.

Initially there weren't any issues besides high cpu utilization by lsyncd. But now it causes load average on the system to be increased too much. Shutting lsyncd down brings down the load average so its obvious that this issue is being caused by lsyncd.

I even tried replacing the shell scripts with lua code in the lsyncd-config, although the cpu utilization got reduced but the load average problem still persisted.

here is a link to the lsyncd-config i wrote
https://gist.github.com/perfectayush/5502216

any idea why this happens and what can be done to tackle this.

axkibe · 2013-05-03T08:19:27Z

coz ,level 4 spawns a single rsync for each file changed. It doesn't
keeps up with the bulk of file changed on my server.

This is not the case. default behavior is waiting for the defined delay
timeout and then send out one single rsync which gets the list of files
that have changes transfered through a pipe (or 1000 affected files,
whichever comes first). I putted a lot of effort into that to make this
possible :-)

If you set delay to zero, Lsyncd has no possibility to aggregate changes,
since it has zero time to do it.

Initially there weren't any issues besides high cpu utilization by lsyncd.
But now it causes load average on the system to be increased too much.
Shutting lsyncd down brings down the load average so its obvious that this
issue is being caused by lsyncd.

How much file changes per second are we talking here? So far CPU for Lsyncd
has never been a problem for anyone I heared of. Memory can be tough! Since
the kernel keeps aprox. 1KB of unswapable memory per watch this can add up.
So in your case this add ups to aprox. 1GB. Maybe your system is running
out of memory?

I'm afraight right now there isn't much away around that limit. Its a
limitation built into inotify.

Axel

perfectayush · 2013-05-03T10:19:29Z

I know level 4 has a delay which aggregates filelist before spawning rsync. But we deal with around 40 small files being changed every second (based on the log that I created there were 145000 files affected in an hour). Setting delay to a large value didn't work either, seems like there is a limit to it. Level 4 couldn't keep up in one of the test i did. Memory is not an issue we have 24 GB RAM on the server. And lsyncd uses around 500 mb ram.

axkibe · 2013-05-03T13:36:37Z

The aggregation is limited to 1000 events in the queue, since checking each event for an event that is already in the queue has n^2 runtime performance. This could be changed to n * log n by clever use of lookup tables but isn't been done so far.

Runnig the Lua Profiler with a default configuration would be helpful to see where the CPU is eaten, if its in Lua itself and not the kernel after all.

izzy · 2017-01-16T07:38:38Z

Can still confirm this. Tried it with ~1,000,000 directories on a 3TB file system and apart from the initial rsync taking a long time the inotify_add_watch for every folder took up an enormous amount of time and (single-core) cpu load, which seemingly lead to lsync repeatedly doing an init or at least part of it. I didn't see the rsync again after it finished, but checking on it after two days with strace I realized it was adding watches for folders I had already seen beeing added.

axkibe · 2017-01-16T08:34:21Z

If Lsyncd encounters an inotify queue overflow event, it fully restarts.

Otherwise if a path is moved or deleted and recreated it will of course add new watches for that path.

izzy · 2017-01-16T09:19:22Z

If Lsyncd encounters an inotify queue overflow event, it fully restarts.

Is that preventable somehow? Also, would it be possible to multithread the add_watch process or increase it's performance otherwise?

axkibe · 2017-01-16T09:26:12Z

You can increase the inotify queue length as kernel parameter (via sys dev)

/proc/sys/filesystem/inotify/max_queued_events

This is an issue only if events pile up faster than Lsyncd can empty them out. The basic possiblity of this faucet to be faster than Lsyncd can drain it is not avoidable however. It can only be made less likely. For this you'd have to go GlusterFS or DBRD or so, where the sync is controlled on device level and thus these can throttle incoming events.

axkibe · 2017-01-16T09:26:52Z

PS: You should be able to see in the log if there was an Overflow, or why a watch for a folder has been readed (like the folder was moved)

axkibe closed this as completed May 3, 2013

axkibe reopened this May 3, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lsyncd causes load while monitoring large filesystem. #202

Lsyncd causes load while monitoring large filesystem. #202

perfectayush commented May 2, 2013

axkibe commented May 3, 2013

perfectayush commented May 3, 2013

axkibe commented May 3, 2013

izzy commented Jan 16, 2017

axkibe commented Jan 16, 2017 •

edited

Loading

izzy commented Jan 16, 2017

axkibe commented Jan 16, 2017

axkibe commented Jan 16, 2017

Lsyncd causes load while monitoring large filesystem. #202

Lsyncd causes load while monitoring large filesystem. #202

Comments

perfectayush commented May 2, 2013

axkibe commented May 3, 2013

perfectayush commented May 3, 2013

axkibe commented May 3, 2013

izzy commented Jan 16, 2017

axkibe commented Jan 16, 2017 • edited Loading

izzy commented Jan 16, 2017

axkibe commented Jan 16, 2017

axkibe commented Jan 16, 2017

axkibe commented Jan 16, 2017 •

edited

Loading