Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_disk is very slow #1919

Closed
nono-gdv opened this issue Sep 8, 2023 · 10 comments
Closed

check_disk is very slow #1919

nono-gdv opened this issue Sep 8, 2023 · 10 comments

Comments

@nono-gdv
Copy link

nono-gdv commented Sep 8, 2023

Hi,

Since upgrading from Debian 11 (monitoring-plugins 2.3.1) to Debian 12 (monitoring-plugins 2.3.3), check_disk has gotten very slow on one of our servers.

Here is the old version 2.3.1:

# time ./check_disk -W 1% -K 1% -w 1% -c 1% -l --exclude-type=fuse.lxcfs --exclude-type=tmpfs --exclude-type=devtmpfs --exclude-type=overlay --exclude-type=nsfs --exclude-type=tracefs --exclude-type=squashfs --exclude-type=zfs -e -A -i ^/var/lib/lxc/.+ -i ^/etc/octopuce/
DISK OK| /=11765MB;19786;19786;0;19986

real    0m14,709s
user    0m10,701s
sys     0m0,307s

I tried the exact same command line with version 2.3.3 but gave up and stopped the command after more than one hour.

The machine has about 16000 mounted filesystems, most of them ZFS snapshots (which check_disk should ignore with this command line).

# mount | wc -l                                          
16486
# mount | grep -v zfs | wc -l                                          
24

Running the plugin under strace shows it calling statfs again and again.

I suspect this behaviour was introduced in commit dd249c5 but I have not formally verified it.

@RincewindsHat
Copy link
Member

Hi @nono-gdv,
firstly thanks for reporting this. This looks quite interesting.
When I find the time I will try to replicate the problem and try to find out what the original cause is.

If possible, could you grab a debugger and see where check_disk spends its time?

PS: I also would be quite interested to learn what this setup does.

@sni
Copy link
Contributor

sni commented Sep 11, 2023

i assume this is due to #1879 which has been fixed in #1880. But afaik did not make it into a release yet.

@RincewindsHat
Copy link
Member

ahhh, forgot about that. this might fix the issues, it is worth a try.

@nono-gdv
Copy link
Author

I just tested and can confirm the issue was introduced in commit dd249c5 fixed in 0dd1110, even though there is no automounter involved in my case. Sorry for the noise, I should have checked the master branch first.

As for "what this setup does", the box hosts mirrors of 40 or 50 APT repos with daily snapshots (implemented as ZFS snapshots), with the oldest going back almost 3 years, hence the huge number of mounts.

@RincewindsHat
Copy link
Member

@waja Sounds like you might want to cherry-pick that patch then.

@waja
Copy link
Member

waja commented Sep 12, 2023

@nono-gdv It would be very welcomed to report that issue in den Debian BTS. This might raise the chance to get that fixed in Debian 12, cause patches for minor (or no) bug fixes have a small chance to get accepted by stable release managers.

@nono-gdv
Copy link
Author

Done, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1051768

@waja
Copy link
Member

waja commented Sep 12, 2023

This was fixed with 22_check_disk_avoid_mount (https://sources.debian.org/src/monitoring-plugins/2.3.3-6/debian/patches/22_check_disk_avoid_mount/) in Version 2.3.3-6. Let's see if I get the chance to backport it.

@waja
Copy link
Member

waja commented Sep 26, 2023

This issue will be solved with the next stable point release: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052218#24

@waja
Copy link
Member

waja commented Oct 7, 2023

This issue will be solved with the next stable point release: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052218#24

Which happened right now. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants