New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk free space threshold - at least a Warning message in the log file #8367
Comments
Having not enough disk space to allocate the shard is worth warning about. Closes elastic#8367
I had to fix this for someone recently as well. Lots of warning would be annoying (and maybe counter productive if you don't partition your disks sanely) but would still make the problem less mysterious. You could always turn off the warnings if you don't like them with logging config. |
Fixes an issue where only absolute bytes were taken into account when kicking off an automatic reroute due to disk usage. Also randomized the tests to use either an absolute value or a percentage so this is tested. Also adds logging for each node over the high and low watermark every time a new cluster info usage is gathered (defaults to every 30 seconds). Related to elastic#8368 Fixes elastic#8367
Fixes an issue where only absolute bytes were taken into account when kicking off an automatic reroute due to disk usage. Also randomized the tests to use either an absolute value or a percentage so this is tested. Also adds logging for each node over the high and low watermark every time a new cluster info usage is gathered (defaults to every 30 seconds). Related to #8368 Fixes #8367
Fixes an issue where only absolute bytes were taken into account when kicking off an automatic reroute due to disk usage. Also randomized the tests to use either an absolute value or a percentage so this is tested. Also adds logging for each node over the high and low watermark every time a new cluster info usage is gathered (defaults to every 30 seconds). Related to #8368 Fixes #8367
@dakrone I worked with a client for whom this fix may have made things even worse. A log entry every 30 seconds means the log file keeps getting bigger and bigger. If there is no place to move shards to, or if it takes more time for the move to finish (think large indexes, or many re-allocations pending) then this may as well crash the node due to 0 diskspace left after a short while. My 2c. |
@synhershko opened #8686 to address this. |
Fixes an issue where only absolute bytes were taken into account when kicking off an automatic reroute due to disk usage. Also randomized the tests to use either an absolute value or a percentage so this is tested. Also adds logging for each node over the high and low watermark every time a new cluster info usage is gathered (defaults to every 30 seconds). Related to elastic#8368 Fixes elastic#8367
Hi,
Today i had the issue, that all my replica shards were not starting.
After 3 hours i enabled DEBUG in logging.yml and finally spotted:
Less than the required 15.0% free disk threshold (11.348983564561003% free) on node [blahblah], preventing allocation.
So i freed some space and everything is back online.
It would be great if ES emits a (at least)warning in the log file.
The text was updated successfully, but these errors were encountered: