Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk free space threshold - at least a Warning message in the log file #8367

Closed
j0r0 opened this issue Nov 6, 2014 · 3 comments

Comments

@j0r0
Copy link
Contributor

commented Nov 6, 2014

Hi,
Today i had the issue, that all my replica shards were not starting.
After 3 hours i enabled DEBUG in logging.yml and finally spotted:
Less than the required 15.0% free disk threshold (11.348983564561003% free) on node [blahblah], preventing allocation.

So i freed some space and everything is back online.

It would be great if ES emits a (at least)warning in the log file.

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Nov 6, 2014
Raise log level on DiskThresholdDecider
Having not enough disk space to allocate the shard is worth warning about.

Closes elastic#8367
@nik9000

This comment has been minimized.

Copy link
Contributor

commented Nov 6, 2014

I had to fix this for someone recently as well. Lots of warning would be annoying (and maybe counter productive if you don't partition your disks sanely) but would still make the problem less mysterious.

You could always turn off the warnings if you don't like them with logging config.

dakrone added a commit to dakrone/elasticsearch that referenced this issue Nov 7, 2014
Take percentage watermarks into account for reroute listener
Fixes an issue where only absolute bytes were taken into account when
kicking off an automatic reroute due to disk usage. Also randomized the
tests to use either an absolute value or a percentage so this is tested.

Also adds logging for each node over the high and low watermark every
time a new cluster info usage is gathered (defaults to every 30
seconds).

Related to elastic#8368
Fixes elastic#8367

@dakrone dakrone closed this in 3712d97 Nov 7, 2014

dakrone added a commit that referenced this issue Nov 7, 2014
Take percentage watermarks into account for reroute listener
Fixes an issue where only absolute bytes were taken into account when
kicking off an automatic reroute due to disk usage. Also randomized the
tests to use either an absolute value or a percentage so this is tested.

Also adds logging for each node over the high and low watermark every
time a new cluster info usage is gathered (defaults to every 30
seconds).

Related to #8368
Fixes #8367
dakrone added a commit that referenced this issue Nov 13, 2014
Take percentage watermarks into account for reroute listener
Fixes an issue where only absolute bytes were taken into account when
kicking off an automatic reroute due to disk usage. Also randomized the
tests to use either an absolute value or a percentage so this is tested.

Also adds logging for each node over the high and low watermark every
time a new cluster info usage is gathered (defaults to every 30
seconds).

Related to #8368
Fixes #8367
@synhershko

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2014

@dakrone I worked with a client for whom this fix may have made things even worse. A log entry every 30 seconds means the log file keeps getting bigger and bigger. If there is no place to move shards to, or if it takes more time for the move to finish (think large indexes, or many re-allocations pending) then this may as well crash the node due to 0 diskspace left after a short while. My 2c.

@dakrone

This comment has been minimized.

Copy link
Member

commented Nov 27, 2014

@synhershko opened #8686 to address this.

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Take percentage watermarks into account for reroute listener
Fixes an issue where only absolute bytes were taken into account when
kicking off an automatic reroute due to disk usage. Also randomized the
tests to use either an absolute value or a percentage so this is tested.

Also adds logging for each node over the high and low watermark every
time a new cluster info usage is gathered (defaults to every 30
seconds).

Related to elastic#8368
Fixes elastic#8367
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.