Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't free up space because there's not enough space? ;) #9260

Closed
darkpixel opened this issue Jan 12, 2015 · 4 comments
Closed

Can't free up space because there's not enough space? ;) #9260

darkpixel opened this issue Jan 12, 2015 · 4 comments

Comments

@darkpixel
Copy link

One of my nodes is 'low' on space. Down to ~16 GB.

So I tried to run curator to remove older logs and I get the following message:

[2015-01-12 11:53:50,250][INFO ][cluster.metadata         ] [tetrad] [logstash-2014.12.13] deleting index
[2015-01-12 11:53:50,251][DEBUG][action.admin.indices.delete] [tetrad] [logstash-2014.12.13] failed to delete index
java.lang.IllegalStateException: Free bytes [4518450191893] cannot be less than 0 or greater than total bytes [4509977353216]
        at org.elasticsearch.cluster.DiskUsage.<init>(DiskUsage.java:36)
        at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDecider.canRemain(DiskThresholdDecider.java:439)
        at org.elasticsearch.cluster.routing.allocation.decider.AllocationDeciders.canRemain(AllocationDeciders.java:105)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.moveShards(AllocationService.java:257)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:223)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:160)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:146)
        at org.elasticsearch.cluster.metadata.MetaDataDeleteIndexService$2.execute(MetaDataDeleteIndexService.java:130)
        at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329)
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[2015-01-12 11:53:56,767][INFO ][cluster.routing.allocation.decider] [tetrad] low disk watermark [15%] exceeded on [Pc_MAIWOQVe6qKNtKVIYpw][zefram] free: 16.3gb[14.5%], replicas will not be assigned to this node

The actual error from curator is:

root@tetrad:~# /root/.virtualenvs/curator/bin/curator --host localhost delete --older-than 10;
2015-01-12 11:53:50,238 INFO      Job starting...
2015-01-12 11:53:50,241 INFO      Deleting indices...
Traceback (most recent call last):
  File "/root/.virtualenvs/curator/bin/curator", line 9, in <module>
    load_entry_point('elasticsearch-curator==2.1.1', 'console_scripts', 'curator')()
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator_script.py", line 364, in main
    arguments.func(client, **argdict)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator.py", line 1025, in delete
    _op_loop(client, matching_indices, op=delete_index, dry_run=dry_run, **kwargs)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator.py", line 767, in _op_loop
    skipped = op(client, item, **kwargs)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator.py", line 610, in delete_index
    client.indices.delete(index=index_name)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/client/indices.py", line 188, in delete
    params=params)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 301, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 82, in perform_request
    self._raise_error(response.status, raw_data)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(500, u'IllegalStateException[Free bytes [4518450191893] cannot be less than 0 or greater than total bytes [4509977353216]]')
root@tetrad:~# 
@darkpixel
Copy link
Author

One thing I forgot to mention is that the nodes use ZFS for their storage... Maybe the error about free bytes exceeding total bytes might be somehow related to ZFS compression?

@darkpixel
Copy link
Author

The workaround:

  • stop the node that is low on disk space
  • run curator to delete older indexes
  • start the node that is low on disk space
  • the node will delete its old indexes, freeing up space

@dakrone
Copy link
Member

dakrone commented Jan 13, 2015

Related to #9249, the workaround there will work for this also until it is fixed.

@darkpixel
Copy link
Author

Thanks for the pointer!

@dakrone dakrone added the v1.3.8 label Jan 13, 2015
dakrone added a commit to dakrone/elasticsearch that referenced this issue Jan 26, 2015
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.

If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)

Fixes elastic#9249
Relates to elastic#9260
dakrone added a commit that referenced this issue Jan 26, 2015
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.

If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)

Fixes #9249
Relates to #9260
dakrone added a commit that referenced this issue Jan 26, 2015
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.

If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)

Fixes #9249
Relates to #9260
dakrone added a commit that referenced this issue Jan 26, 2015
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.

If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)

Fixes #9249
Relates to #9260

Conflicts:
	src/main/java/org/elasticsearch/cluster/DiskUsage.java
	src/test/java/org/elasticsearch/cluster/DiskUsageTests.java
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.

If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)

Fixes elastic#9249
Relates to elastic#9260
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.

If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)

Fixes elastic#9249
Relates to elastic#9260

Conflicts:
	src/main/java/org/elasticsearch/cluster/DiskUsage.java
	src/test/java/org/elasticsearch/cluster/DiskUsageTests.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants