ElasticSearch fail on startup if found empty state file for index #27007

nefelim4ag · 2017-10-13T12:50:40Z

ES 5.6.3

openjdk version "1.8.0_141"
OpenJDK Runtime Environment (build 1.8.0_141-8u141-b15-1~deb9u1-b15)
OpenJDK 64-Bit Server VM (build 25.141-b15, mixed mode)

Linux ids 4.12.0-0.bpo.1-amd64 #1 SMP Debian 4.12.6-1~bpo9+1 (2017-08-27) x86_64 GNU/Linux

Some times after server crash, ES can found files like:
/ES/ES_STOR_00/nodes/0/indices/9UbB7kLtQ4mBJxPCzgHjZQ/_state/state-30.st
with zero size (empty), and fail start up.

I think ES must just ignore/drop that files and reread indexes

Stop ES
truncase -s 0 /ES/ES_STOR_00/nodes/0/indices/9UbB7kLtQ4mBJxPCzgHjZQ/_state/state-30.st
Start ES

Thanks

jasontedor · 2017-10-13T14:16:48Z

These state files are written atomically. That is, when we are going to write a new state file out to disk, we first write the new state file to temporary file, then we fsync the file to disk, then we atomically move the temporary file into its permanent location (the underlying filesystem must support atomic move, there is no fallback here), then we fsync the directory to disk. Seeing a zero-length state file means the file was modified by external force. Failing startup is the right behavior here, we should not leniently allow external forces to modify files on disk, this needs to be forcefully brought to your attention so that you can address the root problem, it requires manual intervention to recover from.

I think ES must just ignore/drop that files and reread indexes

We simply abhor leniency of this form. If these files are modified by external force, corrupted on disk, etc. it needs to be brought to your attention. Lines in a log file are not enough.

nefelim4ag · 2017-10-13T14:28:20Z

@jasontedor , thanks for explain! :)

jasontedor · 2017-10-13T14:32:59Z

@nefelim4ag You're most welcome.

grantholly · 2017-10-18T00:34:55Z

I'm having a similar issue:

ES 5.6.0 java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) CentOS Release 7.3

I had a node die with an OutOfMemoryError. When I tried to restart the nodes, they could not start back up and were complaining about being unable to read an index state file.

[2017-10-17T13:50:16,501][ERROR][o.e.g.GatewayMetaState ] [REDACTED] failed to read local state, exiting... org.elasticsearch.ElasticsearchException: java.io.IOException: failed to read [id:9, legacy:false, file:/data/2/logstash-1/REDACTED/nodes/0/indices/5FxmYg9DQziflptxDNdW0A/_state/state-9.st]

I checked the file, and unlike other index state files, the file in the log has no content

$ stat /data/1/logstash-1/REDACTED/nodes/0/indices/5FxmYg9DQziflptxDNdW0A/_state/state-9.st File: ‘/data/1/logstash-1/REDACTED/nodes/0/indices/5FxmYg9DQziflptxDNdW0A/_state/state-9.st’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 831h/2097d Inode: 4298048169 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 496/elasticsearch) Gid: ( 390/elasticsearch) Access: 2017-10-17 13:42:55.163173624 -0700 Modify: 2017-10-16 16:34:24.419638685 -0700 Change: 2017-10-16 16:34:24.419638685 -0700 Birth: -

Elasticsearch will NOT startup in this state. My cluster is in a perpetual red state as I cannot allocate the unallocated primary and replica shard that has damaged state files. Here's a bit from where I tried to allocate the unassigned shards:

_cluster/reroute?dry_run -d '{"commands": [{"allocate_stale_primary": {"index": "logstash-2017.10.07", "shard": 5, "node": "REDACTED", "accept_data_loss": true}}]}'

...
"unassigned": [
{
"unassigned_info": {
"allocation_status": "no_attempt",
"details": "node_left[BYwe091yTUG_0aPwOcpirQ]",
"delayed": false,
"at": "2017-10-16T23:36:05.480Z",
"reason": "NODE_LEFT"
},
"recovery_source": {
"type": "PEER"
},
"index": "logstash-2017.10.07",
"shard": 5,
"relocating_node": null,
"node": null,
"primary": false,
"state": "UNASSIGNED"
},
{
"unassigned_info": {
"allocation_status": "no_valid_shard_copy",
"details": "node_left[65jx7VyYRty3DZ1SPbMKUg]",
"delayed": false,
"at": "2017-10-16T23:40:06.448Z",
"reason": "NODE_LEFT"
},
"recovery_source": {
"type": "EXISTING_STORE"
},
"index": "logstash-2017.10.10",
"shard": 1,
"relocating_node": null,
"node": null,
"primary": true,
"state": "UNASSIGNED"
},
{
"unassigned_info": {
"allocation_status": "no_attempt",
"details": "primary failed while replica initializing",
"delayed": false,
"at": "2017-10-16T23:40:06.448Z",
"reason": "PRIMARY_FAILED"
},
"recovery_source": {
"type": "PEER"
},
"index": "logstash-2017.10.10",
"shard": 1,
"relocating_node": null,
"node": null,
"primary": false,
"state": "UNASSIGNED"
},
{
"unassigned_info": {
"allocation_status": "no_valid_shard_copy",
"details": "failed recovery, failure RecoveryFailedException[[logstash-2017.10.06][5]: Recovery failed on {REDACTED}{SnJTnI9YS7GKLV9dkOCbtw}{Y6DRDT41Qiqylyu1gXaVsw}{10.0.44.201}{10.0.44.201:9300}{rack=1}]; nested: IndexShardRecoveryException[failed to fetch index version after copying it over]; nested: IndexShardRecoveryException[shard allocated for local recovery (post api), should exist, but doesn't, current files: []]; nested: FileNotFoundException[no segments* file found in store(mmapfs(/data/5/logstash-2/nodes/0/indices/qKtuZNX9SHCHEcj8dviiMQ/5/index)): files: []]; ",
"delayed": false,
"failed_attempts": 1,
"at": "2017-10-17T22:15:50.136Z",
"reason": "ALLOCATION_FAILED"
},
"recovery_source": {
"type": "EXISTING_STORE"
},
"index": "logstash-2017.10.06",
"shard": 5,
"relocating_node": null,
"node": null,
"primary": true,
"state": "UNASSIGNED"
},
{
"unassigned_info": {
"allocation_status": "no_attempt",
"details": "primary failed while replica initializing",
"delayed": false,
"at": "2017-10-16T23:40:06.448Z",
"reason": "PRIMARY_FAILED"
},
"recovery_source": {
"type": "PEER"
},
"index": "logstash-2017.10.06",
"shard": 5,
"relocating_node": null,
"node": null,
"primary": false,
"state": "UNASSIGNED"
}
]
},
"restore": {
"snapshots": []
}
},
"acknowledged": true
}
...

If I'm reading the docs right, I'm looking at data loss if I were to use the reroute API with allocate_empty_primary. I have a few questions:

If I'm totally wrong about being stuck allocating an empty primary, what are my options for recovering?
If I can't recover those shards, is the most expedient path to getting my cluster back to green just deleting the affected indices?

jasontedor added the discuss label Oct 13, 2017

jasontedor closed this as completed Oct 13, 2017

domeniconappo mentioned this issue Jul 4, 2018

404 over http://localhost:9200/geonames openeventdata/mordecai#56

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ElasticSearch fail on startup if found empty state file for index #27007

ElasticSearch fail on startup if found empty state file for index #27007

nefelim4ag commented Oct 13, 2017

jasontedor commented Oct 13, 2017

nefelim4ag commented Oct 13, 2017

jasontedor commented Oct 13, 2017

grantholly commented Oct 18, 2017

ElasticSearch fail on startup if found empty state file for index #27007

ElasticSearch fail on startup if found empty state file for index #27007

Comments

nefelim4ag commented Oct 13, 2017

jasontedor commented Oct 13, 2017

nefelim4ag commented Oct 13, 2017

jasontedor commented Oct 13, 2017

grantholly commented Oct 18, 2017