Restore shard routing. #6393

jpountz · 2014-06-03T14:12:31Z

Routing has been inadvertly changed in #5562 resulting in documents going to
different shards in 1.2. This is a terrible bug because an indexing request
would not necessarily go to the same shard anymore, potentially leading to
duplicates.

Close #6391

Routing has been inadvertly changed in elastic#5562 resulting in documents going to different shards in 1.2. This is a terrible bug because an indexing request would not necessarily go to the same shard anymore, potentially leading to duplicates. Close elastic#6391

s1monw · 2014-06-03T14:15:04Z

LGTM

colings86 · 2014-06-03T14:17:44Z

src/main/resources/org.elasticsearch.cluster.routing/shard_routes.txt

@@ -0,0 +1,217 @@
+# Index num_shards _type _id _routing shard_id


Shouldn't this file should be in src/test/resources rather than src/main/resources since its a test file?

Good point!

jpountz · 2014-06-03T14:27:11Z

Thanks @colings86 I updated the path and will push shortly...

andrassy · 2014-06-03T15:39:36Z

If data is indexed with the broken 1.2 and then a fixed version is released then won't that, again, lead to data corruption. Really nasty bug :(

s1monw · 2014-06-04T08:09:29Z

@andrassy Agreed, indexed data with 1.2.0 will not be compatible with 1.2.1. Yet, we are working on tooling / help for the problem. I also agree that this bug is really nasty.

micpalmia · 2014-06-04T10:18:45Z

We had 0.90.7 installed, with no routing in place. We updated to 1.2.0 and reindexed everything with custom routing, that is now enforced using aliases (we don't explicitly route documents in the queries).

Does this mean that we might store (or might have stored) documents in the wrong shards, putting them in some kind of endless limbo? Any advice on how we should proceed?

bleskes · 2014-06-04T11:01:48Z

@micpalmia sadly roughly 50% of the documents indexed with 1.2.0 is affected by this issue - regardless of whether custom routing is used or not. We're still evaluating the options for creating a tool to help identify (and hopefully solve) the problematic documents but in your case it will likely be half of your data. Since you've reindexed once I think your quickest option will be to reindex again once you upgrade to 1.2.1. Do note that 1.2.0 is consistent with it self - so if you indexed all data with 1.2.0 you're good to access it with 1.2.0. This may help with timing the reindexing.

micpalmia · 2014-06-04T13:12:01Z

Thank you @bleskes, should we look somewhere specific for future updates about this issue?

nariman-haghighi · 2014-06-04T16:41:29Z

Any updates here? This is about as ugly as it could get. It's shocking something like this made it out. Not only does it impact us, it impacts our customers and our SLAs. Is there a manual workaround at the moment? Deleting duplicates through a query and re-indexing? Re-indexing alone won't do it.

kimchy · 2014-06-04T17:46:49Z

@nariman-haghighi we are working on a tool that will allow to go over the data and fix it, should be out in a day or 2.

s1monw · 2014-06-04T18:53:03Z

@nariman-haghighi one manual workaround is to move to 1.2.1 and reindex into a new index. The new index will have everything in the right place. That said, it's not always feasible and we try to come up with a solution that is less painful.

nariman-haghighi · 2014-06-06T19:12:09Z

The workaround of re-indexing into a new index worked for 2/3 indexed but something peculiar happens with the 3rd index. It has 454 documents (all unique Ids), but the attempt to re-index with the bulk API results in 1 document in the destination index. No changes to mappings. May be a NEST issue? /cc: @Mpdreamz

jpountz · 2014-06-06T23:44:42Z

@nariman-haghighi Have you checked if the bulk response contained errors? Additionally, http://www.elasticsearch.org/blog/tool-help-routing-issues-elasticsearch-1-2-0/ might help get your indices back to normal.

colings86 reviewed Jun 3, 2014
View reviewed changes

Address @colings86 's concerns.

4f74e30

jpountz added v1.2.1 labels Jun 3, 2014

jpountz closed this Jun 3, 2014

javanna mentioned this pull request Jun 4, 2014

Duplicate Documents In All Queries Following Upgrade to 1.2.0 #6396

Closed

clintongormley changed the title ~~Routing: Restore shard routing.~~ Restore shard routing. Jun 7, 2015

clintongormley added the :Core/Infra/Core Core issues without another label label Jun 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore shard routing. #6393

Restore shard routing. #6393

jpountz commented Jun 3, 2014

s1monw commented Jun 3, 2014

colings86 Jun 3, 2014

jpountz Jun 3, 2014

jpountz commented Jun 3, 2014

andrassy commented Jun 3, 2014

s1monw commented Jun 4, 2014

micpalmia commented Jun 4, 2014

bleskes commented Jun 4, 2014

micpalmia commented Jun 4, 2014

nariman-haghighi commented Jun 4, 2014

kimchy commented Jun 4, 2014

s1monw commented Jun 4, 2014

nariman-haghighi commented Jun 6, 2014

jpountz commented Jun 6, 2014

		@@ -0,0 +1,217 @@
		# Index num_shards _type _id _routing shard_id

Restore shard routing. #6393

Restore shard routing. #6393

Conversation

jpountz commented Jun 3, 2014

s1monw commented Jun 3, 2014

colings86 Jun 3, 2014

Choose a reason for hiding this comment

jpountz Jun 3, 2014

Choose a reason for hiding this comment

jpountz commented Jun 3, 2014

andrassy commented Jun 3, 2014

s1monw commented Jun 4, 2014

micpalmia commented Jun 4, 2014

bleskes commented Jun 4, 2014

micpalmia commented Jun 4, 2014

nariman-haghighi commented Jun 4, 2014

kimchy commented Jun 4, 2014

s1monw commented Jun 4, 2014

nariman-haghighi commented Jun 6, 2014

jpountz commented Jun 6, 2014