Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore shard routing. #6393

Closed
wants to merge 2 commits into from
Closed

Restore shard routing. #6393

wants to merge 2 commits into from

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Jun 3, 2014

Routing has been inadvertly changed in #5562 resulting in documents going to
different shards in 1.2. This is a terrible bug because an indexing request
would not necessarily go to the same shard anymore, potentially leading to
duplicates.

Close #6391

Routing has been inadvertly changed in elastic#5562 resulting in documents going to
different shards in 1.2. This is a terrible bug because an indexing request
would not necessarily go to the same shard anymore, potentially leading to
duplicates.

Close elastic#6391
@s1monw
Copy link
Contributor

s1monw commented Jun 3, 2014

LGTM

@@ -0,0 +1,217 @@
# Index num_shards _type _id _routing shard_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this file should be in src/test/resources rather than src/main/resources since its a test file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@jpountz
Copy link
Contributor Author

jpountz commented Jun 3, 2014

Thanks @colings86 I updated the path and will push shortly...

@andrassy
Copy link

andrassy commented Jun 3, 2014

If data is indexed with the broken 1.2 and then a fixed version is released then won't that, again, lead to data corruption. Really nasty bug :(

@s1monw
Copy link
Contributor

s1monw commented Jun 4, 2014

@andrassy Agreed, indexed data with 1.2.0 will not be compatible with 1.2.1. Yet, we are working on tooling / help for the problem. I also agree that this bug is really nasty.

@micpalmia
Copy link
Contributor

We had 0.90.7 installed, with no routing in place. We updated to 1.2.0 and reindexed everything with custom routing, that is now enforced using aliases (we don't explicitly route documents in the queries).

Does this mean that we might store (or might have stored) documents in the wrong shards, putting them in some kind of endless limbo? Any advice on how we should proceed?

@bleskes
Copy link
Contributor

bleskes commented Jun 4, 2014

@micpalmia sadly roughly 50% of the documents indexed with 1.2.0 is affected by this issue - regardless of whether custom routing is used or not. We're still evaluating the options for creating a tool to help identify (and hopefully solve) the problematic documents but in your case it will likely be half of your data. Since you've reindexed once I think your quickest option will be to reindex again once you upgrade to 1.2.1. Do note that 1.2.0 is consistent with it self - so if you indexed all data with 1.2.0 you're good to access it with 1.2.0. This may help with timing the reindexing.

@micpalmia
Copy link
Contributor

Thank you @bleskes, should we look somewhere specific for future updates about this issue?

@nariman-haghighi
Copy link

Any updates here? This is about as ugly as it could get. It's shocking something like this made it out. Not only does it impact us, it impacts our customers and our SLAs. Is there a manual workaround at the moment? Deleting duplicates through a query and re-indexing? Re-indexing alone won't do it.

@kimchy
Copy link
Member

kimchy commented Jun 4, 2014

@nariman-haghighi we are working on a tool that will allow to go over the data and fix it, should be out in a day or 2.

@s1monw
Copy link
Contributor

s1monw commented Jun 4, 2014

@nariman-haghighi one manual workaround is to move to 1.2.1 and reindex into a new index. The new index will have everything in the right place. That said, it's not always feasible and we try to come up with a solution that is less painful.

@nariman-haghighi
Copy link

The workaround of re-indexing into a new index worked for 2/3 indexed but something peculiar happens with the 3rd index. It has 454 documents (all unique Ids), but the attempt to re-index with the bulk API results in 1 document in the destination index. No changes to mappings. May be a NEST issue? /cc: @Mpdreamz

@jpountz
Copy link
Contributor Author

jpountz commented Jun 6, 2014

@nariman-haghighi Have you checked if the bulk response contained errors? Additionally, http://www.elasticsearch.org/blog/tool-help-routing-issues-elasticsearch-1-2-0/ might help get your indices back to normal.

@clintongormley clintongormley changed the title Routing: Restore shard routing. Restore shard routing. Jun 7, 2015
@clintongormley clintongormley added the :Core/Infra/Core Core issues without another label label Jun 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Two documents with the same _id
9 participants