-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reindexing losing data? #11435
Comments
Hi @ryanbaldwin Could you tell us more about your mappings, and exactly how you do the bulk and scan/scroll. Also, in scan/scroll, could you check for shard failures, and check your logs to see if any exceptions are reported. See #11419 (comment) for a similar issue. Also, could you give us the output of |
are you calling refresh before you do the count call? How many docs are you missing, how is the index created? Can you provide more infos? |
Hey all. I'm not at home right now but will provide a detailed explanation when I'm able to. May be later tonight, or tomorrow at the latest. For now I'll go by memory while fat thumbing on my phone. High level answers:
3, after the bulk call I repeat step 2 using the scroll id returned by the previous scroll call. Wash, rinse, repeat until hits on the call to scroll is 0. Like I said, It seems as though I'm missing 1 doc for each scan/bulk cycle. Perhaps it's something in my script, but after specifying the scan size in the origian scan call, I don't rely on the number ever again. I simply iterate over every document. I can provide more details later, such as an excerpt of the bulk calls, etc. For now: thoughts?
On Sun, May 31, 2015 at 8:35 AM, Simon Willnauer notifications@github.com
|
Thx Ryan for the details. Quick question - do you use parent & child documents or custom routing? |
Negative. Setup is pretty stock. Default 5 shards + 1 replica, and however they get routed is how they get routed. That said I AM using a dynamic mapping template on the target index, but it's mostly just setting 99% of the incoming string values to not_analyzed, since this is explicit audit data and not something that really requires full text search. As a side note, here's some possible useful information about the topology: I have 2 ES servers, each in their own docker container, each configured identically, and each running on the same host. Each ES server has its own persistent logs/data volumes on the host (ie they are not sharing the same logs/data directories on the host). Sitting in front is an nginx doing simple round robin balancing between the two. The clojure app is doing everything through nginx, same with manual queries I run via Sense. As far as I'm aware that topology with docker should roughly approximate (at a minimum) what 2 separate instances on two separate hosts in a network should look like.
On Sun, May 31, 2015 at 12:19 PM, Boaz Leskes notifications@github.com
|
Also - no parent child docs. Just 45k documents, each one an audit event, and each one completely independent.
On Sun, May 31, 2015 at 12:46 PM, ryan baldwin ryanbaldwin@gmail.com
|
Hi @ryanbaldwin
This sounds a lot like a bug in your code, perhaps:
An easy way to test this would be to use a module known to work. If you're familiar with Perl, you could install the Search::Elasticsearch module (see https://metacpan.org/pod/Search::Elasticsearch) and run the following script (updating the index names for your local setup):
|
Ugh. Clinton. You are indeed correct. I made the classic _bulk error: I did not append a "\n" to the final document body. Hence the n*1 documents missing. Very sorry, but thank you for your help. Also - Clinton - I must congratulate you on the ElasticSearch: The Definitive Guide book. This is, by far, the best tech book I've read in over a decade. Extremely easy to understand, an excellent voice, and absolute gold on every page (including the "don't forget to put a \n after the last document when using the bulk api!", which I obviously, promptly, forgot). Huge kudos to you and Zach. Thanks for your help, and again, my apologies for the false alarm. |
kind words @ryanbaldwin - thank you :) /cc @polyfractal |
I'm experiencing the same issue, but I'm using the Reindex API. And there's an additional caveat: testing the Reindex call in my local docker environment never misses a document. Doing this in the Docker swarm architecture for development and beta environments loses all data. I'm using Elixir and its Tirexs client to communicate with Elastic. Let me show the very basic tests I'm trying:
Pretty naive and straightforward. This is the feedback I get when running this procedure
Look at the difference in the ouput for point 2) and point 5). Something prevented from reindexing my 4 documents. |
@sebastialonso Please ask questions like these in the forum. The most likely thing is that your temp index hadn't refreshed before you started the second reindex, so no documents were visible to search. |
We have around 700k records and i managed to fix this by adding timeouts of 10 seconds in 2 places
Without the timeout, in our case it was always missing few thousand records. |
This still happens on 8.3.1. |
Hi,
I've been doing some playing around with ES for the purposes of introducing it into our organization for searching audit data. Part of my adventure is data modelling and playing with mappings.
I had a small index, called "audit_v1", which has 43,754 documents. I created a second index called "audit_v2" and did a scan & scroll and bulk create the 43,754 documents 500 at a time into the new index. I've done this 5 times now, and every single time I'm seeing not only fewer number of records showing up in the audit_v2 index, but it's the same number ever single time. This is despite the _bulk api not reporting any errors in the response. From what I can tell all the documents should be there, but apparently they aren't.
Is this a legit bug, or is it possible I'm misunderstanding exactly what [index]/_count returns?
Sorry for the trivial question, but I can't find a clear answer online. I'm using v1.5.0.
Thanks
The text was updated successfully, but these errors were encountered: