Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleted documents show up in completion suggester #117

Closed
johnywith1n opened this issue Jul 7, 2014 · 17 comments
Closed

Deleted documents show up in completion suggester #117

johnywith1n opened this issue Jul 7, 2014 · 17 comments

Comments

@johnywith1n
Copy link

@johnywith1n johnywith1n commented Jul 7, 2014

I have the following steps:

  1. create an index
  2. index 2 documents
  3. get suggestions
  4. delete a document
  5. get suggestions

When I do this with a set of curl commands, the 5th step doesn't return any suggestions as expected since I deleted the document corresponding to it. But when I do these steps with the library, a suggestion is returned for the deleted document.

Here's the project with the script and curl commands: https://github.com/johnywith1n/elasticsearch-problem

@spalger
Copy link
Member

@spalger spalger commented Jul 7, 2014

Hey @johnywith1n

Here are a few things that might help you:

  1. there is no reason to use q.promisifyAll. We will return a promise if you don't send a callback.
  2. you can pass refresh: true to most commands that modify an documents (including delete and create).
  3. I'm not sure that sending index: '' is doing what you want. Try my other suggestions and let me know if you are still experiencing the issue.

@johnywith1n
Copy link
Author

@johnywith1n johnywith1n commented Jul 7, 2014

I'm using promisifyAll since elasticsearch.js is on bluebird 1.x and not 2.x.

Adding refresh: true worked. But I'm curious, what's the difference between passing in refresh to the create and delete commands and client.indices.refresh?

Also, isn't setting refresh:true to each operation bad for performance? I thought ES automatically refreshes, but these deleted documents still show up in the suggestions after a day.

@spalger
Copy link
Member

@spalger spalger commented Jul 7, 2014

Yes, setting refresh:true for every operation would result in bad performance, but I was under the impression you were just trying to get this script working. Seeing deleted documents show up in suggestions after days implies that something else is happening entirely. (automatic refreshes happen every second by default)

I tried running your curl script and got:

{"acknowledged":true}
{"_index":"test","_type":"test","_id":"1","_version":1,"created":true}
{"_index":"test","_type":"test","_id":"2","_version":1,"created":true}
{"_shards":{"total":5,"successful":5,"failed":0}}
{"found":true,"_index":"test","_type":"test","_id":"1","_version":2}
{"_shards":{"total":5,"successful":5,"failed":0}}

I also get very similar output when I run the same commands in Marvel.

As for the promises bit: try passing defer: q.defer to the client constructor and it will use your version of bluebird.

@spalger spalger closed this Jul 7, 2014
@spalger
Copy link
Member

@spalger spalger commented Jul 7, 2014

This is likely not an issue with the client itself, but I didn't mean to close the ticket.

@spalger spalger reopened this Jul 7, 2014
@johnywith1n
Copy link
Author

@johnywith1n johnywith1n commented Jul 7, 2014

Yea, I think you're right about it not being the client.

I tried refreshing the index, but that didn't update the suggester. It looks like the completion suggester isn't updated with regards to deleted documents, until a merge occurs. But a merge doesn't occur until there are enough deleted documents. So in the mean time, deleted documents will still be suggested.

Is the completion suggester supposed to only be updated by a merge and not just a refresh? I may have to switch to just a regular search if that's the case.

@spalger
Copy link
Member

@spalger spalger commented Jul 7, 2014

Found this is the docs: "The suggest data structure might not reflect deletes on documents immediately. You may need to do an Optimize for that. You can call optimize with the only_expunge_deletes=true to only cater for deletes or alternatively call a Merge operation."

Right at the bottom of this section

@spalger spalger closed this Jul 7, 2014
@missinglink
Copy link
Contributor

@missinglink missinglink commented Aug 28, 2014

FWIW I have the same issue and running _optimize doesn't fix the issue. This is clearly not a client bug.

I use the suggester a load and it only happens once in a blue moon. In this case I was running an import on 8 cores maxing out the CPU and ES got a bit confused; now those suggest records are stuck in there forever even after their documents have been deleted.

@svola
Copy link

@svola svola commented Jan 28, 2015

For me, calling optimize after deleting documents worked.
The completion suggester was updated then and only then.

@ebuildy
Copy link

@ebuildy ebuildy commented Feb 16, 2015

I am running the same weird behavior, documents are deleted but still showing up in completion suggester. I tried _optimize, cache/_clear etc... still present.

How can I debug this? (I mean for instance how can I get the document ID the suggest search gives...)

Thanks,

@svola
Copy link

@svola svola commented Feb 17, 2015

You can add the id of the document to the payload-section at index-time:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html#indexing

For me it worked after calling optimize.... Good luck!

@ebuildy
Copy link

@ebuildy ebuildy commented Feb 18, 2015

I have a ton of documents, it's working sometime (even without optimize) but sometime not. I believe suggest use FST engine, maybe there are orphans created after a delete operation, I will investigate how ES deals with this and the replication.

@micpalmia
Copy link

@micpalmia micpalmia commented Feb 19, 2015

I'm having the same issue. Have anybody investigated and/or opened a ticket on the ES github already?

@mcanzerini
Copy link

@mcanzerini mcanzerini commented Mar 24, 2015

I'm having the same issue too. Any suggestion ?

@missinglink
Copy link
Contributor

@missinglink missinglink commented Mar 24, 2015

It's a known issue [1] but will not likely be fixed until the release version es@2.0 [2]

It's a problem related to the FST data structures and certainly not a bug in elasticsearch-js

[1] elastic/elasticsearch#7761
[2] elastic/elasticsearch#8909

@hgw2101
Copy link

@hgw2101 hgw2101 commented Mar 2, 2016

Does calling optimize every time after deleting a document cause any performance concerns?

@spalger
Copy link
Member

@spalger spalger commented Mar 3, 2016

@hgw2101 yes, it does.

@Kamapcuc
Copy link

@Kamapcuc Kamapcuc commented Oct 25, 2016

The same bug occurs whan you change suggest context of document. 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants