Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate IDs when indexing #1716

Closed
philippeauriach opened this issue Sep 22, 2021 · 8 comments
Closed

Duplicate IDs when indexing #1716

philippeauriach opened this issue Sep 22, 2021 · 8 comments
Labels
bug Something isn't working as expected
Projects
Milestone

Comments

@philippeauriach
Copy link

Describe the bug
Uploading several batches of the same content result in duplicates ids in meilisearch. Objects are coming from a Strapi service, and sent to a meilisearch instance (local docker latest image : 0.22.0) via the meilisearch nodejs package.

Full reproduction project available at : https://github.com/philippeauriach/meilisearch-dup-id

To Reproduce

  • clone https://github.com/philippeauriach/meilisearch-dup-id
  • configure your meilisearch instance endpoint in hooks/meilisearch/utils.js if needed
  • yarn install
  • yarn develop
  • => you should have 27 products in your meilisearch instance (expected behavior)
  • kill the process, run it again yarn develop
  • => meilisearch now has 54 products (broken behavior, products are duplicated)

Expected behavior
Products should be replaced when having the same ID

Screenshots
Capture d’écran 2021-09-22 à 22 49 59

MeiliSearch version: v0.22.0

Additional context
Strapi ?

@curquiza
Copy link
Member

Thank you so much for the report!
@ManyTheFish is investigation on it 🙂

@curquiza curquiza added the bug Something isn't working as expected label Sep 23, 2021
@curquiza curquiza added this to Candidates in Bug triage via automation Sep 23, 2021
@curquiza curquiza moved this from Candidates to Bugs - severity 1 🔥 in Bug triage Sep 23, 2021
bors bot added a commit to meilisearch/milli that referenced this issue Sep 23, 2021
369: Add test checking the bug reported in meilisearch issue 1716 r=Kerollmops a=ManyTheFish

The bug is not present in the newer milli version.

Related to [Meilisearch#1716](meilisearch/meilisearch#1716)

Co-authored-by: many <maxime@meilisearch.com>
@ManyTheFish
Copy link
Member

ManyTheFish commented Sep 23, 2021

Hello @philippeauriach! Thanks for your bug report, this should be fixed in the next version of Meilisearch.

The bug occurs because a Document addition of 0 documents is made between other document addition.
To fix the bug in your application, you can change a bit your code in order to avoid empty document addition:

philippeauriach/meilisearch-dup-id#1

Thanks again for your issue! 👍

@curquiza curquiza added this to the v0.23.0 milestone Sep 23, 2021
@curquiza curquiza changed the title Duplicate IDs when indexing from nodejs meilisearch (and strapi ?) Duplicate IDs when indexing Sep 23, 2021
@khash
Copy link

khash commented Sep 28, 2021

FWIW, I've come across this issue with a pure Ruby client. Now I don't know if it is the same issue or not, but there is how you can recreate it:

(in this example index = client.index('acme'))

  1. Add some docs index.add_documents([{id: "foo"}, {id: "bar"}])
  2. Add an empty array index.add_documents([])

This will break the index in a way that while the API returns the documents correctly with /indexes/acme/documents, doing a GET on /indexes/acme/documents/foo returns a 404.

This means if you run the first line again (add docs), you will end up with duplicates.

This is caused by adding an empty array. Here is how you can reproduce it with only CURL on a clean setup:

$ curl -X POST http://localhost:7700/indexes/acme/documents --data '[{"id": "foo"}, {"id":"bar"}]'

$ curl http://localhost:7700/indexes/acme/documents/foo
{"id":"foo"}

$curl -X POST http://localhost:7700/indexes/acme/documents --data '[]'

$curl http://localhost:7700/indexes/acme/documents/foo
{"message":"Document with id foo not found.","errorCode":"document_not_found","errorType":"invalid_request_error","errorLink":"https://docs.meilisearch.com/errors#document_not_found"}

All the while, curl http://localhost:7700/indexes/acme/documents returns the correct set of documents which makes things very confusing.

@curquiza
Copy link
Member

Hello @khash! Your bug should not be present in the next version. I did not succeed to reproduce your issue on our branch that will be merged into main soon :)

@curquiza
Copy link
Member

Closed by #1711

Bug triage automation moved this from Bugs - severity 1 🔥 to Done Sep 29, 2021
@Arsapol
Copy link

Arsapol commented Nov 7, 2022

I face this problem in version 0.29.1 when I update index setting more than 1 time.

Before update : Every works fine
After update : Document data duplicate with the same id

@curquiza
Copy link
Member

curquiza commented Nov 7, 2022

Hello @Arsapol

Is it related to this? #3021

Otherwise, can you open an issue with the detailed steps to reproduce?

@Kerollmops
Copy link
Member

Indeed it looks related to #3021 and will be fixed in the next release of Meilisearch, i.e. v0.30.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected
Projects
No open projects
Development

No branches or pull requests

6 participants