New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to set _id in bulk index with raw source documents #2861
Labels
type: bug
A general bug
Comments
spring-projects-issues
added
the
status: waiting-for-triage
An issue we've not yet triaged
label
Feb 28, 2024
sothawo
added
status: waiting-for-feedback
We need additional information before we can continue
and removed
status: waiting-for-triage
An issue we've not yet triaged
labels
Feb 28, 2024
I'm sorry, I actually sent the code with that mistake. Please see my edited post. The problem is not in the compilation (it was my mistake in copying and pasting the code here), but in the "withId" call. You can set anything there and the setting is ignored in the bulk request. |
spring-projects-issues
added
status: feedback-provided
Feedback has been provided
and removed
status: waiting-for-feedback
We need additional information before we can continue
labels
Feb 28, 2024
sothawo
added
type: bug
A general bug
and removed
status: feedback-provided
Feedback has been provided
labels
Feb 28, 2024
sothawo
added a commit
that referenced
this issue
Feb 28, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've tried to bulk index a bunch of JSON raw records into ES, and I needed to set custom _id values for them. Individual indexing works by calling "IndexQueryBuilder().withId(some_id_value)" and then calling the individual index method, but calling the "bulkIndex" method doesn't consider what was defined as the _id desired value.
Here's the code that ignores the ".withId" call:
It should be interesting (if not mandatory) that the user could set the _id for each individual record sent in the bulk request.
I was able to loop over individual IndexQuery objects and send them one by one to ES, and that correctly sets the _id value, but that increases processing time a lot - in my scenario of ~2m JSON records, elapsed time increases from 15-20 minutes (in batches of 2000 records) to ~3 hours.
The text was updated successfully, but these errors were encountered: