-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pipeline to ensure unique Enrich index documents #46348
Conversation
This reverts commit 84d025f.
Pinging @elastic/es-core-features |
The reason this pr build failed, is because node startup took too long in enrich qa security module:
I've seen this in other pr builds too (#46351). Not sure what the actual cause is here. Maybe slow build node? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I left two small comments.
.../plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/EnrichPolicyReindexPipeline.java
Outdated
Show resolved
Hide resolved
IndexResponse indexRequest = client().index(new IndexRequest() | ||
.index(sourceIndex) | ||
.id(collidingDocId) | ||
.source( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set specific _routing value here too and verify that in the .enrich index it is no longer there?
Normally if a _routing gets specified it gets indexed into a _routing
field that is also queryable, so
testing that it is queryable in the source index and not in the enrich index should be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Adds a pipeline that removes ids and routing from documents before indexing them into enrich indices. Enrich documents may come from multiple indices, and thus have id collisions on them. This pipeline ensures that documents with colliding id fields do not clobber one another during the reindex operation while executing an enrich policy.
This PR adds a pipeline that removes ids and routing from documents before indexing them into enrich indices. Enrich documents may come from multiple indices, and thus have id collisions on them. This pipeline ensures that documents with colliding id fields do not clobber one another during the reindex operation while executing an enrich policy.