Skip to content

Fix inner hits + aggregations concurrency bug #128036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jun 2, 2025

Conversation

benchaplin
Copy link
Contributor

Resolves #122419.

There's a concurrency bug that occurs when doing aggregations on inner hits. It can result in one of three exceptions:

  • java.lang.IllegalStateException: Error retrieving path
  • java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" because "this.preloadedStoredFieldValues" is null
  • java.lang.AssertionError: invalid decRef call: already closed

The underlying issue is that InnerHitSubContext is not thread safe, yet instances are shared across leaf slice search threads during an aggregation. Specifically, the race condition occurs when InnerHitSubContext.rootId & InnerHitSubContext.rootSource fields are set and accessed concurrently by multiple threads.

The tests I've added to TopHitsIT reproduce the issue. If you paste those tests into main and run them a few times you should see one of the exceptions.

I've solved this by forking the InnerHitSubContext instances, similar to what was done here #106990. SearchExecutionContext is at times accessed from InnerHitSubContext, so I also had to make sure the forked SearchExecutionContext was used in those cases.

@benchaplin benchaplin requested a review from javanna May 12, 2025 19:52
@benchaplin benchaplin added >bug Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations v8.19.0 v9.1.0 labels May 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Collaborator

Hi @benchaplin, I've created a changelog YAML for you.

@benchaplin benchaplin force-pushed the inner_hits_aggs_concurrency_bug branch from 3647f68 to ff7d042 Compare May 21, 2025 21:33
@benchaplin benchaplin requested a review from javanna June 2, 2025 13:20
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM great work

@benchaplin benchaplin merged commit 13bce60 into elastic:main Jun 2, 2025
18 checks passed
@benchaplin benchaplin added the auto-backport Automatically create backport pull requests when merged label Jun 2, 2025
benchaplin added a commit to benchaplin/elasticsearch that referenced this pull request Jun 2, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to elastic#122419
elasticsearchmachine pushed a commit that referenced this pull request Jun 2, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to #122419
@javanna
Copy link
Member

javanna commented Jun 3, 2025

Hey @benchaplin I think it makes sense to backport this fix to 9.0 as well. Thoughts?

mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jun 3, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to elastic#122419
benchaplin added a commit to benchaplin/elasticsearch that referenced this pull request Jun 3, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to elastic#122419
@benchaplin
Copy link
Contributor Author

Hey @benchaplin I think it makes sense to backport this fix to 9.0 as well. Thoughts?

Ah, agreed. Thanks for catching this, I got a little mixed up with versions.

elasticsearchmachine pushed a commit that referenced this pull request Jun 3, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to #122419
joshua-adams-1 pushed a commit to joshua-adams-1/elasticsearch that referenced this pull request Jun 3, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to elastic#122419
Samiul-TheSoccerFan pushed a commit to Samiul-TheSoccerFan/elasticsearch that referenced this pull request Jun 5, 2025
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to elastic#122419
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IllegalStateException during search: "Error retrieving path ..."
3 participants