Skip to content

Conversation

@s1monw
Copy link
Member

@s1monw s1monw commented Nov 21, 2018

While indexing we try to apply frozen deletes packages concurrently
on indexing threads if necessary. This is done in an opaque way via
IndexWriter#processEvents. Yet, when we commit or refresh we have to
ensure we apply all frozen update packages before we return.
Today we execute the apply method in a blocking fashion which is unnecessary
when we are in a IndexWriter#processEvents context, we block indexing
threads while they could just continue since it's already being applied.
We also might wait in BufferedUpdatesStream when we apply all necessary updates
were we can continue with other work instead of waiting.
This change also tries to apply the packages that are not currently applied
first in order to not unnecessarily block.

@s1monw
Copy link
Member Author

s1monw commented Nov 21, 2018

@mikemccand can you take a look

@s1monw
Copy link
Member Author

s1monw commented Nov 21, 2018

I recommend to look at the diff with ?w=1 -> https://github.com/apache/lucene-solr/pull/503/files?w=1

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look out of curiosity but I'm not familiar enough with this class to comment, I'll leave it to Mike. :)

…cessEvents

While indexing we try to apply frozen deletes packages concurrently
on indexing threads if necessary. This is done in an opaque way via
IndexWriter#processEvents. Yet, when we commit or refresh we have to
ensure we apply all frozen update packages before we return.
Today we execute the apply method in a blocking fashion which is unncessary
when we are in a IndexWriter#processEvents context, we block indexing
threads while they could just continue since it's already being applied.
We also might wait in BufferedUpdatesStream when we apply all necessary updates
were we can continue with other work instead of waiting.
This change also tries to apply the packages that are not currently applied
first in order to not unnecessarily block.
@s1monw s1monw force-pushed the optimize_apply_frozen_updates branch from 12a878e to 1203118 Compare November 21, 2018 13:02
@s1monw
Copy link
Member Author

s1monw commented Nov 21, 2018

pushed

@s1monw s1monw closed this Nov 21, 2018
@msokolov
Copy link
Contributor

I'm curious if you were able to measure a good improvement from this? I guess a large delete concurrent with indexing lots of docs would stress it?

@s1monw
Copy link
Member Author

s1monw commented Nov 25, 2018

@msokolov I measures a pretty signficant lock contention on this lock without my change. with the change it went down significantly. Refreshes got 30% faster for that benchmark (small docs with 25% collision on the IDs and soft-deletes enabled). There was some minor indexing throughput improvements but nothing major.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants