-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch Lucene DocValues codec to Mode.BEST_SPEED #86
Comments
@gautamworah96 We had seen an increase of performance for these tasks when introducing compression because it seems that the compressed format performs better for linear scans and worse for selective queries. So it's not unlikely that your change did not slow down these tasks. |
Results from luceneutil benchmarks comparing
|
I think all we need to do for this issue is re-enable Also, in my $day job (Amazon's customer facing product search) we also found that enabling compression for the taxonomy index, despite that we count much fewer hits than the "pure browse" tests here which count facets for every document in the index, was performance neutral and shrank the taxonomy index. Net/net compression for Lucene's facets seems to be a good thing! |
I am running two benchmarks now: one with changes in
Edit: benchmarks results before and after this change show no difference. I'll try directly sending in the |
git diff which disregards the
gives the following results:
The next step is to just remove the |
Whoa, great! This confirms that enabling compression substantially speeds up pure browse faceting use cases based on the taxonomy index. SSDV facets are not impacted, which is expected because they do not even use the taxonomy index.
OK. |
Thanks for digging @gautamworah96! |
PR for this issue was merged |
This gave an impressive bump in QPS in nightly benchmarks, e.g. pure browse facets for last-modified year/month/day. Looks like ~30% speedup, much more than the previous ~5% speedup. Confused :) |
Hmm, and facet performance for a simple I will add an annotation. |
We have noticed a big drop in the throughput from faceting for
all Day of Year
facets. One possible explanation for this could be a change to add two modes to the Lucene codec:BEST_SPEED
,BEST_COMPRESSION
.It is possible that using the
BEST_COMPRESSION
mode caused a drop in the faceting throughput. This issue is to confirm ifFACET_FIELD_DV_FORMAT_DEFAULT='Lucene80'
by default usesBEST_COMPRESSION
and if yes does changing it toBEST_SPEED
improve the throughput to the previous rate.The text was updated successfully, but these errors were encountered: