You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The HistogramCollector uses PointTreeBulkCollector logic only when the PointTree is dense compared to the buckets across which it is getting collected. But for cases that don't have SortedNumericDocValues indexed at all, we should allow the collection using PointTree irrespective of how slow it is, instead of throwing IllegalStateException("Expected numeric field, but got doc-value type: "). Related to #14439 (comment)
In my opinion, it's fine to require doc values to be indexed for faceting to work. I don't think we should try to support faceting (or sorting) when the field has a points index but no doc values.
Currently, we use the PointTreeBulkCollector if we can collect the documents efficiently, irrespective of DocValues is indexed or not. IMO, we should be consistent in requiring doc values to be indexed for faceting to work. For example - it might be confusing for users to get exception if they try with bucket width of 100, and succeed when bucket width is 1000.
In this specific case, it is fairly easy to support so I believe we should go ahead with that for consistency of functionality. Have created small PR - #14559 for the same. Let me know what you think
Activity
jpountz commentedon Apr 24, 2025
In my opinion, it's fine to require doc values to be indexed for faceting to work. I don't think we should try to support faceting (or sorting) when the field has a points index but no doc values.
jainankitk commentedon Apr 26, 2025
Currently, we use the
PointTreeBulkCollector
if we can collect the documents efficiently, irrespective ofDocValues
is indexed or not. IMO, we should be consistent in requiring doc values to be indexed for faceting to work. For example - it might be confusing for users to get exception if they try with bucket width of 100, and succeed when bucket width is 1000.In this specific case, it is fairly easy to support so I believe we should go ahead with that for consistency of functionality. Have created small PR - #14559 for the same. Let me know what you think