Skip to content

Allow Histogram Collection using PointTree when SortedNumericDocValues is absent #14536

@jainankitk

Description

@jainankitk
Contributor

Description

The HistogramCollector uses PointTreeBulkCollector logic only when the PointTree is dense compared to the buckets across which it is getting collected. But for cases that don't have SortedNumericDocValues indexed at all, we should allow the collection using PointTree irrespective of how slow it is, instead of throwing IllegalStateException("Expected numeric field, but got doc-value type: "). Related to #14439 (comment)

Activity

jpountz

jpountz commented on Apr 24, 2025

@jpountz
Contributor

In my opinion, it's fine to require doc values to be indexed for faceting to work. I don't think we should try to support faceting (or sorting) when the field has a points index but no doc values.

jainankitk

jainankitk commented on Apr 26, 2025

@jainankitk
ContributorAuthor

Currently, we use the PointTreeBulkCollector if we can collect the documents efficiently, irrespective of DocValues is indexed or not. IMO, we should be consistent in requiring doc values to be indexed for faceting to work. For example - it might be confusing for users to get exception if they try with bucket width of 100, and succeed when bucket width is 1000.

In this specific case, it is fairly easy to support so I believe we should go ahead with that for consistency of functionality. Have created small PR - #14559 for the same. Let me know what you think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @jpountz@jainankitk

      Issue actions

        Allow Histogram Collection using PointTree when SortedNumericDocValues is absent · Issue #14536 · apache/lucene