forked from apache/lucene
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduce duplication in taxonomy facets; always do counts (apache#12966)
This is a large change, refactoring most of the taxonomy facets code and changing internal behaviour, without changing the API. There are specific API changes this sets us up to do later, e.g. retrieving counts from aggregation facets. 1. Move most of the responsibility from TaxonomyFacets implementations to TaxonomyFacets itself. This reduces code duplication and enables future development. Addresses genericity issue mentioned in apache#12553. 2. As a consequence, introduce sparse values to FloatTaxonomyFacets, which previously used dense values always. This issue is part of apache#12576. 3. Compute counts for all taxonomy facets always, which enables us to add an API to retrieve counts for association facets in the future. Addresses apache#11282. 4. As a consequence of having counts, we can check whether we encountered a label while faceting (count > 0), while previously we relied on the aggregation value to be positive. Closes apache#12585. 5. Introduce the idea of doing multiple aggregations in one go, with association facets doing the aggregation they were already doing, plus a count. We can extend to an arbitrary number of aggregations, as suggested in apache#12546. 6. Don't change the API. The only change in behaviour users should notice is the fix for non-positive aggregation values, which were previously discarded. 7. Add tests which were missing for sparse/dense values and non-positive aggregations.
- Loading branch information
1 parent
b888652
commit 4130c72
Showing
14 changed files
with
940 additions
and
889 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
55 changes: 55 additions & 0 deletions
55
lucene/facet/src/java/org/apache/lucene/facet/TopOrdAndNumberQueue.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.lucene.facet; | ||
|
||
import org.apache.lucene.util.PriorityQueue; | ||
|
||
/** Keeps highest results, first by largest value, then tie-break by smallest ord. */ | ||
public abstract class TopOrdAndNumberQueue extends PriorityQueue<TopOrdAndNumberQueue.OrdAndValue> { | ||
|
||
/** Holds a single entry. */ | ||
public abstract static class OrdAndValue { | ||
|
||
/** Ordinal of the entry. */ | ||
public int ord; | ||
|
||
/** Default constructor. */ | ||
public OrdAndValue() {} | ||
|
||
/** Compare with another {@link OrdAndValue}. */ | ||
public abstract boolean lessThan(OrdAndValue other); | ||
|
||
/** Get the value stored in this {@link OrdAndValue}. */ | ||
public abstract Number getValue(); | ||
} | ||
|
||
/** Sole constructor. */ | ||
public TopOrdAndNumberQueue(int topN) { | ||
super(topN); | ||
} | ||
|
||
@Override | ||
public boolean lessThan(TopOrdAndNumberQueue.OrdAndValue a, TopOrdAndNumberQueue.OrdAndValue b) { | ||
return a.lessThan(b); | ||
} | ||
|
||
/** | ||
* Create a new {@link org.apache.lucene.facet.TopOrdAndNumberQueue.OrdAndValue} of the | ||
* appropriate type. | ||
*/ | ||
public abstract OrdAndValue newOrdAndValue(); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.