-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Multi-column filtering, let DimensionSpec handle extraction functions exclusively #3378
Comments
Un-milestoning this from 0.10.0. @jon-wei reading through this it looks like some of it has been done already as part of PRs in 0.10.0 and some of it has not. (And actually some of it it looks like we decided to go in a different direction: like filters not taking dimensionSpecs.) Do you think we should keep it open and scope down to what still makes sense given what has already been done? Or close and reimagine and open new proposals? |
@jon-wei is this still needed given the expression stuff that exists now? |
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time. |
Multi-column filtering, let DimensionSpec handle extraction functions exclusively
To support filters that accept multi-column inputs, this proposal suggests that:
DimensionSpec
The following methods are added to DimensionSpec:
DimensionSpec
to accept list of dimension names, instead of a single nameColumnSelectorFactory.makeDimensionSelector()
, column reading code will callDimensionSpec.getColumnValueSelector(columnSelectorFactory)
and usegetOutputType()
to determine the type of the selectorDimensionSelector
,LongColumnSelector
, etc.DimensionSelectors
,LongColumnSelectors
, etc. for individual columns from theColumnSelectorFactory
, applying the extraction function, into a new "virtual column"DimensionSelector
. If the output type is long, the returned object will be aLongColumnSelector
.decorate()
becomes a private method, wrapping a delegate selector whengetColumnValueSelector()
is calledgetExtractionFn()
, extractionFn is no longer needed by/exposed to reader codeExtractionDimensionSpec
for grouping/filtering on "virtual columns" derived from multiple input columns, as well as any dimensions with extraction functions appliedDefaultDimensionSpec
for reading from the base columns without transformationsRegexFilteredDimensionSpec
andListFilteredDimensionSpec
can just pass new method calls to delegate specLookupDimensionSpec
: replace this usage withExtractionDimensionSpec
and the right extraction function?DimensionSpec
, it can return null for cases where bitmap is not valid (e.g., with a multiple column extraction fn, or other extraction fn that makes the indexes unusable)BitmapIndexSelector.getBitmapIndex(dimension)
, calldimensionSpec.getBitmapIndex(BitmapIndexSelector)
ValueMatcherFactory
DimensionSpec.getColumnValueSelector()
instead ofColumnSelectorFactory.makeDimensionSelector()
Query engines
DimensionSpec.getColumnValueSelector()
instead ofColumnSelectorFactory.makeDimensionSelector()
ColumnSelectorFactory
isDescending()
so ExtractionDimensionSpec can create SingleScanTimeDimSelector for __timeFilters:
DimensionSpec
instead of a dimension namepreservesOrdering
andextractionType
properties of the DimensionSpec (e.g., for BoundFilter optimizations)DimensionSpec
Related Topics/Issues/PRs:
The text was updated successfully, but these errors were encountered: