feat: add SegmentPruner support for datasources/policies#19228
feat: add SegmentPruner support for datasources/policies#19228clintropolis merged 2 commits intoapache:masterfrom
Conversation
changes: * adds new `include` method to `SegmentPruner` for checking individual segments for whether or not to prune * adds default implementation of `prune` method which calls `include` * adds new `combine` method to `SegmentPruner` for merging pruners * adds new `CompositeSegmentPruner` for cases where pruners cannot be naturally combined * adds new `createSegmentPruner` method to `DataSource` and `Policy` so that they can participate in pruning * updates `ExecutionVertex` to combine the new datasource pruner with the pruner of the filter
| * such as filters may still be used. | ||
| */ | ||
| @Nullable | ||
| default SegmentPruner createSegmentPruner() |
There was a problem hiding this comment.
Could also implement this in FilteredDataSource.
There was a problem hiding this comment.
yea, I thought about this, but ignored it for now since I think FilteredDataSource, and UnnestDataSource since it has a filter too, both need to be a bit more thoughtful in how they prune. I think they need to be combining with the pruner that is beneath them from the base, but maybe only in some cases or modifying it in others? Like I think for unnest we might want to like prune differently if the filter is on the unnest column, depending on whether unnest is on a mvd or an array, similar to what we do for unnest filter pushdown? I'm not certain if we have to do anything besides combine FilteredDataSource, I haven't fully thought about it yet, and didn't want to for now 😅
I'll look into improving this in a follow-up.
changes:
includemethod toSegmentPrunerfor checking individual segments for whether or not to pruneprunemethod which callsincludecombinemethod toSegmentPrunerfor merging prunersCompositeSegmentPrunerfor cases where pruners cannot be naturally combinedcreateSegmentPrunermethod toDataSourceandPolicyso that they can participate in pruningExecutionVertexto combine the new datasource pruner with the pruner of the filter