Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Filter::mergeWith for MultiRange #154

Closed
mbasmanova opened this issue Sep 2, 2021 · 7 comments
Closed

Implement Filter::mergeWith for MultiRange #154

mbasmanova opened this issue Sep 2, 2021 · 7 comments
Labels
good first issue Good for newcomers stale

Comments

@mbasmanova
Copy link
Contributor

Similar to #153 this task is to implement Filter::mergeWith() for MultiRange filters which represent an OR of two of more filters.

CC: @pedroerp @kgpai @aditi-pandit @majetideepak

@mbasmanova mbasmanova added the good first issue Good for newcomers label Sep 2, 2021
@amaliujia
Copy link
Contributor

@mbasmanova

I am a person who is interested in contributing to open source. Looks like this issue is a good one for a starter. May I work on this issue?

@mbasmanova
Copy link
Contributor Author

Welcome, @amaliujia . Sure, feel free to pick this up. CC: @atanu1991 who is working on a related issue #153

@amaliujia
Copy link
Contributor

@mbasmanova

I want to discuss a bit on high level before start to implement.

MultiRange is a class that contains a list of Filter, with nan_allowed and null_allowed. So my current idea of the implementation is:

  1. Two MultiRange can merge only if they have the same nan_allowed and null_allowed.
  2. Two MultiRange can merge only if they have the same data type on all the Filters.
  3. If 1 and 2 are true, then simply have a nested loop to do
 for f1 in MultiRangeA.getFilters():
      for f2 in MultiRangeB.getFilters():
           f1.mergeWith(f2);

Can you give me some suggestions?

@mbasmanova
Copy link
Contributor Author

mbasmanova commented Sep 24, 2021

Rui,

MultiRange represents and OR of multiple filters. NaN values pass only if MultiRange::nanAllowed is true. Null values pass only if nullAllowed is true.

Merging two filters is combining the filters with an AND. A value passes the new filter only if it passes both the original filters.

Here is how I would represent a merge of two MultiRange filters (with just 2 filters in each for simplicity):

(a OR b) AND (c OR d) = (a AND c) OR (a AND d) OR (b AND c) OR (b AND d)

Therefore, what you have proposed is very close.

  1. Calculate nullAllowed for the combined filter: bool bothNullAllowed = nullAllowed_ && other->testNull();
  2. Calculate nanAllowed for the combined filter: {same as above}
  3. Loop over all combinations of filters from left and right side and merge these.
  4. Construct a new filter using flags and list of filters from previous steps.

In step (3), make sure to check if a merge of two filter is kAlwaysFalse, kIsNull, kIsNotNull. In this case, the filter can be dropped from the result (null and NaN flags in the nested filters do not count for MultiRange filter; only top-level flags take effect).

Step (3) may result in an empty list of filters or a list of one. If the list is empty, return kAlwaysFalse, kIsNull, or kIsNotNull as appropriate (see nullOrFalse(bothNullAllowed) in Filter.cpp). If the list contains just one filter, return that filter after incorporating the null flag, e.g. if bothNullAllowed && !filter->testNull() return filter.mergeWith(IsNull) else return filter.

That should be it.

(Check out #233 that implements merge logic for BytesValues and BytesRanges.)

@amaliujia
Copy link
Contributor

amaliujia commented Sep 24, 2021

Thanks @mbasmanova!

I prepared a WIP impl based on your suggestions: #299.

Can you comment on the PR to see if I am on the right track? I haven't added test cases and I can add those later.

Sorry to occupy too much of your time. I am a beginner so probably need more guidance at this moment (for example on how to deal with nullness in Volex)? Maybe later I could help to contribute a doc about nullness, nan, always true, always false, is null, is not null with Filter.

@stale
Copy link

stale bot commented Sep 14, 2022

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale label Sep 14, 2022
@kgpai
Copy link
Contributor

kgpai commented Sep 15, 2022

Looks like this issue was resolved with #299 . Closing.

@kgpai kgpai closed this as completed Sep 15, 2022
zhouyuan pushed a commit to zhouyuan/velox that referenced this issue Jun 7, 2023
relative pr:

Fix hashjoin runtime issue facebookincubator#106
INVALID_STATE on HashJoin when spill is turned on facebookincubator#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged facebookincubator#236
Support kPreceeding & kFollowing for window range frame type facebookincubator#287
Yohahaha pushed a commit to Yohahaha/velox that referenced this issue Jul 4, 2023
relative pr:

Fix hashjoin runtime issue facebookincubator#106
INVALID_STATE on HashJoin when spill is turned on facebookincubator#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged facebookincubator#236
Support kPreceeding & kFollowing for window range frame type facebookincubator#287
chenxu14 pushed a commit to chenxu14/velox that referenced this issue Jul 5, 2023
relative pr:

Fix hashjoin runtime issue facebookincubator#106
INVALID_STATE on HashJoin when spill is turned on facebookincubator#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged facebookincubator#236
Support kPreceeding & kFollowing for window range frame type facebookincubator#287
PHILO-HE pushed a commit to PHILO-HE/velox that referenced this issue Jul 17, 2023
relative pr:

Fix hashjoin runtime issue facebookincubator#106
INVALID_STATE on HashJoin when spill is turned on facebookincubator#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged facebookincubator#236
Support kPreceeding & kFollowing for window range frame type facebookincubator#287
rui-mo pushed a commit to rui-mo/velox that referenced this issue Jul 21, 2023
relative pr:

Fix hashjoin runtime issue facebookincubator#106
INVALID_STATE on HashJoin when spill is turned on facebookincubator#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged facebookincubator#236
Support kPreceeding & kFollowing for window range frame type facebookincubator#287
rui-mo pushed a commit to rui-mo/velox that referenced this issue Jul 24, 2023
relative pr:

Fix hashjoin runtime issue facebookincubator#106
INVALID_STATE on HashJoin when spill is turned on facebookincubator#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged facebookincubator#236
Support kPreceeding & kFollowing for window range frame type facebookincubator#287
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers stale
Projects
None yet
Development

No branches or pull requests

3 participants