Skip to content

Conversation

@matthewmturner
Copy link
Contributor

Which issue does this PR close?

Closes #1162

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@matthewmturner
Copy link
Contributor Author

@alamb i took a first cut at adding an optimization and a test for it. would you mind giving it a quick review to make sure its good before i proceed with the rest?

@pjmore
Copy link
Contributor

pjmore commented Oct 31, 2021

It's pretty bikesheddy so feel free to ignore me but for symmetric matches, like for or, I tend to prefer matching like

(Expr::Literal(ScalarValue::Boolean(lit)), col) | 
(col, Expr::Literal(ScalarValue::Boolean(lit))) 
 if self.is_boolean_type(col) => {
    match lit{
        Some(true) => Expr::Literal(ScalarValue::Boolean(Some(true))),
         _ => *col
     }
}

Which lets there be a single implementation instead of having two nearly identical branches.

@matthewmturner
Copy link
Contributor Author

@pjmore thank you for the recommendation - it makes sense. i will update.

@matthewmturner
Copy link
Contributor Author

After making the above simplification getting an error on the inner match... im a bit confused by the error as it looks to me like the match arms have the same types. Will keep looking into it.

@alamb
Copy link
Contributor

alamb commented Nov 1, 2021

I plan to review this PR later today, FYI

@pjmore
Copy link
Contributor

pjmore commented Nov 1, 2021

@matthewmturner Ah that's what I get for not fully testing my suggestion. The compilation error comes from an ownership issue.

let left: Box<Expr>;
let right: Box<Expr>;
match (left.as_ref(): &Expr, right.as_ref(): &Expr){
 (col : &Expr, Expr::Literal(ScalarValue::Boolean(lit: &Option<bool>)))=>{
        *left // What you had initially. Moving out of Box<Expr> gives Expr
       *col // My suggestion. Moving out of immutable reference, not allowed due to ownership rules
}
_ =>Expr::Binary{ //fallthrough pattern
    left,
    op: Operator::Or,
    right
}

As far as I can tell without box patterns or some unnecessary allocations in the fall-through pattern there isn't a way to do what I suggested. Sorry about that, bad suggestion on my part.

@matthewmturner
Copy link
Contributor Author

@pjmore no problem, thanks for explanation. i think i understand better. To confirm, it's because col is an immutable reference to left or right and not the actual boxed value within those that it doesnt work?

ill admit im still a little confused why col cant be copied and left/right can - but ill just need to dig more into rust's ownership rules for that.

i'll revert to the old implementation for now.

_ => Expr::Literal(ScalarValue::Boolean(None)),
},
(Expr::Literal(ScalarValue::Boolean(b)), _)
if self.is_boolean_type(&right) =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb to help me understand, can you provide color on whats an example of an Expr that isnt a Boolean but that has a boolean type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, i get it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One example is if someone writes col_a OR true and col_a has type of int64 or something -- I don't know how far such an expression will get (it will likely error out when trying to to actually run the PhysicalExpr)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great @matthewmturner -- thank you! I am sorry for the delay in reviewing

Comment on lines 215 to 216
Expr::Literal(ScalarValue::Boolean(l)),
Expr::Literal(ScalarValue::Boolean(r)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ScalarValue == ScalarValue case is covered by ConstEvaluator -- so while it is not incorrect to add it here, I also don't think it is necessary (and might be confusing as it would imply ConstEvaluator can't handle this.

if self.is_boolean_type(&right) =>
{
match b {
Some(true) => Expr::Literal(ScalarValue::Boolean(Some(true))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so cool!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can match null as well -- aka ScalarValue::Boolean(None)

because null OR true == true

alamb=# select null or true;
 ?column? 
----------
 t
(1 row)

if self.is_boolean_type(&left) =>
{
match b {
Some(true) => Expr::Literal(ScalarValue::Boolean(Some(true))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also add the rules that null OR false --> null

alamb=# select null or false;
 ?column? 
----------
 
(1 row)

right,
},
},
Operator::And => match (left.as_ref(), right.as_ref()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rules for null and AND are NULL AND true --> NULL, and NULL AND false --> false

alamb=# select null and true;
 ?column? 
----------
 
(1 row)

alamb=# select null and false;
 ?column? 
----------
 f
(1 row)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewmturner -- I left some feedback (but I don't think it is required) -- let me know if you want to address it as part of this PR or if we should do it as a follow on PR.

Thanks again!

@matthewmturner
Copy link
Contributor Author

@matthewmturner -- I left some feedback (but I don't think it is required) -- let me know if you want to address it as part of this PR or if we should do it as a follow on PR.

Thanks again!

thanks! yes, i will update this PR.

@matthewmturner
Copy link
Contributor Author

@alamb I believe ive handled all your comments...can you review when you have time?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great -- thank you @matthewmturner very cool

@alamb alamb merged commit bbd8e1b into apache:master Nov 2, 2021
@houqp houqp added the enhancement New feature or request label Nov 6, 2021
H0TB0X420 pushed a commit to H0TB0X420/datafusion that referenced this pull request Oct 7, 2025
* Add test for showing empty DataFrame and improve print output for empty DataFrames

* Add tests for handling empty DataFrames and zero-row queries

* Add tests for showing DataFrames with no rows and improve output messages

* Fix assertion in test_show_from_empty_batch to ensure proper output for empty DataFrames

* feat(tests): add a blank line before test_show_select_where_no_rows function for improved readability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add additional algebraic simplifications

4 participants