Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DataType::Decimal for IN and NOT IN expressions #2764

Merged
merged 1 commit into from
Jun 24, 2022

Conversation

liukun4515
Copy link
Contributor

@liukun4515 liukun4515 commented Jun 22, 2022

Which issue does this PR close?

Part of #2755

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Jun 22, 2022
.collect();
// TODO in the arrow-rs, should support NULL type to Decimal Data type
// TODO support in the arrow-rs, NULL value cast to Decimal Value
let result_type = get_coerce_type(expr_type, &list_types);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IN operator is a shorthand for multiple OR conditions.
We should get the coerced data type for IN expr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this issue: #2759
The type of list is BOOLEAN, INT32, and these two type are not compatible.

@liukun4515 liukun4515 force-pushed the support_decimal_inlist_filter branch from e641539 to 9851725 Compare June 22, 2022 10:21
);

// expression: "a in (0, 1, NULL)"
let list = vec![
lit(ScalarValue::Int64(Some(0))),
lit(ScalarValue::Int64(Some(1))),
lit(ScalarValue::Utf8(None)),
lit(ScalarValue::Null),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't get the coerced data type between INT64 and UTF8.
I think the type of NULL is Null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this test case, we can know that we should not treat Null as UTF8 data type

@liukun4515
Copy link
Contributor Author

@alamb @andygrove @viirya PTAL

@@ -310,6 +313,8 @@ pub fn create_physical_expr(
&list_expr_data_type,
&value_expr_data_type,
) {
// TODO: Can't cast from list type to value type directly
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't cast the list data type to the value data type directly, and need to coerced common data type if they are compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about issue: #2759

Copy link
Contributor

@yahoNanJing yahoNanJing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liukun4515 liukun4515 force-pushed the support_decimal_inlist_filter branch from 88cfa73 to fb3cef3 Compare June 23, 2022 11:20
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some suggestions for code improvements that I think should be done, but the logic and tests in this PR look good to me. 👍 thanks @liukun4515

@liukun4515 liukun4515 force-pushed the support_decimal_inlist_filter branch from f0d0d7c to 8837d9d Compare June 24, 2022 02:16
@codecov-commenter
Copy link

Codecov Report

Merging #2764 (f0d0d7c) into master (bc00776) will increase coverage by 0.17%.
The diff coverage is 80.36%.

❗ Current head f0d0d7c differs from pull request most recent head 8837d9d. Consider uploading reports for the commit 8837d9d to get more accurate results

@@            Coverage Diff             @@
##           master    #2764      +/-   ##
==========================================
+ Coverage   84.95%   85.13%   +0.17%     
==========================================
  Files         271      272       +1     
  Lines       48053    48207     +154     
==========================================
+ Hits        40824    41039     +215     
+ Misses       7229     7168      -61     
Impacted Files Coverage Δ
...atafusion/physical-expr/src/expressions/in_list.rs 74.33% <80.12%> (+0.43%) ⬆️
datafusion/expr/src/binary_rule.rs 84.50% <100.00%> (ø)
datafusion/physical-expr/src/planner.rs 91.80% <100.00%> (ø)
datafusion/core/src/physical_plan/hash_join.rs 94.68% <0.00%> (-1.44%) ⬇️
datafusion/optimizer/src/filter_null_join_keys.rs 92.85% <0.00%> (-1.03%) ⬇️
datafusion/expr/src/window_frame.rs 92.43% <0.00%> (-0.85%) ⬇️
datafusion/core/src/execution/context.rs 78.37% <0.00%> (-0.28%) ⬇️
datafusion/expr/src/utils.rs 90.68% <0.00%> (-0.13%) ⬇️
datafusion/common/src/scalar.rs 74.82% <0.00%> (-0.12%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc00776...8837d9d. Read the comment docs.

@liukun4515
Copy link
Contributor Author

All comments have be resolved.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @liukun4515

@alamb alamb changed the title support decimal for inlist expr Support DataType::Decimal for IN and NOT IN expressions Jun 24, 2022
@alamb alamb merged commit c2972a2 into apache:master Jun 24, 2022
@liukun4515 liukun4515 deleted the support_decimal_inlist_filter branch June 25, 2022 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants