Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-4772][VL] Support empty map/array literal #4771

Merged
merged 14 commits into from
Mar 6, 2024

Conversation

WangGuangxin
Copy link
Contributor

@WangGuangxin WangGuangxin commented Feb 25, 2024

What changes were proposed in this pull request?

Support array or map literal without value.
For example

create table map_table(a map<bigint, string>) using parquet;
insert into table map_table select map(1, 'hello');
select size(coalesce(a, map())) from map_table;

How was this patch tested?

Added UT

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@WangGuangxin WangGuangxin changed the title support empty map/array literal [VL] support empty map/array literal Feb 25, 2024
Copy link

Run Gluten Clickhouse CI

@WangGuangxin WangGuangxin changed the title [VL] support empty map/array literal [GLUTEN-4772][VL] Support empty map/array literal Feb 25, 2024
Copy link

#4772

Copy link

Run Gluten Clickhouse CI

@WangGuangxin
Copy link
Contributor Author

@rui-mo @ulysses-you

@ulysses-you
Copy link
Contributor

it seems the CH backend failed test is related.

10:50:44  - test literals *** FAILED ***
10:50:44    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1285.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1285.0 (TID 2287) (gluten-gluten-ci-7197-7bl43-bdv22-rrs5h executor driver): io.glutenproject.exception.GlutenException: Unsupported spark literal type kEmptyList
10:50:44  0. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Common/Exception.cpp:96: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000b8320fb in /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/utils/extern-local-engine/libch.so
10:50:44  1. DB::Exception::Exception<std::basic_string_view<char, std::char_traits<char>>>(int, FormatStringHelperImpl<std::type_identity<std::basic_string_view<char, std::char_traits<char>>>::type>, std::basic_string_view<char, std::char_traits<char>>&&) @ 0x00000000060cf590 in /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/utils/extern-local-engine/libch.so
10:50:44  2. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/Parser/SerializedPlanParser.cpp:1537: local_engine::SerializedPlanParser::parseLiteral(substrait::Expression_Literal const&) @ 0x000000000bb58a9f in /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/utils/extern-local-engine/libch.so

literalBuilder.setList(listBuilder.build());
} else {
Type.List.Builder listTypeBuilder = Type.List.newBuilder();
listTypeBuilder.setType(elementType.toProtobuf());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if if element type is NullType, we will still fallback it right ? e.g., select array();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to #2996. It will not fallback once NullType is supported

@@ -382,7 +393,8 @@ std::shared_ptr<const core::ConstantTypedExpr> SubstraitVeloxExprConverter::toVe
ArrayVectorPtr SubstraitVeloxExprConverter::literalsToArrayVector(const ::substrait::Expression::Literal& literal) {
auto childSize = literal.list().values().size();
if (childSize == 0) {
return makeEmptyArrayVector(pool_);
// childSize == 0 should not happend here but just for integrity check
return makeEmptyArrayVector(pool_, UNKNOWN());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to throw for unexpected behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if the child size is 0 that means it is a empty list, we should not go into this method. @WangGuangxin can you address this comment ?

@@ -393,7 +405,8 @@ ArrayVectorPtr SubstraitVeloxExprConverter::literalsToArrayVector(const ::substr
MapVectorPtr SubstraitVeloxExprConverter::literalsToMapVector(const ::substrait::Expression::Literal& literal) {
auto childSize = literal.map().key_values().size();
if (childSize == 0) {
return makeEmptyMapVector(pool_);
// childSize == 0 should not happend here but just for integrity check
return makeEmptyMapVector(pool_, UNKNOWN(), UNKNOWN());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

@@ -497,6 +510,7 @@ RowVectorPtr SubstraitVeloxExprConverter::literalsToRowVector(const ::substrait:
vectors.emplace_back(literalsToMapVector(child));
break;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove this empty line?

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor

@WangGuangxin can you take a look at ch backend test ? it seems still failed

@WangGuangxin
Copy link
Contributor Author

@WangGuangxin can you take a look at ch backend test ? it seems still failed

@ulysses-you sure. you mean this one ?
image

I cannot login it since it requires username and password.

Can you paste the failed Test?

@ulysses-you
Copy link
Contributor

Copy link

github-actions bot commented Mar 2, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Mar 2, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Mar 2, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Mar 2, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Mar 2, 2024

Run Gluten Clickhouse CI

@WangGuangxin
Copy link
Contributor Author

It's ok now, cc @ulysses-you @rui-mo

validateFallbackResult("SELECT array(map())")
validateFallbackResult("SELECT map()")
validateFallbackResult("SELECT map(1, array())")
validateFallbackResult("SELECT map(1, map())")
validateFallbackResult("SELECT array(null)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why does it fallback ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's not related to empty literal, but NullType support.
I'll fix it in another PR, otherwise this PR is too big.

@ulysses-you
Copy link
Contributor

@WangGuangxin can you resolve this comment ? #4771 (comment)

@WangGuangxin
Copy link
Contributor Author

@WangGuangxin can you resolve this comment ? #4771 (comment)

ok

@zhouyuan
Copy link
Contributor

zhouyuan commented Mar 4, 2024

CC: @PHILO-HE

Copy link

github-actions bot commented Mar 4, 2024

Run Gluten Clickhouse CI

rui-mo
rui-mo previously approved these changes Mar 5, 2024
Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CI passes.

@WangGuangxin
Copy link
Contributor Author

There is a failed test VeloxSubstraitRoundTripTest.arrayLiteral and I'll fix it

Copy link

github-actions bot commented Mar 5, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Mar 5, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Mar 5, 2024

Run Gluten Clickhouse CI

@WangGuangxin
Copy link
Contributor Author

issues resolved @ulysses-you @rui-mo

Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ulysses-you ulysses-you merged commit c2b0805 into apache:main Mar 6, 2024
20 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4771_time.csv log/native_master_03_06_2024_10910e900_time.csv difference percentage
q1 34.77 35.69 0.920 102.65%
q2 26.27 24.09 -2.178 91.71%
q3 37.54 37.17 -0.367 99.02%
q4 38.11 37.31 -0.800 97.90%
q5 71.46 70.29 -1.162 98.37%
q6 7.16 7.07 -0.087 98.79%
q7 84.52 84.52 -0.001 100.00%
q8 87.11 86.24 -0.877 98.99%
q9 124.63 125.77 1.144 100.92%
q10 44.26 42.91 -1.349 96.95%
q11 20.77 20.70 -0.072 99.65%
q12 29.19 29.30 0.106 100.36%
q13 45.88 45.63 -0.251 99.45%
q14 15.59 20.55 4.957 131.79%
q15 28.78 29.07 0.291 101.01%
q16 12.73 13.92 1.193 109.37%
q17 104.63 102.68 -1.952 98.13%
q18 144.11 145.05 0.938 100.65%
q19 12.67 12.61 -0.065 99.49%
q20 27.58 25.67 -1.908 93.08%
q21 226.00 224.29 -1.708 99.24%
q22 15.05 13.74 -1.306 91.32%
total 1238.81 1234.28 -4.534 99.63%

taiyang-li pushed a commit to bigo-sg/gluten that referenced this pull request Mar 25, 2024
WangGuangxin added a commit to WangGuangxin/gluten that referenced this pull request May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants