Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-4695][VL] Fix array type Substrait signature #4686

Merged
merged 3 commits into from
Feb 15, 2024

Conversation

liujiayi771
Copy link
Contributor

What changes were proposed in this pull request?

We need to pass elementType in ArrayType to Substrait signature.

How was this patch tested?

Add test case in TestOperator.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Contributor Author

liujiayi771 commented Feb 12, 2024

ARRAY comparison not supported for values that contain nulls, exlcude the test first.

2024-02-12T03:36:26.3508665Z Caused by: java.lang.RuntimeException: Exception: VeloxUserError
2024-02-12T03:36:26.3509365Z Error Source: USER
2024-02-12T03:36:26.3509736Z Error Code: INVALID_ARGUMENT
2024-02-12T03:36:26.3510401Z Reason: ARRAY comparison not supported for values that contain nulls
2024-02-12T03:36:26.3511114Z Retriable: False
2024-02-12T03:36:26.3511767Z Expression: !decoded.base()->containsNullAt(indices[index])
2024-02-12T03:36:26.3512426Z Function: checkNestedNulls
2024-02-12T03:36:26.3512982Z File: ../../velox/functions/lib/CheckNestedNulls.cpp
2024-02-12T03:36:26.3513643Z Line: 34
2024-02-12T03:36:26.3513933Z Stack trace:
2024-02-12T03:36:26.3516454Z # 0  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)
2024-02-12T03:36:26.3520742Z # 1  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxUserError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
2024-02-12T03:36:26.3523237Z # 2  facebook::velox::functions::checkNestedNulls(facebook::velox::DecodedVector const&, int const*, int, bool)
2024-02-12T03:36:26.3525717Z # 3  facebook::velox::aggregate::prestosql::(anonymous namespace)::NonNumericMinAggregate::addRawInput(char**, facebook::velox::SelectivityVector const&, std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > > const&, bool)
2024-02-12T03:36:26.3527961Z # 4  facebook::velox::exec::StreamingAggregation::evaluateAggregates()
2024-02-12T03:36:26.3528861Z # 5  facebook::velox::exec::StreamingAggregation::getOutput()
2024-02-12T03:36:26.3530397Z # 6  facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<facebook::velox::RowVector>&)
2024-02-12T03:36:26.3532127Z # 7  facebook::velox::exec::Driver::next(std::shared_ptr<facebook::velox::exec::BlockingState>&)
2024-02-12T03:36:26.3533129Z # 8  facebook::velox::exec::Task::next(folly::SemiFuture<folly::Unit>*)
2024-02-12T03:36:26.3533861Z # 9  gluten::WholeStageResultIterator::next()
2024-02-12T03:36:26.3534623Z # 10 Java_io_glutenproject_vectorized_ColumnarBatchOutIterator_nativeHasNext
2024-02-12T03:36:26.3535336Z # 11 0x00007f189e7bfc00

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Contributor Author

@PHILO-HE @rui-mo Could you help to review?

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work! Could you please create an issue to record array containing null is not supported?


if (startWith(signature, "list")) {
auto listStart = signature.find_first_of('<');
auto listEnd = signature.find_first_of('>');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider nested array? For nested array, find_first_of('>') will return wrong ending position for outer array, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should use find_last_of.

@liujiayi771 liujiayi771 changed the title [VL] Fix array type Substrait signature [GLUTEN-4695][VL] Fix array type Substrait signature Feb 14, 2024
Copy link

#4695

Copy link

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Contributor Author

@PHILO-HE Create an issue #4695.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks!

@PHILO-HE
Copy link
Contributor

Gluten ClickHouse CI is red. Please check it.

@liujiayi771
Copy link
Contributor Author

Gluten ClickHouse CI is red. Please check it.

CH CI has passed.

@PHILO-HE PHILO-HE merged commit 182bb9c into apache:main Feb 15, 2024
19 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4686_time.csv log/native_master_02_12_2024_65f940287_time.csv difference percentage
q1 33.44 34.33 0.895 102.68%
q2 24.41 24.28 -0.134 99.45%
q3 39.07 38.39 -0.678 98.27%
q4 37.77 37.95 0.178 100.47%
q5 71.05 70.33 -0.727 98.98%
q6 6.10 7.09 0.991 116.25%
q7 82.79 85.24 2.450 102.96%
q8 83.40 85.96 2.557 103.07%
q9 124.85 123.47 -1.378 98.90%
q10 42.88 42.49 -0.395 99.08%
q11 20.30 20.53 0.227 101.12%
q12 21.41 28.57 7.161 133.44%
q13 44.27 45.01 0.740 101.67%
q14 20.76 15.14 -5.621 72.92%
q15 28.82 27.31 -1.509 94.76%
q16 14.12 14.37 0.247 101.75%
q17 101.50 102.63 1.122 101.11%
q18 149.62 148.90 -0.723 99.52%
q19 13.92 12.50 -1.417 89.82%
q20 29.25 26.50 -2.753 90.59%
q21 222.35 227.26 4.911 102.21%
q22 17.16 14.05 -3.111 81.87%
total 1229.25 1232.28 3.033 100.25%

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_master_02_15_2024_time.csv log/native_master_02_12_2024_65f940287_time.csv difference percentage
q1 33.08 34.33 1.254 103.79%
q2 25.47 24.28 -1.196 95.30%
q3 37.97 38.39 0.415 101.09%
q4 39.78 37.95 -1.835 95.39%
q5 72.63 70.33 -2.303 96.83%
q6 5.49 7.09 1.604 129.22%
q7 81.13 85.24 4.114 105.07%
q8 85.37 85.96 0.587 100.69%
q9 124.15 123.47 -0.682 99.45%
q10 43.10 42.49 -0.614 98.58%
q11 19.99 20.53 0.540 102.70%
q12 26.86 28.57 1.717 106.39%
q13 46.27 45.01 -1.258 97.28%
q14 16.39 15.14 -1.253 92.35%
q15 28.00 27.31 -0.693 97.53%
q16 14.65 14.37 -0.287 98.04%
q17 101.18 102.63 1.445 101.43%
q18 148.68 148.90 0.220 100.15%
q19 12.80 12.50 -0.296 97.69%
q20 26.68 26.50 -0.181 99.32%
q21 228.94 227.26 -1.683 99.26%
q22 13.63 14.05 0.421 103.09%
total 1232.24 1232.28 0.036 100.00%

@liujiayi771 liujiayi771 deleted the array-validation branch February 16, 2024 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants