Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-4483][CH]Improve divide function #4484

Closed
wants to merge 8 commits into from

Conversation

KevinyhZou
Copy link
Contributor

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #4483)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

性能测试,通过开发完成的 benchmark_spark_divide_function.cpp 测试 新的divide 函数, 对比CH 的divide函数,以及过去使用FunctionParser的divide 的性能差异,如下

2024-01-22 16:51:29.252 <Warning> SignalHandler: LD_PRELOAD is not set, SignalHandler is disabled
2024-01-22T16:51:29+08:00
Running ./benchmark_local_engine
Run on (32 X 2100 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x16)
  L1 Instruction 32 KiB (x16)
  L2 Unified 1024 KiB (x16)
  L3 Unified 11264 KiB (x2)
Load Average: 13.31, 24.04, 42.68
--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
BM_CHDivideFunction/iterations:10                  268 ms          260 ms           10
BM_SparkDivideFunction/iterations:10               243 ms          243 ms           10
BM_GlutenDivideFunctionParser/iterations:10        322 ms          322 ms           10

可见, 新的SparkDivideFunction 相对于CH 的Divide 大约有10% 左右的提升,相对于Gluten 的FunctionParser 的实现约有30% ~ 40% 左右的提升

Copy link

#4483

Copy link

Run Gluten Clickhouse CI

2 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@@ -63,6 +63,7 @@ include_directories(
${CMAKE_CURRENT_BINARY_DIR}/proto
${THRIFT_INCLUDE_DIR}
${CMAKE_BINARY_DIR}/contrib/thrift-cmake
${CMAKE_BINARY_DIR}/contrib/llvm-project/llvm/include
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not add this, while compile code, will has the excpeiton 'llvm/IR/IRBuilder.h' file can not be found, as this is defined in FunctionBinaryArithmetic.h.

@@ -72,6 +73,8 @@ include_directories(
${ClickHouse_SOURCE_DIR}/contrib/azure/sdk/storage/azure-storage-blobs/inc
${ClickHouse_SOURCE_DIR}/contrib/azure/sdk/core/azure-core/inc
${ClickHouse_SOURCE_DIR}/contrib/azure/sdk/storage/azure-storage-common/inc
${ClickHouse_SOURCE_DIR}/contrib/llvm-project/llvm/include
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above mentioned

R r = vec2->getElement(i);
if (r == 0)
{
data[i] = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the stale stale label Mar 11, 2024
Copy link

This PR was auto-closed because it has been stalled for 10 days with no activity. Please feel free to reopen if it is still valid. Thanks.

@github-actions github-actions bot closed this Mar 21, 2024
@KevinyhZou KevinyhZou changed the title [GLUTEN-4483][CH]Improve divide [GLUTEN-4483][CH]Improve divide function Apr 12, 2024
Copy link

#4483

@KevinyhZou
Copy link
Contributor Author

reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] Improve divide function performance
2 participants