New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue optimizing branch miss of if function when result type is float*/decimal*/int* #59148
Continue optimizing branch miss of if function when result type is float*/decimal*/int* #59148
Conversation
This is an automated comment for commit 818fb98 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
Performance report on x86-64Performance report on aarch64Some bad cases
|
Bad cases are solved |
Fast test failures are unrelated (unpinned dependency update). Please rebase with master |
// This macro performs a branch-free conditional assignment for floating point types. | ||
// It uses bitwise operations to avoid branching, which can be beneficial for performance. | ||
#define BRANCHFREE_IF_FLOAT(TYPE, vc, va, vb, vr) \ | ||
using UIntType = typename NumberTraits::Construct<false, false, sizeof(TYPE)>::Type; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, we have a similar approach in AggregateFunctionSum:
ClickHouse/src/AggregateFunctions/AggregateFunctionSum.h
Lines 171 to 187 in 76abd32
static_assert(sizeof(Value) == 4 || sizeof(Value) == 8); | |
using equivalent_integer = typename std::conditional_t<sizeof(Value) == 4, UInt32, UInt64>; | |
constexpr size_t unroll_count = 128 / sizeof(T); | |
T partial_sums[unroll_count]{}; | |
const auto * unrolled_end = ptr + (count / unroll_count * unroll_count); | |
while (ptr < unrolled_end) | |
{ | |
for (size_t i = 0; i < unroll_count; ++i) | |
{ | |
equivalent_integer value; | |
std::memcpy(&value, &ptr[i], sizeof(Value)); | |
value &= (!condition_map[i] != add_if_zero) - 1; | |
Value d; | |
std::memcpy(&d, &value, sizeof(Value)); |
@Algunenano I know you like such oprimizations, so, please be a reviewer instead of me :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really nice to me. Thanks a lot, specially for adding the extra parenthesis in for+if situations (helps readability IMO). I've left a few comments. Can you have a look, please?
BTW, this is another opt of if function when input type is map: #59413 @Algunenano |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Continue optimizing branch miss of if function when result type is float*/decimal*/int* , follow up of #57885
The POCs helps me prove my ideas: https://github.com/bigo-sg/gluten/commits/improve_if/ (the last two commits)