Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-4350][CH] Improve NativeReader's performance for small blocks #4369

Merged
merged 1 commit into from
Jan 29, 2024

Conversation

lgbo-ustc
Copy link
Contributor

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

Fixes: #4350

After this modification.

after_improve_native_reader

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

unit tests

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Copy link

#4350

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@lgbo-ustc
Copy link
Contributor Author

lgbo-ustc commented Jan 11, 2024

Notice that when parse aggregate data, there are a lot of virtual function call to initialize an aggregate data place. Reduce these virtual function call should bring some improvement.

Run a test on this.

static void readFixedSizeAggregateData(DB::ReadBuffer &in, DB::ColumnPtr & column, size_t rows, NativeReader::ColumnParseUtil & column_parse_util)
{
    ColumnAggregateFunction & real_column = typeid_cast<ColumnAggregateFunction &>(*column->assumeMutable());
    auto & arena = real_column.createOrGetArena();
    ColumnAggregateFunction::Container & vec = real_column.getData();
    size_t initial_size = vec.size();
    vec.reserve(initial_size + rows);
    #if 0
    for (size_t i = 0; i < rows; ++i)
    {
        AggregateDataPtr place = arena.alignedAlloc(column_parse_util.aggregate_state_size, column_parse_util.aggregate_state_align);
        column_parse_util.aggregate_function->create(place);
        auto n = in.read(place, column_parse_util.aggregate_state_size);
        chassert(n == column_parse_util.aggregate_state_size);
        vec.push_back(place);
    }
    #else
    AggregateDataPtr places = arena.alignedAlloc(column_parse_util.aggregate_state_align_size * rows, column_parse_util.aggregate_state_align);
    column_parse_util.aggregate_function->createPlaces(places, rows, column_parse_util.aggregate_state_align_size);
    for (size_t i = 0; i < rows; ++i)
    {
        auto n = in.read(places, column_parse_util.aggregate_state_size);
        chassert(n == column_parse_util.aggregate_state_size);
        places += column_parse_util.aggregate_state_align_size;
        vec.push_back(places);
    }
    #endif
}
                if (!column_parse_util.aggregate_state_align)
                    column_parse_util.aggregate_state_align_size = column_parse_util.aggregate_state_size;
                else
                {
                    auto n = column_parse_util.aggregate_state_size % column_parse_util.aggregate_state_align;
                    if (!n)
                    {
                        column_parse_util.aggregate_state_align_size = column_parse_util.aggregate_state_size + (column_parse_util.aggregate_state_align - n);
                    }
                    else
                        column_parse_util.aggregate_state_align_size = column_parse_util.aggregate_state_size;
                }

Got result
image

But need to modify CH core.

Copy link

Run Gluten Clickhouse CI

@baibaichen baibaichen added the look into details Clickhouse changes its interface, we need to look into whether behavior is changed or not. label Jan 29, 2024
Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@baibaichen baibaichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@baibaichen baibaichen merged commit 6cbd187 into apache:main Jan 29, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
look into details Clickhouse changes its interface, we need to look into whether behavior is changed or not.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] The execution time of SourceFromJavaIter seems too large then expected.
2 participants