Skip to content

fixed group by int16 and Date types on AMD EPYC 7401P machine#3512

Merged
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
lapkofear:amd_perf_problem
Nov 4, 2018
Merged

fixed group by int16 and Date types on AMD EPYC 7401P machine#3512
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
lapkofear:amd_perf_problem

Conversation

@lapkofear
Copy link
Copy Markdown
Contributor

There is sever performance degradation on AMD EPYC 7401P based systems. Every query with grouping by int16 or Date type becomes very slow (see: #3490).

Table & table_src,
Arena * arena) const
{
for (auto it = table_src.begin(); it != table_src.end(); ++it)
Copy link
Copy Markdown
Contributor Author

@lapkofear lapkofear Nov 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the last cycle iteration, calculation of the table_src.end() becomes extremely slow(up to 1s) . for template parameters [with Method = DB::AggregationMethodOneNumber<short unsigned int, HashMapTable<long unsigned int, HashMapCell<long unsigned int, char*, TrivialHash, HashTableNoState>, TrivialHash, HashTableFixedGrower<16>, Allocator<true> > >; Table = HashMapTable<long unsigned int, HashMapCell<long unsigned int, char*, TrivialHash, HashTableNoState>, TrivialHash, HashTableFixedGrower<16>, Allocator<true> >] . Such strange behavior is not reproduced on Intel desktop machines. (On Xeon we did not test this) and on other data types.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could provide branch with logs and they output, because such behavior is really strange for me.

Comment thread dbms/src/Interpreters/Aggregator.cpp Outdated
for (auto it = table_src.begin(); it != table_src.end(); ++it)
decltype(table_src.end()) end = table_src.end();

for (auto it = table_src.begin(); it != end; ++it)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write
for (auto it = table_src.begin(), end = table_src.end(); it != end; ++it)
for more convenience.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thank you.

@alexey-milovidov
Copy link
Copy Markdown
Member

Yes, really strange.
For this hash table, end() is just buf + 65536.

Also it's strange that it is calculated on every interation.
I have a testing server on AMD EPYC and I will try to reproduce.

@alexey-milovidov
Copy link
Copy Markdown
Member

I have successfully reproduced this issue.

@alexey-milovidov
Copy link
Copy Markdown
Member

I have no sudo on the server with AMD EPYC, will wait...

@lapkofear
Copy link
Copy Markdown
Contributor Author

It would be great to find out the root cause

@alexey-milovidov alexey-milovidov merged commit c25e093 into ClickHouse:master Nov 4, 2018
@alexey-milovidov
Copy link
Copy Markdown
Member

I have merged this PR, but we need to investigate further, this is TODO.

@lapkofear lapkofear deleted the amd_perf_problem branch November 5, 2018 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants