Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed group by int16 and Date types on AMD EPYC 7401P machine #3512

Merged
merged 3 commits into from Nov 4, 2018
Merged

fixed group by int16 and Date types on AMD EPYC 7401P machine #3512

merged 3 commits into from Nov 4, 2018

Conversation

lapkofear
Copy link
Contributor

There is sever performance degradation on AMD EPYC 7401P based systems. Every query with grouping by int16 or Date type becomes very slow (see: #3490).

@@ -1475,7 +1475,9 @@ void NO_INLINE Aggregator::mergeDataImpl(
Table & table_src,
Arena * arena) const
{
for (auto it = table_src.begin(); it != table_src.end(); ++it)
Copy link
Contributor Author

@lapkofear lapkofear Nov 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the last cycle iteration, calculation of the table_src.end() becomes extremely slow(up to 1s) . for template parameters [with Method = DB::AggregationMethodOneNumber<short unsigned int, HashMapTable<long unsigned int, HashMapCell<long unsigned int, char*, TrivialHash, HashTableNoState>, TrivialHash, HashTableFixedGrower<16>, Allocator<true> > >; Table = HashMapTable<long unsigned int, HashMapCell<long unsigned int, char*, TrivialHash, HashTableNoState>, TrivialHash, HashTableFixedGrower<16>, Allocator<true> >] . Such strange behavior is not reproduced on Intel desktop machines. (On Xeon we did not test this) and on other data types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could provide branch with logs and they output, because such behavior is really strange for me.

for (auto it = table_src.begin(); it != table_src.end(); ++it)
decltype(table_src.end()) end = table_src.end();

for (auto it = table_src.begin(); it != end; ++it)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write
for (auto it = table_src.begin(), end = table_src.end(); it != end; ++it)
for more convenience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thank you.

@alexey-milovidov
Copy link
Member

Yes, really strange.
For this hash table, end() is just buf + 65536.

Also it's strange that it is calculated on every interation.
I have a testing server on AMD EPYC and I will try to reproduce.

@alexey-milovidov
Copy link
Member

I have successfully reproduced this issue.

@alexey-milovidov
Copy link
Member

I have no sudo on the server with AMD EPYC, will wait...

@lapkofear
Copy link
Contributor Author

It would be great to find out the root cause

@alexey-milovidov alexey-milovidov merged commit c25e093 into ClickHouse:master Nov 4, 2018
@alexey-milovidov
Copy link
Member

I have merged this PR, but we need to investigate further, this is TODO.

@lapkofear lapkofear deleted the amd_perf_problem branch November 5, 2018 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants