Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce String Hash Map to speed up aggregation over short string keys. #6243

Merged
merged 3 commits into from Oct 23, 2019

Conversation

akuzm
Copy link
Contributor

@akuzm akuzm commented Jul 31, 2019

This PR introduces the string hash map that speeds up the aggregation over short string keys, finishing the series of patches from #5417.

Category:

  • Performance Improvement

Changelog entry:
The performance of aggregation over short string keys is improved.

@akuzm akuzm changed the base branch from 19.11 to master August 9, 2019 12:28
@akuzm akuzm force-pushed the aku/hashtables branch 5 times, most recently from 744e913 to 5f3057f Compare August 14, 2019 11:09
@akuzm akuzm force-pushed the aku/hashtables branch 4 times, most recently from 79ce3e1 to 2bb68c2 Compare August 21, 2019 10:23
@akuzm akuzm changed the title [wip] memory management in hash tables [wip] string hash tables review Aug 30, 2019
@akuzm akuzm force-pushed the aku/hashtables branch 2 times, most recently from c00ddeb to ce3671e Compare September 25, 2019 11:47
@akuzm akuzm mentioned this pull request Oct 17, 2019
@akuzm
Copy link
Contributor Author

akuzm commented Oct 17, 2019

No significant difference on this query:

SELECT URL, count() AS c FROM hits_100m_single GROUP BY URL ORDER BY c DESC LIMIT 10
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━┓
┃ version               ┃ count() ┃   min ┃   med ┃   max ┃ q95-q05 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━┩
│ /home/akuzm/ch-string │     161 │ 6.832 │ 7.107 │ 7.296 │   0.221 │
├───────────────────────┼─────────┼───────┼───────┼───────┼─────────┤
│ /home/akuzm/ch-master │     161 │ 6.935 │ 7.108 │  7.29 │    0.22 │
└───────────────────────┴─────────┴───────┴───────┴───────┴─────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ q                                             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ [0,0.005000114440917969,0.018000125885009766] │
└───────────────────────────────────────────────┘
result-10.17-16.23.01.txt

@akuzm
Copy link
Contributor Author

akuzm commented Oct 18, 2019

That's what I got from string_hash_map test, with ArenaKeyHolder, 49 runs:

┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓
┃ file   ┃ version              ┃    min ┃    med ┃    max ┃ q95-q05 ┃        mem ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩
│ term1  │ HashMapWithSavedHash │  1.061 │  1.075 │  1.117 │    0.04 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term1  │ StringHashMap        │  0.558 │  0.565 │  0.594 │   0.021 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term2  │ HashMapWithSavedHash │  2.181 │  2.213 │  2.303 │   0.054 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term2  │ StringHashMap        │  1.019 │  1.033 │  1.091 │    0.04 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term4  │ HashMapWithSavedHash │  5.229 │  5.276 │  5.558 │   0.112 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term4  │ StringHashMap        │  2.435 │  2.463 │   2.54 │    0.08 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term8  │ HashMapWithSavedHash │  8.192 │  8.284 │  9.519 │    0.88 │  402653184 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term8  │ StringHashMap        │  5.586 │  5.653 │  5.735 │   0.128 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term16 │ HashMapWithSavedHash │  8.348 │  8.441 │  8.788 │   0.311 │  805306368 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term16 │ StringHashMap        │  6.822 │  6.894 │   7.73 │   0.088 │  134217728 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term24 │ HashMapWithSavedHash │  8.754 │  8.883 │  9.245 │   0.297 │ 1207959552 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term24 │ StringHashMap        │  8.146 │  8.245 │  9.893 │   0.232 │  671088640 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term48 │ HashMapWithSavedHash │  9.601 │  9.679 │ 10.469 │   0.326 │ 2415919104 │
├────────┼──────────────────────┼────────┼────────┼────────┼─────────┼────────────┤
│ term48 │ StringHashMap        │ 10.198 │ 10.349 │ 11.339 │   0.785 │ 2415919104 │
└────────┴──────────────────────┴────────┴────────┴────────┴─────────┴────────────┘

All short strings improve, and for long strings there is 5% regression which shouldn't be noticeable in real workload. We'll see what the CI performance test says.

@akuzm
Copy link
Contributor Author

akuzm commented Oct 18, 2019

The next thing I want to do is to study the coverage info -- e.g. mergeToViaFind is not covered at all.

@akuzm
Copy link
Contributor Author

akuzm commented Oct 21, 2019

SELECT MobilePhoneModel, uniq(UserID) AS u FROM hits_100m_single WHERE MobilePhoneModel != '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━┓
┃ version               ┃ count() ┃   min ┃   med ┃   max ┃ q95-q05 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━┩
│ /home/akuzm/ch-string │     161 │ 0.657 │ 0.671 │ 0.747 │   0.037 │
├───────────────────────┼─────────┼───────┼───────┼───────┼─────────┤
│ /home/akuzm/ch-master │     161 │  0.66 │ 0.671 │ 0.726 │   0.038 │
└───────────────────────┴─────────┴───────┴───────┴───────┴─────────┘

This query regresses x1.23 in performance test, but I can't reproduce the difference. According to perf report, the time is dominated by reading the table, applying the prewhere clause and calculating uniq(UserID), that's why we are not seeing any changes.

@github-actions github-actions bot added the pr-build Pull request with build/testing/packaging improvement label Oct 21, 2019
@akuzm akuzm changed the title [wip] string hash tables review Introduce String Hash Map to speed up aggregation over short string keys. Oct 21, 2019
@akuzm akuzm marked this pull request as ready for review October 21, 2019 13:26
It speeds up aggregation over short string keys. Use it as a default
aggregation method for string keys.
switch (sz)
{
case 0:
keyHolderDiscardKey(key_holder);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, we don't persist yet, why discard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emplace() guarantees that it will either persist or discard the key. It doesn't pass the zero key_holder forward, and instead passes the empty key0, so it doesn't need the key from key_holder and should discard it here.

Actually I'm not sure when leaving it out would have any visible consequences -- discard is only defined for SerializedKeyHolder, and if it is zero, it's going to have zero size, and the discard would be a noop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emplace() guarantees that it will either persist or discard the key.

Hmm, but dispatch is also used for find() and potentially hash().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For uses other than emplace, it should always be the NoopKeyHolder, for which this call does nothing. I noticed that the signature of FindCallable::operator () is somewhat misleading now, because it accepts a template KeyHolder. I'll change it to only accept the NoopKeyHolder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, we don't have the noop key holder anymore. It's represented by just the plain key now.

Copy link
Collaborator

@amosbird amosbird Oct 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized I've answered the question about OnExistingKey and OnNewKey to you #5417 (comment). Heh, now it confuses me. Those names do convey more than they should.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think the persist/discard is better that onNew/onExisting, but maybe still bad indeed. Don't know how to rename it. Basically, it is a way for a hash table to say that it either needs a persistent version of this key, or it doesn't. Something like makeKeyPersistent/dontMakeKeyPersistent? This also may be bad, because for SerializedKeyHolder, discardKey actually destroys it, and you can't use it anymore.

@akuzm akuzm added pr-performance Pull request with some performance improvements and removed pr-build Pull request with build/testing/packaging improvement labels Oct 21, 2019
@akuzm
Copy link
Contributor Author

akuzm commented Oct 21, 2019

The ubsan failure is in preciseExp10, not related to this PR.

@github-actions github-actions bot added the pr-build Pull request with build/testing/packaging improvement label Oct 21, 2019
@blinkov blinkov removed the pr-build Pull request with build/testing/packaging improvement label Oct 21, 2019
@akuzm
Copy link
Contributor Author

akuzm commented Oct 22, 2019

With the latest dispatch improvement (switch((sz - 1) >> 3) there is even less regression on string_hash_map test:

┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓
┃ file   ┃ version              ┃ count() ┃   min ┃   med ┃    max ┃ q95-q05 ┃        mem ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩
│ term1  │ HashMapWithSavedHash │      21 │ 1.058 │ 1.068 │    1.1 │   0.027 │  268435456 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term1  │ StringHashMap        │      21 │ 0.539 │ 0.545 │  0.565 │   0.019 │  167794720 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term2  │ HashMapWithSavedHash │      21 │ 2.197 │  2.21 │  2.285 │   0.068 │  671088640 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term2  │ StringHashMap        │      21 │ 1.057 │ 1.075 │  1.342 │   0.255 │  269238304 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term4  │ HashMapWithSavedHash │      21 │ 5.257 │ 5.303 │  5.413 │    0.13 │ 2281701376 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term4  │ StringHashMap        │      21 │  2.61 │ 2.627 │  2.672 │   0.053 │  721436704 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term8  │ HashMapWithSavedHash │      21 │ 8.185 │ 8.251 │  8.683 │   0.127 │ 4697620480 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term8  │ StringHashMap        │      21 │ 5.742 │ 5.778 │  5.907 │   0.141 │ 2826969120 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term16 │ HashMapWithSavedHash │      21 │ 8.254 │ 8.451 │  9.588 │   0.557 │ 5100273664 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term16 │ StringHashMap        │      21 │  6.95 │ 7.035 │  8.062 │   0.595 │ 3992977440 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term24 │ HashMapWithSavedHash │      21 │ 8.741 │ 8.836 │  9.085 │   0.305 │ 5502926848 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term24 │ StringHashMap        │      21 │ 8.182 │ 8.348 │   8.54 │   0.332 │ 5066784800 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term48 │ HashMapWithSavedHash │      21 │ 9.566 │ 9.695 │ 10.905 │   1.231 │ 6710886400 │
├────────┼──────────────────────┼─────────┼───────┼───────┼────────┼─────────┼────────────┤
│ term48 │ StringHashMap        │      21 │ 9.827 │ 9.902 │ 10.856 │   0.877 │ 6710904864 │
└────────┴──────────────────────┴─────────┴───────┴───────┴────────┴─────────┴────────────┘

@akuzm
Copy link
Contributor Author

akuzm commented Oct 22, 2019

Functional stateless tests (debug) failure is out of disk space in the CI.

@akuzm akuzm merged commit 1a609c2 into master Oct 23, 2019
@akuzm akuzm deleted the aku/hashtables branch December 9, 2019 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-performance Pull request with some performance improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants