Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(core): FastMap to use Robin Hood hashing #4054

Closed
wants to merge 7 commits into from
Closed

Conversation

jerrinot
Copy link
Contributor

@jerrinot jerrinot commented Dec 11, 2023

FastMap to use Robin Hood hashing.

Testing setup: Clickbench running on c6a.16xlarge
Query: SELECT UserID, COUNT(*) AS c FROM hits GROUP BY UserID ORDER BY c DESC LIMIT 10;

7.3.8-snapshot vanilla
timings: 10992530610
timings: 10956769125
timings: 10934153712
timings: 10952430208
timings: 10920044785
timings: 10868601623
timings: 10865651603
timings: 10885177297

7.3.8-snapshot + Robin Hood FastMap
timings: 10353366441
timings: 10266570821
timings: 10262674985
timings: 10261647727
timings: 10245524141
timings: 10273126124
timings: 10252424759
timings: 10245550197

@puzpuzpuz puzpuzpuz added SQL Issues or changes relating to SQL execution Performance Performance improvements labels Dec 11, 2023
@jerrinot
Copy link
Contributor Author

jerrinot commented Dec 11, 2023

oh boy, a correctness regression
edit: fixed

@jerrinot jerrinot marked this pull request as draft December 11, 2023 11:49
@jerrinot jerrinot marked this pull request as ready for review December 11, 2023 14:09
@@ -333,22 +333,63 @@ private FastMapValue asNew(BaseKey keyWriter, int index, int hashCode, FastMapVa
kPos = (kPos + 7) & ~0x7;
setOffset(offsets, index, keyWriter.startAddress - heapStart);
setHashCode(offsets, index, hashCode);
if (--free == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we change the default load factor for FastMap to 0.8 or even 0.9? To me, the main benefit of using Robin Hood hashing is being able to fit more entries into less memory and that should improve performance in certain scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, I can test it.

@ideoma
Copy link
Collaborator

ideoma commented Jan 5, 2024

[PR Coverage check]

😍 pass : 50 / 50 (100.00%)

file detail

path covered line new line coverage
🔵 io/questdb/cairo/map/FastMap.java 50 50 100.00%

@jerrinot
Copy link
Contributor Author

jerrinot commented Jan 5, 2024

closing as new results do not show any improvements with default load and actually it performs worse with higher loads. chances are this PR shifted bottlenecks: #4032

@jerrinot jerrinot closed this Jan 5, 2024
@bluestreak01 bluestreak01 deleted the jh_rh_hash branch January 8, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Performance improvements SQL Issues or changes relating to SQL execution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants