Unexpected performance drop-off (~10x) aggregating 64bit int vs 32bit int groups #4208

alexander-beedie · 2022-07-26T10:25:34Z

What happens?

I was looking at the grouped aggregation benchmark earlier, as referenced in Parallel Grouped Aggregation in DuckDB, and found a fairly dramatic performance issue (which is so large I think it is likely to be a bug), just by using 64bit values (while keeping everything else about the benchmark the same).

Performance drops by a full order of magnitude (~10x) across the board.

To Reproduce

Run the grouped aggregation benchmark with the following minor change to the experiment() function:

# scale by a value that will force the underlying column dtypes to 64bit
scaling_factor = 5_000_000_000
    
g1 = np.repeat( np.arange( 0, int( n_groups / 2 ) ), int( n_rows / (n_groups / 2) ) ) * scaling_factor
g2 = np.tile( np.arange( 0, 2 ), int( n_rows / 2 ) ) * scaling_factor
d = np.random.randint( 0, n_rows, n_rows ) * scaling_factor

This changes nothing about how much data is generated, it just scales the test values to guarantee that 64bit column dtypes are applied. The result is an unexpected performance falloff.

Results

Characterizing the slowdown under 32bit/64bit regimes, with polars as a reference implementation:

rows	groups	dd:32	dd:64	dd:slowdown	pl:32	pl:64	pl:slowdown
10,000,000	1,000	0.0140	0.588	42.0x	0.032	0.088	2.8x
10,000,000	10,000	0.0520	0.664	12.8x	0.031	0.089	2.9x
10,000,000	100,000	0.0680	0.599	8.8x	0.038	0.098	2.6x
10,000,000	1,000,000	0.1280	1.481	11.6x	0.090	0.183	2.0x
10,000,000	10,000,000	0.1980	2.830	14.3x	0.228	0.307	1.3x
100,000,000	1,000	0.2130	4.145	19.5x	0.319	0.841	2.6x
100,000,000	100,000	0.5490	4.530	8.3x	0.415	0.978	2.4x
100,000,000	1,000,000	0.8060	7.525	9.3x	0.841	1.893	2.3x
100,000,000	10,000,000	1.5920	16.932	10.6x	1.609	2.984	1.9x
100,000,000	100,000,000	2.7520	31.819	11.6x	3.729	5.861	1.6x

Moving to 64bit...

DuckDB (dd in the table above) slows down by ~10x.
this degree of drop-off is not proportional to the extra work being done
Polars (pl in the table above) slows down by ~2x.
this is broadly what you'd expect, given that 64bit values occupy twice the memory of 32bit values

y-axis: time in seconds (lower is better) 
x-axis: rows/groups bucket in the above table

OS:

macOS (12.4)

DuckDB Version:

0.4.0

DuckDB Client:

Python

Full Name:

Alexander Beedie

Have you tried this on the latest `master` branch?

I agree (no - tested on latest officially released package, v0.4.0)

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

I agree

The text was updated successfully, but these errors were encountered:

Mytherin · 2022-07-26T11:07:30Z

Thanks for the report! This data set seems to trigger a worst-case behavior in our hash function, causing all the data to be handled in the same hash partition. Modifying the hash function fixes the issue. This is likely because the data all consists of very large numbers that are evenly spaced out. We should definitely have a look at fixing this, although I doubt this is a common data distribution in practice.

alexander-beedie · 2022-07-26T11:15:40Z

We should definitely have a look at fixing this...

That would be great!

...although I doubt this is a common data distribution in practice.

Hopefully not; I might try some other workloads to see if this can be triggered in other data regimes, because when it does occur it seems to hit hard, heh. Thanks! :)

Mytherin · 2022-08-07T08:20:06Z

This should be fixed on the master now by the introduction of a stronger hash function (#4304).

alexander-beedie changed the title ~~Huge performance drop-off (~10x) aggregating 64 bit int vs 32 bit int groups~~ Unexpected performance drop-off (~10x) aggregating 64bit int vs 32bit int groups Jul 26, 2022

Mytherin self-assigned this Jul 26, 2022

Mytherin mentioned this issue Aug 5, 2022

Improve numeric hash function to a better but slightly slower hash function #4304

Merged

Mytherin closed this as completed in #4304 Aug 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected performance drop-off (~10x) aggregating 64bit int vs 32bit int groups #4208

Unexpected performance drop-off (~10x) aggregating 64bit int vs 32bit int groups #4208

alexander-beedie commented Jul 26, 2022 •

edited

Mytherin commented Jul 26, 2022

alexander-beedie commented Jul 26, 2022

Mytherin commented Aug 7, 2022

Unexpected performance drop-off (~10x) aggregating 64bit int vs 32bit int groups #4208

Unexpected performance drop-off (~10x) aggregating 64bit int vs 32bit int groups #4208

Comments

alexander-beedie commented Jul 26, 2022 • edited

What happens?

To Reproduce

Results

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Mytherin commented Jul 26, 2022

alexander-beedie commented Jul 26, 2022

Mytherin commented Aug 7, 2022

alexander-beedie commented Jul 26, 2022 •

edited

Have you tried this on the latest `master` branch?