Skip to content

Conversation

@WillAyd
Copy link
Member

@WillAyd WillAyd commented Mar 5, 2024

supersedes #56432

this is just a research project at the moment, but the main thing this solves is using templated functions to achieve better code organization and runtime performance, compared to what's in main.

Haven't spent a ton of time optimization but first benchmark run shows factorization of a high cardinality column can be over 2x as fast, though factorizationn of a low cardinality column is 2x as slow.

| Change   | Before [58e63ec1] <khashl-usage~7^2>   | After [23fc5842] <khashl-usage>   |   Ratio | Benchmark (Parameter)                                                                        |
|----------|----------------------------------------|-----------------------------------|---------|----------------------------------------------------------------------------------------------|
| +        | 1.39±0.04ms                            | 3.05±0.3ms                        |    2.19 | hash_functions.Unique.time_unique_with_duplicates('Int64')                                   |
| +        | 21.4±0.8μs                             | 40.1±20μs                         |    1.88 | hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'numpy.float64'>, 10000)  |
| +        | 2.55±0.09ms                            | 3.96±0.2ms                        |    1.55 | hash_functions.Unique.time_unique('Int64')                                                   |
| +        | 53.5±3μs                               | 81.6±20μs                         |    1.52 | hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'numpy.float64'>, 100000) |
| +        | 3.88±0.3ms                             | 5.85±0.8ms                        |    1.51 | hash_functions.Unique.time_unique_with_duplicates('Float64')                                 |
| +        | 15.3±0.2μs                             | 17.2±0.6μs                        |    1.12 | hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'numpy.float64'>, 10000)          |
| -        | 8.46±0.3ms                             | 7.28±0.4ms                        |    0.86 | hash_functions.UniqueAndFactorizeArange.time_unique(5)                                       |
| -        | 9.01±0.2ms                             | 7.48±0.2ms                        |    0.83 | hash_functions.UniqueAndFactorizeArange.time_unique(15)                                      |
| -        | 8.68±0.4ms                             | 7.14±0.3ms                        |    0.82 | hash_functions.UniqueAndFactorizeArange.time_unique(4)                                       |
| -        | 8.84±0.4ms                             | 7.23±0.1ms                        |    0.82 | hash_functions.UniqueAndFactorizeArange.time_unique(9)                                       |
| -        | 8.86±0.4ms                             | 7.14±0.4ms                        |    0.81 | hash_functions.UniqueAndFactorizeArange.time_unique(11)                                      |
| -        | 8.76±0.3ms                             | 6.64±0.1ms                        |    0.76 | hash_functions.UniqueAndFactorizeArange.time_unique(8)                                       |
| -        | 14.4±1ms                               | 7.10±0.8ms                        |    0.49 | hash_functions.UniqueAndFactorizeArange.time_factorize(15)                                   |
| -        | 14.7±0.9ms                             | 6.66±0.3ms                        |    0.45 | hash_functions.UniqueAndFactorizeArange.time_factorize(10)                                   |
| -        | 14.7±1ms                               | 6.58±0.1ms                        |    0.45 | hash_functions.UniqueAndFactorizeArange.time_factorize(11)                                   |
| -        | 15.1±2ms                               | 6.76±0.2ms                        |    0.45 | hash_functions.UniqueAndFactorizeArange.time_factorize(12)                                   |
| -        | 15.4±1ms                               | 6.95±0.3ms                        |    0.45 | hash_functions.UniqueAndFactorizeArange.time_factorize(7)                                    |
| -        | 15.2±1ms                               | 6.92±0.4ms                        |    0.45 | hash_functions.UniqueAndFactorizeArange.time_factorize(8)                                    |
| -        | 14.9±1ms                               | 6.60±0.2ms                        |    0.44 | hash_functions.UniqueAndFactorizeArange.time_factorize(4)                                    |
| -        | 15.4±1ms                               | 6.69±0.1ms                        |    0.44 | hash_functions.UniqueAndFactorizeArange.time_factorize(5)                                    |
| -        | 15.3±1ms                               | 6.77±0.2ms                        |    0.44 | hash_functions.UniqueAndFactorizeArange.time_factorize(6)                                    |
| -        | 15.7±0.8ms                             | 6.27±0.1ms                        |    0.4  | hash_functions.UniqueAndFactorizeArange.time_factorize(9)                                    |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

@WillAyd WillAyd added the Performance Memory or execution speed performance label Mar 5, 2024
mroeschke
mroeschke previously approved these changes Mar 5, 2024
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops accidentally approved

@github-actions
Copy link
Contributor

github-actions bot commented Apr 8, 2024

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Apr 8, 2024
@WillAyd
Copy link
Member Author

WillAyd commented Apr 9, 2024

Will reopen if I get time to look at this again. Generally the idea is promising just takes some time to get the benchmarks right

@WillAyd WillAyd closed this Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Memory or execution speed performance Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants