New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Use np.dot
to speedup spatial.distance.correlation
#19583
Conversation
From what I can tell the CI failure seems to be unrelated but happy to have a closer look if you think that's not the case. |
Yeah, that's unrelated. Out of interest, is |
In an isolated test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the benchmarks on x86_64
Linux with asv continuous -e -b ".*Xdist.*" main spatial-correlation-dot
yields nothing but improvements:
| Change | Before [ae9dc131] <main> | After [4141d2f4] <spatial-correlation-dot> | Ratio | Benchmark (Parameter) |
|----------|----------------------------|----------------------------------------------|---------|------------------------------------------------------|
| - | 1.88±0.01ms | 544±1μs | 0.29 | spatial.XdistWeighted.time_cdist(10, 'cosine') |
| - | 186±0.1ms | 53.7±0.3ms | 0.29 | spatial.XdistWeighted.time_cdist(100, 'cosine') |
| - | 7.49±0.01ms | 2.19±0.05ms | 0.29 | spatial.XdistWeighted.time_cdist(20, 'cosine') |
| - | 847±2μs | 247±1μs | 0.29 | spatial.XdistWeighted.time_pdist(10, 'cosine') |
| - | 92.9±0.2ms | 26.6±0.07ms | 0.29 | spatial.XdistWeighted.time_pdist(100, 'cosine') |
| - | 3.54±0.01ms | 1.03±0.02ms | 0.29 | spatial.XdistWeighted.time_pdist(20, 'cosine') |
| - | 1.30±0ms | 308±8μs | 0.24 | spatial.XdistWeighted.time_pdist(10, 'correlation') |
| - | 2.88±0ms | 663±0.8μs | 0.23 | spatial.XdistWeighted.time_cdist(10, 'correlation') |
| - | 288±0.6ms | 66.0±0.1ms | 0.23 | spatial.XdistWeighted.time_cdist(100, 'correlation') |
| - | 11.5±0.01ms | 2.63±0ms | 0.23 | spatial.XdistWeighted.time_cdist(20, 'correlation') |
| - | 142±0.08ms | 32.7±0.1ms | 0.23 | spatial.XdistWeighted.time_pdist(100, 'correlation') |
| - | 5.47±0.01ms | 1.25±0ms | 0.23 | spatial.XdistWeighted.time_pdist(20, 'correlation') |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
Same thing on ARM Mac (M2 Max):
| Change | Before [ae9dc131] <main> | After [4141d2f4] <spatial-correlation-dot> | Ratio | Benchmark (Parameter) |
|----------|----------------------------|----------------------------------------------|---------|------------------------------------------------------|
| - | 680±5μs | 571±70μs | 0.84 | spatial.Xdist.time_pdist(1000, 'chebyshev') |
| - | 799±3μs | 262±2μs | 0.33 | spatial.XdistWeighted.time_pdist(10, 'cosine') |
| - | 1.74±0.01ms | 558±2μs | 0.32 | spatial.XdistWeighted.time_cdist(10, 'cosine') |
| - | 175±0.5ms | 55.4±0.6ms | 0.32 | spatial.XdistWeighted.time_cdist(100, 'cosine') |
| - | 3.38±0ms | 1.07±0ms | 0.32 | spatial.XdistWeighted.time_pdist(20, 'cosine') |
| - | 7.09±0.09ms | 2.22±0.01ms | 0.31 | spatial.XdistWeighted.time_cdist(20, 'cosine') |
| - | 88.7±2ms | 27.3±0.08ms | 0.31 | spatial.XdistWeighted.time_pdist(100, 'cosine') |
| - | 10.9±0.08ms | 2.79±0.01ms | 0.26 | spatial.XdistWeighted.time_cdist(20, 'correlation') |
| - | 2.74±0.07ms | 694±3μs | 0.25 | spatial.XdistWeighted.time_cdist(10, 'correlation') |
| - | 273±1ms | 69.2±2ms | 0.25 | spatial.XdistWeighted.time_cdist(100, 'correlation') |
| - | 1.24±0.02ms | 316±1μs | 0.25 | spatial.XdistWeighted.time_pdist(10, 'correlation') |
| - | 137±0.1ms | 34.1±0.07ms | 0.25 | spatial.XdistWeighted.time_pdist(100, 'correlation') |
| - | 5.40±0.03ms | 1.34±0ms | 0.25 | spatial.XdistWeighted.time_pdist(20, 'correlation') |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
Combined with the fact that code complexity hasn't increased much/we haven't added more compiled code, it is a pretty easy decision to merge, thanks.
and thanks for your first contribution to SciPy |
Great, thanks for properly running the benchmarks and the fast review. |
What does this implement/fix?
This PR changes the implementation of
spatial.distance.correlation
to rely on dot products for computing the distance (7e85a88). 77f135d switches to builtin math operations when working on scalars as they have less overhead compared to numpy.Overall this reduces the runtime of
spatial.distance.cosine
by ~6 x and the weighted version by ~3.5 x 🚀Additional information
The above numbers where measured using the script below.
Here are the results of the
XdistWeighted
(Xdist
doesn't show a difference since it uses C code instead):