Code speedup due to faster einsum replacement #23

winedarksea · 2023-09-01T19:33:02Z

I use your simdkalman in my AutoTS project, thanks for the work, it is excellent code.
While running the scalene profiler on my code base it identified a possible performance improvement and suggested a speedup. The credit for this really goes to GPT v4.

I've tested the speedup and it is faster and functionally identical for my use case. I haven't necessarily tested this on your entire code bases range of inputs and outputs so there may be some hidden catch that I am unaware of.

simple speed comparison:

%timeit np.einsum("...ij,...kj->...ik", A, B)
31 ms ± 967 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit np.matmul(A, np.swapaxes(B, -1, -2))
7.61 ms ± 328 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

it's hard to quantify the bigger picture but it looks like it makes the function about 10% faster overall, in my use case

oseiskar · 2023-09-03T07:13:29Z

Thank you! This seems like a great performance tweak. I do not immediately see any reason to use einsum instead of swapaxes. Originally the design goal was simply ensuring that all relevant methods use vectorized Numpy operations, but not exploring very thoroughly, which Numpy operations would be the fastest after the first working one was found.

Speculating about the performance: It can be critical which BLAS/LAPACK operation this eventually translates to under the hood of Numpy - and if the answer is "none" that's bad too. The swapaxes version probably translates directly to a sequence of very efficient matrix multiplications where the LHS is stored in row-major and RHS in column-major order, while einsum could fall back to a generic C loop.

I will runs some tests and merge this if there are no issues. May take a week or two since I'm currently a bit swamped.

oseiskar · 2023-09-09T11:31:54Z

Worked fine. Merged and released in 1.0.3. Thanks again!

Update primitives.py with faster einsum

9b75ad8

oseiskar merged commit 9b75ad8 into oseiskar:master Sep 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code speedup due to faster einsum replacement #23

Code speedup due to faster einsum replacement #23

winedarksea commented Sep 1, 2023

oseiskar commented Sep 3, 2023

oseiskar commented Sep 9, 2023

Code speedup due to faster einsum replacement #23

Code speedup due to faster einsum replacement #23

Conversation

winedarksea commented Sep 1, 2023

oseiskar commented Sep 3, 2023

oseiskar commented Sep 9, 2023