A need for contiguity-based axis-order optimization in tensordot

The ordering of axes fed to `tensordot` can have a massive (order of magnitude) impact on its efficiency, based on the memory layout of the array(s) being summed:

```python
>>> import numpy as np
>>> x = np.random.rand(100, 100, 100)
>>> %%timeit
... np.tensordot(x, x, axes=((0, 1, 2), (0, 1, 2)))  
151 µs ± 6.9 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %%timeit
... np.tensordot(x, x, axes=((1, 2, 0), (1, 2, 0))) 
7.9 ms ± 143 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```
Moving `x`'s axis leads to a swap in timing:
```python
>>>  xt = np.moveaxis(x, -1, 0)
>>> %%timeit
... np.tensordot(xt, xt, axes=((0, 1, 2), (0, 1, 2)))  
10.8 ms ± 213 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %%timeit
... np.tensordot(xt, xt, axes=((1, 2, 0), (1, 2, 0))) 
146 µs ± 4.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```

[As suggested](https://github.com/numpy/numpy/pull/11928) by @eric-wieser, `tensordot` would benefit from axis-ordering based on memory contiguity to help guard against these massive slow downs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

A need for contiguity-based axis-order optimization in tensordot #11940

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

A need for contiguity-based axis-order optimization in tensordot #11940

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions