Thoughts on making the numpy dependency optional? #203

janeyx99 · 2022-11-01T20:39:30Z

Proposal

Make the numpy dependency optional, if possible.

Why?

Minimizing dependencies is a general goal as it allows a bigger audience to reap the benefits of this library. More specifically, some of us are interested in making opt_einsum a hard dependency for torch, but we would like to keep numpy unrequired. (If you're curious why torch does not have a hard dependency on numpy, see pytorch/pytorch#60081, tl;dr being the last comment.)

A hard dependency would mean all torch users would get the benefits of opt_einsum right away without thinking too hard/needing to manually install opt_einsum themselves.

Alternatives

We could also have torch vendor in opt_einsum, but that is increased complexity/maintenance + we would like to automatically subscribe to improvements in opt_einsum!

dgasmith · 2022-11-02T04:22:37Z

I think this is entirely possible as we have a fairly weak dependance on NumPy beyond testing. Feel free to take a crack at it, I can look into removing NumPy this weekend.

jcmgray · 2022-11-02T19:23:04Z

Yes I agree this would be a nice thing to do. From what I can tell minor problem points where one can't just import/mock numpy lazily are:

ssa_to_linear - this function, which is used in the greedy path optimization and is not related to backends etc, is actually $\mathcal{O}(N^2)$ and uses numpy to speed it up. I could see if a pure python version can be made fast enough
some type hints use np.ndarray - not sure what the solution is for using types from libraries that are not installed.

jcmgray · 2022-11-02T20:04:49Z

Here's an alternative pure python implementation of ssa_to_linear:

def ssa_to_linear_A(ssa_path):
    N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted(map(ids.index, scon))
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

It's actually faster until one gets to quite large numbers of inputs (x-axis):

At the largest size here the actual main greedy algorithm takes ~7sec so the extra slowdown is not ideal but still only a small part of the whole path finding. Maybe the implementation can be improved too..

dgasmith · 2022-11-02T23:16:15Z

Nice check!

n is unlikely to go over 1000 except for extreme cases, we could also have two algorithms and a switch of:

if len(n) > 1000 and has_numpy:
    return _numpy_impl(*args)
else:
    return _python_impl(*args)

The library isn't strongly type hinted yet (still plenty of Any running around~). I would vote we replace np.ndarray with some library agnostic Protocol type.

jcmgray · 2023-09-02T17:22:21Z

I actually realized you can implement ssa_to_linear in $n\log(n)$ time:

import bisect

def ssa_to_linear_bis(ssa_path, N=None):
    if N is None:
        N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted([bisect.bisect_left(ids, s) for s in scon])
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

this is significantly faster than the numpy version throughout the range.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on making the numpy dependency optional? #203

Thoughts on making the numpy dependency optional? #203

janeyx99 commented Nov 1, 2022 •

edited

dgasmith commented Nov 2, 2022

jcmgray commented Nov 2, 2022

jcmgray commented Nov 2, 2022

dgasmith commented Nov 2, 2022

jcmgray commented Sep 2, 2023 •

edited

Thoughts on making the numpy dependency optional? #203

Thoughts on making the numpy dependency optional? #203

Comments

janeyx99 commented Nov 1, 2022 • edited

Proposal

Why?

Alternatives

dgasmith commented Nov 2, 2022

jcmgray commented Nov 2, 2022

jcmgray commented Nov 2, 2022

dgasmith commented Nov 2, 2022

jcmgray commented Sep 2, 2023 • edited

janeyx99 commented Nov 1, 2022 •

edited

jcmgray commented Sep 2, 2023 •

edited