Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on making the numpy dependency optional? #203

Open
janeyx99 opened this issue Nov 1, 2022 · 5 comments
Open

Thoughts on making the numpy dependency optional? #203

janeyx99 opened this issue Nov 1, 2022 · 5 comments

Comments

@janeyx99
Copy link
Contributor

janeyx99 commented Nov 1, 2022

Proposal

Make the numpy dependency optional, if possible.

Why?

Minimizing dependencies is a general goal as it allows a bigger audience to reap the benefits of this library. More specifically, some of us are interested in making opt_einsum a hard dependency for torch, but we would like to keep numpy unrequired. (If you're curious why torch does not have a hard dependency on numpy, see pytorch/pytorch#60081, tl;dr being the last comment.)

A hard dependency would mean all torch users would get the benefits of opt_einsum right away without thinking too hard/needing to manually install opt_einsum themselves.

Alternatives

We could also have torch vendor in opt_einsum, but that is increased complexity/maintenance + we would like to automatically subscribe to improvements in opt_einsum!

@dgasmith
Copy link
Owner

dgasmith commented Nov 2, 2022

I think this is entirely possible as we have a fairly weak dependance on NumPy beyond testing. Feel free to take a crack at it, I can look into removing NumPy this weekend.

@jcmgray
Copy link
Collaborator

jcmgray commented Nov 2, 2022

Yes I agree this would be a nice thing to do. From what I can tell minor problem points where one can't just import/mock numpy lazily are:

  • ssa_to_linear - this function, which is used in the greedy path optimization and is not related to backends etc, is actually $\mathcal{O}(N^2)$ and uses numpy to speed it up. I could see if a pure python version can be made fast enough
  • some type hints use np.ndarray - not sure what the solution is for using types from libraries that are not installed.

@jcmgray
Copy link
Collaborator

jcmgray commented Nov 2, 2022

Here's an alternative pure python implementation of ssa_to_linear:

def ssa_to_linear_A(ssa_path):
    N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted(map(ids.index, scon))
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

It's actually faster until one gets to quite large numbers of inputs (x-axis):
ssa_to_linear

At the largest size here the actual main greedy algorithm takes ~7sec so the extra slowdown is not ideal but still only a small part of the whole path finding. Maybe the implementation can be improved too..

@dgasmith
Copy link
Owner

dgasmith commented Nov 2, 2022

Nice check!

n is unlikely to go over 1000 except for extreme cases, we could also have two algorithms and a switch of:

if len(n) > 1000 and has_numpy:
    return _numpy_impl(*args)
else:
    return _python_impl(*args)

The library isn't strongly type hinted yet (still plenty of Any running around~). I would vote we replace np.ndarray with some library agnostic Protocol type.

@jcmgray
Copy link
Collaborator

jcmgray commented Sep 2, 2023

I actually realized you can implement ssa_to_linear in $n\log(n)$ time:

import bisect

def ssa_to_linear_bis(ssa_path, N=None):
    if N is None:
        N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted([bisect.bisect_left(ids, s) for s in scon])
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

this is significantly faster than the numpy version throughout the range.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants