Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translations and ideas for extensions #5

Open
RikVoorhaar opened this issue Feb 16, 2021 · 4 comments
Open

Translations and ideas for extensions #5

RikVoorhaar opened this issue Feb 16, 2021 · 4 comments

Comments

@RikVoorhaar
Copy link
Contributor

In addition to the take translation I added in my previous PR, there is some more that might be good to add. At least, I am using these myself. I can make a PR.

  • split. The syntax is different for numpy and tensorflow/torch. The former wants the number of splits or an array of locations of splits, whereas tensorflow/torch either want the number of splits or an array of split sizes. We can go from one format the other using np.diff
  • diff. This is implemented in tensorflow as tf.experimental.numpy.diff, and not implemented at all for torch. This also means I don't know what the cleanest way is to implement split mentioned above. Maybe just using np.diff and then convert to array of right backend if necessary?
  • linalg.norm, seems to work with tensorflow, but for torch we need to do _SUBMODULE_ALIASES["torch", "linalg.norm"] = "torch"
    I didn't check these things for any other libraries.

Maybe a bit of an overly ambitious idea, but have you ever thought about baking in support for JIT? Right now it seems that for TensorFlow everything works with eager execution, and I'm not sure you can compile the computation graphs resulting from a series of ar.do calls.
PyTorch also support JIT to some extend with TorchScript
Numpy doesn' t have JIT, but there is Numba
Cupy has an interface with Numba that does seem to allow JIT.
JAX has support for JIT

Another thing is gradients. Several of these libraries have automatic gradients, and having an autoray interface for doing computations with automatic gradients would be fantastic as well (although probably also ambitious).

If you think these things are doable at all, I wouldn't mind spending some time to try to figure out how this could work.


Less ambitiously, you did mention in #3 that something along the lines of

with set_backend(like):
    ...

would be pretty nice. I can try to do this. This probably comes down to checking for a global flag in ar.do after the line

if like is None:
@jcmgray
Copy link
Owner

jcmgray commented Feb 16, 2021

Regarding split I think torch has tensor_split which is the equivalent? Since the user interface is unified as the numpy version feel free to contribute any versions even if limited to just 1 other backend and specific syntaxes - we can incrementally improve from there.


autoray works with all the jit libraries that work via tracing (jax.jit, torch.jit.trace, tensorflow.function) - indeed this is one of the reasons for it to exist and how I use it. Similarly for autodiff - I use autoray with all the above and also e.g. autograd.

Do you mean 'baked in' like a unified autoray.jit(backend='torch')(fn) / autoray.gradient(fn(arrays), arrays) like interface? This is to some extent what quimb.tensor.TNOptimizer does.

I do have a private, experimental 'lazy' autoray module which is like a very minimal, lightweight computational graph - you can perform an entire autoray computation symbolically, do some basic simplifications/reuse, then optionally compute later.

The motivation is partly just that libraries like dask are very slow for large (100000+) computational graphs, but also maybe down the line to think about automatically chunking jit, checkpointing autodiff etc.


with_backend I am slightly in two minds about. The negatives are it's just another way to do something one can already do - i.e. it doesn't really solve any problems I have come across so far.

The other is that it adds more overhead to do, which I feel is crucial to keep low since it often called millions+ of times. Though that might be cleverly avoided somethow.

However if it had some crucial use case and little to no overhead I could be tempted.

@RikVoorhaar
Copy link
Contributor Author

Regarding split I think torch has tensor_split which is the equivalent? Since the user interface is unified as the numpy version feel free to contribute any versions even if limited to just 1 other backend and specific syntaxes - we can incrementally improve from there.

RIght, tensor_split is indeed much more like the numpy split. I'll make a PR.

autoray works with all the jit libraries that work via tracing (jax.jit, torch.jit.trace, tensorflow.function) - indeed this is one of the reasons for it to exist and how I use it. Similarly for autodiff - I use autoray with all the above and also e.g. autograd.

Do you mean 'baked in' like a unified autoray.jit(backend='torch')(fn) / autoray.gradient(fn(arrays), arrays) like interface? This is to some extent what quimb.tensor.TNOptimizer does.

I do have a private, experimental 'lazy' autoray module which is like a very minimal, lightweight computational graph - you can perform an entire autoray computation symbolically, do some basic simplifications/reuse, then optionally compute later.

The motivation is partly just that libraries like dask are very slow for large (100000+) computational graphs, but also maybe down the line to think about automatically chunking jit, checkpointing autodiff etc.

I wasn't aware JIT worked so well with autray. Maybe it's actually a good idea to document this then? Same for autograd.
I was indeed thinking of something like function decorator @autoray_jit(backend). A unified interface like that would make benchmarking between libraries much easier. Support for Numba in particular would also be nice.

with_backend I am slightly in two minds about. The negatives are it's just another way to do something one can already do - i.e. it doesn't really solve any problems I have come across so far.

The other is that it adds more overhead to do, which I feel is crucial to keep low since it often called millions+ of times. Though that might be cleverly avoided somethow.

However if it had some crucial use case and little to no overhead I could be tempted.

I do think the overhead would be minimal, but I can't think of a particular convincing use case. If you're creating a lot of arrays it's annoying to always supply like=backend as an argument, and with_backend would create cleaner code. I thought of it mostly as a nice-to-have feature.

@RikVoorhaar
Copy link
Contributor Author

Another thing I just noticed is that when calling ar.do("where", X==0), for numpy and torch we get a list of tuples; one for each dimension of the array, whereas for tensorflow we get a (Nxd) array, with d the number of dimensions.
This can lead to cryptic error messages, so maybe it's a good idea to translate the output of this function to conform to numpy's form? Even transposing the resulting array would essentially fix it.

@jcmgray
Copy link
Owner

jcmgray commented Feb 17, 2021

I wasn't aware JIT worked so well with autray. Maybe it's actually a good idea to document this then? Same for autograd.
I was indeed thinking of something like function decorator @autoray_jit(backend). A unified interface like that would make benchmarking between libraries much easier. Support for Numba in particular would also be nice.

Yeah I'll definitely have a think about this. One might even just want @autoray.jit with backend inference. Numba however is much more tricky as it's doesn't trace the function and then compile, it inspects the actual code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants