Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add "axes" option in tensor contract #74

Open
Yue-Zhengyuan opened this issue Oct 28, 2019 · 12 comments
Open

Feature Request: Add "axes" option in tensor contract #74

Yue-Zhengyuan opened this issue Oct 28, 2019 · 12 comments

Comments

@Yue-Zhengyuan
Copy link

Yue-Zhengyuan commented Oct 28, 2019

Update: TensorOperations does not have big advantage over NumPy for large tensors.

In NumPy, we have a handy feature in np.tensordot, as illustrated below:

import numpy as np

ts1 = np.random.rand(2,3,4,5)
ts2 = np.random.rand(5,4,3,2)

# contract the last index of ts1 and first index of ts2
ts_ctrct = np.tensordot(ts1, ts2, axes=((3), (0)))

print(np.shape(ts_ctrct))

The axes option can quickly tell the program which indices should be contracted. However, the equivalent function tensorcontract is hard to use when the number of indices is large. For example,

using TensorOperations
using Random

ts1 = rand(2,3,4,5)
ts2 = rand(5,4,3,2)

@tensor ts_ctrct = tensorcontract(ts1, (1,2,3,4), ts2, (4,5,6,7))

println(size(ts_ctrct))

I have to manually write down (1,2,3,4...). Thus I would like to suggest to use the axes option to replace the index currently widely used in TensorOperations.

@Yue-Zhengyuan Yue-Zhengyuan changed the title Feature Request: Add "axes" option in tensor product Feature Request: Add "axes" option in tensor contract Oct 28, 2019
@Jutho
Copy link
Owner

Jutho commented Oct 28, 2019

Two remarks:

  1. If you use the methods in TensorOperations.jl, you shouldn't be using the @tensor macro, so either
ts_ctrct = tensorcontract(ts1, (1,2,3,4), ts2, (4,5,6,7))

or

@tensor ts_ctrct[1,2,3,5,6,7] := ts1[1,2,3,4]*ts2[4,5,6,7]
  1. Could you provide some timings; are you sure you are not measuring compilation time?
    Also, keep in mind that with this package, in the first place, I had large tensors in mind.

@Jutho
Copy link
Owner

Jutho commented Oct 28, 2019

I did some test myself:

Python 3.7.4 (default, Aug 13 2019, 15:17:50) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.8.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import numpy as np                                                                                    

In [2]: ts1 = np.random.rand(2,3,4,5)                                                                         

In [3]: ts2 = np.random.rand(5,4,3,2)                                                                         

In [4]: %timeit ts_ctrct = np.tensordot(ts1, ts2, axes=((3), (0)))                                            
17.1 µs ± 107 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

versus

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.2.0 (2019-08-20)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using TensorOperations

julia> using BenchmarkTools

julia> ts1 = rand(2,3,4,5);

julia> ts2 = rand(5,4,3,2);

julia> @btime tensorcontract($ts1, (1,2,3,4), $ts2, (4,5,6,7));
  2.387 μs (25 allocations: 6.28 KiB)

Could you provide more details with respect to your conclusion that TensorOperations is way slower?

@Yue-Zhengyuan
Copy link
Author

Sorry for the late reply, I indeed forgot about the compilation time... I also benchmarked on my computer using your code and got similar results.

But I still want to ask if there is a neat way of using tensorcontract if the tensor to be contracted has, say 10 subscripts. I did not find an explicit example on this. How to avoid writing tensorcontract (ts1, (1,2,3,4,5,6,7,8,9,10), ts2, (10,11,12,13,14,15,16,17,18,19))? In NumPy this can be done with np.tensordot(ts1, ts2, axes=1). Thank you.

@Jutho
Copy link
Owner

Jutho commented Oct 29, 2019

I could add a tensordot function which behaves somewhat like the Python version.

@Yue-Zhengyuan
Copy link
Author

Besides, I mention that the tensortrace function can also be improved. In NumPy, we have np.trace(tensor, axis1=m, axis2=n), which means tracing over the mth and the nth indices.

@Yue-Zhengyuan
Copy link
Author

I did some test for bigger tensors (RAM: 16GB):

Julia 1.0.5:

julia> using BenchmarkTools

julia> using Random

julia> using TensorOperations

julia> ts1 = rand(16,16,16,16);

julia> ts2 = rand(16,16,16,16);

julia> @btime tensorcontract($ts1, (1,2,3,4), $ts2, (4,5,6,7));
  22.499 ms (26 allocations: 128.00 MiB)

Python 3.7.3 with NumPy 1.17.2

Python 3.7.3 (default, Mar 27 2019, 16:54:48) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import numpy as np                                      

In [2]: np.version.version                                      
Out[2]: '1.17.2'

In [3]: ts1 = np.random.rand(16,16,16,16)                       

In [4]: ts2 = np.random.rand(16,16,16,16)                       

In [5]: %timeit np.tensordot(ts1, ts2, axes=1)                  
30.6 ms ± 1.81 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

So tensorcontract is just a little bit better. Is this expected?

My NumPy configuration:

blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/Users/zhengyuanyue/anaconda3/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/Users/zhengyuanyue/anaconda3/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/Users/zhengyuanyue/anaconda3/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/Users/zhengyuanyue/anaconda3/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

@Jutho
Copy link
Owner

Jutho commented Oct 30, 2019

Yes, contracting say the last axis of tensor 1 with the first axis of tensor 2 is exactly one of the tensor contractions that can directly be mapped to a matrix multiplication without any additional permutations or reshuffling of the data in memory. So all the runtime is essentially in the matrix multiplication. I would expect the timings to be even more similar.

Anyway, for large tensors most of the runtime should anyway be in the matrix multiplication, even if reshuffling is involved, and as such, timings should never differ by huge amounts.

@tomohiro-soejima
Copy link

Hi,

Just chiming in to say that I would also love to see np.tensordot-like function implemented in TensorOperations. In some applications, (e.g. iterative contraction where the number of legs increase after each iteration), I find the numpy syntax a lot easier to write/read then tensorcontract.

I'd also be happy to create a pull request for tensordot. I expect this to be little more than a wrapper for tensorcontract.

@Jutho
Copy link
Owner

Jutho commented Apr 17, 2020

In principle I am not opposed to other methods, I hardly ever use the method syntax.

However, they do have to use Julia terminology, and not be just plain copies of Numpy functions. So I guess dims instead of axes would be the preferred keyword. In that respect, I am also not completely sold on the name tensordot. In Numpy, if I am correct, even basic matrix multiplication requires dot. However, in Julia, dot always means inner product. Not sure if there is a better name though. tensormul and in place versions tensormul! complement somewhat the in-place matrix multiplication method mul!

@MacKenzieHnC
Copy link

Is this related to why this fails?

# naive inner product for nd tensors
A = rand(1,2,3,4,5)
B = rand(1,2,3,4,5)

@tensor C[a,b,c,d] := A[a,b,c,d,e] * conj(B[a,b,c,d,e])

Or do I just have a logical error?

It represents the self-attention score calculation in a transformer, in case that helps anyone understand

@mcabbott
Copy link

mcabbott commented Oct 6, 2020

@tensor C[a,b,c,d] := A[a,b,c,d,e] * conj(B[a,b,c,d,e])

I believe the rule for this package is that every index must appear twice, either both on the right, or left & right. (When there is one term.) It doesn't handle things like batched matrix multiplication, nor this, which I suppose you could call a batched dot product.

@MacKenzieHnC
Copy link

Okay, yeah, I was speaking gibberish haha I think I get it now

Thank you ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants