Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Literals get contracted as if they were symbols #113

Open
antoine-levitt opened this issue Dec 7, 2021 · 2 comments
Open

Literals get contracted as if they were symbols #113

antoine-levitt opened this issue Dec 7, 2021 · 2 comments

Comments

@antoine-levitt
Copy link

julia> a = ones(2,2)
2×2 Matrix{Float64}:
 1.0  1.0
 1.0  1.0

julia> @tensor A[i,j] := a[i,1]*a[j,1]
2×2 Matrix{Float64}:
 2.0  2.0
 2.0  2.0

This is hugely surprising to me. I guess if constants are not supported (#111) erroring would be better than contracting here.

@Jutho
Copy link
Owner

Jutho commented Dec 8, 2021

This used to be on the package README but I guess it moved to the docs when I created a more extensive documentation (which has grown a bit out of date):
https://jutho.github.io/TensorOperations.jl/stable/indexnotation/

The point is that in my field, there is a very convenient notation for contracting several tensors, where we use integer labels to denote which indices need to be contracted, in such a way that the value of the integer indicates the order in which the pairwise different contractions will happen.

For example, if you want to multiply two matrices and a vector, you could write this as
y[a] = A[a,b]*B[b,c]*x[c]
but this might end up first multiplying A with B, and only then the result of that with x. Instead, with integer literals, we could specify the same contraction as
y[-1] = A[-1,2]*B[2,1]*x[1]
This might seem trivial for this example, and could easily be fixed with parenthesis as y[a] = A[a,b]*(B[b,c]*x[c]). However, once you have several tensors involved, the convention is extremely useful. Note that we also have the convention to use negative integers to indicate output indices, where the output indices are sorted in decreasing value if not specified. That is, a contraction such as
S[-1,-2] = R[3,1]*A[1,2,-2]*O[4,2]*A[3,4,-1]
could be rewritten as
S[:] = R[3,1]*A[1,2,-2]*O[4,2]*A[3,4,-1]
This convention is known as NCON (for network contractor), see e.g. https://arxiv.org/abs/1402.0939 or https://github.com/mhauru/ncon.

Note that you can still obtain your intended behaviour as

@tensor A[i,j] := a[:,1][i]*a[:,1][j]

or

@tensor A[i,j] := view(a,:,1)[i]*view(a,:,1)[j]

I agree that is less pleasing for your use case, but changing to not adhere the NCON convention when it is being used is not really an option. What could be considered is a style where @tensor follows NCON convention if all index labels are integers, and does not otherwise, but this would require some work, and a transition period where it could warn users that behaviour will change, as it is would be a very strongly breaking change.

@antoine-levitt
Copy link
Author

I see. Yes it would be nice to mention in the Readme somewhere that it follows the ncon convention rather than the "usual" Einstein summation (eg the one in numpy). Especially, I came to this package from the EinSum.jl package, expecting more or less a blas-dispatching version of EinSum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants