Skip to content

Tensor contraction (einsum, permutation, ...)

Roman edited this page Aug 17, 2020 · 6 revisions

Fastor implements most of the tensor algebra algorithms using the Einstein index notation. The indices of tensor contraction/permutation should be compile time constants as Fastor performs compile time operation minimisation to re-structure the tensor network expression for most efficient evaluation order. Most of the linear algebra functions such as matrix-matrix multiplication, trace, determinant and adjugate of a matrix can be computed through the Einstein summation. The most general form of tensor contraction interface exposed by Fastor is the einsum function. If you are familiar with other implementations of einsum such as NumPy's a (NumPy to Fastor) cheatsheet is provided at the bottom of this page. A non-exhaustive list of what can be done with einsum is the following:

Tensor contraction for vectors and matrices [einsum]

To follow the mathematical syntax of Einstein summation we will first define a couple of enums

enum {i,j,k,l,m,n};
enum {I,J,K,L,M,N};

Note that this is not mandatory and you can simply use constant integer numbers instead of an enum.

Finding trace of a matrix (2D tensor)

Tensor<double,3,3> A  = {{1,2},{3,4}};
Tensor<double> traceA = einsum<Index<i,i>>(A);

// or simply using constant integers
Tensor<double> traceA = einsum<Index<0,0>>(A);

// or creating an alias for shorter syntax
using i_i = Index<i,i>;
Tensor<double> traceA = einsum<i_i>(A);

In Einstein notation any index that appears twice gets summed over and the corresponding dimension disappears (is sum-reduced). In the above example in Index<i,i> the index i appears twice and hence gets sum-reduced.

Also note that it is not mandatory to have sequentially ascending or descending indices, in that you can simply do

Tensor<double> traceA = einsum<Index<j,j>>(A);
Tensor<double> traceA = einsum<Index<k,k>>(A);

// or simply using constant integers
Tensor<double> traceA = einsum<Index<1,1>>(A);
Tensor<double> traceA = einsum<Index<2,2>>(A);

These are all functionally identical.

Matrix-vector multiplication

Tensor<double,4,10> A; Tensor<double,10> b;
Tensor<double,4> c = einsum<Index<i,j>,Index<j>>(A,b);

// or simply using constant integers
Tensor<double,4> c = einsum<Index<0,1>,Index<1>>(A,b);

// or creating an alias for shorter syntax
using ij = Index<i,j>;
using j  = Index<j>;
Tensor<double,4> c = einsum<ij,j>(A,b);

This will run as efficiently as c = matmul(A,b) or c = A % b both of which do matrix-vector multiplication.

Vector-matrix multiplication

Tensor<double,4,10> A; Tensor<double,4> b;
Tensor<double,10> c = einsum<Index<i>,Index<i,j>>(b,A);

// or simply using constant integers
Tensor<double,4> c = einsum<Index<0>,Index<0,1>>(b,A);

// or creating an alias for shorter syntax
using i  = Index<i>;
using ij = Index<i,j>;
Tensor<double,4> c = einsum<i,ij>(b,A);

This will run as efficiently as c = matmul(b,A) or c = b % A both of which do vector-matrix multiplication.

Matrix-matrix multiplication

Tensor<double,4,10> A; Tensor<double,10,3> B;
Tensor<double,4,3> C = einsum<Index<i,j>,Index<j,k>>(A,B);

// or simply using constant integers
Tensor<double,4,3> C = einsum<Index<0,1>,Index<1,2>>(A,B);

// or creating an alias for shorter syntax
using ij = Index<i,j>;
using jk = Index<j,k>;
Tensor<double,4,3> C = einsum<ij,jk>(A,B);

This will run as efficiently as C = matmul(A,B) or C = A % B both of which do matrix-matrix multiplication.

Tensor contraction for high order tensors [einsum]

Let us now have a look at how we can contract two arbitrary order tensors using the Einstein summation feature einsum

Tensor<double,2,3,5> A; Tensor<double,3,5,2,4> B;
// fill A and B
A.random(); B.random();
auto C = einsum<Index<I,J,K>,Index<J,L,M,N>>(A,B);

// or simply using constant integer numbers instead of an enum
auto C = einsum<Index<0,1,2>,Index<1,3,4,5>>(A,B);

indices that appear twice within the Index get summed over. In the above example using enums, index J is being summed over/contracted while the other indices are free. This results in a tensor type Tensor<double,2,5,5,2,4>. The resulting tensor is always ordered based on the relative position of the free indices (this is neither alphabetical nor ascending) that is in the above case the resulting tensor is ordered based on the corresponding free indices I,K,L,M,N since that is how they appear in the einsum function. If you would want to order the output tensor to a different shape you can use the OIndex to indicate the shape of the output tensor

Tensor<double,2,3,5> A; Tensor<double,3,5,2,4> B;
// fill A and B
A.random(); B.random();
// Order as I,K,L,M,N - same as default
auto C = einsum<Index<I,J,K>,Index<J,L,M,N>,OIndex<I,K,L,M,N>>(A,B);    

// Order as I,L,K,N,M - resulting tensor [Tensor<double,2,5,5,4,2>]
auto C = einsum<Index<I,J,K>,Index<J,L,M,N>,OIndex<I,L,K,N,M>>(A,B);    

Tensor-network contraction [einsum]

Fastor implements a graph search optimisation scheme to find the best sequence of contraction in a tensor network. So if you have multiple tensors that you want to contract instead of computing them pair by pair it is always advisable to compute them in one go

Tensor<double,2,3> A;
Tensor<double,3,4,5> B;
Tensor<double,5,7> C;
auto D = einsum<Index<I,J>,Index<J,K,L>,Index<K,M>>(A,B,C);

The type and dimensions of the output tensor is automatically deduced and need not be specified. Again the type of the output tensor D in this case is Tensor<double,2,5,7> deduced from the free indices I,K,M. Say, if you want to re-order the output tensor to shape 7,2,5 instead you can do

auto D = einsum<Index<I,J>,Index<J,K,L>,Index<K,M>,OIndex<M,I,K>>(A,B,C);

If your tensor network also comprises of summing over singletons for instance e(i) = a(i) + B(i,j)*C(k,j)*d(k) you can simply write them like so

// Establish a set of indices for convenience
using ij = Index<I,J>;
using kj = Index<K,J>;
using k  = Index<K>;

// Perform tensor contraction over the network
auto e = a + einsum<ij,kj,k>(B,C,d);

Here einsum<ij,kj,k>(B,C,d) will be evaluated first and then the result is added to a and assigned to e.

Tensor contraction - many-index-contraction [inner]

By convention Einstein summation does not allow summing over indices that appear more than twice. In such cases you can use Fastor's other alternate functions such as the inner function. An example of summing over three indices is the following

Tensor<double,5,5,5> D; D.random();
auto E = inner(D);

which computes D_iii which in essence is the trace of a 3-dimensional tensor. You can also apply the inner functions on pairs of tensors such as inner(A,B) to compute the inner product of two tensors; see linear algebra functions.

Tensor permutation - [permute]

An example of tensor permutation to permute the values and reshape the tensor to different tensor

Tensor<float,3,4,5,2> F; F.random();
Tensor<float,4,5,3,2> G = permute<Index<J,K,I,L>>(F);

// or simply using constant integer numbers instead of an enum
Tensor<float,4,5,3,2> G = permute<Index<1,2,0,3>>(F);

Tensor permutation - [permutation]

This works just like the permute function but permutes values slightly differently. This is kept for historical reasons as other projects using Fastor use this feature (prefer to use permute instead)

Tensor<float,3,4,5,2> F; F.random();
Tensor<float,4,5,3,2> G = permutation<Index<J,K,I,L>>(F);

// or simply using constant integer numbers instead of an enum
Tensor<float,4,5,3,2> G = permutation<Index<1,2,0,3>>(F);

The main difference between permutation and permute can be explained through the following example

Tensor<double,2,2,2,2> a;
auto b = permutation<Index<i,l,j,k>>(a);    // b(i,j,k,l) == a(i,l,j,k);
auto c = permute<Index<i,l,j,k>>(a);        // c(i,l,j,k) == a(i,j,k,l);

So in essence permutation is the inverse of permute loosely speaking.

Ordering of tensor contractions

By default Fastor deduces the return type of the output tensor from the indices of a tensor contraction automatically. For einsum with one or two tensors the resulting output tensor is always ordered using the free (non-contracting) indices in the order that they appear in the contraction for instance, in the following tensor contraction

Tensor<double,2,3,5> A; Tensor<double,3,5,2,4> B;
auto C = einsum<Index<I,J,K>,Index<J,K,M,N>>(A,B);

The output tensor is Tensor<double,2,2,4> deduced from the free indices I,M,N and sorted in that fashion. Note that this ordering is not ascending or alphabetical in the sense that if you rewrite the same contraction pattern like follows

Tensor<double,2,3,5> A; Tensor<double,3,5,2,4> B;
auto C = einsum<Index<N,J,K>,Index<J,K,M,I>>(A,B);

The output tensor is still going to be Tensor<double,2,2,4> deduced from the free indices N,M,I and sorted in that fashion.

If you need to re-order the output to a tensor of different shape you can either use the OIndex sequence or use the permute function explicitly for instance, if you want the tensor to be ordered in N,I,M fashion rather than I,M,N you can simply do

Tensor<double,2,3,5> A; Tensor<double,3,5,2,4> B;
auto D = einsum<Index<I,J,K>,Index<J,K,M,N>,OIndex<N,I,M>>(A,B);

or explicitly call permute

Tensor<double,2,3,5> A; Tensor<double,3,5,2,4> B;
auto C = einsum<Index<I,J,K>,Index<J,K,M,N>>(A,B);
auto D = permute<Index<M,I,N>>(C);

The output tensor D is now Tensor<double,4,2,2>.

Note that, Fastor's einsum does not support disabling contraction over an index which you may find in other implementations of einsum for instance in NumPy. In such cases you can simply use the sum and diag features while specifying two different indices to avoid summing over a repeated index. For example to broadcast an inner contraction over a dimension, you can do

Tensor<double,4,4> A, B;
diag(einsum<Index<i,j>,Index<i,k>>(A,B));           // similar to NumPy's einsum("ij,ij->j",A,B);

Given that Fastor determines the output tensor implicitly, for tensor networks (more than two tensors in an einsum call) it might not always be apparent what the output tensor is since in these cases Fastor re-orders the indices to determine the best contraction pattern. In other words, if there are more than two tensors in an einsum call the output tensor may not be always ordered in any particular order. In such cases you can query this pattern. For example, say we have a contraction between 5 tensors

Tensor<double,4,4> fCC;
Tensor<double,4,4,4,4> fII;
auto fD = einsum<Index<P,I>,Index<Q,J>,Index<I,J,K,L>,Index<R,K>,Index<S,L>>(fCC,fCC,fII,fCC,fCC);

The resulting index of fD can now be queried using the rather verbose einsum_helper function. This resulting index is constructed in such a way that if you permute the output tensor using this index you get a tensor in the order the free indices appear in the einsum function call (Certainly you can control the shape of the output using the OIndex sequence)

using resulting_index = einsum_helper<Index<P,I>,Index<Q,J>,Index<I,J,K,L>,Index<R,K>,Index<S,L>,
        decltype(fCC),decltype(fCC),decltype(fII),decltype(fCC),decltype(fCC)>::resulting_index;

einsum_helper is rather a functor that takes all the indices and all the tensors as input type i.e. einsum_helper<Indices...,Tensors...>. You can also query resulting_tensor just like resulting_index

You can now use permute to permute the output tensor

fD = permute<resulting_index>(fD);

For debugging purposes, you can also simply print the type of resulting index using the type_name function

print(type_name<resulting_index>());

will print

Fastor::Index<8ul, 7ul, 6ul, 9ul> 

NumPy to Fastor cheatsheet

If you are familiar with NumPy's einsum function, then you can use the following non-exhaustive cheatsheet for code translation

NumPy Fastor
einsum("ii",A) einsum<Index<i,i>>(A)
einsum("ji",A) permute<Index<j,i>>(A)
einsum("ijk->kij",A) permute<Index<k,i,j>>(A)
einsum("ij,jk",A,B) einsum<Index<i,j>,Index<j,k>>(A,B)
einsum("ij,jk->ki",A,B) einsum<Index<i,j>,Index<j,k>,OIndex<k,i>>(A,B)
einsum("mij,jk,lk->mli",A,B,C) einsum<Index<m,i,j>,Index<j,k>,Index<l,k>,OIndex<m,l,i>>(A,B,C)

Note that Fastor's einsum is not equivalent to NumPy's einsum, in particular

  1. The indices in an einsum function cannot appear more than twice
  2. You cannot explicitly disable summing over an index in Fastor as mentioned above, but you can use sum and diag functions to achieve the same thing