In [86]:
import numpy as np
import time

# Tensor Contractions
Tensor contrations can be hard! Here, we will show how to perform tensor contractions using 2D matrix multiplications (e.g. DGEMM) that match those produced by the Einstein summation function (`np.einsum`). By so doing, we will unravel what einsum is actually doing under the hood and code our own einsum algorithm! 

In the context of coupled cluster theory, we are often faced with contractions of the form
$$c_{an}^{ef}=v_{mn}^{ef}t_{a}^m$$
where $m,n = 1\ldots N_{occ}$ and $a,e,f = N_{occ}+1\ldots N_{tot}$ where $N_{occ}$ is the number of occupied spinorbitals in the reference (HF) state and $N_{tot}$ is the total number of spinorbitals in the basis set.

Here is a mock set up:


In [87]:
def get_matrices(Nocc,Nunocc):
    
    Norb = Nocc+Nunocc
    
    V = np.random.randint(0,10,(Norb,Norb,Norb,Norb))
    V = V - np.einsum('pqrs->pqsr',V,optimize=True)
    #V = V - np.transpose(V,(0,2,1,3))
    #np.einsum('pqrs->pqsr',V,optimize=True)
    
    T1 = np.random.randint(0,10,(Nunocc,Nocc))
    T2 = np.random.randint(0,10,(Nunocc,Nunocc,Nocc,Nocc))
    
    for a in range(Nunocc):
        for b in range(a+1,Nunocc):
            for i in range(Nocc):
                for j in range(i+1,Nocc):
                    T2[b,a,i,j] = -T2[a,b,i,j]
                    T2[a,b,j,i] = -T2[a,b,i,j]
                    T2[b,a,j,i] = T2[a,b,i,j]
    
    return V, T1, T2, Norb

Nocc = 10
Nunocc = 16

V, T1, T2, Norb = get_matrices(Nocc,Nunocc)

In accordance with practical calculations, we have initialized the two-electron repulsion integral matrix over the total basis set. The $T_1$ cluster amplitudes are always read from bottom to top, so unoccupied along rows and occupied along columns.

First, let's think about how DGEMMs work. They flatten multidimensional arrays and perform logical binary point operations on the resulting vectors (in Fortran and Matlab, flattening occurs in a column-major fashion). A tensor contraction of multidimensional arrays 

$$v(m,n,e,f)t_1(a,m)$$

occurs by contracting the last dimension of the object on the left (the slow-moving dimension in memory) with the first dimension of the object on the right (fast-moving dimension in memory). So as it is written, the above contraction cannot occur as the contraction dimensions are not lined up properly. First, we must permute $v(m,n,e,f) \rightarrow v'(n,e,f,m)$ and $t_1(a,m)\rightarrow t_1'(m,a)$. Then, we can form

$$v'(n,e,f,m)t_1'(m,a)$$

Next, we must reshape the tensors into 2D matrices $v'(n,e,f,m) \rightarrow \textbf{V}(nef,m)$ and $t_1'(m,a) \rightarrow \textbf{t}_1(m,a)$ to form the DGEMM matrix product

$$\textbf{C}(nef,a) = \textbf{V}(nef,m)\textbf{t}_1(m,a)$$

Now, we can reshape this matrix into the correct (unravelled) dimensions

$$\textbf{C}(nef,a) \rightarrow c(n,e,f,a)$$

and finally permute this object into the correct tensor we want 

$$c(n,e,f,a)\rightarrow c(a,n,e,f)$$

The same is true for an arbitrary number of contracted and uncontracted dimensions. Let's take another example from coupled cluster calculations:

$$c_{af}^{mj} = \sum_{n,e} v_{mn}^{ef}t_2^{nj}{_{ae}} \equiv v(m,n,e,f)t_2(a,e,n,j)$$

First, we permute to line up the contraction dimensions 
$$v(m,n,e,f)\rightarrow v'(m,f,n,e)$$  $$t_2(a,e,n,j) \rightarrow t_2'(n,e,a,j)$$ 

Note that the contracted indices occur in the SAME order on the end of tensor $v$ and at the beginning of tensor $t_2$. This is because we are using reshape to put the tensor in the correct dimesions for matrix multplication. In other contexts, we might require that the contracted indices be the mirror image of one another i.e. $v'(m,f,n,e)t'_2(e,n,a,j)$ as this has directly suitable dimensions for DGEMM (or maybe this comes about from row-major vs. column-major linear indexing...?). Next, we reshape into 2D matrices to use DGEMM multiplication. 

$$v'(m,f,n,e)t_2'(n,e,a,j) \rightarrow \textbf{C}(mf,aj)=\textbf{V}(mf,ne)\textbf{t}_2(ne,aj) $$

the product is unravelled into its corresponding tensor and permuted into the answer we want
$$\textbf{C}(mf,aj) \rightarrow c(m,f,a,j) \rightarrow c(a,f,m,j)$$

#### The key here is that we want to formalize the notion that, when contracting tensors $AB$, the contraction indices must be placed to the end of $A$ and at the beginning of $B$

## An algorithm
The above is obviously formulaic and can be boiled down to an algorithm as follows:

#### Problem: 
Calculate the contraction $A(i_1,i_2,\ldots,i_m)B(j_1,j_2,\ldots,j_n)$ where the sets $\{i_{c(u)}\}$ and $\{j_{c(u)}\}$ denote contracted (uncontracted) indices in $A$ and $B$, respectively. Note that the number of elements in $\{i_{c}\}$ must equal the number of elements in $\{j_{c}\}$; we are simply using different letters to emphasize that the shared contraction indices may appear at different indical positions in both $A$ and $B$. 

(1) Establish the indical positions of contraction and uncontraction in $A$ and $B$

(2) Permute the tensors $A$ and $B$ such that they are in the order $A(\{i_{u}\},\{i_{c}\})$ and $B(\{j_{c}\},\{j_{u}\})$

(3) Reshape $A(\{i_{u}\},\{i_{c}\}) \rightarrow \textbf{A}_{N^A_u \times N_c}$ and $B(\{j_{c}\},\{j_{u}\}) \rightarrow \textbf{B}_{N_c\times N^B_u}$

(4) Perform the DGEMM matrix product $\textbf{C}_{N^A_u \times N^B_u} = \textbf{A}_{N^A_u \times N_c}\textbf{B}_{N_c\times N^B_u}$

(5) Unravel the DGEMM product into its tensorial dimensions $\textbf{C}_{N^A_u \times N^B_u} \rightarrow c(\{i_u\},\{j_u\})$

(6) If needed, permute $c(\{i_u\},\{j_u\})$ into the desired output ordering

Here is a code that executes this basic functionality up to step (5)

In [88]:
def myeinsum_test(A,indA,B,indB):
    # contracts indices indA of tensor A with indices indB of tensor B such that
    # indA[k] contracts with indB[k]
    
    # list of total indices
    ind0A = range(0,len(np.shape(A)))
    ind0B = range(0,len(np.shape(B)))
    
    # list those indices that are uncontracted in both
    indA_un = list(set(ind0A)-set(indA))
    indB_un = list(set(ind0B)-set(indB))
    
    # The permuted order suitable for contraction
    # NOTE: A(uncontracted,contracted)*B(contracted,uncontracted)
    A_permuted = np.transpose(A,indA_un+indA)
    B_permuted = np.transpose(B,indB+indB_un)
    
    dim_un_A = [A.shape[indA_un[i]] for i in range(len(indA_un))]
    dim_un_B = [B.shape[indB_un[i]] for i in range(len(indB_un))]
    
    numel_unc = [np.prod(dim_un_A), np.prod(dim_un_B)]
    numel_con = int(A.size/numel_unc[0])
    
    if numel_con != int(B.size/numel_unc[1]):
        print('Error: contraction dimensions not compatible')
        return
    else:
        Ars = np.reshape(A_permuted,(numel_unc[0],numel_con))
        Brs = np.reshape(B_permuted,(numel_con,numel_unc[1]))
    
        dim_out = [dim_un_A[i] for i in range(len(dim_un_A))] + [dim_un_B[i] for i in range(len(dim_un_B))]
    
        return np.reshape(Ars@Brs,dim_out)
    
    

Let's test our function out. We will use the aforementioned example of 

$$c(f,e,n,a) = v(m,n,e,f)t_1(a,m)$$

For our function, we must explicitly supply the indices of contraction in $v$ and $t_1$. Since we are contracting over $m$ in both tensors, $m$ appears in position $0$ in $v$ and position $1$ in $t_1$. Actually, in the above code, we cannot produce $c(f,e,n,a)$. We are restricted to produce the tensor ordering that results from simply putting together the uncontracted indices in the order they appear after contraction:
$$v(n,e,f,m)t_1(m,a) = c(n,e,f,a)$$

Anyway, to produce $v_{mn}^{ef}$, we require that the first two indices enumerate occupied spinorbitals and the last two run over unoccupied ones. This is efficiently done using the native `slice` command.

In [89]:
o = slice(0,Nocc)
u = slice(Nocc,Norb)
Voouu = V[o,o,u,u]

To test our code, we will perform the contaction $v(m,n,e,f)t_1(a,m) = c(n,e,f,a)$ using both np.einsum and our own method

In [90]:
C_exact = np.einsum('mnef,am->nefa',Voouu,T1)
C_test = myeinsum_test(Voouu,[0],T1,[1])

np.testing.assert_array_equal(C_exact,C_test,err_msg='Arrays not equal!')
print('Arrays equal!')

Arrays equal!


It works! That's certainly nice, but we would like to take it a step further. It's somewhat undesirable that we have to supply the correct index postions for contraction in both tensors. It would be nice if the function acted more like an actual tensor contraction and automatically contracted over shared alphanumeric indices.

Furthermore, if we are using alphanumeric indices, we could also use the strings to specify input and output orderings for the involved tensors so that we are not restricted to the tensor ordering that results from the segregation of uncontracted from contracted indices. In fact, this is exactly what Numpy's einsum function does (and why it is so convenient to work with!). Here we present an extension of our simple algorithm that perform contractions with the same alphanumeric indical functionality.

Note that the code consists of two parts: the first part (which takes up most of the lines of code) simply parses the input strings to extract the contraction index positions needed for the second part which is simply the core contraction algorithm previously provided. The somewhat tricky bit lies in the fact that, for example, upon contraction, the product $v(m,f,n,e)t_2(n,e,a,j)$ produces $c(m,f,a,j)$ after reshaping, DGEMM multiplication, and unravelling. Suppose we want to produce the output tensor of a specified permutation, say $c(f,a,j,m)$. Now, the string $mfaj$ must be permuted into the desired order held in the variable 'strC' (in this case $fajm$). So the function intermediate_output_str(indA_un, indB_un) produces the string $mfaj$ using the uncontracted indices. The function return_output_indices(c, strC) compares the string $mfaj$ to the desired ordering e.g. $fajm$ and produces the index permutation to be used with np.transpose() that permutes the resulting tensor product into the desired ordering.

Note that the following code does not support self-contractions within individual tensors. For example, it could not evaluate the expression $v_{ii}^{mn}t_{mn}^{ef}$ since the first tensor has a contraction over its own indices (although one could add this functionality without too much trouble!). 

In [91]:
# Tensor contraction function
def tensor_contract(A,strA,B,strB,strC):
    
    # strA = string of char labels for indices of A
    # strB = string of char labels for indices of B
    # strC = string of char labels for indices of output C
    # contraction implied by shared indices and output char string must respect this
    
    def return_contraction_indices(a, b):
        indA = []
        indB = []
        for i,v1 in enumerate(a):
            for j,v2 in enumerate(b):
                if v1 == v2:
                    indA.append(a.index(v1))
                    indB.append(b.index(v2))
        return indA, indB
    
    def intermediate_output_str(indA_un, indB_un):
        c = ''
        for i in range(len(indA_un)):
            c += strA[indA_un[i]]
        for i in range(len(indB_un)):
            c += strB[indB_un[i]]
        return c
        
    def return_output_indices(c,strC):
        indC = []
        ct = 0
        for i,v in enumerate(strC):
            indC.append(c.index(strC[ct]))
            ct+=1
        return indC
    
    indA, indB = return_contraction_indices(strA,strB)
    
    # contracts indices indA of tensor A with indices indB of tensor B such that
    # indA[k] contracts with indB[k]
    
    # list of total indices
    ind0A = range(0,len(np.shape(A)))
    ind0B = range(0,len(np.shape(B)))
    
    # list those indices that are uncontracted in both
    indA_un = list(set(ind0A)-set(indA))
    indB_un = list(set(ind0B)-set(indB))
    
    c = intermediate_output_str(indA_un,indB_un)
    
  #  print(c)
    
    indC = return_output_indices(c,strC)
    
  #  print(indC)
    
    # The permuted order suitable for contraction
    # NOTE: A(uncontracted,contracted)*B(contracted,uncontracted)
    A_permuted = np.transpose(A,indA_un+indA)
    B_permuted = np.transpose(B,indB+indB_un)
    
    dim_un_A = [A.shape[indA_un[i]] for i in range(len(indA_un))]
    dim_un_B = [B.shape[indB_un[i]] for i in range(len(indB_un))]
    
    numel_unc = [np.prod(dim_un_A), np.prod(dim_un_B)]
    numel_con = int(A.size/numel_unc[0])

    if numel_con != int(B.size/numel_unc[1]):
        print('Error: contraction dimensions not compatible')
        return
    else:
        # explicitly typecast to ints to avoid float 1.0 when one tensor gets fully contracted
        Ars = np.reshape(A_permuted,(int(numel_unc[0]),int(numel_con)))
        Brs = np.reshape(B_permuted,(int(numel_con),int(numel_unc[1])))
        dim_out = dim_un_A + dim_un_B
        return np.transpose(np.reshape(Ars@Brs,dim_out),indC)
    
    
# Einsum-like parsing and wrapper for tensor contraction function
def einsumKG(input_str, arr_1, arr_2):
    
    temp = input_str.split('->')
    strC = temp[1]
    temp2 = temp[0].split(',')
    strA = temp2[0]
    strB = temp2[1]
    
    return tensor_contract(arr_1, strA, arr_2, strB, strC)

    
    

And we can test our code now for various contractions and permutations against `np.einsum`. 

In [94]:
def test_einsum(Nocc,Nunocc):
    
    print('Nocc = {}, Nunocc = {}'.format(Nocc,Nunocc))
    
    V, T1, T2, Norb = get_matrices(Nocc,Nunocc)
    
    o = slice(0,Nocc)
    u = slice(Nocc,Norb)
    
    Voouu = V[o,o,u,u]
    
    print('Test 1 - nmaf,an->mf')
    # v_{mn^ef}t1_{a^m}
    t0 = time.time()
    C_exact = np.einsum('nmaf,an->mf',Voouu,T1)
    t1 = time.time()
    tau1 = t1 - t0;

    t0 = time.time()
    C_exact = np.einsum('nmaf,an->mf',Voouu,T1,optimize=True)
    t1 = time.time()
    tau2 = t1 - t0;

    t0 = time.time()
    C_test = einsumKG('nmaf,an->mf',Voouu,T1)
    t1 = time.time()
    tau3 = t1 - t0;

    np.testing.assert_array_equal(C_exact,C_test,err_msg='Arrays are not equal!')
    print('Arrays equal!')
    print('Time for Numpy einsum: {} ms'.format(tau1*1000))
    print('Time for Numpy einsum (optimized): {} ms'.format(tau2*1000))
    print('Time for einsumKG: {} ms'.format(tau3*1000))
    
    print('Test 2 - nmab,abnm->')
    t0 = time.time()
    C_exact = np.einsum('nmab,abnm->',Voouu,T2)
    t1 = time.time()
    tau1 = t1 - t0;

    t0 = time.time()
    C_exact = np.einsum('nmab,abnm->',Voouu,T2,optimize=True)
    t1 = time.time()
    tau2 = t1 - t0;

    t0 = time.time()
    C_test = einsumKG('nmab,abnm->',Voouu,T2)
    t1 = time.time()
    tau3 = t1 - t0

    np.testing.assert_array_equal(C_exact,C_test,err_msg='Arrays are not equal!')
    print('Arrays equal!')
    print('Time for Numpy einsum: {} ms'.format(tau1*1000))
    print('Time for Numpy einsum (optimized): {} ms'.format(tau2*1000))
    print('Time for einsumKG: {} ms'.format(tau3*1000))
    print('\n')

In [95]:
test_einsum(5,8)
test_einsum(10,20)
test_einsum(20,50)

Nocc = 5, Nunocc = 8
Test 1 - nmaf,an->mf
Arrays equal!
Time for Numpy einsum: 0.05412101745605469 ms
Time for Numpy einsum (optimized): 0.14090538024902344 ms
Time for einsumKG: 0.1518726348876953 ms
Test 2 - nmab,abnm->
Arrays equal!
Time for Numpy einsum: 0.05412101745605469 ms
Time for Numpy einsum (optimized): 0.12493133544921875 ms
Time for einsumKG: 0.13017654418945312 ms


Nocc = 10, Nunocc = 20
Test 1 - nmaf,an->mf
Arrays equal!
Time for Numpy einsum: 0.16188621520996094 ms
Time for Numpy einsum (optimized): 0.2970695495605469 ms
Time for einsumKG: 0.1819133758544922 ms
Test 2 - nmab,abnm->
Arrays equal!
Time for Numpy einsum: 0.1747608184814453 ms
Time for Numpy einsum (optimized): 0.19884109497070312 ms
Time for einsumKG: 0.32401084899902344 ms


Nocc = 20, Nunocc = 50
Test 1 - nmaf,an->mf
Arrays equal!
Time for Numpy einsum: 3.1061172485351562 ms
Time for Numpy einsum (optimized): 1.2888908386230469 ms
Time for einsumKG: 6.085872650146484 ms
Test 2 - nmab,abnm->
Arrays equa