### TNSAA Workshop 2018, January 29, Beijing

# &nbsp;

# TensorKit.jl:
## Tensors and tensor networks with symmetries in Julia

# &nbsp;

### Jutho Haegeman
#### Department of Physics and Astronomy, UGent

## Overview

* **What is a tensor?**
* **Tensors and symmetries**
* **Intro to the Julia Language**
* **TensorKit.jl**
* **Some (preliminary) benchmarks**
* **Outlook**

# What is a tensor?

## Tensors as elements from a tensor product space
* Tensors are elements from the tensor product of vector spaces

    * e.g. a rank-$N$ tensor $t$:
    $$ t \in V_1 \otimes V_2 \otimes \ldots \otimes V_N $$
      <img src="tensor.svg">

    * tensors behave as vectors, tensors in the same space can be added and multiplied by scalars
    
    * (For quantum mechanics, we will only consider complex vector spaces or super vector spaces with Euclidean inner product, for which the dual vector space is naturally isomorphic to the complex conjugate vector space: $\overline{V} = V^{\ast}$.)
    
    * e.g. without symmetries: $t \in \mathbb{C}^{d_1} \otimes \mathbb{C}^{d_2} \otimes \ldots \otimes \mathbb{C}^{d_N}$
    $$t = \sum_{i_1=1}^{d_1}\sum_{i_2=1}^{d_2}\cdots \sum_{i_N=1}^{d_N} t_{i_1,i_2,\ldots,i_N} \vert i_1\rangle \otimes \vert i_2\rangle \otimes\ldots\otimes \vert i_N\rangle  $$

## Tensors as linear maps between tensor product spaces

* Sometimes we also want to think of a tensor as a linear map between 2 tensor product spaces:
    
    * tensor contraction = composition of linear maps (matrix multiplication)
    * tensor factorization (singular value decompostion, eigenvalue decomposition, qr factorization)

    * e.g. a linear map (henceforth called tensor map)
    $$t: W_1 \otimes W_2 \otimes \ldots \otimes W_{N_2} \to V_1 \otimes V_2 \otimes \ldots \otimes V_{N_1}$$ 
      <img src="tensormap.svg">

        * has range/codomain $\mathrm{cod}(t) = V_1 \otimes V_2 \otimes \ldots \otimes V_{N_1}$
        * has domain $\mathrm{dom}(t) = W_1 \otimes W_2 \otimes \ldots \otimes W_{N_2}$
        * a tensor of rank $N$ is a tensor map with $N_1=N, N_2 = 0$
    
* Category theory: tensor product spaces are the objects of the category $\mathsf{Vect}$ and tensor (maps) are the corresponding morphisms. $\mathsf{Vect}$ is a monoidal cateogry as $V_1 \otimes (V_2 \otimes V_3) \equiv (V_1 \otimes V_2) \otimes V_3$ are trivially isomorphic.

## Tensors and index permutations
* Bipartition between domain and codomain indices of a tensor is not fixed and has to be constantly changed in contracting tensor networks or during tensor factorizations.
* To change the order between spaces in either domain or codomain of the tensor, we need a braiding structure $c_{1,2}$, i.e. an isomorphism $c_{1,2}:V_1 \otimes V_2 \to V_2 \otimes V_1$.
    * For bosons or spins (finite dimensional complex Euclidean spaces, the action of is trivial. For fermions (super vector spaces, i.e. with a natural $\mathbb{Z}_2$ grading), the braiding acts as a controlled $Z$. 
    * These braidings are symmetric ($c_{2,1}=c_{1,2}^{-1}$), which means that crossing lines in a tensor network diagram are well defined.
    <img src="braiding.svg">

## Tensors and index permutations
* Bipartition between domain and codomain indices of a tensor is not fixed and has to be constantly changed in contracting tensor networks or during tensor factorizations.
* To exchange spaces between domain and codomain, we use the autonomous and pivotal structure ($\epsilon: V^{\ast} \otimes V \to \mathbb{C}$ and $\eta:\mathbb{C} \to V \otimes V^{\ast}$ with $V^{\ast \ast} \equiv V$). Henceforth, we denote $V^\ast$ as $\overline{V}$ as they are the same.
<img src="repartition.svg">

## Tensors, adjoints and inner products
* A general tensor map can have normal and dual spaces in both domain and codomain: we will use left pointing arrows for normal spaces and right pointing arrows for dual spaces
<img src="arrowedtensor.svg">
e.g $$t \big( \langle j_1\vert \otimes \vert j_2\rangle\big) = t_{i_1,i_2,i_3; j_1,j_2} \vert i_1\rangle \otimes \langle i_2\vert \otimes \vert i_3\rangle$$

* The adjoint (dagger or Hermitian conjugate) of a tensor map
$$t: W_1 \otimes W_2 \otimes \ldots \otimes W_{N_2} \to V_1 \otimes V_2 \otimes \ldots \otimes V_{N_1}$$ 
is a tensor map
$$t^\dagger: V_1 \otimes V_2 \otimes \ldots \otimes V_{N_1} \to W_1 \otimes W_2 \otimes \ldots \otimes W_{N_2}$$

* Note that the dagger of a tensor $t \in V_1 \otimes V_2 \otimes \ldots \otimes V_{N}$ is not a tensor but a dual object $t^\dagger: V_1 \otimes V_2 \otimes \ldots \otimes V_{N} \to \mathbb{C}$. By using index permitations, $t^\dagger$ can be permuted into a tensor in $\overline{V}_N \otimes \ldots \otimes \overline{V}_2 \otimes \overline{V}_1$ or in $\overline{V}_1 \otimes \overline{V}_2 \otimes \ldots \otimes \overline{V}_N$ (using additional braidings).

# Tensors and symmetries

## Symmetries and unitary representations
* Let $v(g)$ be a unitary representation of elements $g$ of a group $\mathsf{G}$ on a (complex Euclidean) space $V$. We can decompose $v(g)$ into a block diagonal form corresponding to the irreducible representations $j$ of $\mathsf{G}$. $V$ is completely specified (up to basis transforms) by the number of times $n_j$ that irrep $j$ appears, i.e.
$$ V = \bigoplus_{j} R_j \otimes \mathbb{C}^{n_j}$$
The corresponding representation $v(g)$ takes the form
$$ v(g) = \bigoplus_{j} r_j(g) \otimes 1_{n_j}$$

## Symmetries and unitary representations
* Consider a tensor map
$$t: W_1 \otimes W_2 \otimes \ldots \otimes W_{N_2} \to V_1 \otimes V_2 \otimes \ldots \otimes V_{N_1}$$ 
where the spaces $V_i$ and $W_j$ have unitary representations $v_i(g)$ and $w_i(g)$ for the elements $g$ of a group $\mathsf{G}$. A symmetric tensor acts as an intertwiner between the unitary representations 
$$v_1(g) \otimes v_2(g) \otimes \ldots \otimes v_{N_1}(g)$$
and
$$w_1(g) \otimes w_2(g) \otimes \ldots \otimes w_{N_1}(g),$$
i.e.
$$(v_1(g) \otimes v_2(g) \otimes \ldots \otimes v_{N_1}(g)) \circ t = t \circ (w_1(g) \otimes w_2(g) \otimes \ldots \otimes w_{N_1}(g)) .$$

* In particular, a symmetric tensor $t \in V_1 \otimes V_2 \otimes \ldots \otimes V_N$ is invariant under the action of $$v_1(g) \otimes v_2(g) \otimes \ldots \otimes v_{N}(g)$$
for all $g \in \mathsf{G}$.

## Symmetric tensors and fusion trees
* By Schur's Lemma, a symmetric tensor map will be block diagonal in the coupled basis of irreps, obtained from fusing the irreps contained in the tensor product $v_1(g) \otimes v_2(g) \otimes \ldots \otimes v_{N_1}(g)$ and $w_1(g) \otimes w_2(g) \otimes \ldots \otimes w_{N_1}(g)$.

* We can use the elementary isometric intertwiners $x_{a,b}^{c;\mu}: R_{c} \to R_{a} \otimes R_{b}$ (fusion/splitting tensors, a.k.a. Clebsch-Gordon coefficients):
    <img src="splittingtensor.svg">
    Here, $\mu = 1,\ldots N_{a,b}^c$ is a degeneracy label for number of times $N_{a,b}^c$ that irrep $c$ appears in the fusion product of $a \otimes b$.

## Symmetric tensors and fusion trees

* For the space $V_1 \otimes V_2 \otimes \overline{V}_3 \otimes \overline{V}_4$, we can build a set of canonical fusion trees $x_{a_1 a_2 \bar{a}_3 \bar{a}_4}^{c,(e_1,e_2),(\mu_1,\mu_2,\mu_3)}$ for fusing to a joint irrep $c$
    <img src="fusiontree.svg">
    * Here, $z_a$ is a canonical mapping $R_a \to \overline{R}_{\bar{a}}$ that can be constructed from the elementary intertwiner $x_{a,\bar{a}}^{1}$. Here, $\bar{a}$ is the dual representation of $a$, the unique irrep that has $N_{a,\bar{a}}^{1} > 0$ (and actually $N_{a,\bar{a}}^{1} = 1$).
    * The fusion trees are mutually orthogonal isometries, i.e.
    $$ (x_{a_1 a_2 \bar{a}_3 \bar{a}_4}^{c,(e_1,e_2),(\mu_1,\mu_2,\mu_3)})^\dagger \circ (x_{a_1 a_2 \bar{a}_3 \bar{a}_4}^{c',(e_1',e_2'),(\mu_1',\mu_2',\mu_3')}) = \delta_{c,c'} \delta_{e_1,e_1'}\delta_{e_2,e_2'}\delta_{\mu_1,\mu_1'}\delta_{\mu_2,\mu_2'}\delta_{\mu_3,\mu_3'} 1_{R_c}$$
    

## Symmetric tensors and fusion trees


* We can now decompose tensor map $t:W_1 \otimes \overline{W}_2 \otimes W_3 \to V_1 \otimes V_2 \otimes \overline{V}_3 \otimes \overline{V}_4$ as
    $$ t=\sum_c \sum_{a_1,a_2,a_3,b_1,b_2,e_1,e_2,f_1,\mu_1,\mu_2,\mu_3,\nu_1,\nu_2} x_{a_1 a_2 \bar{a}_3 \bar{a}_4}^{c,(e_1,e_2),(\mu_1,\mu_2,\mu_3)} \otimes t_{a_1 a_2 \bar{a}_3 \bar{a}_4; b_1 \bar{b}_2 b_3}^{c,(e_1,e_2),(f_1),(\mu_1,\mu_2,\mu_3),(\nu_1,\nu_2)} \otimes \big(x_{b_1 \bar{b}_2 b_3}^{c,(f_1),,(\nu_1,\nu_2)}\big)^\dagger $$
    or graphically
    <img src="tensorfused.svg">
     * Here, $t_{a_1 a_2 \bar{a}_3 \bar{a}_4; b_1 \bar{b}_2 b_3}^{c,(e_1,e_2),(f_1),(\mu_1,\mu_2,\mu_3),(\nu_1,\nu_2)}$ is a regular complex array of dimension $(m_1 \times m_2 \times m_3 \times m_4) \times (n_1 \times n_2 \times n_3)$, with $m_i$ the degeneracy of irrep $a_i$ in $V_i$ and and $n_j$ the degeneracy of irrep $b_j$ in $W_j$.

## Symmetric tensors and fusion trees


* The advantage of working with symmetric tensors, is that we can work with only the blocks $t_{a_1 a_2 \bar{a}_3 \bar{a}_4; b_1 \bar{b}_2 b_3}^{c,(e_1,e_2),(f_1),(\mu_1,\mu_2,\mu_3),(\nu_1,\nu_2)}$.

* Furthermore, the various blocks for a fixed label of $c$ can be patched together into a full matrix, which is related to the full matrix representation of the tensor by isometries. For matrix multiplication or singular value decomposition, we don't have to care about those isometries.


## Symmetric tensors and fusion trees

* To manipulate a tensor (index permutations), we need to be able to manipulate the corresponding fusion trees, in order to bring them back into canonical form.

* For this, we only need the "topological" data of the group:
    <img src="Fmove.svg">
    <img src="Rmove.svg">
    <img src="Bmove.svg">
where the Frobenius-Schur indicator $\chi_b$, the irrep dimension $d_b$ and the $B$-symbol all are encoded in the $F$-symbol (6j-symbols in case of $\mathsf{SU}_2$)

* All of this simplifies drastically for
    * Abelian groups
    * Non-abelian groups with no degeneracies ($N_{ab}^{c} \leq 1$), e.g. $\mathsf{SU}_2$

# Introduction to the Julia Language

  <img src="julia.png" style="width: 200px;"/>


* Selling point: dynamic high-level language with the speed of a statically-compiled language

* Key features:
    * Just-in-time compiled (using LLVM infrastructure)
    * Dynamic type system
    * Multiple dispatch:
        * define function behavior across many combinations of argument types
        * automatic generation of efficient, specialized code for different argument types
    * Good support for computational science: numerics, statistics, multidimensional arrays, ...
    * Homoiconic and powerful metaprogramming facilities

## Code specialization

In [1]:
function myabs(x)
    if x < 0
        return -x
    end
    return x
end

myabs (generic function with 1 method)

In [2]:
@code_llvm myabs(3) # LLVM code for 64-bit integer


define i64 @julia_myabs_62987(i64) #0 !dbg !5 {
top:
  %1 = icmp sgt i64 %0, -1
  br i1 %1, label %L4, label %if

if:                                               ; preds = %top
  %2 = sub i64 0, %0
  ret i64 %2

L4:                                               ; preds = %top
  ret i64 %0
}


## Code specialization

In [3]:
code_llvm(myabs,Tuple{UInt64}) # LLVM code for 64-bit unsigned integer


define i64 @julia_myabs_62993(i64) #0 !dbg !5 {
L4:
  ret i64 %0
}


In [4]:
code_llvm(myabs,Tuple{Float64}) # LLVM code for 64-bit floating point


define double @julia_myabs_62994(double) #0 !dbg !5 {
top:
  %1 = fcmp uge double %0, 0.000000e+00
  br i1 %1, label %L9, label %if

if:                                               ; preds = %top
  %2 = fsub double -0.000000e+00, %0
  ret double %2

L9:                                               ; preds = %top
  ret double %0
}


## Type inference & type stability

In [5]:
mysqrt(x) = x < zero(x) ? sqrt(complex(x)) : sqrt(x)
code_warntype(mysqrt,Tuple{Float64})

Variables:
  #self# <optimized out>
  x::Float64

Body:
  begin 
      unless (Base.lt_float)(x::Float64, (Base.sitofp)(Float64, 0)::Float64)::Bool goto 3
      return $(Expr(:invoke, MethodInstance for sqrt(::Complex{Float64}), :(Main.sqrt), :($(Expr(:new, Complex{Float64}, :(x), :((Base.sitofp)(Float64, 0)::Float64))))))
      3: 
      return (Base.Math.sqrt_llvm)(x::Float64)::Float64
  end[1m[91m::Union{Complex{Float64}, Float64}[39m[22m


## Type inference & type stability

In [6]:
function summyabs(v::Vector)
    s = myabs(v[1])
    for i = 2:length(v)
        s += abs(v[i])
    end
    return s
end
code_warntype(summyabs, Tuple{Vector{Float64}})

Variables:
  #self# <optimized out>
  v::Array{Float64,1}
  i::Int64
  #temp#@_4::Int64
  s::Float64
  fy::Float64
  #temp#@_7::Float64

Body:
  begin 
      SSAValue(2) = (Base.arrayref)(v::Array{Float64,1}, 1)::Float64
      $(Expr(:inbounds, false))
      # meta: location In[1] myabs 2
      # meta: location float.jl < 491
      fy::Float64 = (Base.sitofp)(Float64, 0)::Float64
      # meta: pop location
      unless (Base.or_int)((Base.lt_float)(SSAValue(2), fy::Float64)::Bool, (Base.and_int)((Base.and_int)((Base.eq_float)(SSAValue(2), fy::Float64)::Bool, (Base.lt_float)(fy::Float64, 9.223372036854776e18)::Bool)::Bool, (Base.slt_int)((Base.fptosi)(Int64, fy::Float64)::Int64, 0)::Bool)::Bool)::Bool goto 11 # line 3:
      #temp#@_7::Float64 = (Base.neg_float)(SSAValue(2))::Float64
      goto 14
      11:  # line 5:
      #temp#@_7::Float64 = SSAValue(2)
      14: 
      # meta: pop location
      $(Expr(:inbounds, :pop))
      s::Float64 = #temp#@_7::Float64 # line 3:
      SSAValue(

In [7]:
x=randn(100)
@time summyabs(x)
x=randn(1000)
@time summyabs(x)
x=randn(10000)
@time summyabs(x)

  0.007333 seconds (192 allocations: 13.248 KiB)
  0.000005 seconds (5 allocations: 176 bytes)
  0.000026 seconds (5 allocations: 176 bytes)


7900.138261367283

## Type inference & type stability

In [8]:
function summysqrt(v::Vector)
    s = mysqrt(v[1])
    for i = 2:length(v)
        s += mysqrt(v[i])
    end
    return s
end
code_warntype(summysqrt,Tuple{Vector{Int64}})

Variables:
  #self# <optimized out>
  v::Array{Int64,1}
  i::Int64
  #temp#@_4::Int64
  s[1m[91m::Union{Complex{Float64}, Float64}[39m[22m
  #temp#@_6[1m[91m::Union{Complex{Float64}, Float64}[39m[22m
  #temp#@_7[1m[91m::Union{Complex{Float64}, Float64}[39m[22m
  #temp#@_8::Core.MethodInstance
  #temp#@_9[1m[91m::Union{Complex{Float64}, Float64}[39m[22m

Body:
  begin 
      SSAValue(2) = (Base.arrayref)(v::Array{Int64,1}, 1)::Int64
      $(Expr(:inbounds, false))
      # meta: location In[5] mysqrt 1
      unless (Base.slt_int)(SSAValue(2), 0)::Bool goto 7
      #temp#@_6[1m[91m::Union{Complex{Float64}, Float64}[39m[22m = $(Expr(:invoke, MethodInstance for sqrt(::Complex{Float64}), :(Base.sqrt), :($(Expr(:new, Complex{Float64}, :((Base.sitofp)(Float64, SSAValue(2))::Float64), :((Base.sitofp)(Float64, 0)::Float64))))))
      goto 9
      7: 
      #temp#@_6[1m[91m::Union{Complex{Float64}, Float64}[39m[22m = (Base.Math.sqrt_llvm)((Base.sitofp)(Float64, SSAValue(2)

In [9]:
x=randn(100)
@time summysqrt(x)
x=randn(1000)
@time summysqrt(x)
x=randn(10000)
@time summysqrt(x)

  0.046556 seconds (3.00 k allocations: 162.022 KiB)
  0.000085 seconds (3.00 k allocations: 86.203 KiB)
  0.000861 seconds (30.00 k allocations: 860.094 KiB)


4060.6137972516444 + 4121.888993375175im

## Homoiconicity

In [10]:
ex=quote
    function summysqrt(v::Vector)
        s = mysqrt(v[1])
        for i = 2:length(v)
            s += mysqrt(v[i])
        end
        return x
        end
end

quote  # In[10], line 2:
    function summysqrt(v::Vector) # In[10], line 3:
        s = mysqrt(v[1]) # In[10], line 4:
        for i = 2:length(v) # In[10], line 5:
            s += mysqrt(v[i])
        end # In[10], line 7:
        return x
    end
end

In [11]:
typeof(ex)

Expr

## Homoiconicity

In [12]:
Meta.show_sexpr(ex)

(:block,
  (:line, 2, Symbol("In[10]")),
  (:function, (:call, :summysqrt, (:(::), :v, :Vector)), (:block,
      (:line, 3, Symbol("In[10]")),
      (:(=), :s, (:call, :mysqrt, (:ref, :v, 1))),
      (:line, 4, Symbol("In[10]")),
      (:for, (:(=), :i, (:(:), 2, (:call, :length, :v))), (:block,
          (:line, 5, Symbol("In[10]")),
          (:+=, :s, (:call, :mysqrt, (:ref, :v, :i)))
        )),
      (:line, 7, Symbol("In[10]")),
      (:return, :x)
    ))
)

## Metaprogramming

In [13]:
macro twice(ex)
    Expr(:block, esc(ex), esc(ex))
end

@twice (macro with 1 method)

In [14]:
x=3;
@twice x+=1
x

5

In [15]:
macroexpand(:(@twice x+=1))

quote 
    x += 1
    x += 1
end

# TensorKit.jl


In [16]:
using Revise
using TensorKit
using TensorOperations.@optimalcontractiontree



## General tensors (without symmetries)

In [17]:
t = TensorMap(randisometry, ℂ^4 ⊗ ℂ^2, ℂ^3 ⊗ ℂ^2) #or randnormal, randuniform

TensorMap((ℂ^4 ⊗ ℂ^2) ← (ℂ^3 ⊗ ℂ^2)):
[:, :, 1, 1] =
  0.0312438  -0.210017
  0.0601861   0.12207 
 -0.0771344  -0.65382 
 -0.0610272   0.706569

[:, :, 2, 1] =
 -0.225436   0.17414 
 -0.243116  -0.459363
 -0.177601   0.091665
  0.724433   0.289803

[:, :, 3, 1] =
 -0.177359   -0.825418 
 -0.342269    0.28757  
 -0.0450495   0.216373 
  0.190785   -0.0462469

[:, :, 1, 2] =
 -0.581861  0.0401665
  0.542701  0.18495  
 -0.109522  0.436191 
 -0.110039  0.341655 

[:, :, 2, 2] =
  0.518423  -0.0479517
 -0.141276  -0.151231 
  0.277309   0.549337 
 -0.18156    0.523902 

[:, :, 3, 2] =
  0.477003  -0.0236493
  0.228581   0.174189 
 -0.806983   0.135042 
  0.137927  -0.0289096


## General tensors (without symmetries)

* Permuting `TensorMap` indices (between range and domain)

In [18]:
permuteind(t, (1,3),(4,2))

TensorMap((ℂ^4 ⊗ (ℂ^3)') ← (ℂ^2 ⊗ (ℂ^2)')):
[:, :, 1, 1] =
  0.0312438  -0.225436  -0.177359 
  0.0601861  -0.243116  -0.342269 
 -0.0771344  -0.177601  -0.0450495
 -0.0610272   0.724433   0.190785 

[:, :, 2, 1] =
 -0.581861   0.518423   0.477003
  0.542701  -0.141276   0.228581
 -0.109522   0.277309  -0.806983
 -0.110039  -0.18156    0.137927

[:, :, 1, 2] =
 -0.210017   0.17414   -0.825418 
  0.12207   -0.459363   0.28757  
 -0.65382    0.091665   0.216373 
  0.706569   0.289803  -0.0462469

[:, :, 2, 2] =
 0.0401665  -0.0479517  -0.0236493
 0.18495    -0.151231    0.174189 
 0.436191    0.549337    0.135042 
 0.341655    0.523902   -0.0289096


## General tensors (without symmetries)

* Tensor Factorization:

In [19]:
U,S,V = svd(t,(1,3),(4,2),truncerr(0.5)) # or qr, ..., 
S

LoadError: [91mMethodError: no method matching resize!(::Strided.StridedView{Float64,1,Array{Float64,1},Base.#identity}, ::Int64)[0m
Closest candidates are:
  resize!([91m::Array{T,1} where T[39m, ::Integer) at array.jl:772
  resize!([91m::BitArray{1}[39m, ::Integer) at bitarray.jl:825[39m

## Symmetries

In [20]:
c = U1Irrep(1)
@show c
@show dual(c)
@show one(c)
@show collect(c ⊗ c)
s = SU2Irrep(1)
@show s
@show dual(s)
@show one(s)
@show collect(s ⊗ s);
n = c × s
@show n
@show dual(n)
@show one(n)
@show collect(n ⊗ n);

c = U₁(1//1)
dual(c) = U₁(-1//1)
one(c) = U₁(0//1)
collect(c ⊗ c) = U₁[2//1]
s = SU₂(1//1)
dual(s) = SU₂(1//1)
one(s) = SU₂(0//1)
collect(s ⊗ s) = SU₂[0//1, 1//1, 2//1]
n = (U₁(1//1) × SU₂(1//1))
dual(n) = (U₁(-1//1) × SU₂(1//1))
one(n) = (U₁(0//1) × SU₂(0//1))
collect(n ⊗ n) = (U₁ × SU₂)[(2//1, 0//1), (2//1, 1//1), (2//1, 2//1)]


### Symmetries and fusion trees

In [21]:
collect(fusiontrees((c,conj(c),c,conj(c),c),c))

1-element Array{TensorKit.FusionTree{U₁,5,3,4,Void},1}:
 FusionTree{U₁}((1//1, -1//1, 1//1, -1//1, 1//1), 1//1, (0//1, 1//1, 0//1))

In [22]:
collect(fusiontrees((s,s,s,s,s),s))

15-element Array{TensorKit.FusionTree{SU₂,5,3,4,Void},1}:
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (0//1, 1//1, 0//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (0//1, 1//1, 1//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (0//1, 1//1, 2//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (1//1, 0//1, 1//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (1//1, 1//1, 0//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (1//1, 1//1, 1//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (1//1, 1//1, 2//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (1//1, 2//1, 1//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (1//1, 2//1, 2//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (2//1, 1//1, 0//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (2//1, 1//1, 1//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 1//1), 1//1, (2//1, 1//1, 2//1))
 FusionTree{SU₂}((1//1, 1//1, 1//1, 1//1, 

### Symmetries and fusion trees

In [23]:
collect(fusiontrees((n,dual(n),n,dual(n),n),n))

15-element Array{TensorKit.FusionTree{(U₁ × SU₂),5,3,4,Void},1}:
 FusionTree{(U₁ × SU₂)}(((1//1, 1//1), (-1//1, 1//1), (1//1, 1//1), (-1//1, 1//1), (1//1, 1//1)), (1//1, 1//1), ((0//1, 0//1), (1//1, 1//1), (0//1, 0//1)))
 FusionTree{(U₁ × SU₂)}(((1//1, 1//1), (-1//1, 1//1), (1//1, 1//1), (-1//1, 1//1), (1//1, 1//1)), (1//1, 1//1), ((0//1, 0//1), (1//1, 1//1), (0//1, 1//1)))
 FusionTree{(U₁ × SU₂)}(((1//1, 1//1), (-1//1, 1//1), (1//1, 1//1), (-1//1, 1//1), (1//1, 1//1)), (1//1, 1//1), ((0//1, 0//1), (1//1, 1//1), (0//1, 2//1)))
 FusionTree{(U₁ × SU₂)}(((1//1, 1//1), (-1//1, 1//1), (1//1, 1//1), (-1//1, 1//1), (1//1, 1//1)), (1//1, 1//1), ((0//1, 1//1), (1//1, 0//1), (0//1, 1//1)))
 FusionTree{(U₁ × SU₂)}(((1//1, 1//1), (-1//1, 1//1), (1//1, 1//1), (-1//1, 1//1), (1//1, 1//1)), (1//1, 1//1), ((0//1, 1//1), (1//1, 1//1), (0//1, 0//1)))
 FusionTree{(U₁ × SU₂)}(((1//1, 1//1), (-1//1, 1//1), (1//1, 1//1), (-1//1, 1//1), (1//1, 1//1)), (1//1, 1//1), ((0//1, 1//1), (1//1, 1//1), (0//1, 1//1)))

## Symmetries and representation spaces

In [24]:
V1 = Z2Space(3,4)
V2 = U1Space(0=>3,1=>2,-1=>2)
V3 = SU2Space(0=>1,1/2=>2,1=>1)
V4 = RepresentationSpace{ℤ₂×ℤ₂}((0,0)=>3,(1,0)=>2,(0,1)=>2,(1,1)=>1)
@show V1
@show V2
@show V3
@show V4;
W1 = V1 ⊗ V1 ⊗ V1'
c = ℤ₂(0)
@show dim(V1)
@show dims(W1,(c,c,c))
@show dim(W1,(c,c,c))
W2 = V2 ⊗ V2 ⊗ V2
@show collect(TensorKit.blocksectors(W2));

V1 = RepresentationSpace{ℤ₂}(0=>3, 1=>4)
V2 = RepresentationSpace{U₁}(0//1=>3, 1//1=>2, -1//1=>2)
V3 = RepresentationSpace{SU₂}(0//1=>1, 1//2=>2, 1//1=>1)
V4 = RepresentationSpace{(ℤ₂ × ℤ₂)}((0, 0)=>3, (1, 0)=>2, (0, 1)=>2, (1, 1)=>1)
dim(V1) = 7
dims(W1, (c, c, c)) = (3, 3, 3)
dim(W1, (c, c, c)) = 27
collect(TensorKit.blocksectors(W2)) = U₁[0//1, 1//1, -1//1, 2//1, -2//1, 3//1, -3//1]


### Symmetric tensors

In [25]:
V1 = U1Space(0=>3,1=>2,-1=>2)
V2 = U1Space(0=>4,1=>3,-1=>3,2=>2,-2=>2)
t = Tensor(randn, V1 ⊗ V2 ⊗ V2' ⊗ V1')

TensorMap((RepresentationSpace{U₁}(0//1=>3, 1//1=>2, -1//1=>2) ⊗ RepresentationSpace{U₁}(0//1=>4, 1//1=>3, -1//1=>3, 2//1=>2, -2//1=>2) ⊗ RepresentationSpace{U₁}(0//1=>4, 1//1=>3, -1//1=>3, 2//1=>2, -2//1=>2)' ⊗ RepresentationSpace{U₁}(0//1=>3, 1//1=>2, -1//1=>2)') ← ProductSpace{U₁Space,0}()):
* Data for sector (U₁(0//1), U₁(0//1), U₁(0//1), U₁(0//1)) ← ():
[:, :, 1, 1] =
  0.48829     1.20849   -0.373954   0.655721 
 -0.0328748   0.144397  -2.29702    2.35152  
 -0.0572206  -1.19095   -0.383219  -0.0455046

[:, :, 2, 1] =
 -1.28738   -1.21872    0.284383    0.771736
 -2.20188    0.069144   0.0238391  -0.502429
  0.982306   1.02042   -2.0276      0.316226

[:, :, 3, 1] =
  0.695419   0.963833   0.275837  -2.00103  
 -0.783211  -0.735714  -0.12261   -0.0890856
  1.93299   -1.2437     1.30646    1.7517   

[:, :, 4, 1] =
 -0.985491   1.73759   0.337716  -1.16531 
 -1.1265    -0.572328  0.634778   0.967514
  0.554456  -0.654078  1.18396   -0.460801

[:, :, 1, 2] =
 -0.986715   0.143518  

### Symmetric tensors

In [26]:
U,S,V=svd(t,(1,3),(2,4));
@show dim(domain(S))
S

dim(domain(S)) = 98


TensorMap(ProductSpace(RepresentationSpace{U₁}(0//1=>24, 1//1=>21, -1//1=>21, -2//1=>12, 2//1=>12, -3//1=>4, 3//1=>4)) ← ProductSpace(RepresentationSpace{U₁}(0//1=>24, 1//1=>21, -1//1=>21, -2//1=>12, 2//1=>12, -3//1=>4, 3//1=>4))):
* Data for sector (U₁(0//1),) ← (U₁(0//1),):
 9.40942  0.0      0.0      0.0      0.0      …  0.0      0.0       0.0     
 0.0      8.72463  0.0      0.0      0.0         0.0      0.0       0.0     
 0.0      0.0      7.81949  0.0      0.0         0.0      0.0       0.0     
 0.0      0.0      0.0      7.49838  0.0         0.0      0.0       0.0     
 0.0      0.0      0.0      0.0      7.13776     0.0      0.0       0.0     
 0.0      0.0      0.0      0.0      0.0      …  0.0      0.0       0.0     
 0.0      0.0      0.0      0.0      0.0         0.0      0.0       0.0     
 0.0      0.0      0.0      0.0      0.0         0.0      0.0       0.0     
 0.0      0.0      0.0      0.0      0.0         0.0      0.0       0.0     
 0.0      0.0      0.0      0.

### Symmetric tensors

In [27]:
U,S,V,ϵ=svd(t,(1,3),(2,4),truncdim(97));
ϵ

LoadError: [91mMethodError: no method matching resize!(::Strided.StridedView{Float64,1,Array{Float64,1},Base.#identity}, ::Int64)[0m
Closest candidates are:
  resize!([91m::Array{T,1} where T[39m, ::Integer) at array.jl:772
  resize!([91m::BitArray{1}[39m, ::Integer) at bitarray.jl:825[39m

## Tensor Contractions

In [28]:
d, D = 2, 32
Vphys = ℂ^d
Vvirt = ℂ^D
A = Tensor(randnormal, Vvirt ⊗ Vphys ⊗ Vvirt')
h = TensorMap(randn, Vphys ⊗ Vphys, Vphys ⊗ Vphys)

@tensor hAA[α,t1,t2,α'] := h[t1,t2,s1,s2]*A[α,s1,β]*A[β,s2,α']

@tensor hAA[-1,-2,-3,-4] = h[-2,-3,2,3]*A[-1,2,1]*A[1,3,-4];

## Optimization of tensor contraction order

Contraction order matters!

* matrix - matrix - vector multiplication: `A*B*v`: 
  `A*(B*v)` is much faster than `(A*B)*v`

* Optimal contraction order in more complicated tensor networks?
  <img src="mera.png" style="width: 200px;"/>
  
* Pairwise contraction is always sufficient, but in which sequence?

## Optimization of tensor contraction order


* Manual determination can become laborious task
* Contraction of two-dimensional multiscale entanglement renormalization ansatz:
  <img src="2dmerac.png" style="width: 1200px;"/>

### Algorithmic determination of optimal contraction sequence

"Faster identification of optimal contraction sequences for tensor networks"

Robert N. C. Pfeifer, JH, and Frank Verstraete, Phys Rev E 90, 033315 (2014)

* Breadth-first constructive approach:
  <img src="algorithm.png" style="width: 800px;"/>
* Add tricks to make it efficient   

## Optimization of tensor contraction order

  <img src="mera2.png" style="width: 1200px;"/>
  

## Optimization of tensor contraction order



In [29]:
ex=:(result[-4,-5,-6,-1,-2,-3] := 
        W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1)[11,12,14,15]*conj(U2)[13,10,16,17]*
        conj(W1)[1,14,-4]*conj(W2)[15,16,-5]*conj(W3)[17,6,-6])
Meta.show_sexpr(ex)


(:(:=), (:ref, :result, -4, -5, -6, -1, -2, -3), (:call, :*, (:ref, :W1, 1, 2, -1), (:ref, :W2, 3, 4, -2), (:ref, :W3, 5, 6, -3), (:ref, :U1, 7, 8, 2, 3), (:ref, :U2, 9, 10, 4, 5), (:ref, :h, 11, 12, 13, 7, 8, 9), (:ref, (:call, :conj, :U1), 11, 12, 14, 15), (:ref, (:call, :conj, :U2), 13, 10, 16, 17), (:ref, (:call, :conj, :W1), 1, 14, -4), (:ref, (:call, :conj, :W2), 15, 16, -5), (:ref, (:call, :conj, :W3), 17, 6, -6)))

In [30]:
@optimalcontractiontree W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1)[11,12,14,15]*conj(U2)[13,10,16,17]*
        conj(W1)[1,14,-4]*conj(W2)[15,16,-5]*conj(W3)[17,6,-6]


(((9, 1), ((3, 11), ((2, 5), ((8, 10), (4, (6, 7)))))), 2*χ^9 + 4*χ^8 + 0*χ^7 + 2*χ^6 + 2*χ^5 + 0*χ^4 + 0*χ^3 + 0*χ^2 + 0*χ + 0)

In [31]:
tic()
@optimalcontractiontree W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1)[11,12,14,15]*conj(U2)[13,10,16,17]*
        conj(W1)[1,14,-4]*conj(W2)[15,16,-5]*conj(W3)[17,6,-6]
toc()


elapsed time: 0.021642211 seconds


0.021642211

In [32]:
tic()
@optimalcontractiontree W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1)[11,12,14,15]*conj(U2)[13,10,16,17]*
        conj(W1)[1,14,-4]*conj(W2)[15,16,-5]*conj(W3)[17,6,-6]
toc()



elapsed time: 0.002390268 seconds


0.002390268

## Optimization of tensor contraction order

In [33]:
V=ℂ^5;
W1 = W2 = W3 = TensorMap(randisometry,V ⊗ V, V);
U1 = U2 = TensorMap(randisometry, V ⊗ V, V ⊗ V);
h = TensorMap(randn, V ⊗ V ⊗ V, V ⊗ V ⊗ V);

In [34]:
@time @tensor result[-4,-5,-6,-1,-2,-3] := 
        W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1[11,12,14,15])*conj(U2[13,10,16,17])*
        conj(W1[1,14,-4])*conj(W2[15,16,-5])*conj(W3[17,6,-6])
@time @tensoropt result[-4,-5,-6,-1,-2,-3] := 
        W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1[11,12,14,15])*conj(U2[13,10,16,17])*
        conj(W1[1,14,-4])*conj(W2[15,16,-5])*conj(W3[17,6,-6]);

 18.231829 seconds (10.02 M allocations: 2.083 GiB, 5.22% gc time)
  1.110044 seconds (671.56 k allocations: 41.934 MiB, 9.34% gc time)


In [35]:
@time @tensor result[-4,-5,-6,-1,-2,-3] := 
        W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1[11,12,14,15])*conj(U2[13,10,16,17])*
        conj(W1[1,14,-4])*conj(W2[15,16,-5])*conj(W3[17,6,-6])
@time @tensoropt result[-4,-5,-6,-1,-2,-3] := 
        W1[1,2,-1]*W2[3,4,-2]*W3[5,6,-3]*
        U1[7,8,2,3]*U2[9,10,4,5]*
        h[11,12,13,7,8,9]*
        conj(U1[11,12,14,15])*conj(U2[13,10,16,17])*
        conj(W1[1,14,-4])*conj(W2[15,16,-5])*conj(W3[17,6,-6]);

  0.012159 seconds (372 allocations: 2.665 MiB, 82.11% gc time)


# Some benchmarks

* TensorKit.jl vs ITensor v2.1.0 and Uni10 1.0
* All linked to same version of OpenBLAS

* Disclaimers:
    * Very preliminary and limited benchmark (only one specific contraction, no factorization)
    * I am not a C++ programmer 

# Some benchmarks

In [None]:
using Cxx
using ProfileView
using PyPlot

In [None]:
using Cxx

In [None]:
const itensorpath = "/Users/jutho/Dropbox/Code/Libraries/itensor/"
Libdl.dlopen(itensorpath * "lib/libitensor.dylib", Libdl.RTLD_GLOBAL)
addHeaderDir(itensorpath, kind=C_System)
addHeaderDir("/usr/local/opt/openblas/include", kind=C_System)
cxxinclude("itensor/all.h")

In [None]:
const uni10path = "/Users/jutho/Dropbox/Code/Libraries/uni10/"
Libdl.dlopen(uni10path * "lib/libuni10.dylib", Libdl.RTLD_GLOBAL)
addHeaderDir(uni10path * "include", kind=C_System)
cxxinclude("uni10.hpp")

## MPS - MPO Environment without symmetries

In [None]:
function time_mpo_nosym_tk(dphys::Int, dvirt::Int, n::Int)
    Vphys = ℂ^dphys
    Vvirt = ℂ^dvirt

    A = Tensor(randn, Vvirt ⊗ Vphys ⊗ Vvirt')
    FL = Tensor(randn, Vvirt ⊗ Vphys' ⊗ Vvirt')
    FR = Tensor(randn, Vvirt ⊗ Vphys ⊗ Vvirt')
    M = Tensor(randn, Vphys ⊗ Vphys ⊗ Vphys' ⊗ Vphys')

    e = 0.;
    times = zeros(Float64, n)
    for i = 1:n
        elapsed = Base.time_ns()
        @tensor e += (((FL[alpha',t,alpha]*A[alpha,s,beta])*FR[beta,t',beta'])*conj(A[alpha',s',beta']))*M[t,s',t',s]
        elapsed = Base.time_ns() - elapsed
        times[i] = elapsed/1e9
    end
    return times
end
function time_mpo_nosym_itensor(dphys::Int, dvirt::Int, n::Int)
    itimes = Vector{UInt}(n)
    icxx"""
    using namespace itensor;

    auto alpha = Index("alpha", $dvirt);
    auto beta = Index("beta", $dvirt);
    auto s = Index("s", $dphys);
    auto t = Index("t", $dphys);
    
    auto A = randomTensor(alpha, s, beta);
    auto FL = randomTensor(prime(alpha), t, alpha);
    auto FR = randomTensor(beta, prime(t), prime(beta));
    auto M = randomTensor(t, prime(s), prime(t), s);

    unsigned long* itimes = $(pointer(itimes)::Ptr{UInt});
    int i = 0;
    double e = 0.;
    while (i < $n)
    {
        unsigned long elapsed = $:(Base.time_ns()::UInt64);
        ITensor v = (((FL * A) * FR)*prime(conj(A))) * M;
        e += v.real();
        elapsed = $:(Base.time_ns()::UInt64) - elapsed;
        itimes[i] = elapsed;
        i += 1;
    }
    """
    times = itimes/1e9
end
function time_mpo_nosym_uni10(dphys::Int, dvirt::Int, n::Int)
    itimes = Vector{UInt}(n)
    icxx"""
    using namespace uni10;
    Bond Pout(BD_IN, $dphys);
    Bond Pin(BD_OUT, $dphys);
    Bond Vout(BD_IN, $dvirt);
    Bond Vin(BD_OUT, $dvirt);
    
    std::vector<Bond> Aind;
    Aind.push_back(Vout);
    Aind.push_back(Pout);
    Aind.push_back(Vin);
    UniTensor A(Aind, "A");
    A.randomize();

    std::vector<Bond> FLind;
    FLind.push_back(Vout);
    FLind.push_back(Pin);
    FLind.push_back(Vin);
    UniTensor FL(FLind, "FL");
    FL.randomize();

    std::vector<Bond> FRind;
    FRind.push_back(Vout);
    FRind.push_back(Pout);
    FRind.push_back(Vin);
    UniTensor FR(FRind, "FR");
    FR.randomize();
    
    std::vector<Bond> Mind;
    Mind.push_back(Pout);
    Mind.push_back(Pout);
    Mind.push_back(Pin);
    Mind.push_back(Pin);
    UniTensor M(Mind, "M");
    M.randomize();
    
    std::vector<Bond> A2ind;
    A2ind.push_back(Vout);
    A2ind.push_back(Pin);
    A2ind.push_back(Vin);
    UniTensor A2(A2ind, "cA");
    A2.randomize();

    int labelA[] = {1,5,2};
    int labelFL[] = {6,3,1};
    int labelFR[] = {2,4,8};
    int labelM[] = {3,7,4,5};
    int labelA2[] = {8,7,6};

    A.setLabel(labelA);
    FL.setLabel(labelFL);
    FR.setLabel(labelFR);
    M.setLabel(labelM);
    A2.setLabel(labelA2);

    unsigned long* itimes = $(pointer(itimes)::Ptr{UInt});
    int i = 0;
    double e = 0.;
    while (i < $n)
    {
        unsigned long elapsed = $:(Base.time_ns()::UInt64);
        auto v = (((FL*A)*FR)*A2)*M;
        e += v.getBlock()[0];
        elapsed = $:(Base.time_ns()::UInt64) - elapsed;
        itimes[i] = elapsed;
        i += 1;
    }
    """
    times = itimes/1e9
end

## MPS - MPO Environment without symmetries

In [None]:
# compile
d = 2; D = 32;
n = 1;
times = (time_mpo_nosym_tk(d,D,n), time_mpo_nosym_itensor(d,D,n), time_mpo_nosym_uni10(d,D,n))

In [None]:
# small
d = 2; D = 32;
n = 10000;
times = (time_mpo_nosym_tk(d,D,n), time_mpo_nosym_itensor(d,D,n), time_mpo_nosym_uni10(d,D,n))
plot(sort(times[1]));plot(sort(times[2]));plot(sort(times[3]));ylim([0.,10*minimum(times[1])])
legend(["TensorKit.jl","ITensor v2.1.0","Uni10 v1.0"])

## MPS - MPO Environment without symmetries

In [None]:
# medium
d = 3; D = 128; n = 1000;
times = (time_mpo_nosym_tk(d,D,n), time_mpo_nosym_itensor(d,D,n), time_mpo_nosym_uni10(d,D,n))
plot(sort(times[1]));plot(sort(times[2]));plot(sort(times[3]));ylim([0.,10*minimum(times[1])])
legend(["TensorKit.jl","ITensor v2.1.0","Uni10 v1.0"])

## MPS - MPO Environment without symmetries

In [None]:
# large
d = 4; D = 512; n = 50;
times = (time_mpo_nosym_tk(d,D,n), time_mpo_nosym_itensor(d,D,n), time_mpo_nosym_uni10(d,D,n))
plot(sort(times[1]));plot(sort(times[2]));plot(sort(times[3]));ylim([0.,10*minimum(times[1])])
legend(["TensorKit.jl","ITensor v2.1.0","Uni10 v1.0"])

## MPS - MPO Environment with U1 symmetry

In [None]:
function time_mpo_sym_tk(Vphys::U1Space, Vvirt::U1Space, n::Int)
    A = Tensor(randn, Vvirt ⊗ Vphys ⊗ Vvirt')
    FL = Tensor(randn, Vvirt ⊗ Vphys' ⊗ Vvirt')
    FR = Tensor(randn, Vvirt ⊗ Vphys ⊗ Vvirt')
    M = Tensor(randn, Vphys ⊗ Vphys ⊗ Vphys' ⊗ Vphys')


    times = zeros(Float64, n)
    for i = 1:n
        elapsed = Base.time_ns()
        @tensor Aenv[alpha' beta'; s'] := ((FL[alpha',t,alpha]*A[alpha,s,beta])*M[t,s',t',s])*FR[beta, t', beta']
        elapsed = Base.time_ns() - elapsed
        times[i] = elapsed/1e9
    end
    return times
end
function time_mpo_sym_itensor(Vphys::U1Space, Vvirt::U1Space, n::Int)
    itimes = Vector{UInt}(n)
    Qphys = Vector{Int}()
    dphys = Vector{Int}()
    for q in sectors(Vphys)
        push!(Qphys, q.charge)
        push!(dphys, dim(Vphys, q))
    end
    nphys = length(Qphys)
    Qvirt = Vector{Int}()
    dvirt = Vector{Int}()
    for q in sectors(Vvirt)
        push!(Qvirt, q.charge)
        push!(dvirt, dim(Vvirt, q))
    end
    nvirt = length(Qvirt)
    icxx"""
    using namespace itensor;
    auto QNphys = stdx::reserve_vector<IndexQN>($nphys);
    auto QNvirt = stdx::reserve_vector<IndexQN>($nvirt);

    long *Qphys = $(pointer(Qphys)::Ptr{Int});
    long *dphys = $(pointer(dphys)::Ptr{Int});
    long *Qvirt = $(pointer(Qvirt)::Ptr{Int});
    long *dvirt = $(pointer(dvirt)::Ptr{Int});

    int i = 0;
    while (i < $nphys) {
        QNphys.emplace_back(Index("I"+std::to_string(Qphys[i]),dphys[i]), QN(Qphys[i]));
        i++;
    }
    i = 0;
    while (i < $nvirt) {
        QNvirt.emplace_back(Index("I"+std::to_string(Qvirt[i]),dvirt[i]), QN(Qvirt[i]));
        i++;
    }
    auto QNvirt2 = std::vector<IndexQN>(QNvirt);
    auto QNphys2 = std::vector<IndexQN>(QNphys);

    auto alpha = IQIndex("alpha", std::move(QNvirt));
    auto beta = IQIndex("beta", std::move(QNvirt2));
    auto s = IQIndex("s", std::move(QNphys));
    auto t = IQIndex("t", std::move(QNphys2));

    auto A = randomTensor(QN(0), alpha, s, dag(beta));
    auto FL = randomTensor(QN(0), prime(alpha), dag(t), dag(alpha));
    auto FR = randomTensor(QN(0), beta, prime(t), dag(prime(beta)));
    auto M = randomTensor(QN(0), t, prime(s), dag(prime(t)), dag(s));
    
    unsigned long* itimes = $(pointer(itimes)::Ptr{UInt});
    i = 0;
    while (i < $n)
    {
        unsigned long elapsed = $:(Base.time_ns()::UInt64);
        auto FLA = ((FL * A)*M)*FR;
        elapsed = $:(Base.time_ns()::UInt64) - elapsed;
        itimes[i] = elapsed;
        i += 1;
    }
    """
    times = itimes/1e9
end
function time_mpo_sym_uni10(Vphys::U1Space, Vvirt::U1Space, n::Int)
    itimes = Vector{UInt}(n)
    Qphys = Vector{Int}()
    for q in sectors(Vphys)
        for k = 1:dim(Vphys, q)
            push!(Qphys, q.charge)
        end
    end
    nphys = length(Qphys)
    Qvirt = Vector{Int}()
    for q in sectors(Vvirt)
        for k = 1:dim(Vvirt, q)
            push!(Qvirt, q.charge)
        end
    end
    nvirt = length(Qvirt)
    icxx"""
    using namespace uni10;
    std::vector<Qnum> Qphys;
    std::vector<Qnum> Qvirt;
    long *intphys = $(pointer(Qphys)::Ptr{Int});
    long *intvirt = $(pointer(Qvirt)::Ptr{Int});
    int i = 0;
    while (i < $nphys) {
        Qphys.push_back(Qnum(intphys[i]));
        i++;
    }
    i = 0;
    while (i < $nvirt) {
        Qvirt.push_back(Qnum(intvirt[i]));
        i++;
    }
    Bond Pout(BD_IN, Qphys);
    Bond Pin(BD_OUT, Qphys);
    Bond Vout(BD_IN, Qvirt);
    Bond Vin(BD_OUT, Qvirt);
        
    std::vector<Bond> Aind;
    Aind.push_back(Vout);
    Aind.push_back(Pout);
    Aind.push_back(Vin);

    std::vector<Bond> FLind;
    FLind.push_back(Vout);
    FLind.push_back(Pin);
    FLind.push_back(Vin);

    std::vector<Bond> FRind;
    FRind.push_back(Vout);
    FRind.push_back(Pout);
    FRind.push_back(Vin);
    
    std::vector<Bond> Mind;
    Mind.push_back(Pout);
    Mind.push_back(Pout);
    Mind.push_back(Pin);
    Mind.push_back(Pin);

    
    UniTensor A(Aind, "A");
    A.randomize();

    UniTensor FL(FLind, "FL");
    FL.randomize();

    UniTensor FR(FRind, "FR");
    FR.randomize();

    UniTensor M(Mind, "M");
    M.randomize();


    
    std::vector<Bond> A2ind;
    A2ind.push_back(Vout);
    A2ind.push_back(Pin);
    A2ind.push_back(Vin);
    UniTensor A2(A2ind, "cA");
    A2.randomize();

    int labelA[] = {1,5,2};
    int labelFL[] = {6,3,1};
    int labelFR[] = {2,4,8};
    int labelM[] = {3,7,4,5};
    int labelA2[] = {8,7,6};

    A.setLabel(labelA);
    FL.setLabel(labelFL);
    FR.setLabel(labelFR);
    M.setLabel(labelM);
    A2.setLabel(labelA2);

    unsigned long* itimes = $(pointer(itimes)::Ptr{UInt});
    i = 0;
    double e = 0.;
    while (i < $n)
    {
        unsigned long elapsed = $:(Base.time_ns()::UInt64);
        auto FLA = ((FL * A)*M)*FR;
        elapsed = $:(Base.time_ns()::UInt64) - elapsed;
        itimes[i] = elapsed;
        i += 1;
    }
    """
    times = itimes/1e9
end

In [None]:
# compile
Vphys = U1Space(0=>2,-1=>1,+1=>1)
Vvirt = U1Space(0=>10,-1=>8,+1=>8,-2=>6,+2=>6,-3=>4,+3=>4,-4=>2,+4=>2)
n = 100
times = (time_mpo_sym_tk(Vphys,Vvirt,n),time_mpo_sym_itensor(Vphys,Vvirt,n),time_mpo_sym_uni10(Vphys,Vvirt,n));

## MPS - MPO Environment with U1 symmetry

In [None]:
# small
Vphys = U1Space(0=>2,-1=>1,+1=>1)
Vvirt = U1Space(0=>10,-1=>8,+1=>8,-2=>6,+2=>6,-3=>4,+3=>4,-4=>2,+4=>2)
n = 10000
times = (time_mpo_sym_tk(Vphys,Vvirt,n),time_mpo_sym_itensor(Vphys,Vvirt,n),time_mpo_sym_uni10(Vphys,Vvirt,n))
plot(sort(times[1]));plot(sort(times[2]));plot(sort(times[3]));ylim([0.,10*minimum(times[1])])
legend(["TensorKit.jl","ITensor v2.1.0","Uni10 v1.0"])

## MPS - MPO Environment with U1 symmetry

In [None]:
# medium
Vphys = U1Space(0=>4,-1=>2,+1=>2,-2=>2,+2=>2)
Vvirt = U1Space(0=>50,-1=>40,+1=>40,-2=>30,+2=>30,-3=>20,+3=>20,-4=>10,+4=>10)
n = 1000
times = (time_mpo_sym_tk(Vphys,Vvirt,n),time_mpo_sym_itensor(Vphys,Vvirt,n),time_mpo_sym_uni10(Vphys,Vvirt,n))
plot(sort(times[1]));plot(sort(times[2]));plot(sort(times[3]));ylim([0.,10*minimum(times[1])])
legend(["TensorKit.jl","ITensor v2.1.0","Uni10 v1.0"])

## Number of lines of code: ITensor

In [None]:
cd(itensorpath*"itensor")
run(pipeline(`find . -iname '*.h' -o -iname '*.ih' -o -iname '*.cc'`,`grep -v mps`,`xargs wc -l`,`tail -n 1`))

## Number of lines of code: Uni10

In [None]:
cd("/Users/jutho/Dropbox/Code/Libraries/uni10-build/uni10/src")
run(pipeline(`find . -iname '*.hpp' -o -iname '*.h' -o -iname '*.cpp'`,`xargs wc -l`,`tail -n 1`))

## Number of lines of code: TensorKit & dependencies

In [None]:
cd("/Users/jutho/Dropbox/Code/Packages/Strided/src")
run(pipeline(`find . -iname '*.jl'`,`xargs wc -l`,`tail -n 1`))

In [None]:
cd("/Users/jutho/Dropbox/Code/Packages/TupleTools/src")
run(pipeline(`find . -iname '*.jl'`,`xargs wc -l`,`tail -n 1`))

In [None]:
cd("/Users/jutho/Dropbox/Code/Packages/TensorOperations/src")
run(pipeline(`find . -iname '*.jl'`,`xargs wc -l`,`tail -n 1`))

In [None]:
cd("/Users/jutho/Dropbox/Code/Packages/WignerSymbols/src")
run(pipeline(`find . -iname '*.jl'`,`xargs wc -l`,`tail -n 1`))

In [None]:
cd("/Users/jutho/Dropbox/Code/Packages/TensorKit/src")
run(pipeline(`find . -iname '*.jl'`,`xargs wc -l`,`tail -n 1`))

## Summary

* Currently in TensorKit:
    * Abelian and Non-Abelian symmetries: $\mathsf{U}_1$, $\mathbb{Z}_N$, $\mathsf{SU}_2$, $\mathsf{U}_1 \rtimes C$
    * Any tensor product of the above
    * Any symmetry a user provides, but with some work to be done if $N_{a,b}^{c} > 1$
    * Fermions (using super vector spaces), but currently untested
* Any number type `Float32`, `Float64`, `Complex{Float32}`, `Complex{Float64}`, MPFR big precision, high precision numbers provided by external packages, ...

* First steps into multithreading beyond BLAS

## Outlook

* Public version (open source, MIT licensed) soon... (also waiting for Julia to stabilize)
* Actual tensor network algorithms
* Arbitrary topological symmetries
* Tensor network objects that allow to efficiently recycle temporaries, extract environments, ...
* Spatial symmetries?
* GPU support?