# CUDA tensor network contraction demo

## Requirements
- The system must have a CUDA GPU available.

In [1]:
using Tenet
using EinExprs
using Adapt
using CUDA
using BenchmarkTools

│ Please consider using an official build from https://julialang.org/downloads/.
└ @ CUDA /home/bsc/bsc021386/.julia/packages/CUDA/75aiI/src/initialization.jl:180


Create a random tensor network and find its contraction path:

In [2]:
# Initialize random tensor network
regularity = 6
ntensors = 10
tn = rand(TensorNetwork, ntensors, regularity)
path = einexpr(tn; optimizer=Exhaustive())

SizedEinExpr{Symbol}(EinExpr{Symbol}(Symbol[], EinExpr{Symbol}[EinExpr{Symbol}([:P, :A, :c, :Y], EinExpr{Symbol}[]), EinExpr{Symbol}([:P, :A, :c, :Y], EinExpr{Symbol}[EinExpr{Symbol}([:H, :U, :F, :Z, :X, :O, :A, :c], EinExpr{Symbol}[]), EinExpr{Symbol}([:H, :P, :U, :F, :Z, :X, :O, :Y], EinExpr{Symbol}[EinExpr{Symbol}([:E, :K, :U, :V, :I, :a, :F, :Z, :b], EinExpr{Symbol}[]), EinExpr{Symbol}([:E, :H, :P, :K, :V, :I, :a, :b, :X, :O, :Y], EinExpr{Symbol}[EinExpr{Symbol}([:D, :E, :H, :M, :P], EinExpr{Symbol}[]), EinExpr{Symbol}([:D, :M, :K, :V, :I, :a, :b, :X, :O, :Y], EinExpr{Symbol}[EinExpr{Symbol}([:D, :J, :G, :a, :O, :T, :Y], EinExpr{Symbol}[EinExpr{Symbol}([:D, :J, :R], EinExpr{Symbol}[]), EinExpr{Symbol}([:R, :G, :a, :O, :T, :Y], EinExpr{Symbol}[])]), EinExpr{Symbol}([:J, :M, :G, :K, :V, :I, :b, :X, :T], EinExpr{Symbol}[EinExpr{Symbol}([:M, :G, :S, :K, :Q], EinExpr{Symbol}[]), EinExpr{Symbol}([:J, :S, :Q, :V, :I, :b, :X, :T], EinExpr{Symbol}[EinExpr{Symbol}([:V, :b, :N, :C, :L, :X, :B

Transform the tensors' data types to `CuArray`s:

In [3]:
cudatn = adapt(CuArray, tn)

TensorNetwork (#tensors=10, #inds=30)

Benchmark CUDA tensor network contraction:

In [4]:
@benchmark contract(cudatn; path)

BenchmarkTools.Trial: 1355 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m854.907 μs[22m[39m … [35m  9.324 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 36.81%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m  3.749 ms               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m  3.673 ms[22m[39m ± [32m796.963 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.85% ±  3.09%

  [39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [34m█[39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m█[39m▃[39m▁[39m▃

Benchmark regular tensor network contraction:

In [5]:
@benchmark contract(tn; path)

BenchmarkTools.Trial: 12 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m325.174 ms[22m[39m … [35m966.103 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 1.81% … 57.31%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m334.514 ms               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 1.78%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m433.969 ms[22m[39m ± [32m194.491 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m19.64% ± 19.62%

  [39m█[34m [39m[39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m█[34m▄[39m[