# EE4375-2022: Second Lab Session: Functions, Type Stability and Benchmarking

## Import Packages  

In [1]:
using LinearAlgebra 
using SparseArrays 

using IterativeSolvers
using Preconditioners

using BenchmarkTools
using Profile
using ProfileView

using Plots 

## Section 1: Build Linear System as Sparse From the Start
Motivation 
1. build matrix as sparse directly, i.e., avoid convection from dense to sparse matrix; 
2. use code as building block for more complex code; 
3. profile code in terms of type-stability, memory and CPU usage; 
4. unit test the code (compare with analytical solution or rate of converge); 
5. upload the code of github to work with github actions; 

### Section 1.1: Build Coefficient Matrix 
We specify the type of the input argument N to providing the compiler more informartion to perform required optimizations. In the following, we 
1. test the code on small input values; 
2. verify the type stability of the code;
3. benchmark the code;

Note that the type-instability that @code_warntype is considered harmless as reported [here](https://discourse.julialang.org/t/how-to-prevent-type-instability-in-for-loops/30508). 

In [2]:
function buildMat1D(N::Int64)
  Nm1::Int64 = N-1; Np1::Int64 = N+1 
  h::Float64 = 1/N; h2::Float64 = h*h; 
  stencil = Vector{Float64}([-1/h2, 2/h2, -1/h2]);
  #..Allocate row, column and value vector 
  I = Int64[]; sizehint!(I, 3*Nm1); 
  J = Int64[]; sizehint!(J, 3*Nm1); 
  vals=Float64[]; sizehint!(vals, 3*Nm1);
  intervalRows = Vector{Int64}(2:N);
  for i in intervalRows 
    append!(I, [i,i,i])
    append!(J, [i-1,i,i+1])
    append!(vals,stencil) 
  end 
  #..Build matrix for interior rows   
  A = sparse(I,J,vals,Np1,Np1)
  #..Build matrix for boundary rows
  A[1,1] = 1; A[end,end]=1; A[2,1] =0; A[end-1,end]=0; 
  return A 
end 

buildMat1D (generic function with 1 method)

In [3]:
A = buildMat1D(4)

5×5 SparseMatrixCSC{Float64, Int64} with 11 stored entries:
 1.0     ⋅      ⋅      ⋅    ⋅ 
 0.0   32.0  -16.0     ⋅    ⋅ 
  ⋅   -16.0   32.0  -16.0   ⋅ 
  ⋅      ⋅   -16.0   32.0  0.0
  ⋅      ⋅      ⋅      ⋅   1.0

In [15]:
B = Matrix(A)

5×5 Matrix{Float64}:
 1.0    0.0    0.0    0.0  0.0
 0.0   32.0  -16.0    0.0  0.0
 0.0  -16.0   32.0  -16.0  0.0
 0.0    0.0  -16.0   32.0  0.0
 0.0    0.0    0.0    0.0  1.0

In [16]:
typeof(B)

Matrix{Float64}[90m (alias for [39m[90mArray{Float64, 2}[39m[90m)[39m

In [28]:
@code_warntype buildMat1D(5.);

MethodInstance for buildMat1D(::Float64)
  from buildMat1D(N) in Main at In[21]:2
Arguments
  #self#[36m::Core.Const(buildMat1D)[39m
  N[36m::Float64[39m
Locals
  @_3[33m[1m::Union{Nothing, Tuple{Int64, Int64}}[22m[39m
  A[36m::SparseMatrixCSC{Float64, Int64}[39m
  intervalRows[36m::Vector{Int64}[39m
  vals[36m::Vector{Float64}[39m
  J[36m::Vector{Int64}[39m
  I[36m::Vector{Int64}[39m
  stencil[36m::Vector{Float64}[39m
  h2[36m::Float64[39m
  h[36m::Float64[39m
  Np1[36m::Int64[39m
  Nm1[36m::Int64[39m
  i[36m::Int64[39m
Body[36m::SparseMatrixCSC{Float64, Int64}[39m
[90m1 ─[39m       Core.NewvarNode(:(A))
[90m│  [39m %2  = (N - 1)[36m::Float64[39m
[90m│  [39m %3  = Base.convert(Main.Int64, %2)[36m::Int64[39m
[90m│  [39m       (Nm1 = Core.typeassert(%3, Main.Int64))
[90m│  [39m %5  = (N + 1)[36m::Float64[39m
[90m│  [39m %6  = Base.convert(Main.Int64, %5)[36m::Int64[39m
[90m│  [39m       (Np1 = Core.typeassert(%6, Main.Int64))
[90m│ 

In [25]:
@benchmark buildMat1D(1000)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m 87.333 μs[22m[39m … [35m  5.549 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 98.09%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m 97.166 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m119.481 μs[22m[39m ± [32m289.167 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m17.06% ±  6.86%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m▇[39m█[34m▄[39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▂[39

### Section 1.2: Construction of the Right-Hand Side Vector

In [22]:
function buildRhs1D(N::Int64,sourceFct::Function)
  h = 1/N;
  x = Vector(0:h:1)
  #..Build vector for interior rows 
  f = sourceFct(x)
  #..Build matrix for boundary rows
  f[1] = 0; f[end] = 0; 
  return f 
end 

buildRhs1D (generic function with 1 method)

In [20]:
sourceFct(x)= x.*sin.(π*x)

sourceFct (generic function with 1 method)

In [33]:
f = buildRhs1D(5,sourceFct);
typeof(A)

SparseMatrixCSC{Float64, Int64}

In [33]:
@code_warntype buildRhs1D(5,sourceFct)

MethodInstance for buildRhs1D(::Int64, ::typeof(sourceFct))
  from buildRhs1D(N::Int64, sourceFct::Function) in Main at In[30]:1
Arguments
  #self#[36m::Core.Const(buildRhs1D)[39m
  N[36m::Int64[39m
  sourceFct[36m::Core.Const(sourceFct)[39m
Locals
  f[36m::Vector{Float64}[39m
  x[36m::Vector{Float64}[39m
  h[36m::Float64[39m
Body[36m::Vector{Float64}[39m
[90m1 ─[39m      (h = 1 / N)
[90m│  [39m %2 = (0:h:1)[36m::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}[39m
[90m│  [39m      (x = Main.Vector(%2))
[90m│  [39m      (f = (sourceFct)(x))
[90m│  [39m      Base.setindex!(f, 0, 1)
[90m│  [39m %6 = f[36m::Vector{Float64}[39m
[90m│  [39m %7 = Base.lastindex(f)[36m::Int64[39m
[90m│  [39m      Base.setindex!(%6, 0, %7)
[90m└──[39m      return f



In [34]:
@benchmark buildRhs1D(1000,sourceFct)

BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m6.017 μs[22m[39m … [35m 1.082 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 99.30%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m7.542 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m8.621 μs[22m[39m ± [32m31.902 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m12.26% ±  3.29%

  [39m▁[39m▃[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁[39m▃[39m▄[39m▅[39m▇[39m▇[39m█[39m█[34m▇[39m[39m▇[39m▇[39m▆[39m▅[39m▅[39m▄[39m▃[39m▂[39m▁[39m▁[39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▃
  [39m█[39m█[39m█[39m█[39m█[39m█[39m▇

### Section 1.3: Solve the Linear System - Default Versions 
Here we employ a sparse direct solver. 

In [25]:
function solvePoisson1D(N::Int64,sourceFct::Function)
  A = buildMat1D(N);
  f = buildRhs1D(N,sourceFct)
  u = A\f 
  return u 
end

solvePoisson1D (generic function with 1 method)

In [26]:
@benchmark solvePoisson1D(1000,sourceFct)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m282.958 μs[22m[39m … [35m  9.458 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 81.18%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m307.000 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m356.168 μs[22m[39m ± [32m490.612 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m9.81% ±  6.73%

  [39m [39m [39m [39m [39m [39m [39m [39m▅[39m█[39m▆[34m▄[39m[39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▄[39m▆[39m▄

In [34]:
# algorithm being used to solve the linear system 
# use edit("/Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/SparseArrays/src/linalg.jl:1548")
# to view the source code 
methods(\, (SparseMatrixCSC{Float64, Int64}, Vector{Float64}))

### Section 1.4: Solve the Linear System - Alternative Versions 
Here we employ a dense direct solver. 

In [29]:
function solvePoisson1D(N::Int64,sourceFct::Function)
  A = buildMat1D(N);
  B = Matrix(A)
  f = buildRhs1D(N,sourceFct)
  u = B\f 
  return u 
end

solvePoisson1D (generic function with 1 method)

In [30]:
@benchmark solvePoisson1D(1000,sourceFct)

BenchmarkTools.Trial: 184 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m10.585 ms[22m[39m … [35m248.194 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m16.441 ms               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m27.204 ms[22m[39m ± [32m 29.754 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m2.50% ± 6.99%

  [39m█[39m▅[34m▅[39m[39m▂[39m▁[39m [39m [32m▂[39m[39m [39m▃[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m█[39m█[34m█[39m[39m█[3

In [35]:
# algorithm being used to solve the linear system 
methods(\, (Matrix{Float64}, Vector{Float64}))