# EE4375-2022: Second Lab Session: Functions, Type Stability and Benchmarking

## Import Packages  

In [1]:
using LinearAlgebra 
using SparseArrays 

using IterativeSolvers
using Preconditioners

using BenchmarkTools
using Profile
using ProfileView

using Plots 

## Section 1: Build Linear System as Sparse From the Start
Motivation 
1. build matrix as sparse directly, i.e., avoid convection from dense to sparse matrix; 
2. use code as building block for more complex code; 
3. profile code in terms of type-stability, memory and CPU usage; 
4. unit test the code (compare with analytical solution or rate of converge); 
5. upload the code of github to work with github actions; 

### Section 1.1: Build Coefficient Matrix 
We specify the type of the input argument N to providing the compiler more informartion to perform required optimizations. In the following, we 
1. test the code on small input values; 
2. verify the type stability of the code;
3. benchmark the code;

Note that the type-instability that @code_warntype is considered harmless as reported [here](https://discourse.julialang.org/t/how-to-prevent-type-instability-in-for-loops/30508). 

In [2]:
function buildMat1D(N::Int64)
  Nm1::Int64 = N-1; Np1::Int64 = N+1 
  h::Float64 = 1/N; h2::Float64 = h*h; 
  stencil = Vector{Float64}([-1/h2, 2/h2, -1/h2]);
  #..Allocate row, column and value vector 
  I = Int64[]; sizehint!(I, 3*Nm1); 
  J = Int64[]; sizehint!(J, 3*Nm1); 
  vals=Float64[]; sizehint!(vals, 3*Nm1);
  intervalRows = Vector{Int64}(2:N);
  for i in intervalRows 
    append!(I, [i,i,i])
    append!(J, [i-1,i,i+1])
    append!(vals,stencil) 
  end 
  #..Build matrix for interior rows   
  A = sparse(I,J,vals,Np1,Np1)
  #..Build matrix for boundary rows
  A[1,1] = 1; A[end,end]=1; A[2,1] =0; A[end-1,end]=0; 
  return A 
end 

buildMat1D (generic function with 1 method)

In [3]:
A = buildMat1D(4)

5×5 SparseMatrixCSC{Float64, Int64} with 11 stored entries:
 1.0     ⋅      ⋅      ⋅    ⋅ 
 0.0   32.0  -16.0     ⋅    ⋅ 
  ⋅   -16.0   32.0  -16.0   ⋅ 
  ⋅      ⋅   -16.0   32.0  0.0
  ⋅      ⋅      ⋅      ⋅   1.0

In [4]:
B = Matrix(A)

5×5 Matrix{Float64}:
 1.0    0.0    0.0    0.0  0.0
 0.0   32.0  -16.0    0.0  0.0
 0.0  -16.0   32.0  -16.0  0.0
 0.0    0.0  -16.0   32.0  0.0
 0.0    0.0    0.0    0.0  1.0

In [5]:
typeof(B)

Matrix{Float64}[90m (alias for [39m[90mArray{Float64, 2}[39m[90m)[39m

In [7]:
@code_warntype buildMat1D(5.);

In [8]:
@benchmark buildMat1D(1000)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m 86.416 μs[22m[39m … [35m  5.541 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 98.13%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m 97.208 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m117.298 μs[22m[39m ± [32m260.107 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m15.56% ±  6.82%

  [39m [39m [39m [39m [39m [39m [39m█[39m▆[34m▄[39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▃[39m▄[39

### Section 1.2: Construction of the Right-Hand Side Vector

In [9]:
function buildRhs1D(N::Int64,sourceFct::Function)
  h = 1/N;
  x = Vector(0:h:1)
  #..Build vector for interior rows 
  f = sourceFct(x)
  #..Build matrix for boundary rows
  f[1] = 0; f[end] = 0; 
  return f 
end 

buildRhs1D (generic function with 1 method)

In [10]:
sourceFct(x)= x.*sin.(π*x)

sourceFct (generic function with 1 method)

In [11]:
f = buildRhs1D(5,sourceFct);
typeof(A)

SparseMatrixCSC{Float64, Int64}

In [12]:
@code_warntype buildRhs1D(5,sourceFct)

MethodInstance for buildRhs1D(::Int64, ::typeof(sourceFct))
  from buildRhs1D(N::Int64, sourceFct::Function) in Main at In[9]:1
Arguments
  #self#[36m::Core.Const(buildRhs1D)[39m
  N[36m::Int64[39m
  sourceFct[36m::Core.Const(sourceFct)[39m
Locals
  f[36m::Vector{Float64}[39m
  x[36m::Vector{Float64}[39m
  h[36m::Float64[39m
Body[36m::Vector{Float64}[39m
[90m1 ─[39m      (h = 1 / N)
[90m│  [39m %2 = (0:h:1)[36m::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}[39m
[90m│  [39m      (x = Main.Vector(%2))
[90m│  [39m      (f = (sourceFct)(x))
[90m│  [39m      Base.setindex!(f, 0, 1)
[90m│  [39m %6 = f[36m::Vector{Float64}[39m
[90m│  [39m %7 = Base.lastindex(f)[36m::Int64[39m
[90m│  [39m      Base.setindex!(%6, 0, %7)
[90m└──[39m      return f



In [13]:
@benchmark buildRhs1D(1000,sourceFct)

BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m6.067 μs[22m[39m … [35m941.433 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 99.24%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m7.542 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m8.667 μs[22m[39m ± [32m 28.727 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m11.45% ±  3.43%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▅[39m█[39m▆[34m▃[39m[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m▂[39m▃[39m▂[39m▂[3

### Section 1.3: Solve the Linear System - Default Versions 
Here we employ a sparse direct solver. 

In [21]:
function solvePoisson1D(N::Int64,sourceFct::Function)
  A = buildMat1D(N);
  f = buildRhs1D(N,sourceFct)
  u = A\f 
  return u 
end

solvePoisson1D (generic function with 1 method)

In [22]:
@benchmark solvePoisson1D(100,sourceFct)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m30.083 μs[22m[39m … [35m  8.825 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 51.90%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m34.042 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m42.430 μs[22m[39m ± [32m250.964 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m9.33% ±  1.57%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m▅[39m▇[39m█[39m▆[34m▄[39m[39m▄[39m▃[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▁[39m▂[39m▄[39

In [16]:
# algorithm being used to solve the linear system 
# use edit("/Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/SparseArrays/src/linalg.jl:1548")
# to view the source code 
methods(\, (SparseMatrixCSC{Float64, Int64}, Vector{Float64}))

### Section 1.4: Solve the Linear System - Alternative Versions 
Here we employ a dense direct solver. 

In [28]:
function solvePoisson1D(N::Int64,sourceFct::Function)
  A = buildMat1D(N);
  # B = Matrix(A)
  B = Tridiagonal(A) 
  B = sparse(B)
  f = buildRhs1D(N,sourceFct)
  u = B\f 
  return u 
end

solvePoisson1D (generic function with 1 method)

In [29]:
@benchmark solvePoisson1D(100,sourceFct)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m38.000 μs[22m[39m … [35m  8.350 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 55.68%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m42.209 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m51.483 μs[22m[39m ± [32m257.605 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m9.35% ±  1.86%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁[39m▅[39m▆[39m█[39m█[34m▆[39m[39m▆[39m▃[39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▁[39m▂[39m▃[39

In [35]:
# algorithm being used to solve the linear system 
methods(\, (Matrix{Float64}, Vector{Float64}))

Here we employ a tri-diagonal matrix (not obvious with solver variant is being used). 

In [50]:
function solvePoisson1D(N::Int64,sourceFct::Function)
  A = buildMat1D(N);
  B = Tridiagonal(A) 
  f = buildRhs1D(N,sourceFct)
  u = B\f 
  return u 
end

solvePoisson1D (generic function with 1 method)

In [51]:
@benchmark solvePoisson1D(100,sourceFct)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m27.417 μs[22m[39m … [35m  5.577 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 99.30%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m29.750 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m33.891 μs[22m[39m ± [32m140.691 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m10.96% ±  2.62%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m▁[39m▃[39m█[34m▅[39m[39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m▂[39m▃[39m▅

## Sandbox

In [39]:
?sparse

search: [0m[1ms[22m[0m[1mp[22m[0m[1ma[22m[0m[1mr[22m[0m[1ms[22m[0m[1me[22m [0m[1ms[22m[0m[1mp[22m[0m[1ma[22m[0m[1mr[22m[0m[1ms[22m[0m[1me[22mvec [0m[1ms[22m[0m[1mp[22m[0m[1ma[22m[0m[1mr[22m[0m[1ms[22m[0m[1me[22m_vcat [0m[1ms[22m[0m[1mp[22m[0m[1ma[22m[0m[1mr[22m[0m[1ms[22m[0m[1me[22m_hcat [0m[1ms[22m[0m[1mp[22m[0m[1ma[22m[0m[1mr[22m[0m[1ms[22m[0m[1me[22m_hvcat [0m[1mS[22m[0m[1mp[22m[0m[1ma[22m[0m[1mr[22m[0m[1ms[22m[0m[1me[22mVector



```
sparse(A)
```

Convert an AbstractMatrix `A` into a sparse matrix.

# Examples

```jldoctest
julia> A = Matrix(1.0I, 3, 3)
3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

julia> sparse(A)
3×3 SparseMatrixCSC{Float64, Int64} with 3 stored entries:
 1.0   ⋅    ⋅
  ⋅   1.0   ⋅
  ⋅    ⋅   1.0
```

---

```
sparse(I, J, V,[ m, n, combine])
```

Create a sparse matrix `S` of dimensions `m x n` such that `S[I[k], J[k]] = V[k]`. The `combine` function is used to combine duplicates. If `m` and `n` are not specified, they are set to `maximum(I)` and `maximum(J)` respectively. If the `combine` function is not supplied, `combine` defaults to `+` unless the elements of `V` are Booleans in which case `combine` defaults to `|`. All elements of `I` must satisfy `1 <= I[k] <= m`, and all elements of `J` must satisfy `1 <= J[k] <= n`. Numerical zeros in (`I`, `J`, `V`) are retained as structural nonzeros; to drop numerical zeros, use [`dropzeros!`](@ref).

For additional documentation and an expert driver, see `SparseArrays.sparse!`.

# Examples

```jldoctest
julia> Is = [1; 2; 3];

julia> Js = [1; 2; 3];

julia> Vs = [1; 2; 3];

julia> sparse(Is, Js, Vs)
3×3 SparseMatrixCSC{Int64, Int64} with 3 stored entries:
 1  ⋅  ⋅
 ⋅  2  ⋅
 ⋅  ⋅  3
```


In [40]:
A = buildMat1D(100);
B = Matrix(A); # convert from sparse to dense 
C = Tridiagonal(B)
f = buildRhs1D(100,sourceFct);

In [37]:
typeof(A)

SparseMatrixCSC{Float64, Int64}

In [38]:
typeof(B)

Matrix{Float64}[90m (alias for [39m[90mArray{Float64, 2}[39m[90m)[39m

In [35]:
@btime B \ f ;

  156.125 μs (4 allocations: 81.55 KiB)


In [36]:
@btime A \ f ;

  19.083 μs (54 allocations: 43.08 KiB)


In [41]:
@btime C \ f ;

  2.296 μs (8 allocations: 5.34 KiB)


In [42]:
typeof(C)

Tridiagonal{Float64, Vector{Float64}}

In [52]:
methods(\, (Tridiagonal{Float64, Vector{Float64}}, Vector{Float64}))

In [46]:
edit("/Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/tridiag.jl")