# EE4375-2022: Third Lab Session: Solve 2D Poisson Using Ruge-Stueben and Smoothed Aggregation AMG

## Import Packages  

In [1]:
using Kronecker 

using LinearAlgebra 
using SparseArrays 

using IterativeSolvers
using Preconditioners
using AlgebraicMultigrid

using BenchmarkTools
using Profile
using ProfileView

## Section 1: Build Linear System as Sparse From the Start
Motivation 
1. Build matrix as sparse directly, i.e., avoid convection from dense to sparse matrix; 

To do 
1. incorporate boundary conditions in 2D matrix properly; 
2. illustrate how lazy Kronecker saves memory - still need to figure out how to save CPU time. Does size continue to work for the lazy format. 

### Section 1.1: Build Coefficient Matrix 

In [2]:
function buildMat1D(N)
  Nm1 = N-1; Np1 = N+1 
  h = 1/N; h2 = h*h; 
  stencil = [-1/h2, 2/h2, -1/h2]; 
  #..Allocate row, column and value vector 
  I = zeros(Int64,3*Nm1) # allocate 1D array of Int64 
  J = zeros(Int64,3*Nm1) # allocate 1D array of Int64 
  vals = zeros(3*Nm1)
  #..Construct row, column and value vector 
  for i in 2:N
    offset = 3*(i-2)
    I[[offset+1, offset+2, offset+3]] = [i,i,i]
    J[[offset+1, offset+2, offset+3]] = [i-1,i,i+1]
    vals[[offset+1, offset+2, offset+3]] = stencil
  end 
  #..Build matrix for interior rows   
  A = sparse(I,J,vals,Np1,Np1)
  #..Build matrix for boundary rows
  A[1,1] = 1; A[end,end]=1; A[2,1] =0; A[end-1,end]=0; 
  return A 
end 

function buildMat2D(N)
    A1d = buildMat1D(N)
    A2d = KroneckerSum(A1d, A1d) # using lazy evaluation 
    return A2d 
end 

function buildMat2DAnis(N,epsilon)
    A1d = buildMat1D(N)
    I   = range(1, N); J = range(1, N); vals = ones(Float64, N);
    D   = sparse(I,J,vals,N,N)
    A2d = kronecker(A1d, D) + epsilon * kronecker(D,A1d)  
    return A2d
end 

buildMat2DAnis (generic function with 1 method)

In [3]:
A2d = buildMat2D(30)
A2d = buildMat2DAnis(30,0.0001);
# A2dCollected = collect(A2d)
#varinfo()

In [4]:
A = sparse(collect(A2d))
p = AMGPreconditioner{RugeStuben}(A);
# p = AMGPreconditioner{SmoothedAggregation}(A);
println(p)

AMGPreconditioner{RugeStuben, AlgebraicMultigrid.MultiLevel{AlgebraicMultigrid.Pinv{Float64}, GaussSeidel{SymmetricSweep}, GaussSeidel{SymmetricSweep}, SparseMatrixCSC{Float64, Int64}, Adjoint{Float64, SparseMatrixCSC{Float64, Int64}}, SparseMatrixCSC{Float64, Int64}, AlgebraicMultigrid.MultiLevelWorkspace{Vector{Float64}, 1}}, AlgebraicMultigrid.V}(Multilevel Solver
-----------------
Operator Complexity: 2.45
Grid Complexity: 1.937
No. of Levels: 7
Coarse Solver: Pinv
Level     Unknowns     NonZeros
-----     --------     --------
    1          930         4290 [40.81%]
    2          463         3307 [31.46%]
    3          230         1822 [17.33%]
    4          110          826 [ 7.86%]
    5           46          202 [ 1.92%]
    6           16           48 [ 0.46%]
    7            6           16 [ 0.15%]
, AlgebraicMultigrid.V())


### Section 1.2: Construction of the Right-Hand Side Vector

In [5]:
A = collect(A2d)
N,_ = size(A)
x = ones(N); 
f = A2d*x;

## Section 2: Intermediate Stuff 

In [6]:
@code_warntype buildMat2D(5)

MethodInstance for buildMat2D(::Int64)
  from buildMat2D(N) in Main at In[2]:23
Arguments
  #self#[36m::Core.Const(buildMat2D)[39m
  N[36m::Int64[39m
Locals
  A2d[36m::KroneckerSum{Float64, SparseMatrixCSC{Float64, Int64}, SparseMatrixCSC{Float64, Int64}}[39m
  A1d[36m::SparseMatrixCSC{Float64, Int64}[39m
Body[36m::KroneckerSum{Float64, SparseMatrixCSC{Float64, Int64}, SparseMatrixCSC{Float64, Int64}}[39m
[90m1 ─[39m     (A1d = Main.buildMat1D(N))
[90m│  [39m     (A2d = Main.KroneckerSum(A1d, A1d))
[90m└──[39m     return A2d



In [7]:
A = buildMat2D(100);
A = buildMat2DAnis(4,1);
println(A)

[2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 33.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 -16.0 33.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 -16.0 33.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 33.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 33.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 64.0 -16.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 -16.0 64.0 -16.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 -16.0 0.0 0.0 -16.0 64.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 33.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 33.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -16.0 0.0 0.0 0.0 64.0 -16.0 0.0 0

In [8]:
typeof(A)

Matrix{Float64}[90m (alias for [39m[90mArray{Float64, 2}[39m[90m)[39m

In [13]:
typeof(sparse(collect(A)))

SparseMatrixCSC{Float64, Int64}

In [9]:
@which A*b

LoadError: UndefVarError: b not defined

In [10]:
Np1,_ = size(A)
b = ones(Np1)
L = tril(A,-1); U = triu(A,1); D = Diagonal(A)
@btime A*b
@btime L*b 
@btime L*b + U*b 
@btime L*b + U*b + D*b 
@btime L*b + U*b + L*b 

  254.534 ns (1 allocation: 224 bytes)
  253.916 ns (1 allocation: 224 bytes)
  557.659 ns (3 allocations: 672 bytes)
  615.370 ns (4 allocations: 896 bytes)
  799.202 ns (4 allocations: 896 bytes)


20-element Vector{Float64}:
   0.0
 -16.0
 -48.0
 -32.0
 -16.0
 -16.0
 -32.0
 -64.0
 -80.0
 -48.0
 -48.0
 -64.0
 -80.0
 -64.0
 -32.0
 -32.0
 -16.0
 -48.0
 -32.0
   0.0

In [11]:
# issparse(D)

## Section 3: Linear System Solve with the Laplacian

In [24]:
A2d = buildMat2D(30)
A = sparse(collect(A2d))
p = AMGPreconditioner{RugeStuben}(A);
println(p)
Np1,_ = size(A)
f = ones(Np1)
# @benchmark A\f 
# print(bm1)
#bm2 = @benchmark cg(A, f, Pl=p, log=true)
#print(bm2)

AMGPreconditioner{RugeStuben, AlgebraicMultigrid.MultiLevel{AlgebraicMultigrid.Pinv{Float64}, GaussSeidel{SymmetricSweep}, GaussSeidel{SymmetricSweep}, SparseMatrixCSC{Float64, Int64}, Adjoint{Float64, SparseMatrixCSC{Float64, Int64}}, SparseMatrixCSC{Float64, Int64}, AlgebraicMultigrid.MultiLevelWorkspace{Vector{Float64}, 1}}, AlgebraicMultigrid.V}(Multilevel Solver
-----------------
Operator Complexity: 2.122
Grid Complexity: 1.694
No. of Levels: 6
Coarse Solver: Pinv
Level     Unknowns     NonZeros
-----     --------     --------
    1          961         4433 [47.12%]
    2          478         3720 [39.54%]
    3          135          977 [10.38%]
    4           39          227 [ 2.41%]
    5           12           46 [ 0.49%]
    6            3            5 [ 0.05%]
, AlgebraicMultigrid.V())
Trial(386.042 μs)

In [25]:
@benchmark A\f

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m385.625 μs[22m[39m … [35m  1.956 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 41.22%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m394.083 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m406.320 μs[22m[39m ± [32m122.446 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m1.49% ±  3.84%

  [39m [39m [39m [39m [39m [39m▄[39m▅[39m▆[39m█[39m▇[34m▇[39m[39m▅[39m▂[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m▂[39m▄

In [26]:
cg(A, f, Pl=p, log=true)

([0.49999999999569716, 0.01484988212760718, 0.028605153013143488, 0.041281096290838044, 0.0528917963423019, 0.06345015394505359, 0.07296790060775629, 0.0814556116045568, 0.08892271772509272, 0.09537751575445769  …  0.09537751575322337, 0.0889227177238336, 0.08145561160315551, 0.07296790060636599, 0.0634501539436287, 0.05289179634098615, 0.04128109628968657, 0.028605153012334805, 0.014849882127077012, 0.49999999999569716], Converged after 6 iterations.)

In [27]:
@benchmark cg(A, f, Pl=p, log=true)

BenchmarkTools.Trial: 9366 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m524.334 μs[22m[39m … [35m 4.942 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 89.08%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m528.750 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m532.675 μs[22m[39m ± [32m96.063 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.45% ±  2.21%

  [39m [39m [39m [39m [39m▆[39m█[39m▆[34m▂[39m[39m [39m▁[39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m▃[39m▄[39m█[39m