#### Test Pipeline
This is a Julia Notebook to check the correctness of the C++ code. 

The notebook proceeds as follows: 
- Generate a synthetic instance
- Run the original implementation of the algorithm (pure Julia)
- Run the C++ implementation of the algorithm

#### Step 1: Data Generation

Generates a ground truth covariance matrix S of the form $I + \beta x_1 x_1^\top + \beta x_2 x_2^\top$, where 
- $x_1, x_2$ are $k$-sparse, with non-overlapping support 
- $\beta$ controls the signal-to-noise ratio

Then, samples $n$ multivariate normal observation from $\mathcal{N}(0,S)$ and constructs the empirical covariance matrix $\Sigma$

In [1]:
using Random, LinearAlgebra
p = 10 #Dimension
r = 2 #Number of sparse PCs
k = 4 #Sparsity of each PC
β = 1 #Signal strength


x1 = zeros(p); x1[1:k] = sign.(rand(k) .- .5)
x2 = zeros(p); x2[(k+1):(k+k)] = sign.(rand(k) .- .5)

# x1[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap)] = sign.(rand(k_overlap) .- .5)
# x2[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap_half)] = -x1[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap_half)]
# x2[(2*k_nonoverlapping+k_overlap_half+1):(2*k_nonoverlapping+k_overlap)] = x1[(2*k_nonoverlapping+k_overlap_half+1):(2*k_nonoverlapping+k_overlap)]

@assert sum(abs.(x1) .> 0) == k
@assert sum(abs.(x2) .> 0) == k
@assert abs(dot(x1,x2)) ≤ 1e-10

shufflecoords = randperm(p)
x1 = x1[shufflecoords]; x2=x2[shufflecoords] 

x1 /= norm(x1); x2 /= norm(x2) 

S = β*x1*x1'+β*x2*x2'+ Matrix(1.0*I, p, p)
S = (S + S')/2

10×10 Matrix{Float64}:
  1.25   0.0    0.0    0.0   0.0   0.25   0.0   0.0  -0.25   0.25
  0.0    1.25  -0.25   0.25  0.0   0.0    0.25  0.0   0.0    0.0
  0.0   -0.25   1.25  -0.25  0.0   0.0   -0.25  0.0   0.0    0.0
  0.0    0.25  -0.25   1.25  0.0   0.0    0.25  0.0   0.0    0.0
  0.0    0.0    0.0    0.0   1.0   0.0    0.0   0.0   0.0    0.0
  0.25   0.0    0.0    0.0   0.0   1.25   0.0   0.0  -0.25   0.25
  0.0    0.25  -0.25   0.25  0.0   0.0    1.25  0.0   0.0    0.0
  0.0    0.0    0.0    0.0   0.0   0.0    0.0   1.0   0.0    0.0
 -0.25   0.0    0.0    0.0   0.0  -0.25   0.0   0.0   1.25  -0.25
  0.25   0.0    0.0    0.0   0.0   0.25   0.0   0.0  -0.25   1.25

In [2]:
n = 1000 #Sample size

using Distributions
d = MvNormal(zeros(p), S)
X = rand(d, n) #p by N matrix of observations

Sn = cov(X') #Sample covariance matrix

10×10 Matrix{Float64}:
  1.2882       -0.0103172  …   0.0257327  -0.296523     0.240164
 -0.0103172     1.27416       -0.0228875   0.0476198   -0.0311994
 -0.000765166  -0.286993       0.0424141   0.00607313  -0.0153745
 -0.0356709     0.209423      -0.0335702   0.0152126   -0.0421924
  0.0398914    -0.0738074      0.0349229   0.00422163   0.0113545
  0.240236     -0.0335114  …   0.034226   -0.280671     0.25605
  0.013439      0.272195       0.0193981   0.0102632    0.0433183
  0.0257327    -0.0228875      1.05778     0.0246143   -0.0069118
 -0.296523      0.0476198      0.0246143   1.26297     -0.275914
  0.240164     -0.0311994     -0.0069118  -0.275914     1.21442

In [3]:
show(stdout, "text/plain", Sn)

10×10 Matrix{Float64}:
  1.2882       -0.0103172  -0.000765166  -0.0356709   0.0398914    0.240236     0.013439     0.0257327  -0.296523     0.240164
 -0.0103172     1.27416    -0.286993      0.209423   -0.0738074   -0.0335114    0.272195    -0.0228875   0.0476198   -0.0311994
 -0.000765166  -0.286993    1.20937      -0.171858    0.0661205    0.052722    -0.249636     0.0424141   0.00607313  -0.0153745
 -0.0356709     0.209423   -0.171858      1.22118    -0.065833    -0.0467649    0.269919    -0.0335702   0.0152126   -0.0421924
  0.0398914    -0.0738074   0.0661205    -0.065833    0.986653     0.0580241    0.00448651   0.0349229   0.00422163   0.0113545
  0.240236     -0.0335114   0.052722     -0.0467649   0.0580241    1.25667      0.00631185   0.034226   -0.280671     0.25605
  0.013439      0.272195   -0.249636      0.269919    0.00448651   0.00631185   1.25078      0.0193981   0.0102632    0.0433183
  0.0257327    -0.0228875   0.0424141    -0.0335702   0.0349229    0.034226     0.01

In [4]:
[x1 x2]

10×2 Matrix{Float64}:
 -0.5   0.0
  0.0  -0.5
  0.0   0.5
  0.0  -0.5
  0.0   0.0
 -0.5   0.0
  0.0  -0.5
  0.0   0.0
  0.5   0.0
 -0.5   0.0

In [5]:
[k, k]

2-element Vector{Int64}:
 4
 4

#### Step 2: Julia benchmark

Applies the Julia code to $S$ (true covariance matrix) and $\Sigma$ (emprirical covariance matrix)

In [6]:
include("algorithm2.jl")

findmultPCs_deflation (generic function with 1 method)

In [7]:
ofv_best, violation_best, runtime, x_best = findmultPCs_deflation(S, r, [k,k]; numIters = 20, verbose = true, violation_tolerance = 1e-4 )

x_best

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern: [4, 4]

  Iteration |      Objective value |   Orthogonality Violation |       Time 


          1 |                0.333 |                  2.00e+00 |      1.853 


          2 |                0.333 |                  1.00e-07 |      2.293 


10×2 Matrix{Float64}:
  0.5   0.0
  0.0  -0.5
  0.0   0.5
  0.0  -0.5
  0.0   0.0
  0.5   0.0
  0.0  -0.5
  0.0   0.0
 -0.5   0.0
  0.5   0.0

In [8]:
ofv_best, violation_best, runtime, x_best = findmultPCs_deflation(Sn, r, [k,k]; numIters = 20, verbose = true, violation_tolerance = 1e-4 )

x_best

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern: [4, 4]

  Iteration |      Objective value |   Orthogonality Violation |       Time 
          1 |                0.341 |                  2.00e+00 |      0.025 
          2 |                0.341 |                  2.00e+00 |      0.061 
          3 |                0.341 |                  2.00e+00 |      0.088 


          4 |                0.341 |                  2.00e+00 |      0.127 
          5 |                0.335 |                  1.00e-07 |      0.167 
          6 |                0.335 |                  1.00e-07 |      0.195 
          7 |                0.335 |                  1.00e-07 |      0.232 


          8 |                0.335 |                  1.00e-07 |      0.261 


10×2 Matrix{Float64}:
  0.0        0.508031
 -0.538193   0.0
  0.480423   0.0
 -0.450682   0.0
  0.0        0.0
  0.0        0.493069
 -0.525764   0.0
  0.0        0.0
  0.0       -0.526066
  0.0        0.471213

In [None]:
show(stdout, "text/plain", x_best)

#### Step 3: R/C++ implementation

Applies the R code to $S$ (true covariance matrix) and $\Sigma$ (emprirical covariance matrix)

In [9]:
using RCall

In [10]:
R"""library(sPCAmPC)"""
# R"""
# library(devtools)
# reload(pkg = "sPCAmPC", quiet = FALSE)"""


RObject{StrSxp}
[1] "sPCAmPC"   "stats"     "graphics"  "grDevices" "utils"     "datasets" 
[7] "methods"   "base"     


In [11]:
R"""

TestMat <- $S 

cpp_findmultPCs_deflation(TestMat, 2, c(4, 4), numIters=20)
"""

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern:  4 4

  Iteration |      Objective value |   Orthogonality Violation |       Time
          1 |                0.292 |                  2.00e+00 |      0.114
          2 |                0.292 |                  1.00e-07 |      0.262


RObject{RealSxp}
           [,1]       [,2]
 [1,] 0.5773503  0.0000000
 [2,] 0.0000000  0.5773571
 [3,] 0.0000000 -0.5773367
 [4,] 0.0000000  0.5773570
 [5,] 0.0000000  0.0000000
 [6,] 0.5773503  0.0000000
 [7,] 0.0000000  0.0000000
 [8,] 0.0000000  0.0000000
 [9,] 0.0000000  0.0000000
[10,] 0.5773502  0.0000000


In [12]:
R"""

TestMat <- $Sn 

cpp_findmultPCs_deflation(TestMat, 2, c(4, 4), numIters=20)
"""

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern:  4 4

  Iteration |      Objective value |   Orthogonality Violation |       Time
          1 |                0.341 |                  2.00e+00 |      0.112
          2 |                0.341 |                  2.00e+00 |      0.212
          3 |                0.341 |                  2.00e+00 |      0.323
          4 |                0.341 |                  2.00e+00 |      0.423
          5 |                0.335 |                  1.00e-07 |      0.525
          6 |                0.335 |                  1.00e-07 |      0.626
          7 |                0.335 |                  1.00e-07 |      0.725


          8 |                0.335 |                  1.00e-07 |      0.826


RObject{RealSxp}
            [,1]       [,2]
 [1,]  0.0000000  0.5075273
 [2,] -0.5364319  0.0000000
 [3,]  0.4759456  0.0000000
 [4,] -0.4487386  0.0000000
 [5,]  0.0000000  0.0000000
 [6,]  0.0000000  0.4911845
 [7,] -0.5332450  0.0000000
 [8,]  0.0000000  0.0000000
 [9,]  0.0000000 -0.5294960
[10,]  0.0000000  0.4698806


In [13]:
[x1 x2]

10×2 Matrix{Float64}:
 -0.5   0.0
  0.0  -0.5
  0.0   0.5
  0.0  -0.5
  0.0   0.0
 -0.5   0.0
  0.0  -0.5
  0.0   0.0
  0.5   0.0
 -0.5   0.0