#### Test Pipeline
This is a Julia Notebook to check the correctness of the C++ code. 

The notebook proceeds as follows: 
- Generate a synthetic instance
- Run the original implementation of the algorithm (pure Julia)
- Run the C++ implementation of the algorithm

#### Step 1: Data Generation

Generates a ground truth covariance matrix S of the form $I + \beta x_1 x_1^\top + \beta x_2 x_2^\top$, where 
- $x_1, x_2$ are $k$-sparse, with non-overlapping support 
- $\beta$ controls the signal-to-noise ratio

Then, samples $n$ multivariate normal observation from $\mathcal{N}(0,S)$ and constructs the empirical covariance matrix $\Sigma$

In [12]:
using Random, LinearAlgebra
Random.seed!(1234)

p = 10 #Dimension
r = 2 #Number of sparse PCs
k = 4 #Sparsity of each PC
β = 1 #Signal strength


x1 = zeros(p); x1[1:k] = sign.(rand(k) .- .5)
x2 = zeros(p); x2[(k+1):(k+k)] = sign.(rand(k) .- .5)

# x1[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap)] = sign.(rand(k_overlap) .- .5)
# x2[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap_half)] = -x1[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap_half)]
# x2[(2*k_nonoverlapping+k_overlap_half+1):(2*k_nonoverlapping+k_overlap)] = x1[(2*k_nonoverlapping+k_overlap_half+1):(2*k_nonoverlapping+k_overlap)]

@assert sum(abs.(x1) .> 0) == k
@assert sum(abs.(x2) .> 0) == k
@assert abs(dot(x1,x2)) ≤ 1e-10

shufflecoords = randperm(p)
x1 = x1[shufflecoords]; x2=x2[shufflecoords] 

x1 /= norm(x1); x2 /= norm(x2) 

S = β*x1*x1'+β*x2*x2'+ Matrix(1.0*I, p, p)
S = (S + S')/2

10×10 Matrix{Float64}:
  1.25   0.25  0.0  0.0   0.0    0.0    0.0   -0.25  -0.25   0.0
  0.25   1.25  0.0  0.0   0.0    0.0    0.0   -0.25  -0.25   0.0
  0.0    0.0   1.0  0.0   0.0    0.0    0.0    0.0    0.0    0.0
  0.0    0.0   0.0  1.0   0.0    0.0    0.0    0.0    0.0    0.0
  0.0    0.0   0.0  0.0   1.25   0.25  -0.25   0.0    0.0   -0.25
  0.0    0.0   0.0  0.0   0.25   1.25  -0.25   0.0    0.0   -0.25
  0.0    0.0   0.0  0.0  -0.25  -0.25   1.25   0.0    0.0    0.25
 -0.25  -0.25  0.0  0.0   0.0    0.0    0.0    1.25   0.25   0.0
 -0.25  -0.25  0.0  0.0   0.0    0.0    0.0    0.25   1.25   0.0
  0.0    0.0   0.0  0.0  -0.25  -0.25   0.25   0.0    0.0    1.25

In [13]:
n = 1000 #Sample size

Random.seed!(1234)

using Distributions
d = MvNormal(zeros(p), S)
X = rand(d, n) #p by N matrix of observations

Sn = cov(X') #Sample covariance matrix

10×10 Matrix{Float64}:
  1.19918      0.272751    -0.0275862   …  -0.255738   -0.238637    0.0827532
  0.272751     1.2777      -0.072279       -0.34593    -0.325634   -0.0808126
 -0.0275862   -0.072279     1.02957         0.0472149   0.0213239   0.0220759
  0.0227836    0.00966214   0.00435918     -0.0416068  -0.0176989  -0.0304168
 -0.0218518    0.0632956   -0.00482757     -0.0352677   0.0490125  -0.252779
  0.00376276   0.0444074   -0.0216567   …  -0.0519505  -0.0310439  -0.15713
 -0.00259936  -0.0351958   -0.0289323      -0.0595101  -0.0786064   0.267666
 -0.255738    -0.34593      0.0472149       1.34151     0.286398   -0.0679424
 -0.238637    -0.325634     0.0213239       0.286398    1.28691    -0.0201723
  0.0827532   -0.0808126    0.0220759      -0.0679424  -0.0201723   1.32953

In [14]:
# show(stdout, "text/plain", Sn)

In [15]:
# [x1 x2]

In [16]:
# [k, k]

#### Step 2: Julia benchmark

Applies the Julia code to $S$ (true covariance matrix) and $\Sigma$ (emprirical covariance matrix)

In [17]:
include("algorithm2.jl")

findmultPCs_deflation (generic function with 1 method)

In [18]:
ofv_best, violation_best, runtime, x_best = findmultPCs_deflation(Sn, r, [k,k]; numIters = 10, verbose = true, violation_tolerance = 1e-4 )

x_best

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern: [4, 4]

  Iteration |      Objective value |   Orthogonality Violation |       Time 
lambda_partial: 

2.1486541170596754
lambda_partial: 2.148654117059676
ofv_overall: 4.297308234119351
          1 |                0.351 |                  2.00e+00 |      0.193 
lambda_partial: 2.1056810347184816
lambda_partial: 2.1056810347184816
ofv_overall: 4.297308234119351
          2 |                0.351 |                  2.00e+00 |      0.209 
lambda_partial: 2.059513889570907
lambda_partial: 2.059513889570907
ofv_overall: 4.297308234119351
          3 |                0.351 |                  2.00e+00 |      0.233 
lambda_partial: 2.0133467444233317
lambda_partial: 2.0133467444233317
ofv_overall: 4.29730823411935
          4 |                0.351 |                  2.00e+00 |      0.248 
lambda_partial: 1.9683030430823858
lambda_partial: 2.1486541170596754
ofv_overall: 4.116957160142061
          5 |                0.336 |                  3.33e-16 |      0.270 
λ0: 884589.728110177


lambda_partial: 884591.6963236693
λ0: 884589.9085508048
lambda_partial: 884592.0571700307
ofv_overall: 4.1168327179259325
          6 |                0.336 |                  2.22e-16 |      0.286 
λ0: 1.7691546850068106e6
lambda_partial: 1.7691566532206854e6
λ0: 1.7691548654121626e6
lambda_partial: 1.7691570140313827e6
ofv_overall: 4.116833094476751
          7 |                0.336 |                  3.33e-16 |      0.307 
λ0: 2.653719722776313e6
lambda_partial: 2.6537216909901844e6
λ0: 2.653719903181661e6
lambda_partial: 2.653722051800879e6
ofv_overall: 4.116833090299198
          8 |                0.336 |                  1.11e-16 |      0.329 


10×2 Matrix{Float64}:
  0.0        0.421568
  0.0        0.529844
  0.0        0.0
  0.0        0.0
  0.458548   0.0
  0.468789   0.0
 -0.536703   0.0
  0.0       -0.543248
  0.0       -0.496414
 -0.530962   0.0

In [15]:
[x1 x2]

10×2 Matrix{Float64}:
 -0.5   0.0
 -0.5   0.0
  0.0  -0.5
 -0.5   0.0
  0.0   0.0
  0.0  -0.5
  0.0   0.5
  0.0  -0.5
  0.5   0.0
  0.0   0.0

In [9]:
# show(stdout, "text/plain", x_best)

#### Step 3: R/C++ implementation

Applies the R code to $S$ (true covariance matrix) and $\Sigma$ (emprirical covariance matrix)

In [7]:
using RCall

R"""library(sPCAmPC)"""
# R"""
# library(devtools)
# reload(pkg = "sPCAmPC", quiet = FALSE)"""


RObject{StrSxp}
[1] "sPCAmPC"   "stats"     "graphics"  "grDevices" "utils"     "datasets" 
[7] "methods"   "base"     


In [8]:
R"""

TestMat <- $Sn

results <- cpp_findmultPCs_deflation(TestMat, 2, c(4, 4), numIters=10)
"""

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern:  4 4

  Iteration |      Objective value |   Orthogonality Violation |       Time
Direct: 1.6313
Via dot: 1.6313
Direct: 1.87427
Via dot: 1.87427
Direct: 1.85339
Via dot: 1.85339
Direct: 1.71951
Via dot: 1.71951
Direct: 1.87427
Via dot: 1.87427
Direct: 1.85339
Via dot: 1.85339
Direct: 1.6313
Via dot: 1.6313
Direct: 1.85339
Via dot: 1.85339
Direct: 1.85339
Via dot: 1.85339
Direct: 1.87427
Via dot: 1.87427
Direct: 1.87427
Via dot: 1.87427
Direct: 1.87427
Via dot: 1.87427
Direct: 1.85339
Via dot: 1.85339
Direct: 1.71951
Via dot: 1.71951
Direct: 1.6313
Via dot: 1.6313
Direct: 1.85339
Via dot: 1.85339
Direct: 1.87427
Via dot: 1.87427
Direct: 1.87427
Via dot: 1.87427
Direct: 1.85339
Via dot: 1.85339
Direct: 1.85339
Via dot: 1.85339
Direct: 1.85339
Via dot: 1.85339
Direct: 1.85339
Via dot: 1.85339
Direct: 1.87427
Via dot: 1.87427
Direct: 1.87427
Via dot: 1.87427
Direct: 1.8

1.66
Via dot: 1.66
Direct: 1.84
Via dot: 1.84
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.84
Via dot: 1.84
Direct: 1.84
Via dot: 1.84
Direct: 1.69
Via dot: 1.69
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.67
Via dot: 1.67
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1.84
Via dot: 1.84
Direct: 1.85
Via dot: 1.85
Direct: 1

1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.35
Via dot: 1.35
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.42
Via dot: 1.42
Direct: 1.42
Via dot: 1.42
Direct: 1.42
Via dot: 1.42
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1.85
Via dot: 1.85
Direct: 1

1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Via dot: 1.4e+06
Direct: 1.4e+06
Vi


Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
Direct: 2.1e+06
Via dot: 2.1e+06
D

RObject{VecSxp}
$objective_value
[1] 3.727657

$orthogonality_violation
[1] 2.220446e-16

$runtime
[1] 0.109

$x_best
            [,1]      [,2]
 [1,]  0.4546087 0.0000000
 [2,]  0.0000000 0.4817843
 [3,]  0.0000000 0.5178144
 [4,] -0.4913661 0.0000000
 [5,]  0.0000000 0.0000000
 [6,]  0.0000000 0.0000000
 [7,]  0.0000000 0.5211327
 [8,]  0.0000000 0.4776744
 [9,] -0.5394452 0.0000000
[10,] -0.5107731 0.0000000



In [11]:
x_best

10×2 Matrix{Float64}:
 -0.456511   0.0
  0.513042   0.0
 -0.543893   0.0
  0.0        0.449626
  0.0        0.458088
  0.0        0.0
  0.0        0.569211
  0.482251   0.0
  0.0       -0.513801
  0.0        0.0

In [13]:
[x1 x2]

10×2 Matrix{Float64}:
 -0.5   0.0
  0.0  -0.5
  0.0   0.5
  0.0  -0.5
  0.0   0.0
 -0.5   0.0
  0.0  -0.5
  0.0   0.0
  0.5   0.0
 -0.5   0.0