#### Test Pipeline
This is a Julia Notebook to check the correctness of the C++ code. 

The notebook proceeds as follows: 
- Generate a synthetic instance
- Run the original implementation of the algorithm (pure Julia)
- Run the C++ implementation of the algorithm

#### Step 1: Data Generation

Generates a ground truth covariance matrix S of the form $I + \beta x_1 x_1^\top + \beta x_2 x_2^\top$, where 
- $x_1, x_2$ are $k$-sparse, with non-overlapping support 
- $\beta$ controls the signal-to-noise ratio

Then, samples $n$ multivariate normal observation from $\mathcal{N}(0,S)$ and constructs the empirical covariance matrix $\Sigma$

In [1]:
using Random, LinearAlgebra
p = 10 #Dimension
r = 2 #Number of sparse PCs
k = 4 #Sparsity of each PC
β = 1 #Signal strength


x1 = zeros(p); x1[1:k] = sign.(rand(k) .- .5)
x2 = zeros(p); x2[(k+1):(k+k)] = sign.(rand(k) .- .5)

# x1[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap)] = sign.(rand(k_overlap) .- .5)
# x2[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap_half)] = -x1[(2*k_nonoverlapping+1):(2*k_nonoverlapping+k_overlap_half)]
# x2[(2*k_nonoverlapping+k_overlap_half+1):(2*k_nonoverlapping+k_overlap)] = x1[(2*k_nonoverlapping+k_overlap_half+1):(2*k_nonoverlapping+k_overlap)]

@assert sum(abs.(x1) .> 0) == k
@assert sum(abs.(x2) .> 0) == k
@assert abs(dot(x1,x2)) ≤ 1e-10

shufflecoords = randperm(p)
x1 = x1[shufflecoords]; x2=x2[shufflecoords] 

x1 /= norm(x1); x2 /= norm(x2) 

S = β*x1*x1'+β*x2*x2'+ Matrix(1.0*I, p, p)
S = (S + S')/2

10×10 Matrix{Float64}:
  1.25  0.0  -0.25   0.0   0.0   0.0    0.0    0.25  -0.25   0.0
  0.0   1.0   0.0    0.0   0.0   0.0    0.0    0.0    0.0    0.0
 -0.25  0.0   1.25   0.0   0.0   0.0    0.0   -0.25   0.25   0.0
  0.0   0.0   0.0    1.25  0.0   0.25  -0.25   0.0    0.0    0.25
  0.0   0.0   0.0    0.0   1.0   0.0    0.0    0.0    0.0    0.0
  0.0   0.0   0.0    0.25  0.0   1.25  -0.25   0.0    0.0    0.25
  0.0   0.0   0.0   -0.25  0.0  -0.25   1.25   0.0    0.0   -0.25
  0.25  0.0  -0.25   0.0   0.0   0.0    0.0    1.25  -0.25   0.0
 -0.25  0.0   0.25   0.0   0.0   0.0    0.0   -0.25   1.25   0.0
  0.0   0.0   0.0    0.25  0.0   0.25  -0.25   0.0    0.0    1.25

In [2]:
n = 1000 #Sample size

using Distributions
d = MvNormal(zeros(p), S)
X = rand(d, n) #p by N matrix of observations

Σ = cov(X') #Sample covariance matrix

10×10 Matrix{Float64}:
  1.21098      0.0198762   -0.270174    …  -0.196772    0.017612
  0.0198762    0.963283     0.0230383      -0.0322775   0.0500873
 -0.270174     0.0230383    1.24186         0.238108   -0.0158213
 -0.0496477    0.05742      0.00808103     -0.0172331   0.208627
  0.0143871    0.0532761    0.028962       -0.0296409   0.0487803
  0.00265853  -0.0118599    0.0193225   …   0.0328639   0.198726
  0.0213643    0.00662886  -0.0373458       0.0059465  -0.212316
  0.264522    -0.0200359   -0.179285       -0.210893   -0.000571205
 -0.196772    -0.0322775    0.238108        1.21893    -0.0292043
  0.017612     0.0500873   -0.0158213      -0.0292043   1.22934

In [14]:
show(stdout, "text/plain", Σ)

10×10 Matrix{Float64}:
  1.2885      -0.00166515   0.240096    0.202755    -0.0204167    0.0052062   0.0185172   0.0123348   -0.0180624  -0.276094
 -0.00166515   1.19835      0.031256    0.0559613    0.00384667  -0.24765    -0.0404634  -0.203201    -0.29201    -0.0353311
  0.240096     0.031256     1.28936     0.302539     0.0133862   -0.036967   -0.0191132   0.0341      -0.0332284  -0.315479
  0.202755     0.0559613    0.302539    1.26655      0.00563072   0.0309291  -0.0665495  -0.022145     0.0364086  -0.265132
 -0.0204167    0.00384667   0.0133862   0.00563072   1.04526      0.061717   -0.0174228  -0.0210354   -0.0648273  -0.0207834
  0.0052062   -0.24765     -0.036967    0.0309291    0.061717     1.3015      0.0775003   0.256675     0.217528    0.0277423
  0.0185172   -0.0404634   -0.0191132  -0.0665495   -0.0174228    0.0775003   1.03382     0.055277     0.0194558   0.0386565
  0.0123348   -0.203201     0.0341     -0.022145    -0.0210354    0.256675    0.055277    1.26787      0.

In [3]:
[x1 x2]

10×2 Matrix{Float64}:
  0.0   0.5
  0.0   0.0
  0.0  -0.5
  0.5   0.0
  0.0   0.0
  0.5   0.0
 -0.5   0.0
  0.0   0.5
  0.0  -0.5
  0.5   0.0

In [4]:
[k, k]

2-element Vector{Int64}:
 4
 4

#### Step 2: Julia benchmark

Applies the Julia code to $S$ (true covariance matrix) and $\Sigma$ (emprirical covariance matrix)

In [5]:
include("algorithm2.jl")

findmultPCs_deflation (generic function with 1 method)

In [6]:
ofv_best, violation_best, runtime, x_best = findmultPCs_deflation(Σ, r, [k,k]; numIters = 20, verbose = true, violation_tolerance = 1e-4 )

x_best

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern: [4, 4]

  Iteration |      Objective value |   Orthogonality Violation |       Time 
          1 |                0.324 |                  2.00e+00 |      1.120 


          2 |                0.324 |                  2.00e+00 |      1.432 
          3 |                0.324 |                  2.00e+00 |      1.447 
          4 |                0.320 |                  1.00e-07 |      1.471 
          5 |                0.320 |                  1.00e-07 |      1.486 


          6 |                0.320 |                  1.00e-07 |      1.509 
          7 |                0.320 |                  1.00e-07 |      1.531 


10×2 Matrix{Float64}:
  0.0        0.521685
  0.0        0.0
  0.0       -0.515621
  0.539966   0.0
  0.0        0.0
  0.451028   0.0
 -0.508223   0.0
  0.0        0.484551
  0.0       -0.476644
  0.496709   0.0

In [19]:
show(stdout, "text/plain", x_best)

10×2 Matrix{Float64}:
  0.0        0.461411
  0.469387   0.0
  0.0        0.532221
  0.0        0.48601
  0.0        0.0
 -0.508664   0.0
  0.0        0.0
 -0.487371   0.0
 -0.532359   0.0
  0.0       -0.517335

#### Step 3: R/C++ implementation

Applies the R code to $S$ (true covariance matrix) and $\Sigma$ (emprirical covariance matrix)

In [7]:
using RCall

In [11]:
R"""
install.packages('devtools', repos='http://cran.us.r-project.org', dependencies=TRUE)
library(devtools)
"""


The downloaded binary packages are in
	/var/folders/tw/x4vcf7js2pdbgpn_tmnglgj40000gp/T//RtmpvkmsCb/downloaded_packages


│ Content type 'application/x-gzip' length 430872 bytes (420 KB)
│ downloaded 420 KB
│ 
└ @ RCall /Users/jeanpauphilet/.julia/packages/RCall/aK5sD/src/io.jl:172


RObject{StrSxp}
[1] "devtools"  "usethis"   "stats"     "graphics"  "grDevices" "utils"    
[7] "datasets"  "methods"   "base"     


In [14]:
R"""install_github('jeanpauphilet/sPCAmPC/R', auth_token ="XXX")"""

InterruptException: InterruptException:

In [15]:
R"""library(sPCAmPC)"""

│   running command ''/usr/bin/git' ls-remote git://github.com/jeanpauphilet/sPCAmPC/R  2>/dev/null' had status 128 and error message 'Interrupted system call'
│ Error: Failed to install 'unknown package' from Git:
│   Command failed (128)
└ @ RCall /Users/jeanpauphilet/.julia/packages/RCall/aK5sD/src/io.jl:172


RObject{StrSxp}
 [1] "sPCAmPC"   "devtools"  "usethis"   "stats"     "graphics"  "grDevices"
 [7] "utils"     "datasets"  "methods"   "base"     


In [22]:
R"""

TestMat <- $S 

TestKS <- matrix( c(4, 4), nrow = 1, ncol = 2, byrow = TRUE)

cpp_findmultPCs_deflation(TestMat, 2, TestKS, numIters=20)
"""

---- Iterative deflation algorithm for sparse PCA with multiple PCs ---
Dimension: 10
Number of PCs: 2
Sparsity pattern:  4 4

  Iteration |      Objective value |   Orthogonality Violation |       Time
          1 |                 0.25 |                  2.00e+00 |      0.078
          2 |                0.292 |                  1.00e-07 |      0.155
          3 |                0.292 |                  1.00e-07 |      0.241


RObject{RealSxp}
            [,1]       [,2]
 [1,]  0.5773503  0.0000000
 [2,]  0.0000000  0.0000000
 [3,] -0.5773503  0.0000000
 [4,]  0.0000000  0.5773542
 [5,]  0.0000000  0.0000000
 [6,]  0.0000000  0.0000000
 [7,]  0.0000000 -0.5773541
 [8,]  0.5773502  0.0000000
 [9,]  0.0000000  0.0000000
[10,]  0.0000000  0.5773426


In [20]:
x_best

10×2 Matrix{Float64}:
  0.0        0.521685
  0.0        0.0
  0.0       -0.515621
  0.539966   0.0
  0.0        0.0
  0.451028   0.0
 -0.508223   0.0
  0.0        0.484551
  0.0       -0.476644
  0.496709   0.0

In [19]:
[x1 x2]

10×2 Matrix{Float64}:
  0.0   0.5
  0.0   0.0
  0.0  -0.5
  0.5   0.0
  0.0   0.0
  0.5   0.0
 -0.5   0.0
  0.0   0.5
  0.0  -0.5
  0.5   0.0