# Compare pure CCD, pure PCA-CCD, and hybrids of them

Coordinate descent is NOT guaranteed to converged to global solution, even when objective is convex. Example:

![Image Test](CCD_counterexample.png)

Fitunately, Zihuai's PCA idea purturbs more than 1 coordinate, which helps CCD move out of local minimums. In my experiments, 

+ Cyclic coordinate descet (CCD): converges quickly, but could get stuck at local min
+ PCA: converges slower, but less likely to get stuck at local min

A practical strategy is to combine them. For example,
+ Run 10 PCA iterations to prime the CCD algorithm, then run 5 CCD iterations to quickly converge to local min, then run PCA to get out of local min... rotate indefinitely.
+ Convergence is declared when both PCA and CCD converge.

In [1]:
using Revise
using Knockoffs
using Random
using GLMNet
using Distributions
using LinearAlgebra
using ToeplitzMatrices
using StatsBase
using CSV, DataFrames
using DelimitedFiles
using Plots
gr(fmt=:png);

function TP(correct_groups, signif_groups)
    return length(signif_groups ∩ correct_groups) / max(1, length(correct_groups))
end
function FDR(correct_groups, signif_groups)
    FP = length(signif_groups) - length(signif_groups ∩ correct_groups) # number of false positives
    return FP / max(1, length(signif_groups))
end

function get_sigma(option::Int, p::Int)
    # note: groups are defined empirically within each simuation
    datadir = "/Users/biona001/Benjamin_Folder/research/4th_project_PRS/group_knockoff_test_data"
    if option == 1
        ρ = 0.7
        Σ = SymmetricToeplitz(ρ.^(0:(p-1))) |> Matrix
    elseif option == 2
        ρ = 0.7
        γ = 0.25
        groups = repeat(1:Int(p/5), inner=5)
        Σ = simulate_block_covariance(groups, ρ, γ)
    elseif option == 3
        covfile = CSV.read(joinpath(datadir, "CorG_2_127374341_128034347.txt"), DataFrame) # 3782 SNPs
        Σ = covfile |> Matrix{Float64}
        Σ = 0.99Σ + 0.01I #ensure PSD
    elseif option == 4
        df = CSV.read(joinpath(datadir, "21_37870779_38711704.csv"), DataFrame)
        Σ = df[:, 7:end] |> Matrix |> Symmetric |> Matrix
    elseif option == 5
        df = CSV.read(joinpath(datadir, "22_17674295_18295575.csv"), DataFrame)
        Σ = df[:, 7:end] |> Matrix |> Symmetric |> Matrix
    else
        error("Option should be 1-5 but was $option")
    end
    return Σ[1:p, 1:p]
end

sigma_option = 4
p = 1000
Σ = get_sigma(sigma_option, p)
top_dir = "/Users/biona001/Desktop/group_knockoff_simulations/gnomad_chr21"

┌ Info: Precompiling Knockoffs [878bf26d-0c49-448a-9df5-b057c815d613]
└ @ Base loading.jl:1423


"/Users/biona001/Desktop/group_knockoff_simulations/gnomad_chr21"

## Example data: gnomAD  

This is one simulation in a region where coordinate descent converges to a local (but not global) minimum, and hybrid approach ultimately arrives at the best optimum. 

In [2]:
# import some data

which = method = :maxent
seed = 9
group_def = "hc"
rep_def = :id
force_contiguous = false
n = 10000
y_dist = "normal"
feature_importance_method = "marginal"
sigma_option = 4
p = 1000
m = 5   # number of knockoffs per variable
k = 10  # number of causal variables

function get_signif_groups(β, groups)
    correct_groups = Int[]
    for i in findall(!iszero, β)
        g = groups[i]
        g ∈ correct_groups || push!(correct_groups, g)
    end
    return correct_groups
end

outdir = joinpath(top_dir, "sim$(seed)")
yfile = joinpath(outdir, "y_$(y_dist).txt")
Xfile = joinpath(outdir, "X.txt")
βfile = joinpath(outdir, "beta.txt")
group_name = force_contiguous ? "$(group_def)_groups_contig" : "$(group_def)_groups"
gfile = joinpath(outdir, group_name, "groups.txt")

βtrue = readdlm(βfile) |> vec
y = readdlm(yfile) |> vec
X = readdlm(Xfile)
groups = readdlm(gfile, Int) |> vec
correct_groups = get_signif_groups(βtrue, groups)

7-element Vector{Int64}:
  15
   4
 125
  24
  89
 212
 259

# ME group knockoffs

### CCD only

In [40]:
μ = zeros(size(Σ, 1))
Random.seed!(seed)
@time me_ccd = modelX_gaussian_group_knockoffs(X, :maxent, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=0, inner_ccd_iter=1, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


Maxent initial obj = -53174.763728468555
Iter 1 (CCD): obj = -49091.53577899868, δ = 0.5422509289614127, t1 = 2.23, t2 = 13.09, t3 = 0.02
Iter 2 (CCD): obj = -46928.59723953059, δ = 0.5203499680150293, t1 = 4.16, t2 = 25.87, t3 = 0.04
Iter 3 (CCD): obj = -45352.19736376025, δ = 0.16185048824703102, t1 = 5.98, t2 = 38.5, t3 = 0.06
Iter 4 (CCD): obj = -44091.93570188365, δ = 0.22595708111271595, t1 = 7.84, t2 = 51.03, t3 = 0.08
Iter 5 (CCD): obj = -42969.25381508038, δ = 0.34345352419463976, t1 = 9.91, t2 = 64.04, t3 = 0.12
Iter 6 (CCD): obj = -41862.84992890996, δ = 0.1966446154888424, t1 = 12.24, t2 = 76.68, t3 = 0.15
Iter 7 (CCD): obj = -40688.13705501341, δ = 0.5749590459061092, t1 = 14.88, t2 = 89.32, t3 = 0.46
Iter 8 (CCD): obj = -39723.94458631194, δ = 0.3201430276120457, t1 = 17.85, t2 = 101.89, t3 = 0.47
Iter 9 (CCD): obj = -38843.38717916202, δ = 0.2901213611279898, t1 = 21.1, t2 = 114.75, t3 = 0.49
Iter 10 (CCD): obj = -37869.02930288915, δ = 0.299606112059142, t1 = 24.53, t2 

### PCA only

In [41]:
@time me_pca = modelX_gaussian_group_knockoffs(X, :maxent, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=1, inner_ccd_iter=0, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


Maxent initial obj = -53174.763728468555
Iter 1 (PCA): obj = -48186.89944729032, δ = 0.6139366928460879, t1 = 1.65, t2 = 0.18
Iter 2 (PCA): obj = -43598.392599229235, δ = 0.6320233160969265, t1 = 3.32, t2 = 0.36
Iter 3 (PCA): obj = -40575.18138832505, δ = 0.596232402437525, t1 = 5.03, t2 = 0.54
Iter 4 (PCA): obj = -38037.83649863352, δ = 0.6939543708648634, t1 = 6.77, t2 = 0.72
Iter 5 (PCA): obj = -36235.76041179674, δ = 0.4878572844878825, t1 = 8.33, t2 = 0.9
Iter 6 (PCA): obj = -35087.31362589038, δ = 0.35526941737134404, t1 = 9.99, t2 = 1.08
Iter 7 (PCA): obj = -33986.133346233044, δ = 0.5913511001369454, t1 = 11.61, t2 = 1.26
Iter 8 (PCA): obj = -33051.557397785575, δ = 0.3393544125421494, t1 = 13.24, t2 = 1.44
Iter 9 (PCA): obj = -32110.97469189481, δ = 0.3598276329599284, t1 = 14.89, t2 = 1.62
Iter 10 (PCA): obj = -31519.097917306743, δ = 0.17883080926488387, t1 = 16.56, t2 = 1.79
Iter 11 (PCA): obj = -31133.244241325083, δ = 0.34607240298337766, t1 = 18.24, t2 = 1.98
Iter 12 (PC

### 10 iter PCA -> 5 iter CCD -> 10 iter PCA...

In [42]:
@time me_pca10_ccd5 = modelX_gaussian_group_knockoffs(X, :maxent, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=10, inner_ccd_iter=5, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


Maxent initial obj = -53174.763728468555
Iter 1 (PCA): obj = -48186.89944729032, δ = 0.6139366928460879, t1 = 1.67, t2 = 0.19
Iter 2 (PCA): obj = -43598.392599229235, δ = 0.6320233160969265, t1 = 3.46, t2 = 0.37
Iter 3 (PCA): obj = -40575.18138832505, δ = 0.596232402437525, t1 = 5.08, t2 = 0.56
Iter 4 (PCA): obj = -38037.83649863352, δ = 0.6939543708648634, t1 = 6.75, t2 = 0.74
Iter 5 (PCA): obj = -36235.76041179674, δ = 0.4878572844878825, t1 = 8.53, t2 = 0.93
Iter 6 (PCA): obj = -35087.31362589038, δ = 0.35526941737134404, t1 = 10.16, t2 = 1.11
Iter 7 (PCA): obj = -33986.133346233044, δ = 0.5913511001369454, t1 = 11.92, t2 = 1.29
Iter 8 (PCA): obj = -33051.557397785575, δ = 0.3393544125421494, t1 = 13.66, t2 = 1.48
Iter 9 (PCA): obj = -32110.97469189481, δ = 0.3598276329599284, t1 = 15.41, t2 = 1.66
Iter 10 (PCA): obj = -31519.097917306743, δ = 0.17883080926488387, t1 = 17.01, t2 = 1.85
Iter 11 (CCD): obj = -28405.018213735286, δ = 0.5905296969105198, t1 = 27.77, t2 = 14.67, t3 = 0.2

### 1 iter PCA -> 1 iter CCD -> 1 iter PCA...

In [44]:
@time me_pca1_ccd1 = modelX_gaussian_group_knockoffs(X, :maxent, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=1, inner_ccd_iter=1, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


Maxent initial obj = -53174.763728468555
Iter 1 (PCA): obj = -48186.89944729032, δ = 0.6139366928460879, t1 = 1.62, t2 = 0.19
Iter 2 (CCD): obj = -43299.446841651945, δ = 0.8294976590660618, t1 = 4.63, t2 = 13.21, t3 = 0.05
Iter 3 (PCA): obj = -39175.840241565755, δ = 0.9419563294742748, t1 = 6.44, t2 = 13.39
Iter 4 (CCD): obj = -34406.62268834304, δ = 0.6066774740336571, t1 = 11.59, t2 = 26.41, t3 = 0.07
Iter 5 (PCA): obj = -32079.30690776617, δ = 0.9199812024378085, t1 = 13.36, t2 = 26.6
Iter 6 (CCD): obj = -29646.677209647936, δ = 0.5894825766694997, t1 = 22.37, t2 = 39.57, t3 = 0.09
Iter 7 (PCA): obj = -28538.13636383661, δ = 0.44776680463743296, t1 = 23.93, t2 = 39.75
Iter 8 (CCD): obj = -27107.12719117422, δ = 0.3824064032153433, t1 = 39.1, t2 = 52.6, t3 = 0.11
Iter 9 (PCA): obj = -26519.79835999211, δ = 0.34218623052554287, t1 = 40.74, t2 = 52.79
Iter 10 (CCD): obj = -26079.528612042126, δ = 0.369067697828775, t1 = 58.49, t2 = 65.55, t3 = 0.14
Iter 11 (PCA): obj = -25773.2422058

Check final objective for all methods

In [46]:
@show me_ccd.obj # CCD only
@show me_pca.obj # PCA only
@show me_pca10_ccd5.obj # 10 pca -> 5 ccd -> 10 pca ...
@show me_pca1_ccd1.obj; # 1 pca -> 1 ccd -> 1 pca ...

me_ccd.obj = -29610.33204931772
me_pca.obj = -27364.466094282798
me_pca10_ccd5.obj = -24852.561492103552
me_pca1_ccd1.obj = -24623.575168329797


# MVR group knockoffs

### CCD

In [72]:
@time mvr_ccd = modelX_gaussian_group_knockoffs(X, :mvr, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=0, inner_ccd_iter=1, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


MVR initial obj = 4.603153798576854e10
Iter 1 (CCD): obj = 1.1219304840684552e9, δ = 0.11103967296786993, t1 = 1.97, t2 = 36.76,t3 = 0.03
Iter 2 (CCD): obj = 7.151123100950178e8, δ = 0.0959515551095888, t1 = 4.13, t2 = 73.45,t3 = 0.05
Iter 3 (CCD): obj = 4.8270163365084046e8, δ = 0.05066928206004477, t1 = 6.7, t2 = 110.2,t3 = 0.08
Iter 4 (CCD): obj = 3.224494491298074e8, δ = 0.06319281634001095, t1 = 9.63, t2 = 146.97,t3 = 0.11
Iter 5 (CCD): obj = 2.1335526274113634e8, δ = 0.03791649626120984, t1 = 13.09, t2 = 183.85,t3 = 0.14
Iter 6 (CCD): obj = 1.5275028429222417e8, δ = 0.020173451682141467, t1 = 17.34, t2 = 220.77,t3 = 0.17
Iter 7 (CCD): obj = 1.1727300380326009e8, δ = 0.03287029484371291, t1 = 22.45, t2 = 257.24,t3 = 0.2
Iter 8 (CCD): obj = 9.724704711080976e7, δ = 0.019877947306114432, t1 = 28.14, t2 = 293.56,t3 = 0.23
Iter 9 (CCD): obj = 8.359059281064972e7, δ = 0.017848907676604894, t1 = 34.59, t2 = 329.89,t3 = 0.26
Iter 10 (CCD): obj = 7.468646445501359e7, δ = 0.016185891119883

### PCA only

In [73]:
@time mvr_pca = modelX_gaussian_group_knockoffs(X, :mvr, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=1, inner_ccd_iter=0, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


MVR initial obj = 4.603153798576854e10
Iter 1 (PCA): obj = 9.577799222667847e9, δ = 0.3017276922493496, t1 = 2.36, t2 = 0.52
Iter 2 (PCA): obj = 1.7915684075516936e8, δ = 0.24515187797693075, t1 = 4.3, t2 = 1.04
Iter 3 (PCA): obj = 7.626404949388093e7, δ = 0.07918423089187183, t1 = 6.03, t2 = 1.58
Iter 4 (PCA): obj = 6.07225790874753e7, δ = 0.035042226871946504, t1 = 7.69, t2 = 2.1
Iter 5 (PCA): obj = 5.632063495609645e7, δ = 0.032748302101832744, t1 = 9.36, t2 = 2.62
Iter 6 (PCA): obj = 5.436775555390754e7, δ = 0.03997893965891557, t1 = 11.05, t2 = 3.14
Iter 7 (PCA): obj = 5.320454893838543e7, δ = 0.036141781642411094, t1 = 12.68, t2 = 3.66
Iter 8 (PCA): obj = 5.238563263406797e7, δ = 0.02452169773051111, t1 = 14.38, t2 = 4.19
Iter 9 (PCA): obj = 5.180117227255002e7, δ = 0.03122226595493174, t1 = 16.02, t2 = 4.7
Iter 10 (PCA): obj = 5.140526245014456e7, δ = 0.037324731766023085, t1 = 17.66, t2 = 5.22
Iter 11 (PCA): obj = 5.117888245451419e7, δ = 0.026439788727054468, t1 = 19.3, t2 = 5

### 10 iter PCA -> 5 iter CCD -> 10 iter PCA...

In [74]:
@time mvr_pca10_ccd5 = modelX_gaussian_group_knockoffs(X, :mvr, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=10, inner_ccd_iter=5, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


MVR initial obj = 4.603153798576854e10
Iter 1 (PCA): obj = 9.577799222667847e9, δ = 0.3017276922493496, t1 = 1.62, t2 = 0.52
Iter 2 (PCA): obj = 1.7915684075516936e8, δ = 0.24515187797693075, t1 = 3.23, t2 = 1.05
Iter 3 (PCA): obj = 7.626404949388093e7, δ = 0.07918423089187183, t1 = 4.85, t2 = 1.57
Iter 4 (PCA): obj = 6.07225790874753e7, δ = 0.035042226871946504, t1 = 6.48, t2 = 2.1
Iter 5 (PCA): obj = 5.632063495609645e7, δ = 0.032748302101832744, t1 = 8.12, t2 = 2.62
Iter 6 (PCA): obj = 5.436775555390754e7, δ = 0.03997893965891557, t1 = 9.75, t2 = 3.15
Iter 7 (PCA): obj = 5.320454893838543e7, δ = 0.036141781642411094, t1 = 11.38, t2 = 3.67
Iter 8 (PCA): obj = 5.238563263406797e7, δ = 0.02452169773051111, t1 = 12.98, t2 = 4.19
Iter 9 (PCA): obj = 5.180117227255002e7, δ = 0.03122226595493174, t1 = 14.65, t2 = 4.72
Iter 10 (PCA): obj = 5.140526245014456e7, δ = 0.037324731766023085, t1 = 16.46, t2 = 5.27
Iter 11 (CCD): obj = 4.64303483004633e7, δ = 0.06528603037171885, t1 = 33.85, t2 = 4

### 1 iter PCA -> 1 iter CCD -> 1 iter PCA...

In [75]:
@time mvr_pca1_ccd1 = modelX_gaussian_group_knockoffs(X, :mvr, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=1, inner_ccd_iter=1, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


MVR initial obj = 4.603153798576854e10
Iter 1 (PCA): obj = 9.577799222667847e9, δ = 0.3017276922493496, t1 = 1.67, t2 = 0.53
Iter 2 (CCD): obj = 2.985914805691862e8, δ = 0.269524635851602, t1 = 5.79, t2 = 36.96,t3 = 0.02
Iter 3 (PCA): obj = 7.283636979645956e7, δ = 0.27742986492656746, t1 = 7.44, t2 = 37.49
Iter 4 (CCD): obj = 4.903225830562202e7, δ = 0.08233604814304266, t1 = 23.84, t2 = 74.07,t3 = 0.04
Iter 5 (PCA): obj = 4.748049612264232e7, δ = 0.11535605994516225, t1 = 26.07, t2 = 74.62
Iter 6 (CCD): obj = 4.594931736987893e7, δ = 0.043638257580752036, t1 = 42.7, t2 = 111.18,t3 = 0.08
Iter 7 (PCA): obj = 4.5555052944730006e7, δ = 0.031124252579181784, t1 = 44.56, t2 = 111.71
Iter 8 (CCD): obj = 4.5092087791791074e7, δ = 0.02768146544106985, t1 = 59.55, t2 = 148.29,t3 = 0.72
Iter 9 (PCA): obj = 4.4919690303285964e7, δ = 0.023300838955965283, t1 = 61.41, t2 = 148.83
Iter 10 (CCD): obj = 4.4687188279407315e7, δ = 0.027545395036238728, t1 = 75.26, t2 = 185.38,t3 = 0.74
Iter 11 (PCA): 

Check final objective for all methods

In [76]:
@show mvr_ccd.obj
@show mvr_pca.obj
@show mvr_pca10_ccd5.obj
@show mvr_pca1_ccd1.obj;

mvr_ccd.obj = 4.996950440468871e7
mvr_pca.obj = 5.08867382132361e7
mvr_pca10_ccd5.obj = 4.402891940847431e7
mvr_pca1_ccd1.obj = 4.406225428900643e7


# SDP group knockoffs

### Equi

In [15]:
μ = zeros(size(Σ, 1))
Random.seed!(seed)
@time equi = modelX_gaussian_group_knockoffs(X, :equi, groups, μ, Σ, m=5, verbose=true);

  4.410695 seconds (12.80 k allocations: 2.446 GiB, 4.30% gc time)


### CCD only

In [94]:
@time sdp_ccd = modelX_gaussian_group_knockoffs(X, :sdp, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=0, inner_ccd_iter=1, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


SDP initial obj = 56079.690064005146
Iter 1 (CCD): obj = 56077.87206422066, δ = 0.35306433181026914, t1 = 7.7, t2 = 12.83, t3 = 0.0
Iter 2 (CCD): obj = 56077.571790988455, δ = 0.034747623926398116, t1 = 13.39, t2 = 25.94, t3 = 0.0
Iter 3 (CCD): obj = 56077.40936475633, δ = 0.012066367756926977, t1 = 18.27, t2 = 39.08, t3 = 0.0
Iter 4 (CCD): obj = 56077.117586482884, δ = 0.13319824412493447, t1 = 22.58, t2 = 52.15, t3 = 0.0
Iter 5 (CCD): obj = 56076.95774296385, δ = 0.022911483999140662, t1 = 26.1, t2 = 65.22, t3 = 0.0
Iter 6 (CCD): obj = 56076.63833849053, δ = 0.12790204056018703, t1 = 29.54, t2 = 78.31, t3 = 0.0
Iter 7 (CCD): obj = 56076.36310981196, δ = 0.13840927034906447, t1 = 33.18, t2 = 91.33, t3 = 0.0
Iter 8 (CCD): obj = 56076.256084204586, δ = 0.03153464580595722, t1 = 36.43, t2 = 104.37, t3 = 0.0
Iter 9 (CCD): obj = 56076.09203739157, δ = 0.08137679834024376, t1 = 39.22, t2 = 117.52, t3 = 0.0
Iter 10 (CCD): obj = 56075.87788011633, δ = 0.1300671696174998, t1 = 41.73, t2 = 130.

### PCA only

In [91]:
@time sdp_pca = modelX_gaussian_group_knockoffs(X, :sdp, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=1, inner_ccd_iter=0, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


SDP initial obj = 56079.690064005146
Iter 1 (PCA): obj = 56072.45214288683, δ = 0.49392194177097315, t1 = 1.92, t2 = 0.19, t3 = 0.05
Iter 2 (PCA): obj = 56067.94306172441, δ = 0.3391991448713793, t1 = 3.61, t2 = 0.38, t3 = 0.1
Iter 3 (PCA): obj = 56064.283676741405, δ = 0.3862316769955135, t1 = 5.28, t2 = 0.56, t3 = 0.15
Iter 4 (PCA): obj = 56061.14079050244, δ = 0.2983201317453147, t1 = 6.95, t2 = 0.74, t3 = 0.21
Iter 5 (PCA): obj = 56056.34643923874, δ = 0.5321204371633618, t1 = 8.59, t2 = 0.92, t3 = 0.27
Iter 6 (PCA): obj = 56053.95850381139, δ = 0.15026949025383504, t1 = 10.26, t2 = 1.1, t3 = 0.33
Iter 7 (PCA): obj = 56050.287827327615, δ = 0.2374151867081092, t1 = 11.92, t2 = 1.29, t3 = 0.39
Iter 8 (PCA): obj = 56048.261159168054, δ = 0.33595365603884825, t1 = 13.59, t2 = 1.47, t3 = 0.45
Iter 9 (PCA): obj = 56046.268787619054, δ = 0.1585401400160023, t1 = 15.26, t2 = 1.65, t3 = 0.51
Iter 10 (PCA): obj = 56045.13259323847, δ = 0.0640219332589984, t1 = 17.09, t2 = 1.84, t3 = 0.58
It

### 10 iter PCA -> 5 iter CCD -> 10 iter PCA...

In [95]:
@time sdp_pca10_ccd5 = modelX_gaussian_group_knockoffs(X, :sdp, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=10, inner_ccd_iter=5, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


SDP initial obj = 56079.690064005146
Iter 1 (PCA): obj = 56072.45214288683, δ = 0.49392194177097315, t1 = 1.83, t2 = 0.19, t3 = 0.05
Iter 2 (PCA): obj = 56067.94306172441, δ = 0.3391991448713793, t1 = 3.47, t2 = 0.37, t3 = 0.1
Iter 3 (PCA): obj = 56064.283676741405, δ = 0.3862316769955135, t1 = 5.16, t2 = 0.54, t3 = 0.15
Iter 4 (PCA): obj = 56061.14079050244, δ = 0.2983201317453147, t1 = 6.86, t2 = 0.72, t3 = 0.21
Iter 5 (PCA): obj = 56056.34643923874, δ = 0.5321204371633618, t1 = 8.53, t2 = 0.89, t3 = 0.27
Iter 6 (PCA): obj = 56053.95850381139, δ = 0.15026949025383504, t1 = 10.09, t2 = 1.07, t3 = 0.33
Iter 7 (PCA): obj = 56050.287827327615, δ = 0.2374151867081092, t1 = 11.62, t2 = 1.24, t3 = 0.39
Iter 8 (PCA): obj = 56048.261159168054, δ = 0.33595365603884825, t1 = 13.17, t2 = 1.42, t3 = 0.45
Iter 9 (PCA): obj = 56046.268787619054, δ = 0.1585401400160023, t1 = 14.8, t2 = 1.6, t3 = 0.52
Iter 10 (PCA): obj = 56045.13259323847, δ = 0.0640219332589984, t1 = 16.66, t2 = 1.78, t3 = 0.59
Ite

### 1 iter PCA -> 1 iter CCD -> 1 iter PCA...

In [96]:
@time sdp_pca1_ccd1 = modelX_gaussian_group_knockoffs(X, :sdp, groups, μ, Σ, 
    m=5, verbose=true, inner_pca_iter=1, inner_ccd_iter=1, outer_iter=100);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:275


SDP initial obj = 56079.690064005146
Iter 1 (PCA): obj = 56072.45214288683, δ = 0.49392194177097315, t1 = 2.0, t2 = 0.19, t3 = 0.05
Iter 2 (CCD): obj = 56071.97600268918, δ = 0.03906536495422958, t1 = 10.42, t2 = 12.9, t3 = 0.05
Iter 3 (PCA): obj = 56065.25153163343, δ = 0.5017774825881296, t1 = 12.39, t2 = 13.09, t3 = 0.1
Iter 4 (CCD): obj = 56063.34397218509, δ = 0.7884953450774447, t1 = 22.14, t2 = 25.86, t3 = 0.1
Iter 5 (PCA): obj = 56056.040617055245, δ = 0.32686786653935135, t1 = 24.96, t2 = 26.05, t3 = 0.15
Iter 6 (CCD): obj = 56055.04218060528, δ = 0.06937611989658213, t1 = 37.74, t2 = 38.92, t3 = 0.15
Iter 7 (PCA): obj = 56051.55629167982, δ = 0.08045836782440748, t1 = 39.63, t2 = 39.11, t3 = 0.2
Iter 8 (CCD): obj = 56050.187281274586, δ = 0.15688937652765522, t1 = 53.73, t2 = 51.97, t3 = 0.2
Iter 9 (PCA): obj = 56046.58677847234, δ = 0.1841571069521299, t1 = 55.84, t2 = 52.15, t3 = 0.25
Iter 10 (CCD): obj = 56045.35090792832, δ = 0.20044063777301901, t1 = 72.09, t2 = 64.89, t

Compare final S values for all methods

In [97]:
idx = findall(!iszero, equi.S)
[equi.S[idx] sdp_ccd.S[idx] sdp_pca.S[idx] sdp_pca10_ccd5.S[idx] sdp_pca1_ccd1.S[idx]]

76664×5 Matrix{Float64}:
  0.000569304   0.0177378     0.796475      0.790119      0.78983
  0.000569304   0.000931357   0.576174      0.347541      0.632671
  0.000569304   0.000557871   0.00235804    0.0525715     0.0403276
  0.000569304   0.00063354    0.0030676     0.00344463    0.00268033
  0.000467792   0.000553064  -0.00182387   -0.00164511   -0.00096014
  0.000370561   0.000405127   0.000851871   0.00106886    0.0010919
  0.000468284   0.000500645  -0.00052845   -0.00065369   -0.000324445
  0.000430677   0.000463411   5.72631e-5   -0.000198569   8.11117e-5
  0.000328536   0.000324269  -0.00032393    0.000135954   0.000322808
  0.00033097    0.000324846  -0.000249686   0.000143566   0.000419341
 -0.000222831  -0.000295231  -0.00153589   -0.00186573   -0.00171383
 -0.00024201   -0.000251217  -0.000926161  -0.00102635   -0.000946974
  0.000308462   0.000308083   0.000171974   0.000315805   0.000393788
  ⋮                                                      
  0.000569304   0.0005

In [98]:
sum([equi.S[idx] sdp_ccd.S[idx] sdp_pca.S[idx] sdp_pca10_ccd5.S[idx] sdp_pca1_ccd1.S[idx]], dims=1)

1×5 Matrix{Float64}:
 24.0512  28.1528  82.6089  109.765  101.402

Check final objective for all methods

In [100]:
@show equi.obj
@show sdp_ccd.obj
@show sdp_pca.obj
@show sdp_pca10_ccd5.obj
@show sdp_pca1_ccd1.obj;

equi.obj = 56079.69006400531
sdp_ccd.obj = 56075.4848221942
sdp_pca.obj = 56007.01838879511
sdp_pca10_ccd5.obj = 55947.64547709972
sdp_pca1_ccd1.obj = 55960.72049095445


## Conclusion

CCA by itself can converge to sub-optimal solution. PCA-based CCD can converge to better solutions but it reaches there much slower. A practical strategy is to run a hybrid version of them, which tends to converge to the best solution, while computational time remains comparable (sometimes faster) than pure CCD. 