# Compare pure CCD, pure PCA-CCD, and hybrids of them

Coordinate descent is NOT guaranteed to converged to global solution, even when objective is convex. Counterexample:

![Image Test](CCD_counterexample.png)

Empirically, 

+ Cyclic coordinate descet (CCD): converges quickly, but could get stuck at local min
+ PCA: converges slower, but less likely to get stuck at local min

A practical strategy is to combine them. For example,
+ Run 10 PCA iterations to prime the CCD algorithm
+ Rotate between running PCA and CCD until both reach convergence.

In [1]:
using Revise
using Knockoffs
using Random
using GLMNet
using Distributions
using LinearAlgebra
using ToeplitzMatrices
using StatsBase
using CSV, DataFrames
using DelimitedFiles
using Plots
gr(fmt=:png);

function TP(correct_groups, signif_groups)
    return length(signif_groups ∩ correct_groups) / max(1, length(correct_groups))
end
function FDR(correct_groups, signif_groups)
    FP = length(signif_groups) - length(signif_groups ∩ correct_groups) # number of false positives
    return FP / max(1, length(signif_groups))
end

function get_sigma(option::Int, p::Int)
    # note: groups are defined empirically within each simuation
    datadir = "/Users/biona001/Benjamin_Folder/research/4th_project_PRS/group_knockoff_test_data"
    if option == 1
        ρ = 0.7
        Σ = SymmetricToeplitz(ρ.^(0:(p-1))) |> Matrix
    elseif option == 2
        ρ = 0.7
        γ = 0.25
        groups = repeat(1:Int(p/5), inner=5)
        Σ = simulate_block_covariance(groups, ρ, γ)
    elseif option == 3
        covfile = CSV.read(joinpath(datadir, "CorG_2_127374341_128034347.txt"), DataFrame) # 3782 SNPs
        Σ = covfile |> Matrix{Float64}
        Σ = 0.99Σ + 0.01I #ensure PSD
    elseif option == 4
        df = CSV.read(joinpath(datadir, "21_37870779_38711704.csv"), DataFrame)
        Σ = df[:, 7:end] |> Matrix |> Symmetric |> Matrix
    elseif option == 5
        df = CSV.read(joinpath(datadir, "22_17674295_18295575.csv"), DataFrame)
        Σ = df[:, 7:end] |> Matrix |> Symmetric |> Matrix
    else
        error("Option should be 1-5 but was $option")
    end
    return Σ[1:p, 1:p]
end

sigma_option = 4
p = 1000
Σ = get_sigma(sigma_option, p)

# define groups
# groups, _ = id_partition_groups(Symmetric(Σ), force_contiguous=true)
# @show length(unique(groups));

top_dir = "/Users/biona001/Desktop/group_knockoff_simulations/gnomad_chr21"

┌ Info: Precompiling Knockoffs [878bf26d-0c49-448a-9df5-b057c815d613]
└ @ Base loading.jl:1423


"/Users/biona001/Desktop/group_knockoff_simulations/gnomad_chr21"

## Example data: gnomAD  

This is an example where hybrid approach beats all other methods

In [2]:
# import some data

which = method = :maxent
seed = 9
group_def = "hc"
rep_def = :id
force_contiguous = false
n = 10000
y_dist = "normal"
feature_importance_method = "marginal"
sigma_option = 4
p = 1000
m = 5   # number of knockoffs per variable
k = 10  # number of causal variables

function get_signif_groups(β, groups)
    correct_groups = Int[]
    for i in findall(!iszero, β)
        g = groups[i]
        g ∈ correct_groups || push!(correct_groups, g)
    end
    return correct_groups
end

outdir = joinpath(top_dir, "sim$(seed)")
yfile = joinpath(outdir, "y_$(y_dist).txt")
Xfile = joinpath(outdir, "X.txt")
βfile = joinpath(outdir, "beta.txt")
group_name = force_contiguous ? "$(group_def)_groups_contig" : "$(group_def)_groups"
gfile = joinpath(outdir, group_name, "groups.txt")

βtrue = readdlm(βfile) |> vec
y = readdlm(yfile) |> vec
X = readdlm(Xfile)
groups = readdlm(gfile, Int) |> vec
correct_groups = get_signif_groups(βtrue, groups)

7-element Vector{Int64}:
  15
   4
 125
  24
  89
 212
 259

# ME group knockoffs

### CCD only

In [48]:
μ = zeros(size(Σ, 1))
Random.seed!(seed)
@time ko_ccd = modelX_gaussian_group_knockoffs(X, :maxent, groups, μ, Σ, m=5, verbose=true);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:263


Full CCD initial obj = -55688.13281259273, 76664 optimization variables
Iter 1: obj = -51125.070717908784, δ = 0.8152279364652412, t1 = 2.3, t2 = 12.94, t3 = 0.02
Iter 2: obj = -46793.979555385275, δ = 0.704329830607164, t1 = 4.33, t2 = 25.77, t3 = 0.03
Iter 3: obj = -44101.23670728737, δ = 0.5109706744537869, t1 = 6.82, t2 = 38.6, t3 = 0.21
Iter 4: obj = -42032.48468851866, δ = 0.6202280623277363, t1 = 9.77, t2 = 51.61, t3 = 0.23
Iter 5: obj = -40321.16733235221, δ = 0.18611712800896402, t1 = 13.47, t2 = 64.57, t3 = 0.25
Iter 6: obj = -38713.78675381487, δ = 0.13416763547365318, t1 = 17.77, t2 = 77.34, t3 = 0.28
Iter 7: obj = -37635.01261419391, δ = 0.32363973156000375, t1 = 22.66, t2 = 90.41, t3 = 0.3
Iter 8: obj = -36804.658225914354, δ = 0.06838962544431994, t1 = 27.9, t2 = 103.46, t3 = 0.32
Iter 9: obj = -35986.071383762814, δ = 0.22477155101803795, t1 = 33.09, t2 = 116.49, t3 = 0.36
Iter 10: obj = -35279.04688752074, δ = 0.0741147922687955, t1 = 38.73, t2 = 129.52, t3 = 0.39
Iter

### PCA only

In [36]:
@time pca_ko = modelX_gaussian_group_knockoffs(X, :maxent_pca, groups, μ, Σ, m=5, verbose=true);

initial obj = -55688.13281259273
Iter 1: obj = -49887.00792787721, δ = 1.220243050869317, t1 = 1.73, t2 = 0.19
Iter 2: obj = -45340.67335726165, δ = 0.8615363863992264, t1 = 3.54, t2 = 0.38
Iter 3: obj = -42288.66198436306, δ = 0.5544813089887345, t1 = 5.24, t2 = 0.56
Iter 4: obj = -39190.97955718186, δ = 0.36244198534542765, t1 = 7.24, t2 = 0.75
Iter 5: obj = -37215.09358490321, δ = 0.4729149947782355, t1 = 9.26, t2 = 0.93
Iter 6: obj = -35407.62732602055, δ = 0.4026994323247552, t1 = 11.86, t2 = 1.11
Iter 7: obj = -33805.21726638959, δ = 0.397549654650883, t1 = 14.59, t2 = 1.29
Iter 8: obj = -32733.07359318107, δ = 0.49713191803492757, t1 = 19.7, t2 = 1.49
Iter 9: obj = -32041.510810555363, δ = 0.17720141440603618, t1 = 21.57, t2 = 1.68
Iter 10: obj = -31500.218093227486, δ = 0.3055799752655914, t1 = 23.3, t2 = 1.87
Iter 11: obj = -31054.572111108388, δ = 0.2823693483955372, t1 = 25.11, t2 = 2.05
Iter 12: obj = -30710.659686604264, δ = 0.30788353909059657, t1 = 26.92, t2 = 2.23
Iter 

### 10 iter of PCA, then full CCD

In [46]:
μ = zeros(size(Σ, 1))
Random.seed!(seed)
@time pca_ccd_ko = modelX_gaussian_group_knockoffs(X, :maxent, groups, μ, Σ, m=5, verbose=true);

Performing 10 PCA-CCD steps to prime main algorithm


└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:263


initial obj = -55688.13281259273
Iter 1: obj = -49887.00792787721, δ = 1.220243050869317, t1 = 1.57, t2 = 0.18
Iter 2: obj = -45340.67335726165, δ = 0.8615363863992264, t1 = 3.14, t2 = 0.36
Iter 3: obj = -42288.66198436306, δ = 0.5544813089887345, t1 = 4.73, t2 = 0.53
Iter 4: obj = -39190.97955718186, δ = 0.36244198534542765, t1 = 6.34, t2 = 0.71
Iter 5: obj = -37215.09358490321, δ = 0.4729149947782355, t1 = 7.96, t2 = 0.89
Iter 6: obj = -35407.62732602055, δ = 0.4026994323247552, t1 = 9.58, t2 = 1.07
Iter 7: obj = -33805.21726638959, δ = 0.397549654650883, t1 = 11.17, t2 = 1.25
Iter 8: obj = -32733.07359318107, δ = 0.49713191803492757, t1 = 12.79, t2 = 1.43
Iter 9: obj = -32041.510810555363, δ = 0.17720141440603618, t1 = 14.41, t2 = 1.61
Iter 10: obj = -31500.218093227486, δ = 0.3055799752655914, t1 = 16.11, t2 = 1.8
Full CCD initial obj = -31500.218093227795, 76664 optimization variables
Iter 1: obj = -28354.42544883219, δ = 0.5983694665154821, t1 = 11.86, t2 = 12.62, t3 = 0.02
Iter 

### PCA and CCD hybrid

In [38]:
@time ko_hybrid = modelX_gaussian_group_knockoffs(X, :maxent_hybrid, groups, μ, Σ, m=5, verbose=true);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:263


Maxent hybrid CCD initial obj = -55688.13281259273
Iter 1: obj = -49887.00792787721, δ = 1.220243050869317, t1 = 1.63, t2 = 0.18
Iter 2: obj = -45340.67335726165, δ = 0.8615363863992264, t1 = 3.24, t2 = 0.37
Iter 3: obj = -42288.66198436306, δ = 0.5544813089887345, t1 = 5.0, t2 = 0.55
Iter 4: obj = -39190.97955718186, δ = 0.36244198534542765, t1 = 6.75, t2 = 0.74
Iter 5: obj = -37215.09358490321, δ = 0.4729149947782355, t1 = 8.53, t2 = 0.92
Iter 6: obj = -35407.62732602055, δ = 0.4026994323247552, t1 = 10.28, t2 = 1.1
Iter 7: obj = -33805.21726638959, δ = 0.397549654650883, t1 = 12.06, t2 = 1.29
Iter 8: obj = -32733.07359318107, δ = 0.49713191803492757, t1 = 13.85, t2 = 1.47
Iter 9: obj = -32041.510810555363, δ = 0.17720141440603618, t1 = 15.67, t2 = 1.66
Iter 10: obj = -31500.218093227486, δ = 0.3055799752655914, t1 = 17.33, t2 = 1.84
Iter 1: obj = -28354.42544883188, δ = 0.5983694665154821, t1 = 11.87, t2 = 12.72, t3 = 0.02
Iter 2: obj = -27027.811155315056, δ = 0.4747001129581579, t

Check final objective for all methods

In [49]:
@show ko_ccd.obj # CCD only
@show pca_ko.obj # PCA only
@show pca_ccd_ko.obj # 10 PCA iter followed by CCD 
@show ko_hybrid.obj; # hybrid

ko_ccd.obj = -27511.287174459656
pca_ko.obj = -27342.8892273691
pca_ccd_ko.obj = -25984.370487958855
ko_hybrid.obj = -24930.782036418357


# MVR group knockoffs

### PCA only

In [78]:
@time pca_ko = modelX_gaussian_group_knockoffs(X, :mvr_pca, groups, μ, Σ, m=5, verbose=true);

initial obj = 5.917809329366885e9
Iter 1: obj = 1.9359158238124454e8, δ = 0.06559019805806361, t1 = 1.9, t2 = 0.54
Iter 2: obj = 8.780314277501507e7, δ = 0.05710761152107989, t1 = 3.74, t2 = 1.08
Iter 3: obj = 6.758024068947296e7, δ = 0.08264719074810654, t1 = 5.54, t2 = 1.61
Iter 4: obj = 6.212911097281186e7, δ = 0.07484120446507896, t1 = 7.48, t2 = 2.15
Iter 5: obj = 5.9000301337717205e7, δ = 0.044710054742149195, t1 = 9.21, t2 = 2.68
Iter 6: obj = 5.658224149396229e7, δ = 0.03902989662425233, t1 = 10.87, t2 = 3.21
Iter 7: obj = 5.469612003874917e7, δ = 0.04443459102588582, t1 = 12.77, t2 = 3.75
Iter 8: obj = 5.3461908628842056e7, δ = 0.047578891867269955, t1 = 14.69, t2 = 4.28
Iter 9: obj = 5.278209107589113e7, δ = 0.0420124766960568, t1 = 16.52, t2 = 4.81
Iter 10: obj = 5.232980876477032e7, δ = 0.029692575160892243, t1 = 18.13, t2 = 5.34
Iter 11: obj = 5.185210914355141e7, δ = 0.01356588525070232, t1 = 19.76, t2 = 5.87
Iter 12: obj = 5.1451410662945226e7, δ = 0.006873114037468125, 

### 10 iter of PCA, then full CCD

In [79]:
@time pca_ccd_ko = modelX_gaussian_group_knockoffs(X, :mvr, groups, μ, Σ, m=5, verbose=true);

Performing 10 PCA-CCD steps to prime main algorithm


└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:263


initial obj = 5.917809329366885e9
Iter 1: obj = 1.9359158238124454e8, δ = 0.06559019805806361, t1 = 1.83, t2 = 0.55
Iter 2: obj = 8.780314277501507e7, δ = 0.05710761152107989, t1 = 3.79, t2 = 1.1
Iter 3: obj = 6.758024068947296e7, δ = 0.08264719074810654, t1 = 5.68, t2 = 1.65
Iter 4: obj = 6.212911097281186e7, δ = 0.07484120446507896, t1 = 7.91, t2 = 2.21
Iter 5: obj = 5.9000301337717205e7, δ = 0.044710054742149195, t1 = 14.9, t2 = 2.89
Iter 6: obj = 5.658224149396229e7, δ = 0.03902989662425233, t1 = 17.23, t2 = 3.42
Iter 7: obj = 5.469612003874917e7, δ = 0.04443459102588582, t1 = 19.11, t2 = 3.95
Iter 8: obj = 5.3461908628842056e7, δ = 0.047578891867269955, t1 = 21.58, t2 = 4.5
Iter 9: obj = 5.278209107589113e7, δ = 0.0420124766960568, t1 = 23.69, t2 = 5.04
Iter 10: obj = 5.232980876477032e7, δ = 0.029692575160892243, t1 = 25.6, t2 = 5.57
Full CCD initial obj = 5.232980876412458e7, 76664 optimization variables
Iter 1: obj = 4.6597326064555384e7, δ = 0.052498376734749316, t1 = 17.66, t

### Hybrid

In [77]:
@time hybrid_ko = modelX_gaussian_group_knockoffs(X, :mvr_hybrid, groups, μ, Σ, m=5, verbose=true);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:263


MVR hybrid CCD initial obj = 5.917809329366885e9
Iter 1: obj = 1.9359158238124454e8, δ = 0.06559019805806361, t1 = 2.09, t2 = 0.55
Iter 2: obj = 8.780314277501507e7, δ = 0.05710761152107989, t1 = 4.01, t2 = 1.14
Iter 3: obj = 6.758024068947296e7, δ = 0.08264719074810654, t1 = 5.77, t2 = 1.72
Iter 4: obj = 6.212911097281186e7, δ = 0.07484120446507896, t1 = 7.53, t2 = 2.27
Iter 5: obj = 5.9000301337717205e7, δ = 0.044710054742149195, t1 = 9.32, t2 = 2.86
Iter 6: obj = 5.658224149396229e7, δ = 0.03902989662425233, t1 = 11.1, t2 = 3.43
Iter 7: obj = 5.469612003874917e7, δ = 0.04443459102588582, t1 = 12.89, t2 = 4.0
Iter 8: obj = 5.3461908628842056e7, δ = 0.047578891867269955, t1 = 14.76, t2 = 4.61
Iter 9: obj = 5.278209107589113e7, δ = 0.0420124766960568, t1 = 16.5, t2 = 5.19
Iter 10: obj = 5.232980876477032e7, δ = 0.029692575160892243, t1 = 18.3, t2 = 5.76
Iter 1: obj = 4.6597326065201126e7, δ = 0.052498376734749316, t1 = 17.65, t2 = 36.42,t3 = 0.05
Iter 2: obj = 4.529119139290935e7, δ = 

Check final objective for all methods

In [81]:
@show pca_ko.obj
@show pca_ccd_ko.obj
@show hybrid_ko.obj;

pca_ko.obj = 5.1451410662945226e7
pca_ccd_ko.obj = 4.437091639271612e7
hybrid_ko.obj = 4.405531109909656e7


# SDP group knockoffs

### CCD only

In [105]:
μ = zeros(size(Σ, 1))
Random.seed!(seed)
@time ccd_ko = modelX_gaussian_group_knockoffs(X, :sdp, groups, μ, Σ, m=5, verbose=true);

└ @ Knockoffs /Users/biona001/.julia/dev/Knockoffs/src/group.jl:267


Full CCD initial obj = 56079.69006400531, 76664 optimization variables
Iter 1: obj = 56077.87206422007, δ = 0.35306433181026914, t1 = 7.66, t2 = 12.46, t3 = 0.0
Iter 2: obj = 56077.57179098782, δ = 0.034747623926398116, t1 = 13.36, t2 = 25.01, t3 = 0.0
Iter 3: obj = 56077.409364755884, δ = 0.012066367756926977, t1 = 18.14, t2 = 37.64, t3 = 0.0
Iter 4: obj = 56077.117586481996, δ = 0.13319824412493447, t1 = 22.39, t2 = 50.17, t3 = 0.0
Iter 5: obj = 56076.95774296262, δ = 0.022911483999140662, t1 = 25.84, t2 = 62.72, t3 = 0.0
Iter 6: obj = 56076.63833848922, δ = 0.12790204056018703, t1 = 29.17, t2 = 75.22, t3 = 0.0
Iter 7: obj = 56076.36310981076, δ = 0.13840927034906447, t1 = 32.71, t2 = 87.78, t3 = 0.0
Iter 8: obj = 56076.25608420319, δ = 0.03153464580595722, t1 = 35.89, t2 = 100.66, t3 = 0.0
Iter 9: obj = 56076.09203739025, δ = 0.08137679834024376, t1 = 38.67, t2 = 113.78, t3 = 0.0
Iter 10: obj = 56075.877880114924, δ = 0.1300671696174998, t1 = 41.12, t2 = 126.35, t3 = 0.0
Iter 11: ob

In [111]:
eigmin(ccd_ko.S), eigmin((m+1)/m*Σ-ccd_ko.S)

(7.855838280496751e-11, 1.8573797753988826e-15)

### PCA only

In [109]:
μ = zeros(size(Σ, 1))
Random.seed!(seed)
@time pca_ko = modelX_gaussian_group_knockoffs(X, :sdp_pca, groups, μ, Σ, m=5, verbose=true);

initial obj = 56079.690064005146
Iter 1: obj = 56072.45214288683, δ = 0.49392194177097315, t1 = 1.71, t2 = 0.180.08, t3 = 0.08
Iter 2: obj = 56067.94306172441, δ = 0.3391991448713793, t1 = 3.27, t2 = 0.360.17, t3 = 0.17
Iter 3: obj = 56064.283676741405, δ = 0.3862316769955135, t1 = 4.92, t2 = 0.540.26, t3 = 0.26
Iter 4: obj = 56061.14079050244, δ = 0.2983201317453147, t1 = 6.59, t2 = 0.710.36, t3 = 0.36
Iter 5: obj = 56056.34643923874, δ = 0.5321204371633618, t1 = 8.42, t2 = 0.890.46, t3 = 0.46
Iter 6: obj = 56053.95850381139, δ = 0.15026949025383504, t1 = 10.28, t2 = 1.070.56, t3 = 0.56
Iter 7: obj = 56050.287827327615, δ = 0.2374151867081092, t1 = 12.06, t2 = 1.250.67, t3 = 0.67
Iter 8: obj = 56048.261159168054, δ = 0.33595365603884825, t1 = 13.7, t2 = 1.430.78, t3 = 0.78
Iter 9: obj = 56046.268787619054, δ = 0.1585401400160023, t1 = 15.24, t2 = 1.610.89, t3 = 0.89
Iter 10: obj = 56045.13259323847, δ = 0.0640219332589984, t1 = 16.84, t2 = 1.781.01, t3 = 1.01
Iter 11: obj = 56044.1336

In [110]:
eigmin(pca_ko.S), eigmin((m+1)/m*Σ-pca_ko.S)

(1.2134956551386175e-8, 7.150086486046698e-9)

In [112]:
idx = findall(!iszero, ccd_ko.S)
[ccd_ko.S[idx] pca_ko.S[idx]]

76664×2 Matrix{Float64}:
  0.0177378     0.796475
  0.000931357   0.576174
  0.000557871   0.00235804
  0.00063354    0.0030676
  0.000553064  -0.00182387
  0.000405127   0.000851871
  0.000500645  -0.00052845
  0.000463411   5.72631e-5
  0.000324269  -0.00032393
  0.000324846  -0.000249686
 -0.000295231  -0.00153589
 -0.000251217  -0.000926161
  0.000308083   0.000171974
  ⋮            
  0.00056466    0.0245963
  0.000556263   0.109572
  0.000401297   0.099169
  0.000562975   0.00697376
  0.000401297   0.099169
  0.000810437   0.109572
  0.000558794   0.0436767
  0.00059342    2.52078e-5
  0.000759886   0.0124663
  0.000641093   0.342549
  0.000562948   0.000106783
  0.00296159    0.0255176

## Conclusion

CCA by itself can converge to sub-optimal solution. PCA-based CCD can converge to better solutions but it reaches there much slower. A practical strategy is to run a hybrid version of them, which tends to converge to the best solution, while computational time remains comparable (often faster) than pure CCD. 