# Compare MVR, ME, fast SDP knockoff with KnockPy

Below code was run on Sherlock, Julia v1.6.2 and python 3.9.10

In [10]:
using Revise
using Knockoffs
using Random
using GLMNet
using Distributions
using LinearAlgebra
using ToeplitzMatrices
using StatsBase
using PyCall
using BenchmarkTools
# using Plots
# gr(fmt=:png);

py"""
import numpy as np
import knockpy as kpy
from knockpy.knockoff_filter import KnockoffFilter
from knockpy.mrc import solve_mvr
from knockpy.mrc import solve_maxent
from knockpy.mrc import _solve_maxent_sdp_cd
"""

## First check Accuracy

In [11]:
seed = 2022

# simulate X
Random.seed!(seed)
n = 600
p = 300
ρ = 0.5
# Σ = (1-ρ) * I + ρ * ones(p, p)
Σ = Matrix(SymmetricToeplitz(ρ.^(0:(p-1)))) # true covariance matrix
μ = zeros(p)
L = cholesky(Σ).L
X = randn(n, p) * L # var(X) = L var(N(0, 1)) L' = var(Σ)

# simulate y
Random.seed!(seed)
k = Int(0.2p)
βtrue = zeros(p)
βtrue[1:k] .= rand(-1:2:1, k) .* rand(Uniform(0.5, 1), k)
shuffle!(βtrue)
correct_position = findall(!iszero, βtrue)
y = X * βtrue + randn(n)

# solve s vector in Julia
Xko_fastSDP = modelX_gaussian_knockoffs(X, :sdp_fast, μ, Σ)
Xko_maxent = modelX_gaussian_knockoffs(X, :maxent, μ, Σ)
Xko_mvr = modelX_gaussian_knockoffs(X, :mvr, μ, Σ)

# solve s vector in Python
py"""
s1 = _solve_maxent_sdp_cd($Σ, True, verbose=False)
s2 = solve_maxent($Σ, verbose=False)
s3 = solve_mvr($Σ, verbose=False)
"""
python_sdp_fast = [py"s1"[j, j] for j in 1:p]
python_me = [py"s2"[j, j] for j in 1:p]
python_mvr = [py"s3"[j, j] for j in 1:p];

Compare coordiate descent SDP 

In [12]:
[Xko_fastSDP.s python_sdp_fast]

300×2 Matrix{Float64}:
 1.0       0.999023
 0.657597  0.652421
 0.682226  0.684445
 0.656115  0.658447
 0.672361  0.666494
 0.663902  0.663628
 0.667925  0.672643
 0.666119  0.658839
 0.666898  0.670295
 0.666572  0.663683
 0.666705  0.670267
 0.666651  0.657008
 0.666673  0.678568
 ⋮         
 0.666641  0.661239
 0.66665   0.666006
 0.666724  0.669058
 0.666804  0.662648
 0.666575  0.667994
 0.665975  0.665694
 0.667282  0.665956
 0.671132  0.667645
 0.647843  0.657317
 0.704939  0.684848
 0.640859  0.652141
 1.0       0.999023

Compare MVR solutions

In [13]:
[Xko_mvr.s python_mvr]

300×2 Matrix{Float64}:
 0.594468  0.593903
 0.430784  0.43036
 0.438428  0.438004
 0.438477  0.438049
 0.438445  0.438011
 0.438447  0.438028
 0.438447  0.437975
 0.438447  0.438034
 0.438447  0.438015
 0.438447  0.438032
 0.438447  0.438001
 0.438447  0.438023
 0.438447  0.438017
 ⋮         
 0.438447  0.43802
 0.438447  0.438019
 0.438447  0.438019
 0.438447  0.438018
 0.438447  0.43802
 0.438447  0.438018
 0.438447  0.438012
 0.438445  0.438026
 0.438477  0.438046
 0.438428  0.438013
 0.430784  0.430359
 0.594468  0.593901

Compare ME solutions

In [14]:
[Xko_maxent.s python_me]

300×2 Matrix{Float64}:
 0.658052  0.657408
 0.470574  0.470117
 0.486666  0.486189
 0.485212  0.484739
 0.485343  0.484868
 0.485331  0.484857
 0.485332  0.484858
 0.485332  0.484856
 0.485332  0.484859
 0.485332  0.484858
 0.485332  0.484858
 0.485332  0.484859
 0.485332  0.484858
 ⋮         
 0.485332  0.484857
 0.485332  0.484858
 0.485332  0.484858
 0.485332  0.484859
 0.485332  0.484858
 0.485332  0.484859
 0.485331  0.484857
 0.485343  0.48487
 0.485212  0.484738
 0.486666  0.486191
 0.470574  0.470113
 0.658052  0.65741

## Timing results (Julia 1.6)

Here we force all functions to run same number of iterations of coordiate descent.

In [21]:
# simulate covariance matrix Sigma in python, and bring it over to Julia
py"""
p = 300
rho = 0.5
Sigma = (1-rho) * np.eye(p) + rho * np.ones((p, p))
"""
Sigma = py"Sigma";

Maximum entropy

In [22]:
@btime begin
    py"""
    solve_maxent(Sigma, verbose=False) # 5 iter
    """
end

  935.925 ms (3 allocations: 144 bytes)


In [31]:
@btime Knockoffs.solve_max_entropy(Sigma, tol=1e-15); # 5 iter

  169.500 ms (24 allocations: 2.87 MiB)


In [32]:
935.925 / 169.367

5.5260174650315586

MVR

In [33]:
@btime begin
    py"""
    solve_mvr(Sigma, verbose=False) # 5 iter
    """
end

  817.788 ms (3 allocations: 144 bytes)


In [25]:
@btime Knockoffs.solve_MVR(Sigma, tol=1e-13); # 5 iter

  142.864 ms (22 allocations: 2.86 MiB)


In [40]:
817.788 / 142.864

5.724241236420652

Coordinate descent SDP

In [34]:
@btime begin
    py"""
    _solve_maxent_sdp_cd(Sigma, True, verbose=False) # 49 iter
    """
end

  7.112 s (3 allocations: 144 bytes)


In [39]:
@btime Knockoffs.solve_sdp_fast(Sigma, verbose=false, niter=49); # 49 iter

  1.472 s (9 allocations: 1.38 MiB)


In [41]:
7.112 / 1.472

4.831521739130435

Seems like Julia is ~5x faster in all 3 cases