In [1]:
include("functions.jl")

magic = readdlm("magic04.data", ',', Any, '\n')
magic = magic[sample(1:19020, 200, replace = false),:]
loc = findall(x->x=="g",magic[:,11])
magic[loc,11] .= 1
loc = findall(x->x=="h",magic[:,11])
magic[loc,11] .= 0
magic = convert(Array{Float64}, magic)

class = magic[:,11]
magic[:,11] .= 1
n, p = size(magic)

(200, 11)

Magic gamma telescope dataset
- \# of sample: 19,020
- \# of covariate: 11 (including intercept)
- response: binary

Choose $N_1$ random samples, and obtain $\hat\beta$ for the logistic regression model. Then, find a locally {$A_K$, $D$}-optimal design of sample size $N_2$ with $\hat\beta$.

Since the whole dataset is too large, we shall conduct our experiment on subset of 1,000 samples.

#### Case 1, OED
- $A$-optimality
- $N_1$: 30
- $N_2$: 100

In [12]:
using BenchmarkTools, StatsBase

Random.seed!(1)
N1 = 30
N2 = 100
samp1 = sample(1:n, N1, replace = false)
cand = setdiff(1:n, samp1)

@time aopt = sagnol_A(magic[cand,:], magic[samp1,:], N2; verbose=1, IC=1)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 3008            
  Cones                  : 171             
  Scalar variables       : 5256            
  Matrix variables       : 0               
  Integer variables      : 171             

Optimizer started.
Mixed integer optimizer started.
Threads used: 24
Presolve started.
Presolve terminated. Time = 0.05
Presolved problem: 2882 variables, 463 constraints, 27240 non-zeros
Presolved problem: 0 general integer, 170 binary, 2712 continuous
Clique table size: 0
BRANCHES RELAXS   ACT_NDS  DEPTH    BEST_INT_OBJ         BEST_RELAX_OBJ       REL_GAP(%)  TIME  
0        1        0        0        NA                   -2.9961939855e-01    NA          0.1   
0        1        0        0        -2.9943662035e-01    -2.9961939855e-01    0.06        0.4   
Cut generation started.
0        2        0        0   

170-element Array{Float64,1}:
 1.0
 0.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 ⋮  
 0.0
 0.0
 0.0
 1.0
 0.0
 1.0
 0.0
 1.0
 0.0
 1.0
 1.0
 0.0

In [13]:
@time aopt = sagnol_A(magic[cand,:], magic[samp1,:], N2; verbose=1, IC=0)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 3179            
  Cones                  : 171             
  Scalar variables       : 5256            
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator - tries                  : 0                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.00            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.01    
Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 3179            
  Cones               

170-element Array{Float64,1}:
 0.9999637050811876   
 1.5723347073907028e-6
 0.9999992469131076   
 0.9984728906815348   
 0.9999993713492352   
 0.9999192659569099   
 0.9999998533940073   
 0.9999986315986255   
 0.9999892575658899   
 0.999999515256423    
 0.9999988243112853   
 0.99999789086498     
 0.9999957319600167   
 ⋮                    
 7.830113670134259e-6 
 1.421254845588156e-6 
 9.106789426881728e-6 
 0.999998043582222    
 1.3669476183583764e-6
 0.9999845389028844   
 3.46937016511816e-6  
 0.9999983996649368   
 1.979452192440497e-6 
 0.9999856743068779   
 0.9999999602352794   
 1.563541605161239e-6 

In [14]:
#N1 = 30
#N2 = 100
#samp1 = sample(1:n, N1, replace = false)

aopt = sagnol_A(magic[cand,:], magic[samp1,:], N2; K = magic', verbose=0, IC=1)
aopt = BitArray(round.(aopt))
println(sum(aopt))
samp2 = [samp1; cand[aopt]]
exact = tr(inv(magic[samp2,:]'magic[samp2,:]))

Optimal
100


14.7893563822838

In [15]:
aopt = sagnol_A(magic[cand,:], magic[samp1,:], N2; K = magic', verbose=0, IC=0)

n_try = 30

apprx = zeros(Float64, n_try)
apprx_size = zeros(Int64, n_try)
for i = 1:n_try
    bool_opt = rand(n-N1) .< aopt
    apprx_size[i] = sum(bool_opt)
    samp2 = [samp1; cand[bool_opt]]
    apprx[i] = tr(inv(magic[samp2,:]'magic[samp2,:]))
end

countmap(apprx_size)

Stall


└ @ Convex /DATA/home/ppinsm/.julia/packages/Convex/6NNC8/src/solution.jl:51


Dict{Int64,Int64} with 5 entries:
  100 => 11
  102 => 1
  98  => 2
  101 => 9
  99  => 7

In [16]:
describe(exact ./ apprx)

Summary Stats:
Length:         30
Missing Count:  0
Mean:           0.999673
Minimum:        0.988966
1st Quartile:   0.998954
Median:         1.000270
3rd Quartile:   1.002729
Maximum:        1.005671
Type:           Float64
