In [1]:
include("functions.jl")

magic = readdlm("magic04.data", ',', Any, '\n')
magic = magic[sample(1:19020, 200, replace = false),:]
loc = findall(x->x=="g",magic[:,11])
magic[loc,11] .= 1
loc = findall(x->x=="h",magic[:,11])
magic[loc,11] .= 0
magic = convert(Array{Float64}, magic)

class = magic[:,11]
magic[:,11] .= 1
n, p = size(magic)

(200, 11)

Magic gamma telescope dataset
- \# of sample: 19,020
- \# of covariate: 11 (including intercept)
- response: binary

Choose $N_1$ random samples, and obtain $\hat\beta$ for the logistic regression model. Then, find a locally {$A_K$, $D$}-optimal design of sample size $N_2$ with $\hat\beta$.

Since the whole dataset is too large, we shall conduct our experiment on subset of 1,000 samples.

#### Case 2, OED
- $D$-optimality
- $N_1$: 30
- $N_2$: 100

In [14]:
using BenchmarkTools, StatsBase

Random.seed!(1)
N1 = 30
N2 = 100
samp1 = sample(1:n, N1, replace = false)
cand = setdiff(1:n, samp1)

@time aopt = sagnol_D(magic[cand,:], magic[samp1,:], N2; verbose=1, IC=1)
aopt = BitArray(round.(aopt))
println(sum(aopt))
samp2 = [samp1; cand[aopt]]
exact = det(magic[samp2,:]'magic[samp2,:])

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 9962            
  Cones                  : 1893            
  Scalar variables       : 12277           
  Matrix variables       : 0               
  Integer variables      : 171             

Optimizer started.
Mixed integer optimizer started.
Threads used: 24
Presolve started.
Presolve terminated. Time = 0.14
Presolved problem: 8048 variables, 3852 constraints, 26581 non-zeros
Presolved problem: 0 general integer, 170 binary, 7878 continuous
Clique table size: 0
BRANCHES RELAXS   ACT_NDS  DEPTH    BEST_INT_OBJ         BEST_RELAX_OBJ       REL_GAP(%)  TIME  
0        1        0        0        NA                   -7.7828933196e+03    NA          0.6   
0        1        0        0        -9.1611658192e+03    -9.1611658192e+03    0.00e+00    1.9   
An optimal solution satisfying the relative gap tolera

8.164016818453245e39

In [15]:
@time aopt = sagnol_D(magic[cand,:], magic[samp1,:], N2; verbose=1, IC=0)

n_try = 30

apprx = zeros(Float64, n_try)
apprx_size = zeros(Int64, n_try)
for i = 1:n_try
    bool_opt = rand(n-N1) .< aopt
    apprx_size[i] = sum(bool_opt)
    samp2 = [samp1; cand[bool_opt]]
    apprx[i] = det(magic[samp2,:]'magic[samp2,:])
end

countmap(apprx_size)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 10133           
  Cones                  : 1893            
  Scalar variables       : 12277           
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 1894
Eliminator terminated.
Eliminator started.
Freed constraints in eliminator : 0
Eliminator terminated.
Eliminator - tries                  : 2                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.00            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.02    
Problem
  Name                   :                 
  Objective

└ @ Convex /DATA/home/ppinsm/.julia/packages/Convex/6NNC8/src/solution.jl:51


Dict{Int64,Int64} with 10 entries:
  100 => 4
  102 => 5
  98  => 3
  101 => 3
  99  => 2
  103 => 3
  104 => 2
  97  => 5
  96  => 1
  105 => 2

In [16]:
describe(apprx ./ exact)

Summary Stats:
Length:         30
Missing Count:  0
Mean:           0.998754
Minimum:        0.825276
1st Quartile:   0.911174
Median:         1.007646
3rd Quartile:   1.081248
Maximum:        1.213229
Type:           Float64
