# Randomized Controlled Trials discussion questions
	• What is a Randomized Controlled Trial (RCT)? When/where are RCTs used?
    
    Experimental design where we randomly assign subjects to experiment or control groups. 
    Economics, clinical research, engineering.     
    
	• Why randomization in scientific trials? What is the objective of an RCT? Tip: think statistics.
    
    Removing influence of bias and variation on measurement of experiment/placebo effects. 
    The objective of an RCT is to make the statistics (eg. moments) of the two groups as similar as possible, while getting good variation among the groups.   
    
	• Can we use optimization to improve outcomes of an RCT? How? What is the form of the problem?
    We can do optimization over moments, using data about participants to assign them. 
    The trick is mixed-integer optimization (MIO).  
    
	• How can uncertainty play a role in RCTs, or "optimized" experimental design (OED)?
    
    In other words, where could uncertainty come from (let's assume we are doing experimental design for Covid vaccine trials):
    - Measurement error in patient features. 
    - Misreporting error. 
    - Interpolation of omitted values (if we do imputation). 
    

# Robust Optimal Experiments

The Randomized Controlled Trial (RCT) is a trusted method in experimental design that aims to figure out responses to certain interventions, while reducing the discrepancy in results due to variance in subjects. In fact, in 2019, Prof. Duflo and Banerjee from MIT got the Nobel Prize in Economics for using RCTs for addressing issues in development economics (esp. poverty and availability of healthcare) using RCTs. 

But very few people talked about the fact that RCTs are quite ineffective in several aspects:
- They rely on the Law of Large Numbers, and large experimental populations are expensive. 
- For small samples, they are bad in achieving "uniform randomness" in the experimental groups. 

Instead, there is research to suggest that **optimal** experimental design (OED) can be significantly more powerful. 

This lecture will hopefully demonstrate that randomization is NOT a reliable method for getting the right distribution of "features" in subjects. Furthermore, it will demonstrate the influence of robustness on OEDs. 

## Motivational Problem: Medical Trials

Suppose that we only have the budget to conduct initial Covid-19 vaccine trials on 10 patients, where the patients are split 50/50 between control and treatment groups. We have had 20 applicants with 5 traits, which we generate randomly. (We have chosen small numbers since this problem can quickly become computationally challenging. But it is definitely solvable in larger scale as well.)

For simplicity, we will only consider the diagonal of the covariance matrix in recitation. However, this method can be extended to the full covariance matrix, by adding a lot more variables!

In [None]:
using Pkg
Pkg.activate(".")
using JuMP, Distributions, Random, LinearAlgebra, Gurobi, Plots

In [None]:
function generate_random_people(n_people::Int64 = 20, n_traits::Int64 = 5)
    # NOTE THAT OUR DATA IS NORMALIZED, so it makes the formulation more straight-forward. 
    continuous_values = rand(MersenneTwister(314), Normal(0.00, 1), (n_people, n_traits))
    return continuous_values
end
n_groups = 2
n_patients = 10
n_people = 20
n_traits = 5
data = generate_random_people(20, 5)

In [None]:
# Let's first try randomization
ctrl_idxs = Int64.(collect(1:n_patients/2))
vacc_idxs = Int64.(collect(n_patients/2+1:n_patients))
function print_details(data, ctrl_idxs, vacc_idxs)
    println("Control group: ", ctrl_idxs)
    println("Vaccine group: ", vacc_idxs)
    println("Mean traits of control group: ", round.(mean(data[ctrl_idxs, :], dims=1); sigdigits = 4))
    println("Mean traits of vaccine group: ", round.(mean(data[vacc_idxs, :], dims=1); sigdigits = 4))
    println("Var of traits of control group: ", round.(var(data[ctrl_idxs, :], dims=1); sigdigits = 4))
    println("Var of traits of vaccine group: ", round.(var(data[vacc_idxs, :], dims=1); sigdigits = 4))
    println("Nominal objective: ", round.(sum(abs.(mean(data[ctrl_idxs, :], dims=1) - mean(data[vacc_idxs, :], dims=1))) + 
            0.5 * sum(abs.(var(data[ctrl_idxs, :], dims=1) - var(data[vacc_idxs, :], dims=1))); sigdigits = 4))
    return
end
print_details(data, ctrl_idxs, vacc_idxs)

In [None]:
# Plotting the errors
function plot_errors(data, ctrl_idxs, vacc_idxs)
    l = @layout [a ; b]
    p1 = bar(collect(1:n_traits), mean(data[ctrl_idxs, :], dims=1)' .-  mean(data[vacc_idxs, :], dims=1)', label = "Mean errors")
    p2 = bar(collect(1:n_traits), var(data[ctrl_idxs, :], dims=1)' .- var(data[vacc_idxs, :], dims=1)', label = "Variance errors")
    plt = plot(p1, p2, layout = l)
    return plt
end
plt = plot_errors(data, ctrl_idxs, vacc_idxs)

### Can Optimization do better? 
It sure can! Let's start by writing out the problem. 

In this case, we will pick 2 groups of equal numbers of patients from the population, while minimizing the L1-norm error in the mean and variances between the two groups. 

In [None]:
# Let's start creating out model, and trying to solve without uncertainty
m = Model(Gurobi.Optimizer)
set_optimizer_attribute(m, "OutputFlag", 0)
@variable(m, x[i=1:n_people, 1:n_groups], Bin)
@variable(m, μ_p[i=1:n_groups, j=1:n_traits]) # Mean
@variable(m, σ_p[i=1:n_groups, j=1:n_traits]) # Variance
for j = 1:n_groups # Taking the mean and std deviation of parameters for each group
    @constraint(m, μ_p[j,:] .== 1/(n_patients/n_groups) * 
                    sum(data[i,:] .* x[i,j] for i=1:n_people)) # computing means
    @constraint(m, σ_p[j,:] .== 1/(n_patients/n_groups) * 
                    sum(data[i,:].^2 .* x[i,j] for i=1:n_people)) # computing variances
    @constraint(m, sum(x[:,j]) == n_patients/n_groups)
end
for i = 1:n_people
    @constraint(m, sum(x[i, :]) <= 1) # each patient only picked at most once
end

@variable(m, d)
@variable(m, M[1:n_traits]) # mean error
@variable(m, V[1:n_traits]) # variance error
rho = 0.5
@objective(m, Min, sum(M) + rho*sum(V))
for i = 1:n_groups
    for j = i+1:n_groups
        @constraint(m, M[:] .>= μ_p[i,:] - μ_p[j,:])
        @constraint(m, M[:] .>= μ_p[j,:] - μ_p[i,:])
        @constraint(m, V[:] .>= σ_p[i, :] - σ_p[j, :])
        @constraint(m, V[:] .>= σ_p[j, :] - σ_p[i, :])
    end
end

In [None]:
optimize!(m)

In [None]:
# Let's see the results
ctrl_opt = findall(x -> x == 1, Array(value.(x[:,1])))
vacc_opt = findall(x -> x == 1, Array(value.(x[:,2])))
print_details(data, ctrl_opt, vacc_opt)

In [None]:
# Plotting the distribution
plot_errors(data, ctrl_opt, vacc_opt)

#### Important note: We are not limited to this objective function!
For example, we could try maximizing variance while keeping the mean variation below a threshold... you can try any combination that is bounded from below!

### How does Robust Optimization (RO) change our solutions? 
It sure can! Let's start by writing out the problem. 

Note that we have to embed our uncertainty in our errors instead of the mean and variance variables. This is because putting uncertain variables in an equality is equivalent to collapsing the feasible set to a point, as you saw in the first question of Homework 1. 

In [None]:
# Let's start creating out model, and trying to solve with uncertainty
rm = Model(Gurobi.Optimizer)
# Let's start creating out model, and trying to solve without uncertainty
@variable(rm, x[i=1:n_people, 1:n_groups], Bin)
@variable(rm, μ_p[i=1:n_groups, j=1:n_traits]) # Mean
@variable(rm, σ_p[i=1:n_groups, j=1:n_traits]) # Variance
ρ = 0.1
Γ = 2
# With the following budget uncertainty
# @uncertain(rm, ell[1:n_people, 1:n_traits])
# @constraint(rm, norm(ell, 1) <= Γ)
# @constraint(rm, -ρ .<= ell .<= ρ)  
for j = 1:n_groups # Taking the mean and std deviation of parameters for each group
    @constraint(rm, μ_p[j,:] .== 1/(n_patients/n_groups) * 
                    sum(data[i,:].*x[i,j] for i=1:n_people))
    @constraint(rm, σ_p[j,:] .== 1/(n_patients/n_groups) * 
                    sum(data[i,:].^2 .* x[i,j] for i=1:n_people))
    @constraint(rm, sum(x[:,j]) == n_patients/n_groups)
end
for i = 1:n_people
    @constraint(rm, sum(x[i, :]) <= 1)
end
@variable(rm, M[1:n_traits])
@variable(rm, V[1:n_traits])
@objective(rm, Min, sum(M) + 0.5*sum(V))
# Let's use the robust counterpart
for i = 1:n_groups # We embed the uncertainty in the errors!
    for j = i+1:n_groups
        for l = 1:n_traits
            y = @variable(rm, [1:n_traits])
            normdummy = @variable(rm, [1:n_traits])
            @constraint(rm, normdummy .>= y)
            @constraint(rm, normdummy .>= -y)
            infdummy = @variable(rm)
            @constraint(rm, [k = 1:n_traits], infdummy >= (x[k,j] - x[k,i] - y[k]))
            @constraint(rm, [k = 1:n_traits], infdummy >= -(x[k,j] - x[k,i] - y[k]))
            @constraint(rm, M[l] * n_patients/n_groups >= 
                        sum(data[k,l] .* (x[k,j] - x[k,i]) for k=1:n_people) + 
                        ρ * sum(normdummy) + Γ*infdummy)
            y = @variable(rm, [1:n_traits])
            normdummy = @variable(rm, [1:n_traits])
            @constraint(rm, normdummy .>= y)
            @constraint(rm, normdummy .>= -y)
            infdummy = @variable(rm)
            @constraint(rm, [k = 1:n_traits], infdummy >= (x[k,j] - x[k,i] - y[k]))
            @constraint(rm, [k = 1:n_traits], infdummy >= -(x[k,j] - x[k,i] - y[k]))
            @constraint(rm, M[l] * n_patients/n_groups >= 
                        - sum(data[k,l] .* (x[k,j] - x[k,i]) for k=1:n_people) +  
                        ρ * sum(normdummy) + Γ*infdummy)
#             Sometimes you have to get creative... linearization of the change of the variance. 
            y = @variable(rm, [1:n_traits])
            normdummy = @variable(rm, [1:n_traits])
            @constraint(rm, normdummy .>= y)
            @constraint(rm, normdummy .>= -y)
            infdummy = @variable(rm)
            @constraint(rm, [k = 1:n_traits], infdummy >= (2*data[k,l]*(x[k,j] - x[k,i]) - y[k]))
            @constraint(rm, [k = 1:n_traits], infdummy >= -(2*data[k,l]*(x[k,j] - x[k,i]) - y[k]))
            @constraint(rm, V[l] * n_patients/n_groups >= 
                        sum(data[k,l].^2 .* (x[k,j] - x[k,i]) for k=1:n_people) + 
                        ρ*sum(normdummy) + Γ*infdummy)
            y = @variable(rm, [1:n_traits])
            normdummy = @variable(rm, [1:n_traits])
            @constraint(rm, normdummy .>= y)
            @constraint(rm, normdummy .>= -y)
            infdummy = @variable(rm)
            @constraint(rm, [k = 1:n_traits], infdummy >= (2*data[k,l]*(x[k,j] - x[k,i]) - y[k]))
            @constraint(rm, [k = 1:n_traits], infdummy >= -(2*data[k,l]*(x[k,j] - x[k,i]) - y[k]))
            @constraint(rm, V[l] * n_patients/n_groups >= 
                        -sum(data[k,l].^2 .* (x[k,j] - x[k,i]) for k=1:n_people) + 
                        ρ*sum(normdummy) + Γ*infdummy)
        end
    end
end

In [None]:
optimize!(rm)

In [None]:
# Let's see the results
ctrl_ro = findall(x -> x == 1, Array(value.(x[:,1])))
vacc_ro = findall(x -> x == 1, Array(value.(x[:,2])))
print_details(data, ctrl_ro, vacc_ro)
println("Robust objective: ", objective_value(rm))

In [None]:
plot_errors(data, ctrl_ro, vacc_ro)

# Conclusions

- Optimal experimental design is a useful method to make sure that the moments of our experiment and control groups are similar, while still being representative of the global population. 
- Uncertainty can result from a variety of factors in experimental designs.
- Robust optimal experimental design can improve the efficacy of experiments with small effect on the statistics of the nominal optimized groups. 
- Robust solutions can be much worse in their worst case values than they are in their nominal outcomes, so they are less conservative than they look!