# BinaryTwoStageDesigns - Quickstart

This is a [jupyter](http://jupyter.org) notebook using the [Julia](https://julialang.org/) kernel [IJulia.jl](https://github.com/JuliaLang/IJulia.jl) demonstrating the use of the julia package [BinaryTwoStageDesigns](https://github.com/imbi-heidelberg/BinaryTwoStageDesigns).

To run this notebook, a working installation of the Cbc and Ipopt solver-packages for MathProgbase are required. 

In [1]:
using BinaryTwoStageDesigns, Cbc, Ipopt, DataFrames

## Setting

Assume that a new anti-cancer agent is to be tested against a historical response rate of $p_0=0.2$ in a phase-II trial and a response rate of $p_1=0.4$ is expected.
The maximal tolerable type-I-error rate for testing $\mathcal{H}_0:p\leq p_0$ is 5% and a type-II-error rate of 20% is deemed acceptable at $p_1=0.4$.

The corresponding single-stage design would require $n=47$ patients in this situation.

In [2]:
p0    = 0.2
p1    = 0.4
beta  = 0.2
alpha = 0.05

0.05

## Adaptive Design

Alternatively, a two-stage adaptive design could be used which minimizes the expected sample size under $p_1=0.4$ subject to the same constraints. 
Additionally, for operational reasons a potential second stage must enroll at least 5 patients. Also, upon rejection of the null hypothesis, at least 25 patients must be enrolled to ensure a sufficiently precise effect estimate for subsequent phase-III planning.

### Sample Space

First, a sample space object is defined. It simply holds infomarion about the allowable search space for the optimization algorithm. Here, the range of possible stage-one sample sizes is limited to 10 to 20 and the maximal overall sample size to 75. 

In [3]:
s = SampleSpace(
    15:25, # n1 range
    75     # nmax
)



### Parameters

Next, the design parameters are also stored in an object. The simplest parameters object corresponds to minimising expected sample size. For a `MESS`-object only $p_0, p_1$, maximal type one and two error rates and the parameter value for which the expected sample size is to be minimized are required besides the sample space object created earlier.

In [4]:
params1 = MESS(
    s,                          # sample space
    p0, p1;                     # null and planning alternative
    alpha = alpha, beta = beta, # max. type one and two error rates
    pess = p1                   # alternative on which to minimize expected sample size
)

SampleSpace



### Optimization

Finally, a solver can be defined and the optimization process is started. Note that both the optimal design as well as all design found while exhaustively exploring the $n_1$-space are returned. The basic technique via integer programming has been desribed in [Kunzmann & Kieser 2016](https://arxiv.org/abs/1605.00249).

In [None]:
design1, res1 = optimaldesign(params1, CbcSolver(), VERBOSE = 1)
DataFrame(design1)

MESS
optimizing design for parameters ''
considering 11 stage-one sample sizes between 15 and 25 using Cbc.CbcMathProgSolverInterface.CbcSolver as solver

    time    n1   % done   sol. time [s]   cum. time [min]       score        best   % diff to best
16:36:11    15      9.1             108               1.8   +2.45e+01   +2.45e+01              0.0



#### Bayesian approach

Alternatively, a Bayesian design criterion can be used where the expected sample size under a prior distribution is minimized subject to a constraint on expected power.
To this end, the minimal clinically relevant response rate $p_{MCR}$ must be defined. 
Here we assume that $p_{MCR}=p_0+0.1$.

In [None]:
pmcr = p0 + .1

For the prior, we simply define a Beta distribution with mass centered slightly below $0.4$:

In [None]:
f(p) = Distributions.pdf(Distributions.Beta(5, 8), p)

Also, for operational reasons additional constraints on the feasible region, i.e., the sample space are imposed.
Often, it will be sensible to require a certain minimal number of subjects for the second stage to outweight the operational burden of an interim analysis (here: 5) and to require a certain minimal number upon rejection of the null hypothesis to ensure a sufficient precision of the response rate estimate when going on to a subsequent phase III trial (here: 25).

In [None]:
s2 = SampleSpace(
    15:25,
    75,   
    n2min = 5, # minmal stage-two sample size
    nmincont = 25 # minimal sample size upon rejection of the null
)

params2 = MBESS(
    s2,                       # sample space
    p0, pmcr, f,          # null, pmcrv, and prior
    alpha = alpha, beta = beta # max. type one error rate and expected type two error rates
)

In [None]:
design2, res2 = optimaldesign(params2, CbcSolver(), VERBOSE = 1)
DataFrame(design2)

## Inference

After completing the trial with 5/16 responses in stage one and 10 further responses in stage two, a point estimate and confidence iterval are required. Point estimates were discussed in [Kunzmann & Kieser 2016](http://onlinelibrary.wiley.com/doi/10.1002/sim.7200/abstract) and different estimators are implemented. Here, we use a compatible minimum expected mean squared error estimator with several favorable properties.

In [None]:
est = OCEstimator(design1, IpoptSolver())

estimate(est, 5, 10)

The maximum likelihood estimator (MLE) would have been:

In [None]:
(5 + 10) / (samplesize(design1, 5))

Any estimator induces an ordering on the sample space which in turn implies p values. The major advantage of the novel optimal compatible estimators in [Kunzmann & Kieser 2016](http://onlinelibrary.wiley.com/doi/10.1002/sim.7200/abstract) is the fact that their implied p values are *always* compatible with the design's test decision.

In [None]:
pvalue(est, 5, 10, p0)

The very same ordering/p values can then be used to derive a Clopper-Pearson type confidence interval (paper under review):

In [None]:
ci = ECPInterval(est, confidence = .9)

limits(ci, 5, 10)