We will introduce the use of the SafetySignalDetection.jl
package with a small example.
We use a small example data set.
Note that:
- The outcome
y
needs to be a bool vector. A different type of vector (e.g. integer) will fail later for the model fitting. - The column
time
needs to be aFloat64
(time until adverse event or until last treatment or follow up). - The column
trialindex
needs to beInt64
(index of historical trials, starting from 1 and consecutively numbered).
using CSV
using DataFrames
using SafetySignalDetection
module_dir = pkgdir(SafetySignalDetection)
csv_path = joinpath(module_dir, "test", "small_historic_dataset.csv")
df = DataFrame(CSV.File(csv_path))
df.y = map(x -> x == 1, df.y)
For both parameters Distributions.Beta()
.
For the mean
For the discount factor
Note:
- The number of samples (e.g. 10,000) and the number of threads (1) need to be passed as integers. Otherwise the call will fail.
- We don't need to discard a burn in period now because the NUTS sampler is used internally and is discarding a burn in automatically already.
using Distributions
prior_a = Beta(1 / 3, 1 / 3)
prior_b = Beta(5, 5)
map_samples = meta_analytic_samples(df, prior_a, prior_b, 10_000, 1)
map_samples[1:5]
The resulting map_samples
are from the meta-analytic prior for the adverse event rate per unit of time in the control arm in the ongoing blinded trial.
Now we give the fit_beta_mixture()
function, together with the
number of components. Usually 2 components will be sufficient to achieve a good approximation.
Internally, the classic Expectation Maximization (EM) algorithm is used.
We can check the approximation graphically by overlaying the resulting probability density function with the samples histogram.
using Plots
map_approx = fit_beta_mixture(map_samples, 2)
stephist(map_samples, label = "prior samples", norm = :pdf)
xvals = range(minimum(map_samples), maximum(map_samples); length = 100)
plot!(xvals, pdf(map_approx, xvals), label = "approximate prior")
Now we can proceed to analyzing the blinded clinical trial.
The important new ingredients here are:
- The prior on the experimental adverse event rate. This can be uninformative, or weakly informative using a diluted background rate distribution.
- The proportion of patients in the experimental treatment arm.
prior_exp = Beta(1, 1)
prior_ctrl = map_approx
exp_proportion = 0.5
csv_current_path = joinpath(module_dir, "docs", "src", "small_current_dataset.csv")
df_current = DataFrame(CSV.File(csv_current_path))
df_current.y = map(x -> x == 1, df_current.y)
result = blinded_analysis_samples(df_current, prior_exp, prior_ctrl, exp_proportion, 10_000, 1)
first(result, 5)
We can now e.g. look at the posterior probability that the rate in the experimental arm,
mean(result[!, :pi_exp] .> result[!, :pi_ctrl])
We can also create corresponding plots.
stephist(result[!, :pi_exp], label = "experimental arm", norm = :pdf)
stephist!(result[!, :pi_ctrl], label = "control arm", norm = :pdf)