Skip to content

Julia library for Bayesian non- and semi-parametric hazard models using B-splines

License

Notifications You must be signed in to change notification settings

mbrueckner/SplineHazard.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SplineHazard.jl

Julia library for Bayesian non- and semi-parametric hazard models using splines

Build Status

This is a Julia library implementing the reversible jump MCMC for Bayesian B-spline hazard models as described in Bayesian adaptive B-spline estimation in proportional hazards frailty models by Sharef et al.

It can be used either as standalone for non-parametric estimation of hazard, cumulative hazard and survival functions, or as part of semi-parametric model, e.g. as baseline hazard in a proportional hazards model.

It uses the Dierckx Julia package for B-spline evaluation.

Features

  • Univariate hazard models using 1-dimensional B-splines with number and location of knot positions determined by the data via reversible jump MCMC
  • Designed to be easily integrated into existing MCMC procedures with only a few lines of code without placing restrictions on the rest of the model (only the likelihood function needs to be supplied by the user)
  • Plot mean posterior hazard / cumulative hazard / survival functions with posterior credible intervals
  • Estimate coverage probability of posterior credible intervals for given true hazard / cumulative hazard / survival functions

TODO

  • More example models (additive hazard model, ...)
  • parametric hazard penalty

Install (Julia 0.7 and later)

(v1.0) pkg> add SplineHazard

(Type ] to enter package mode.) Requires Dierckx, DataFrames and Distributions Julia packages to be installed.

Example Usage (Semi-parametric proportional hazards model)

This example shows how to implement a semi-parametric proportional hazards model with a cubic B-spline baseline hazard and a single covariate. The full code is in src/examples/ph.jl.

Generate an example survival dataset

function ph_data(n=200, beta=0.0, tmax=36)
    z = rand(Binomial(1, 0.5), n) .- 0.5

    ## 25 patients per month
    max_entry = n / 25
    
    entry = rand(Uniform(0, max_entry), n)
    U = -log.(1 .-rand(Uniform(0, 1), n))

    ## 12 months median survival time
    T = U.*exp.(-beta*z)./(log(2)/12)

    ## approx: 10% admin cens, 10% drop-out
    C = min.(rand(Uniform(0, 3*36), n), tmax)
   
    time = min.(T, C)
    status = T .< C
    id = 1:n
    
    DataFrame(id=id, entry=entry, time=time, status=status, z=z)
end

data = ph_data(500, log(0.67)) 

Load the package

using SplineHazard

Define the log-likelihood function. First argument is the B-spline (object of type Dierckx.Spline1D) defined by the current number and location of the knots and the spline weights at the current iteration of the MCMC algorithm. Second argument is the current value of the log-hazard ratio. Third argument is a DataFrame or NamedTuple with fields time (vector of censored event times), status (boolean vector of censoring indicators) and z (vector of group indicators).

function loglik(s::Sampler, beta, data)
    h = hazard(data.time[data.status], s)
    ch = cumhaz(data.time, s)
    sum(log.(h) .+ beta.*data.z[data.status]) - sum(ch.*exp.(beta.*data.z))
end

Function to sample from the full conditional distribution of beta given all other parameters

function sample_beta(beta::Float64, s::Sampler, data, prior::UnivariateDistribution) 
    X = data.z
    time = data.time
    status = data.status

    ch = extract_cumhaz(s, time, s.t)
        
    function log_target(x::Float64)
        logpdf(prior, x) + sum(status.*x.*X - ch.*exp.(x.*X))
    end

    prop = rand(Normal(beta, 2.38), 1)[1]
    r = log_target(prop) - log_target(beta)
    
    if log(rand(1)[1]) < r
        return prop, true
    else
        return beta, false
    end
end

Place 100 candidate knots at quantiles of observed failure times

N_max = 100
cand_knots = quantile(data.time[data.status], Vector(0.0:1/(N_max+1):1.0))[2:(end-1)]

Use Poisson distribution as prior on number of knots, inverse gamma distribution for penalty parameter, use cubic splines, and place the two outer knots at 0.0 and the largest observed failure time.

prior = SplineHazard.Prior(Poisson(4), InverseGamma(0.01, 0.01), (theta, v) -> sum(theta.^2)/(2*v),
                               4, N_max, cand_knots, (0.0, maximum(data.time[data.status])))

Create sampler using log-likelihood function and prior object

s = create_sampler((sp, beta) -> loglik(sp, beta, data), prior)

Randomly select 4 candidate knots as initial knots for the spline

knots0 = sort(sample(1:length(cand_knots), 4))

Set initial values for all parameters. All spline weights set to 0.0 (constant hazard function) and the penalty parameter set to 1.0

M = 10000
set_initial_state!(s, M, Param(4, zeros(Float64, 8), knots0, 1.0))

Tune penalty parameter during warmup iterations aiming for 25% acceptance rate for the spline weight parameters theta. This uses the simple linear regression based tuning as proposed by Sharef et al.

set_tuner!(s, Tuner(Int(M/2))) 

Allocate vector to hold posterior samples of log-hazard ratio and set initial value to 0.0

beta = Vector{Float64}(undef, M+1)
beta[1] = 0.0
beta_ac = 0 ## count acceptances of beta proposals

Normal prior for log-hazard ratio

prior_beta = Normal(0, 10)

Gibbs sampler requires only call to update! function at each iteration to use the RJ-MCMC for the baseline hazard model

for t in 1:M
    beta[t+1], ac = sample_beta(beta[t], s, data, prior_beta)
    beta_ac += ac
        
    update!(s, beta[t+1])  ## beta[t+1] gets passed to the log-likelihood function without modification
end

Print summary information of posterior samples

print("Acceptance rate for beta: $(beta_ac/M)")
print("Posterior mean: $(mean(beta[(s.warmup+1):end]))")
print("Posterior variance: $(var(beta[(s.warmup+1):end]))")

df = summary(s)
head(df)

References

Sharef et al.: Bayesian adaptive B-spline estimation in proportional hazards frailty models

About

Julia library for Bayesian non- and semi-parametric hazard models using B-splines

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages