# Baseline simulations and deterministic sensitivity analyses
In this notebook computations for a health state transition model, commonly referred to as Markov model, are implemented. In the healthcare setting a patient can be in a predefined set of Markov states (health states) per unit of time. Each health state is related with a health reward (Quality-Adjusted Life Years) and costs per unit of time. Furthermore, transitions between health states are possible based on defined transition probabilities. For further explanation of Markov models read Markov_model_explanation.docx in the repository.

Baseline simulations refer to the use of a set of predefined parameters to perform health state transition simulations. Baseline parameters are reported in table 1 in: <br>

"Cost and health effects of case management compared to outpatient clinic follow-up in a Dutch heart failure cohort" <br> by H. van Voorst and A.E.R. Arnold <br>
DOI: 10.1002/ehf2.12692

Next to the baseline simulation results the code below directly computes deterministic one-way sensitivity analyses based on percentages change. In one-way deterministic sensitivity analyses no correlation between parameters was included, thus each parameter is changed while all the other parameters are set to their baseline value. Given a percentage change for every model parameter the simulated Quality-Adjusted Life Years (QALYs), costs (€), and Net Monetary Benefit (NMB in €) cumulative over a 5 year simulated follow-up are computed. An explanation of the background of the computations is available in the Markov_model_explanation.docx file in this repository.

This notebook is the first in an series of three:
1. Baseline simulations and one-way deterministic sensitivity analyses.
2. Probabilistic sensitivity analysis: uniform distributed parameters
3. Probabilistic sensitivity analysis: most probable distributed parameters 

In [1]:
import pandas as pd
import numpy as np
import math
import time
import os
import pickle

## Probability-time adjustment functions
Since the baseline input values of each baseline parameter was not estimated over the same time span computations were required. Furhtermore, based on a probability of an event in a control arm the probability of an event in the intervention arm was computed with the Relative Risk (RR). Functions below assume constant distribution of probabilities through time.

In [2]:
def monthly_prob(totmonths, events, total):
    """
    Function computes the monthly probability of an 
    event if measurement of events is over multiple months
    totmonths: amount of months for measurement
    events: number of events in totmonths
    total: total amount of patients at risk in totmonths
    """
    prob_event = (events/total)
    prob_surv = 1 - prob_event
    
    monthly_event_free_prob = prob_surv**(1/totmonths)
    mothly_event_prob = 1 - monthly_event_free_prob
    
    return mothly_event_prob


def RR_intervention(p_control, RR, rr_months, pc_months):
    """
    Computes the probability of an event 
    in the intervention arm based on 
    the Relative Risk (RR) and probability
    in the control arm (p_control).
    
    p_control: Probability of event in control arm
    RR: Relative Risk of event in intervention 
    arm relative to control arm
    rr_months: months used to compute RR
    pc_months: months over which p_control is measured
    """
    
    if rr_months==pc_months:
        p_intervention = p_control*RR
    else:
        # first convert the control probability to the 
        # same follow up time of the RR probabilities
        pc_adj = 1-(1-p_control)**(rr_months/pc_months)
        # compute the intervention probability event free
        p_int_eventfree = 1-pc_adj*RR # probability of no event in intervention group
        # go back to the followup time of the control probability
        p_intervention = 1-p_int_eventfree**(pc_months/rr_months)
    return p_intervention


## Defining Costs and QALYs per health state
The functions below implement the defining of Cost and QALY related parameters. The function infl_adjustment computes a correction factor for an increase in costs through time.

In [3]:
def infl_adjustment(months, yearly_CPI=1.029): #months
    """
    Compute a inflation adjustment factor for the amount
    of months (months) that have passed since the 
    start year (reference year; 2020). Use a predefined
    inflation factor (yearly_CPI).
    
    Output: The inflation adjustment factor.
    """
    CPI_adj_factor = yearly_CPI**(months/12)
    return CPI_adj_factor

def define_Costs(ic, CPI, refyear):
    """
    - ic: Either the 'Intervention' or 'Control' arm 
        as follow-up costs can differ.
    - CPI: Consumer price index adjustment factor, 
        used to compute the current
        costs indexed from the year in which costs were computed.
        In the study either 2014 or 2016 were 
        used for different costs.
    - refyear: The refernce case year in which the simulations start,
        in the study 2020 was used.

    Output: Cost per month for each of the 4 Markov States
    """
    FU_cost = 36*(CPI**(refyear-2016))
    if ic=='Intervention':
        FU_cost = 36*(CPI**(refyear-2014))
    
    Costs_N12 = round(FU_cost,2)
    Costs_N34 = round(FU_cost,2)
    Costs_H = round(3800*(CPI**(refyear-2016)),2)
    Costs_D = 0
    
    return Costs_N12, Costs_N34, Costs_H, Costs_D

def define_QALYs():
    """
    Define monthly QALYs for the 4 Markov 
    states used in this model.
    
    Output: QALYs per month for each of the 4 Markov states
     """
    QALY_N12 = 0.76/12
    QALY_N34 = 0.54/12
    QALY_H = 0.54/12
    QALY_D = 0
    
    return QALY_N12, QALY_N34, QALY_H, QALY_D


## Model input definition
The function model_input receives a dictionary with all the parameters as control settings (including RR, costs and QALYs) and returns the probability transition matrix and cost and QALY matrices for both control and intervention arm. As the Model contains 4 Markov states each simulation period (month) 4 possible transitions can occur and thus a 4x4 transition matrix was defined.

In [4]:
def model_input(dct):
    """
    A dictionary with all the below defined
    parameters was used as input for this function.
    
    Output: control (fullc) and intervention (fulli) 
    transition matrices. Cost (control;intervention: 
    C_mat_c;C_mat_i) and QALY (Q_mat) matrices.
    """
    
    #control arm
    bc = dct['b']
    cc = dct['c']
    dc = dct['d']
    ac = 1-bc-cc-dc
    
    ec = dct['e'] #0
    gc = dct['g']
    hc = dct['h']
    fc = 1-ec-gc-hc
    
    jc = dct['j']
    kc = dct['k'] #0
    lc = dct['l']
    ic = 1-jc-kc-lc
        
    fullc = np.array([[ac, bc, cc, dc],
                 [ec, fc, gc, hc],
                 [ic, jc, kc, lc],
                 [0,0,0,1]], dtype = 'float64')
    
    #intervention arm 
    # RR was computed over 12 months
    # Probabilities over 1 month
    bi = dct['b']
    ci = RR_intervention(dct['c'], dct['RR_read'], 12, 1)
    di = RR_intervention(dct['d'], dct['RR_mort'], 12, 1)
    ai = 1-bi-ci-di

    ei = dct['e'] 
    gi = RR_intervention(dct['g'], dct['RR_read'], 12, 1)
    hi = RR_intervention(dct['h'], dct['RR_mort'], 12, 1)
    fi = 1-ei-gi-hi
    
    ji = dct['j']
    ki = dct['k'] 
    li = dct['l']
    ii = 1-ji-ki-li

    fulli = np.array([[ai, bi, ci, di],
                     [ei, fi, gi, hi],
                     [ii, ji, ki, li],
                     [0,0,0,1]],dtype = 'float64')
    
    Q_mat = np.array([dct['Q_N12'],dct['Q_N34'],dct['Q_H'],0])
    # define the cost matrices, assume equal costs for hospitalization
    C_mat_c = np.array([dct['C_N12_c'],dct['C_N34_c'],dct['C_H_c'],0])
    C_mat_i = np.array([dct['C_N12_i'],dct['C_N34_i'],dct['C_H_c'],0])

    return fullc, fulli, C_mat_c, C_mat_i, Q_mat


## Simulate a month
The function below simulates a single period (month) based on input transition probabilities in a matrix and then calculates the QALYs and Costs with discounting.

In [5]:
def simulate_month(df, # pd DataFrame where all results are stored
                   r, # A name to add to each row of new results in df
                   month, # the period (month) since start of simulation
                   patient_dist, # Markov state distribution before new period
                   transition_mat, # Matrix with transition probabilities
                   Q_mat, # Matrix with QALYs per Markov state
                   C_mat, # Matrix with Costs per Markov state
                   discount_rate_C, # Discounting % for costs
                   CPI, # Inflation rate
                   discount_rate_Q): # Discounting % for QALYs
    """
    Simulates a single month given input parameters
    
    Output: df with results (df) and new Markov state
    distribution of patients
    """
    
    # Compute inflation adjustment factor (CPI_adj)
    # for the amount of months that have passed since 
    # the begin of simulations
    CPI_adj = infl_adjustment(month, CPI)
    
    # compute the patient Markov state distribution after 1 period (month)
    new_patient_dist = np.matmul(patient_dist,transition_mat)
    
    # Use the patient Markov state distribution 
    #to compute costs and QALYs for the specified period (month)
    QALYs = new_patient_dist*Q_mat
    Costs = new_patient_dist*C_mat
    T_Q = QALYs.sum()
    T_C = Costs.sum()
    # Compute discounted costs and QALYs
    disc_factor_C = discount_rate_C**(month/12)
    disc_factor_Q = discount_rate_Q**(month/12)
    disc_Q = T_Q/(disc_factor_Q )
    disc_C = round((T_C*CPI_adj)/(disc_factor_C), 2)
    
    # put everything in a new row in the dataframe
    nr = [r, month, *new_patient_dist, 
          *QALYs, T_Q, disc_Q, 
          *Costs, T_C, disc_C]
    df.loc[len(df)]=nr
    
    return df, new_patient_dist


## Perform the simulation
The function below implements simulation af a single cohort for multiple periods (months) given a specified transition matrix. 

In [6]:
def simulate_cohort(transition_mat,  # the transition matrix per period (month)
                    C_mat, # Matrix with Costs per Markov state
                    Q_mat, # Matrix with QALYs per Markov state
                    r='Control', #Name to add to each row in the output df
                    sim_months = 60, # Amount of total motnhs to simulate
                    cohort_size = 1e5, # Amount of patients in the cohort
                    CPI = 1.029, # Yearly inflation rate (2.9%)
                    discount_rate_Q = 1.015,# Yearly discounting rate of QALYs
                    discount_rate_C = 1.04):# Yearly discounting rate of Costs
    """
    Simulates the cohort for multiple periods (months)
    
    Output: Dataframe with outcome per period (result_df),
    cumulative costs (Cost_tot_disc) and QALYs (QALY_tot_disc) 
    over the simulated period (sim_months)
    """
    
    t1 = time.time()
    # Define columns of the output file
    result_df = \
    pd.DataFrame(columns = ['Cohort_type', 'Month', 
            'NYHA_12', 'NYHA_34', 'Hospital', 'Dead', 
            'Q_N12','Q_N34','Q_H', 'Q_D', 
            'QALY_tot', 'QALY_disc',
            'C_N12', 'C_N34','C_H', 'C_D', 
            'Cost_tot', 'Cost_disc'])
    # Define the start patient distribution across Markov states 
    patient_dist = np.array([0,0,cohort_size,0])
    # Perform computations per month
    for month in range(1,sim_months+1):

        result_df, patient_dist = \
        simulate_month(result_df,r,month,
                       patient_dist,transition_mat, Q_mat, 
                       C_mat,discount_rate_C, CPI, discount_rate_Q)
        
    Cost_tot_disc = result_df['Cost_disc'].sum()
    QALY_tot_disc = result_df['QALY_disc'].sum()

    t2 = time.time()
    print('Total simulation time '+r+':', round((t2-t1),2), 'seconds')    
    return result_df, Cost_tot_disc, QALY_tot_disc
                

## General functions implemented in final function
Two general functions were used to crunch all the data in usefull format

In [7]:
def excel_multtabs(df_list, tabname_list, loc, fname):
    """
    Based on a list of pandas dataframes (df_list),
    defined tabnames (tabname_list where len(df_list)),
    a location and filename (loc, fname) a excel file
    with multiple tabs is created and saved.
     """
    f = loc+'\\'+fname
    # Create a Pandas Excel writer using XlsxWriter as the engine.
    writer = pd.ExcelWriter(f, engine='xlsxwriter')
    #loop over the list and write the tabs
    for df,tabname in zip(df_list,tabname_list):
        df.to_excel(writer, sheet_name=tabname)
        
    # Close the Pandas Excel writer and output the Excel file.
    writer.save()
    return

def merge_dicts(d1, d2):
    """
    Function to merge two dictionaries to one
     """
    for k,v in d1.items():
        d1[k] = {**d1[k],**d2[k]}
    return d1

## Deterministic sensitivity analysis probabilities
In order to evaluate the independent changes of the probabilities (except for a, e, f, i, k due to the definition of their probability), all options with a change of -10% and +10% of one variable are generated in a dataframe. The parameters a, f, i are used to absorb any changes in the other parameters in order to keep the sum of all transition probabilities 1.

In [8]:
def one_way_sens(fullc,# monthly probabilities of the control arm
                 perc_change, # percentage change to use for sensitivity analysis
                 RR_readmission=0.64, # RR of hospital readmission for intervention arm
                 RR_mortality=0.78, #RR of mortality for intervention arm
                 cohort_size = 1e5,# cohort size
                 sloc=None): # if sloc is specified results are saved
    
    """
    Given a set of input parameters one-way sensitivity
    analysis with change of each parameter with
    a defined percentage (perc_change) was performed
    
    Output: Dictionary with {parameter:{change_percentage:
    {dCosts:cost values, dQALYs: QALY values}}}
    """
    
    # define baseline parameters
    a,b,c,d,e,f,g,h,i,j,k,l = list(fullc)
    dt = np.array([a,b,c,d,e,f,g,h,i,j,k,l])
    costs_control = list(define_Costs('Control', 1.029, 2020)[:-1])
    costs_intervention = list(define_Costs('Intervention', 1.029, 2020)[:-1])
    QALYs = list(define_QALYs()[:-1])

    # create a dict of baseline control arm parameters
    cols = ['changed_parameter',
            'a', 'b', 'c', 'd', 
            'e', 'f', 'g', 'h', 
            'i', 'j', 'k', 'l', 
            'RR_read', 'RR_mort', 
            'C_N12_c', 'C_N34_c','C_H_c', 
            'C_N12_i', 'C_N34_i','C_H_i', 
            'Q_N12','Q_N34','Q_H']
    data_row1 = ['original', *dt,RR_readmission,RR_mortality, 
                *costs_control, *costs_intervention, *QALYs]
    dct = {}
    for col,dr in zip(cols,data_row1):
        dct[col]=dr
        
    # Generate baseline model results
    c_mat, i_mat, Cm_c, Cm_i, Qm = model_input(dct)
    dfc, costc, qc = simulate_cohort(c_mat, Cm_c, Qm , 
                                     r='Control_baseline',
                                     cohort_size = cohort_size)
    dfi, costi, qi = simulate_cohort(i_mat, Cm_i, Qm , 
                                     r='Intervention_baseline',
                                     cohort_size = cohort_size)
    dcost_base = (costi-costc)/cohort_size
    dqaly_base = (qi-qc)/cohort_size
    
    df_list = [dfc,dfi]
    tabname_list = ['C_base', 'I_base']
    res = pd.DataFrame(columns = ['changed_parameter', 
                                  'C_costs', 'I_costs', 
                                  'C_QALY', 'I_QALY', 
                                  'dcosts', 'dQALYs', 
                                  'dcosts_base', 'dQALY_base'])
    
    res.loc[len(res)] = \
    ['baseline',costc/cohort_size, costi/cohort_size, 
    qc/cohort_size, qi/cohort_size, dcost_base, dqaly_base, 0,0]
    
    # define variables that are not used for 
    # deterministic sensitivity (by default left over parameters)
    resultants = ['a', 'f', 'i']
    zeros = ['e','k']
    dct_restabl = {}
    
    # loop over all parameters defined
    for k,v in dct.items():
        # exclude paramters that are resultants or defined as zero
        if (k not in resultants)&\
        (k not in zeros)&\
        (k!='changed_parameter'):
            tdict = {} # data is stored in this dictionary
            updict = {**dct} # copy original parameters
            downdict = {**dct} # copy original parameters
            #change values +/- a perc_change
            updict[k] = v*(1+perc_change)
            downdict[k] = v*(1-perc_change)
            
            #construct transition matrices for perc_change up and down
            upc_mat, upi_mat, upCost_c, upCost_i, upQ = model_input(updict)
            downc_mat, downi_mat, downCost_c, downCost_i, downQ = model_input(downdict)
            
            #compute results for up and down perc_change of parameter k
            addname = k+'__'+str(1+perc_change)
            dfc, costc, qc = simulate_cohort(upc_mat, \
            upCost_c, upQ, r='Control_'+addname, cohort_size = cohort_size)
            dfi, costi, qi = simulate_cohort(upi_mat, \
            upCost_i, upQ, r='Intervention_'+addname, cohort_size = cohort_size)
            #compute difference between intervention and 
            #cohort and difference of difference compared to baseline
            dcost = (costi-costc)/cohort_size # per patient difference in costs (intervention-control)
            dqaly = (qi-qc)/cohort_size # per patient difference in QALYs (intervention-control)
            dCb = abs(dcost)-abs(dcost_base) # difference in costs relative to baseline simulation
            dQb = abs(dqaly)-abs(dqaly_base)# difference in QALYs relative to baseline simulation
            df_list.extend([dfc,dfi])
            tabname_list.extend(['C_'+addname, 'I_'+addname])
            res.loc[len(res)] = [addname,costc/cohort_size, costi/cohort_size, 
                                 qc/cohort_size, qi/cohort_size, dcost, dqaly, dCb, dQb]
            tdict[perc_change] = {'dcosts': dcost, 'dQALYs':dqaly}
            
            addname = k+'__'+str(1-perc_change)
            dfc, costc, qc = simulate_cohort(downc_mat, downCost_c, downQ, r='Control_'+addname,cohort_size = cohort_size)
            dfi, costi, qi = simulate_cohort(downi_mat, downCost_i, downQ, r='Intervention_'+addname,cohort_size = cohort_size)
            dcost = (costi-costc)/cohort_size
            dqaly = (qi-qc)/cohort_size
            dCb = abs(dcost)-abs(dcost_base)
            dQb = abs(dqaly)-abs(dqaly_base)
            df_list.extend([dfc,dfi])
            tabname_list.extend(['C_'+addname, 'I_'+addname])
            res.loc[len(res)] = [addname,costc/cohort_size, costi/cohort_size, 
                                 qc/cohort_size, qi/cohort_size, dcost, dqaly,dCb, dQb]
            tdict[-perc_change] = {'dcosts': dcost, 'dQALYs':dqaly}
            
            dct_restabl[k] = tdict
            
            
    df_list.append(res)
    tabname_list.append('differences') 
    
    # store all dataframes with per simulation results if required
    if sloc!=None:
        fname = 'one-way-sens_'+str(perc_change)+'.xlsx'
        excel_multtabs(df_list, tabname_list, sloc, fname)
    
    return dct_restabl



## Implementation of study data

In [9]:
# Control arm probabilities per month 

#B = N12 -> N34 (NYHA decay from NYHA 1/2 to 
#NYHA 3/4; net effect assumed zero)
bc = 0

#C = N12 -> H (Hospital readmission from NYHA 1/2)
c_tot = 948
c_event = 185
c_months = 12
cc = monthly_prob(c_months, c_event, c_tot)

#D = N12 -> D (Mortality from NYHA 1/2)
d_tot = 948
d_event = 217
d_months = 12
dc = monthly_prob(d_months, d_event, d_tot)

#A = N12 -> N12 (residual; No change from NYHA 1/2)
ac = 1 - bc - cc - dc


#E = N34 -> N12 (No recovery from NYHA 3/4 to 
#NYHA 1/2 was assumed 0)
ec = 0

#G = N34 -> H (Hospital readmission rate from NYHA 3/4)
g_tot = 78 # Value observed in the cohort was not used
gc = 0.0227 # Literature estimate that was used (monthly)

#H = N34 -> D (Mortality rate from NYHA 3/4)
h_tot = 78 
h_event = 31
h_months = 12
hc = monthly_prob(h_months, h_event, h_tot)

#F = N34 -> N34 (residual; No change from NYHA 3/4)
fc = 1 - gc - hc
#fi = 1 - gi - hi

#I = H -> N12 (Discharge to NYHA 1/2)
i_tot = 1114 
i_event = 948
ic = i_event/i_tot

#J = H -> N34 (Discharge to NYHA 3/4)
j_tot = 1114 
j_event = 78
jc = j_event/j_tot

#K = H -> H (Hospital admission were 
# not longer than 1 month; defined 0)
kc = 0

#L = H -> D (In hospital mortality)
l_tot = 1114 
l_event = 88
lc = l_event/l_tot

#M,N,O define as zero, P defined as 1 (dead = dead)

fullc = np.array([ac, bc, cc, dc,
                 ec, fc, gc, hc,
                 ic, jc, kc, lc])

## Implementation of multiple one-way deterministic sensitivity analyses
Below the results of multiple one-way deterministic sensitivity analyses is implemented for different percentage change (pc)

In [None]:
sloc = r'C:\Users\henkvanvoorst\Documents\publicaties\HF simulatie\rebuttle\final_results\rebuttle_codecorrect'
dct_restabl = None
for pc in [.1,.2,.3,.4,.5]:
    dct_restabl2 = one_way_sens(fullc,pc,RR_readmission=0.64, RR_mortality=0.78, sloc=sloc)
    if dct_restabl!=None:
        dct_restabl = merge_dicts(dct_restabl, dct_restabl2)
    else:
        dct_restabl = dct_restabl2


Total simulation time Control_baseline: 0.42 seconds
Total simulation time Intervention_baseline: 0.43 seconds
Total simulation time Control_b__1.1: 0.43 seconds
Total simulation time Intervention_b__1.1: 0.45 seconds
Total simulation time Control_b__0.9: 0.49 seconds
Total simulation time Intervention_b__0.9: 0.42 seconds
Total simulation time Control_c__1.1: 0.42 seconds
Total simulation time Intervention_c__1.1: 0.42 seconds
Total simulation time Control_c__0.9: 0.43 seconds
Total simulation time Intervention_c__0.9: 0.42 seconds
Total simulation time Control_d__1.1: 0.41 seconds
Total simulation time Intervention_d__1.1: 0.43 seconds
Total simulation time Control_d__0.9: 0.47 seconds
Total simulation time Intervention_d__0.9: 0.42 seconds
Total simulation time Control_g__1.1: 0.48 seconds
Total simulation time Intervention_g__1.1: 0.43 seconds
Total simulation time Control_g__0.9: 0.51 seconds
Total simulation time Intervention_g__0.9: 0.44 seconds
Total simulation time Control_h__

Total simulation time Control_Q_H__0.8: 0.42 seconds
Total simulation time Intervention_Q_H__0.8: 0.42 seconds
Total simulation time Control_baseline: 0.42 seconds
Total simulation time Intervention_baseline: 0.5 seconds
Total simulation time Control_b__1.3: 0.43 seconds
Total simulation time Intervention_b__1.3: 0.43 seconds
Total simulation time Control_b__0.7: 0.45 seconds
Total simulation time Intervention_b__0.7: 0.43 seconds
Total simulation time Control_c__1.3: 0.42 seconds
Total simulation time Intervention_c__1.3: 0.41 seconds
Total simulation time Control_c__0.7: 0.42 seconds
Total simulation time Intervention_c__0.7: 0.42 seconds
Total simulation time Control_d__1.3: 0.42 seconds
Total simulation time Intervention_d__1.3: 0.42 seconds
Total simulation time Control_d__0.7: 0.43 seconds
Total simulation time Intervention_d__0.7: 0.42 seconds
Total simulation time Control_g__1.3: 0.47 seconds
Total simulation time Intervention_g__1.3: 0.44 seconds
Total simulation time Control_

Total simulation time Control_Q_H__1.4: 0.42 seconds
Total simulation time Intervention_Q_H__1.4: 0.42 seconds
Total simulation time Control_Q_H__0.6: 0.43 seconds
Total simulation time Intervention_Q_H__0.6: 0.43 seconds
Total simulation time Control_baseline: 0.53 seconds
Total simulation time Intervention_baseline: 0.55 seconds
Total simulation time Control_b__1.5: 0.43 seconds
Total simulation time Intervention_b__1.5: 0.41 seconds
Total simulation time Control_b__0.5: 0.42 seconds
Total simulation time Intervention_b__0.5: 0.4 seconds
Total simulation time Control_c__1.5: 0.43 seconds
Total simulation time Intervention_c__1.5: 0.57 seconds
Total simulation time Control_c__0.5: 0.54 seconds
Total simulation time Intervention_c__0.5: 0.5 seconds
Total simulation time Control_d__1.5: 0.44 seconds
Total simulation time Intervention_d__1.5: 0.46 seconds
Total simulation time Control_d__0.5: 0.41 seconds
Total simulation time Intervention_d__0.5: 0.47 seconds
Total simulation time Contr