The goal of this notebook to use the Microsim codebase to run simulations that allow us to validate the simulated Kaiser population
and outcomes against references. We do this initially by using outcomes that utilize WMH-specific information and then by using
outcomes that utilize only an adjustment to the risks by a constant without any WMH-specific information for risk-adjustment 
(set the wmhSpecific variable to False).

In [1]:
import os
microsimDir = "/Users/deligkaris.1/OneDrive - The Ohio State University Wexner Medical Center/MICROSIM/CODE/microsim"
os.chdir(microsimDir)

from microsim.validation import Validation

import pandas as pd
pd.set_option('future.no_silent_downcasting', True) #make the calculation here future-proof

dataDir = "/Users/deligkaris.1/OneDrive - The Ohio State University Wexner Medical Center/MICROSIM/NOTEBOOKS/DATA"

Run the simulation and save the dataframes with the proportional hazard model information.

In [2]:
%%time
wmhSpecific = True
dfs = Validation.kaiser(wmhSpecific=wmhSpecific, nWorkers=5)


VALIDATION OF BASELINE SIMULATED POPULATION

                          Printing a summary of risk factors and default treatments...
                          min      0.25    med     0.75     max    mean     sd
                          -----------------------------------------------------
                    age    45.0    59.0    65.0    72.0   112.0    65.8     9.5
                    hdl     5.4    42.1    51.8    62.3   127.3    52.6    15.0
                    bmi    13.5    24.9    28.5    32.4    59.5    28.9     5.8
                totChol    53.1   159.0   187.5   215.4   475.8   187.1    41.3
                   trig     9.0    96.8   144.7   200.3  2501.4   155.5    83.6
                    a1c     3.3     5.5     6.2     6.9    14.6     6.3     1.2
                    ldl     8.1    82.8   107.1   130.8   364.1   106.8    35.0
                  waist    34.6    91.3   101.1   110.9   164.1   101.1    14.6
             creatinine     0.1     0.7     1.0     1.3    15.2     

  return bound(*args, **kwds)


             Printing outcome incidence rates at the end of year 4...
             References: a Microsim simulation with all WMH-related models.

             Outcome     Reference     Simulation
             ----------------------------------------
              stroke          12.0           12.0
            dementia          11.0           10.8
               death          27.0           26.9
                  mi          12.0           11.5


             Printing outcome incidence rates by SCD group and modality at the end of year 11...
             References: Stroke-Kent2021, Wang2024, Mortality-Clancy2025, Dementia-Kent2022, MI-no available publication.

             Mortality rates
             ----------------------------------------
     Group                  Reference     Simulation
    CT SBI       61.5 ( 59.1 - 63.9 )           40.8
    CT WMD       63.8 ( 62.6 - 65.1 )           52.2
   CT BOTH       84.9 ( 80.9 - 89.2 )           94.1
   CT NONE       18.2 ( 17.8 - 1

Export the dataframes so that we can use R to do the proportional hazards modeling (feel free to use Python of course if that is your preference).

In [3]:
dfs["stroke"].to_csv(dataDir+"/kaiserStrokeValidation13YrTimes.csv", index=False)
dfs["mi"].to_csv(dataDir+"/kaiserMiValidation13YrTimes.csv", index=False)
dfs["dementia"].to_csv(dataDir+"/kaiserDementiaValidation13YrTimes.csv", index=False)
dfs["death"].to_csv(dataDir+"/kaiserDeathValidation13YrTimes.csv", index=False)

Now we will run the simulation again but the outcomes will include only in an average way the increased risk
of the Kaiser population, without using any WMH-specific information

In [4]:
%%time
wmhSpecific = False
dfs = Validation.kaiser(wmhSpecific=wmhSpecific, nWorkers=5)


VALIDATION OF BASELINE SIMULATED POPULATION

                          Printing a summary of risk factors and default treatments...
                          min      0.25    med     0.75     max    mean     sd
                          -----------------------------------------------------
                    age    45.0    59.0    65.0    72.0   107.0    65.8     9.5
                    hdl     5.4    42.1    51.9    62.3   130.3    52.6    15.0
                    bmi    13.5    24.9    28.5    32.4    60.7    28.9     5.8
                totChol    53.1   158.9   187.5   215.5   510.3   187.1    41.4
                   trig     9.0    96.8   144.7   200.6  3139.6   155.6    83.8
                    a1c     3.3     5.5     6.2     6.9    15.3     6.3     1.2
                    ldl     8.1    82.8   107.0   130.9   468.0   106.8    35.1
                  waist    42.9    91.3   101.1   110.8   164.7   101.1    14.5
             creatinine     0.1     0.7     1.0     1.3    13.0     

  return bound(*args, **kwds)


             Printing outcome incidence rates at the end of year 4...
             References: a Microsim simulation with all WMH-related models.

             Outcome     Reference     Simulation
             ----------------------------------------
              stroke          12.0           11.8
            dementia          11.0           10.7
               death          27.0           26.6
                  mi          12.0           11.7


             Printing outcome incidence rates by SCD group and modality at the end of year 11...
             References: Stroke-Kent2021, Wang2024, Mortality-Clancy2025, Dementia-Kent2022, MI-no available publication.

             Mortality rates
             ----------------------------------------
     Group                  Reference     Simulation
    CT SBI       61.5 ( 59.1 - 63.9 )           35.1
    CT WMD       63.8 ( 62.6 - 65.1 )           49.0
   CT BOTH       84.9 ( 80.9 - 89.2 )           89.2
   CT NONE       18.2 ( 17.8 - 1

In [5]:
dfs["stroke"].to_csv(dataDir+"/kaiserStrokeValidation13YrTimes.csv", index=False)
dfs["mi"].to_csv(dataDir+"/kaiserMiValidation13YrTimes.csv", index=False)
dfs["dementia"].to_csv(dataDir+"/kaiserDementiaValidation13YrTimes.csv", index=False)
dfs["death"].to_csv(dataDir+"/kaiserDeathValidation13YrTimes.csv", index=False)