# Analysis

This analysis reproduces the analysis performed in:

> Monks T, Worthington D, Allen M, Pitt M, Stein K, James MA. A modelling tool for capacity planning in acute and community stroke services. BMC Health Serv Res. 2016 Sep 29;16(1):530. doi: [10.1186/s12913-016-1789-4](https://doi.org/10.1186/s12913-016-1789-4). PMID: 27688152; PMCID: PMC5043535.

It is organised into:

* Base case
    * Run the model
    * Figure 1
    * Figure 3
* Scenario analysis
    * Scenario 1
    * Table 2

In [1]:
# pylint: disable=missing-module-docstring
%load_ext autoreload
%autoreload 1
%aimport simulation

# pylint: disable=wrong-import-position
import os

from IPython.display import display
import pandas as pd
import plotly.express as px

from simulation.parameters import Param, ASUArrivals, RehabArrivals
from simulation.model import Model
from simulation.runner import Runner

In [2]:
# Path to the outputs folder
OUTPUT_DIR = "../outputs/"

## Base case

### Run the model

In [3]:
# Set up runner to run in parallel with nine cores
runner = Runner(param=Param(cores=9))

# Run the model for 150 replications
base_reps, base_overall = runner.run_reps()

### Figure 1

**Figure 1.** Simulation probability density function for occupancy of an acute stroke unit.

In [4]:
def plot_occupancy_freq(df, unit, file, path=OUTPUT_DIR):
    """
    Plot the frequency at which each occupancy level was observed in the audit.

    Parameters
    ----------
    df: pd.DataFrame
        Dataframe output by `get_occupancy_freq()` containing the frequency
        each occupancy was observed at.
    unit: str
        Name of unit ("asu", "rehab")
    file: str
        Filename to save figure to (e.g. "figure.png").
    path: str
        Path to save file to (excluding filename).
    """
    # Create plot
    fig = px.bar(df, x="beds", y="pct", color_discrete_sequence=["black"])

    # Specify axis labels, theme and dimensions
    if unit == "asu":
        unit_lab = "acute"
    elif unit == "rehab":
        unit_lab = "rehabilitation"
    else:
        raise ValueError("unit must be either 'acute' or 'rehab'")

    fig.update_layout(
        xaxis_title=f"No. patients in {unit_lab} unit",
        yaxis_title="% observations",
        template="plotly_white",
        height=450,
        width=800
    )

    # Add box around figure, and set tick spacing to 1
    fig.update_xaxes(linecolor="black", mirror=True, dtick=1)
    fig.update_yaxes(linecolor="black", mirror=True, tickformat=",.0%")

    # Show figure
    fig.show()

    # Save figure
    fig.write_image(os.path.join(path, file))

**Generate plots...**

(the article just includes a plot for the acute stroke unit).

In [5]:
# Acute stroke unit
plot_occupancy_freq(base_overall["asu"], unit="asu",
                    file="occupancy_freq_asu.png")

# Rehabilitation unit
plot_occupancy_freq(base_overall["rehab"], unit="rehab",
                    file="occupancy_freq_rehab.png")

### Figure 3

**Figure 3**. Simulated trade-off between the probability that a patient is delayed and the no. of acute beds available.

We can use our frequency and cumulative frequency of occupied beds from the simulation to calculate blocking probability.

Our model output these tables...

In [6]:
base_overall["asu"].head()

Unnamed: 0,beds,freq,pct,c_pct,prob_delay
0,0,67,0.000245,0.000245,1.0
1,1,570,0.002082,0.002327,0.894819
2,2,2142,0.007825,0.010152,0.770781
3,3,6086,0.022232,0.032384,0.68652
4,4,12600,0.046027,0.078411,0.587002


We can interpret...

* `pct` as the probability of having exactly x beds occupied.
* `c_pct` as the probability of having x or fewer beds occupied.

We can then calculate `pct/c_pct`, which is the probability of delay when the system has exactly x beds occupied.

For example:

| Beds | `pct` | `c_pct` | Probability of delay |
| - | - | - | - |
| 7 | 0.2 | 0.2 | 1.0 |
| 8 | 0.3 | 0.5 | 0.6 |
| 9 | 0.4 | 0.9 | 0.44 |
| 10 | 0.1 | 1.0 | 0.1 |

Interpretation for 8 beds:

* If we **randomly select a day when the occupancy is 8 or fewer beds**, there's a **60%** chance that the occupancy will be **exactly 8 beds (rather than 7 beds)**.

This can then be connected to the probability of delay by thinking about system capacity:

* If we assume that the unit has a total of 8 beds, then when 8 beds are occupied, the unit is at **full capacity**.
* Any new patients arriving when 8 beds are occupied would experience a delay.
* So 0.6 represents the probability that, given we're at or below capacity (7 or 8 beds), we're actually at full capacity (8 beds)

In other words, `pct/c_pct` is the probability that a new arrival will experience a delay when the system has exactly x beds occupied, given that the capacity of the system is x beds.

In [7]:
def plot_delay_prob(df, unit, file, path=OUTPUT_DIR):
    """
    Plot the simulated trade-off between the probability of delay and the
    number of beds available.

    Parameters
    ----------
    df: pd.DataFrame
        Dataframe output by `get_occupancy_freq()` containing the frequency
        each occupancy was observed at.
    unit: str
        Name of unit ("asu", "rehab")
    file: str
        Filename to save figure to (e.g. "figure.png").
    path: str
        Path to save file to (excluding filename).
    """
    # Create the step plot
    fig = px.line(df, x="beds", y="prob_delay",
                  color_discrete_sequence=["black"])
    fig.update_traces(mode="lines", line_shape="hv")

    # Add axis labels, set theme and dimensions
    if unit == "asu":
        unit_lab = "acute"
    elif unit == "rehab":
        unit_lab = "rehabilitation"
    else:
        raise ValueError("unit must be either 'acute' or 'rehab'")

    fig.update_layout(
        xaxis_title=f"No. of {unit_lab} beds available",
        yaxis_title="Probability of delay",
        template="simple_white",
        height=450,
        width=800
    )

    # Set tick frequency and adjust axis
    fig.update_xaxes(dtick=1)
    fig.update_yaxes(dtick=0.1, range=[0, 1])

    # Show figure
    fig.show()

    # Save figure
    fig.write_image(os.path.join(path, file))

In [8]:
plot_delay_prob(base_overall["asu"], unit="asu", file="delay_prob_asu.png")
plot_delay_prob(base_overall["rehab"], unit="rehab",
                file="delay_prob_rehab.png")

## Scenario analysis

### Scenario 1

**5% more admissions.** A 5% increase in admissions across all patient subgroups.

In [9]:
def alter_by_5_percent(params_dict):
    """
    Helper function to reduce all attributes of a class by 5%.

    Parameters
    ----------
    params_dict: dict
        Dictionary of parameters.
    """
    return {k: v * 0.95 for k, v in params_dict.items() if k != '_initialised'}


# Apply 5% increase to inter-arrival parameters
s1_param = Param(
    asu_arrivals=ASUArrivals(**alter_by_5_percent(vars(ASUArrivals()))),
    rehab_arrivals=RehabArrivals(**alter_by_5_percent(vars(RehabArrivals()))),
    cores=9
)

print(vars(s1_param.asu_arrivals))
print(vars(s1_param.rehab_arrivals))

{'stroke': 1.14, 'tia': 8.835, 'neuro': 3.42, 'other': 3.04, '_initialised': True}
{'stroke': 20.71, 'neuro': 30.115, 'other': 27.17, '_initialised': True}


In [10]:
# Run the model for 150 replications
runner = Runner(param=s1_param)
s1_reps, s1_overall = runner.run_reps()

### Table 2

**Table 2** Likelihood of delay. Current admissions versus 5% more admissions.

This table presents results from the base case and scenario 1 for acute beds 9-14 and rehab beds 10-16.

In [11]:
def make_delay_table(
        scenario, scenario_name, base=base_overall, base_name="current",
        asu_beds=list(range(9,15)), rehab_beds=list(range(10,17))
    ):
    """
    Create table with the probability of delay and 1 in n patients delayed,
    for the base case and a provided scenario.

    Parameters
    ----------
    scenario: dict
        Dictionary containing two dataframes: "asu" and "rehab". These contain
        the overall results from a scenario run of the simulation.
    scenario_name: str
        Name for scenario to use in table labels.
    base: dict
        Dictionary containing two dataframes: "asu" and "rehab". These contain
        the overall results from the base case run of the simulation.
    base_name: str
        Name for base case to use in table labels.
    asu_beds: list
        List of acute stroke unit (ASU) bed numbers to get results for.
    rehab_beds: list
        List of rehabilitation unit bed numbers to get results for.
    """
    # Create list to store the ASU and rehab dataframes
    tab_full = []

    # Loop over ASU and rehab units...
    for unit_name, unit_beds in {"asu": asu_beds, "rehab": rehab_beds}.items():

        # Create list to store base case and scenario dataframes
        tab_segment = []

        # Loop over base case and scenario...
        for scenario_name, df in {base_name: base[unit_name],
                                  scenario_name: scenario[unit_name]}.items():

            # Extract results for specified beds
            df = df[df["beds"].isin(unit_beds)][["beds", "prob_delay"]]

            # Add column with calculation of 1 in every n patients delayed
            df[f"1_in_n_delay_{scenario_name}"] = (
                round(1 / df["prob_delay"])).astype(int)

            # Round probability of delay to 2 d.p.
            df["prob_delay"] = round(df["prob_delay"], 2)

            # Rename column to be specific to scenario
            df = df.rename(columns={
                "prob_delay": f"prob_delay_{scenario_name}"})

            # Save dataframe to list
            tab_segment.append(df)

        # Combine into single dataframe
        full_df = pd.merge(tab_segment[0], tab_segment[1], on="beds")

        # Add column with unit name
        full_df.insert(0, "unit", unit_name)

        # Save dataframe to list
        tab_full.append(full_df)

    # Combine into a single table
    return pd.concat(tab_full).reset_index(drop=True)

In [12]:
full_tab2 = make_delay_table(scenario=s1_overall, scenario_name="5%")
full_tab2

Unnamed: 0,unit,beds,prob_delay_current,1_in_n_delay_current,prob_delay_5%,1_in_n_delay_5%
0,asu,9,0.19,5,0.21,5
1,asu,10,0.14,7,0.16,6
2,asu,11,0.09,11,0.11,9
3,asu,12,0.06,16,0.08,13
4,asu,13,0.04,26,0.05,20
5,asu,14,0.02,45,0.03,33
6,rehab,10,0.2,5,0.23,4
7,rehab,11,0.15,7,0.18,6
8,rehab,12,0.11,9,0.13,8
9,rehab,13,0.08,13,0.09,11


These are some adjustments to how table is presented in article...

(hiding / dropping some results)

The 1 in n delay columns are **rounded to the nearest whole number**, but python doesn't allow NaN or Inf in an int column, so they provided as floats.

In [13]:
adj_full_tab_2 = full_tab2.copy()

# Drop the result for ASU beds 9 and rehab beds 10 for the scenario
adj_full_tab_2.loc[(adj_full_tab_2["unit"] == "asu") &
                   (adj_full_tab_2["beds"] == 9),
                   ["prob_delay_5%", "1_in_n_delay_5%"]] = None
adj_full_tab_2.loc[(adj_full_tab_2["unit"] == "rehab") &
                   (adj_full_tab_2["beds"] == 10),
                   ["prob_delay_5%", "1_in_n_delay_5%"]] = None

# Drop the result for rehab 11 beds
adj_full_tab_2 = adj_full_tab_2[
    ~((adj_full_tab_2["unit"] == "rehab") & (adj_full_tab_2["beds"] == 11))]

# Display and save to csv
display(adj_full_tab_2)
adj_full_tab_2.to_csv(
    os.path.join(OUTPUT_DIR, "delay_scenario1.csv"), index=False)

Unnamed: 0,unit,beds,prob_delay_current,1_in_n_delay_current,prob_delay_5%,1_in_n_delay_5%
0,asu,9,0.19,5,,
1,asu,10,0.14,7,0.16,6.0
2,asu,11,0.09,11,0.11,9.0
3,asu,12,0.06,16,0.08,13.0
4,asu,13,0.04,26,0.05,20.0
5,asu,14,0.02,45,0.03,33.0
6,rehab,10,0.2,5,,
8,rehab,12,0.11,9,0.13,8.0
9,rehab,13,0.08,13,0.09,11.0
10,rehab,14,0.05,20,0.06,15.0


### Scenario 4

**No complex-neurological cases.** Complex neurological patients are excluded from the pathway in order to assess their impact on bed requirements.

In [14]:
# Set IAT very high, essentially meaning that we have no neuro arrivals
s4_param = Param(
    asu_arrivals=ASUArrivals(neuro = 10_000_000_000),
    rehab_arrivals=RehabArrivals(neuro=10_000_000_000),
    cores=9
)

In [15]:
# Run the model for 150 replications
runner = Runner(param=s4_param)
s4_reps, s4_overall = runner.run_reps()

### Supplementary table 1

**Supplementary Table 1.** Likelihood of delay. Current admissions versus No Complex neurological patients.

In [16]:
# Make table
sup_tab1 = make_delay_table(scenario=s4_overall,
                            scenario_name="no_complex_neuro",
                            asu_beds=list(range(10,16)),
                            rehab_beds=list(range(12,17)))

# Display and save to csv
display(sup_tab1)
sup_tab1.to_csv(os.path.join(OUTPUT_DIR, "delay_scenario4.csv"), index=False)

Unnamed: 0,unit,beds,prob_delay_current,1_in_n_delay_current,prob_delay_no_complex_neuro,1_in_n_delay_no_complex_neuro
0,asu,10,0.14,7,0.09,11
1,asu,11,0.09,11,0.06,18
2,asu,12,0.06,16,0.03,31
3,asu,13,0.04,26,0.02,55
4,asu,14,0.02,45,0.01,111
5,asu,15,0.01,79,0.0,225
6,rehab,12,0.11,9,0.05,20
7,rehab,13,0.08,13,0.03,32
8,rehab,14,0.05,20,0.02,57
9,rehab,15,0.03,31,0.01,101
