# Surgical Capacity and Waiting Times for Planned Surgeries in Denmark  
*A Regional and Temporal Analysis*

# Problem Statement
**How does the capacity of _surgical_ hospital beds (available/normed) and surgical activity affect waiting times for planned surgery in the five Danish regions?**

---

## Sub-questions

### Development over time (surgery)
- How has the number of **surgically available** and **surgically normed** beds changed over time per region?
- Are there **seasonal fluctuations** in capacity, waiting-time buckets, and number of surgeries?

### Comparison of regions (surgery)
- Which regions have the highest/lowest **surgical capacity level**, and how has the development been?
- Which regions have **shorter waiting times relative to capacity** (e.g., waiting list per 100 surgically available beds)?

### Relationship between waiting time and capacity/activity (surgery)
- Is there a **correlation** between **surgically available beds** and the number of patients in waiting-time buckets (0–30, 31–60, 61–90, 90+ days)?
- Is **more surgically available beds** and/or **higher surgical activity** associated with **shorter waiting times**?

### Statistical analyses / models (surgery)
- Can a **(multi)linear regression** predict the waiting list / 90+ share based on **surgical capacity** and **surgical activity** (with month/region controls)?
- Can a **classification model** identify months with **high load** (e.g., top-25% 90+ waiting time)?  
  (Evaluated with **confusion matrix**, accuracy/F1.)
- Do **correlation heatmaps, tree models, clusters,** and **3D visualizations** provide consistent patterns?

In [None]:
# Use `src/hosp_dataloader.py` to load the surgical datasets.

import sys, os
sys.path.append(os.path.join(os.path.dirname(os.getcwd()), "src"))

In [None]:
# Import the hosp_dataloader module and preview the datasets (loaded as pandas DataFrames)

import hosp_dataloader as hd
from IPython.display import display

kir_op = hd.kir_op
kir_sp = hd.kir_sp
kir_vt = hd.kir_vt

display(kir_op.head())
display(kir_sp.head())
display(kir_vt.head())

## Explanation
- `kir_op`: Surgical operations by region/year/month  
- `kir_sp`: Surgical bed capacity (available vs. normed) by region/year/month  
- `kir_vt`: Waiting-time buckets (0–30, 31–60, 61–90, 90+ days) by region/year/month  

In [None]:
# Show the size of each DataFrame (number of rows and columns)

for name, df in [("operations", kir_op), ("beds", kir_sp), ("waiting", kir_vt)]:
    print(f"{name}: {df.shape}")

In [None]:
# Apply the cleaning function from hosp_clean.py to each dataset 
# so they have consistent columns, valid year/month, and a proper 'Dato' column

import hosp_clean as hc

kir_op_cleaned = hc.clean(kir_op)
kir_sp_cleaned = hc.clean(kir_sp)
kir_vt_cleaned = hc.clean(kir_vt)

In [None]:
# Display dataset structure (.info) for operations, beds, and waiting,
# with section headers to make the outputs easy to distinguish

In [None]:
print("=== Operations ===")
kir_op_cleaned.info()

In [None]:
print("\n=== Beds ===")
kir_sp_cleaned.info()

In [None]:
print("\n=== Waiting ===")
kir_vt_cleaned.info()

In [None]:
# Merge ops, beds, and waiting on Region+Dato; rebuild a single År/Måned from Dato and drop duplicate time columns

import pandas as pd

def merge_all(kir_op_cleaned, kir_sp_cleaned, kir_vt_cleaned):
    df = (kir_op_cleaned
            .merge(kir_sp_cleaned, on=["Region","Dato"], how="inner", suffixes=("_op","_beds"))
            .merge(kir_vt_cleaned, on=["Region","Dato"], how="inner", suffixes=("", "_wait")))

    df["År"] = pd.to_datetime(df["Dato"]).dt.year
    df["Måned"] = pd.to_datetime(df["Dato"]).dt.month

    dup_cols = [c for c in df.columns
                if (c.startswith("År") or c.startswith("Måned")) and c not in ("År", "Måned")]
    df = df.drop(columns=dup_cols)

    return df

all_kir_cleaned = merge_all(kir_op_cleaned, kir_sp_cleaned, kir_vt_cleaned)

display(all_kir_cleaned)


Unnamed: 0,Dato,Region,Kirurgi_Operationer_total,Disponible_senge,Normerede_senge,Belægningsgrad_disponible,Belægningsgrad_normerede,Patienter_total,År,Måned,Kirurgi_Ventetid_0_30_dage,Kirurgi_Ventetid_31_60_dage,Kirurgi_Ventetid_61_90_dage,Kirurgi_Ventetid_90_plus_dage,Kirurgi_Venteliste_total
0,2020-01-01,Region Syddanmark,3592,671.0,675.0,77.0,76.0,15927.0,2020,1,323,241,153,147,864
1,2020-01-01,Region Nordjylland,1621,302.0,329.0,95.0,87.0,8893.0,2020,1,146,96,49,37,328
2,2020-01-01,Region Hovedstaden,4960,1501.0,1584.0,67.0,64.0,31201.0,2020,1,510,248,225,268,1251
3,2020-01-01,Region Sjælland,2115,413.0,443.0,74.0,69.0,9455.0,2020,1,201,153,94,156,604
4,2020-01-01,Region Midtjylland,3758,710.0,741.0,81.0,78.0,17906.0,2020,1,442,260,124,96,922
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
295,2024-12-01,Region Sjælland,2295,345.0,442.0,80.0,63.0,8608.0,2024,12,357,175,107,161,800
296,2024-12-01,Region Midtjylland,3591,572.0,646.0,77.0,68.0,13668.0,2024,12,653,317,130,186,1286
297,2024-12-01,Region Hovedstaden,5233,1057.0,1353.0,67.0,53.0,22044.0,2024,12,600,369,176,265,1410
298,2024-12-01,Region Nordjylland,1631,277.0,321.0,78.0,68.0,6726.0,2024,12,366,166,57,65,654
