# Surgical Capacity and Waiting Times for Planned Surgeries in Denmark  
*A Regional and Temporal Analysis*

# Problem Statement
**How does the capacity of _surgical_ hospital beds (available/normed) and surgical activity affect waiting times for planned surgery in the five Danish regions?**

---

## Sub-questions

### Development over time (surgery)
- How has the number of **surgically available** and **surgically normed** beds changed over time per region?
- Are there **seasonal fluctuations** in capacity, waiting-time buckets, and number of surgeries?

### Comparison of regions (surgery)
- Which regions have the highest/lowest **surgical capacity level**, and how has the development been?
- Which regions have **shorter waiting times relative to capacity** (e.g., waiting list per 100 surgically available beds)?

### Relationship between waiting time and capacity/activity (surgery)
- Is there a **correlation** between **surgically available beds** and the number of patients in waiting-time buckets (0–30, 31–60, 61–90, 90+ days)?
- Is **more surgically available beds** and/or **higher surgical activity** associated with **shorter waiting times**?

### Statistical analyses / models (surgery)
- Can a **(multi)linear regression** predict the waiting list / 90+ share based on **surgical capacity** and **surgical activity** (with month/region controls)?
- Can a **classification model** identify months with **high load** (e.g., top-25% 90+ waiting time)?  
  (Evaluated with **confusion matrix**, accuracy/F1.)
- Do **correlation heatmaps, tree models, clusters,** and **3D visualizations** provide consistent patterns?

In [1]:
# Use `src/hosp_dataloader.py` to load the surgical datasets.

import sys, os
sys.path.append(os.path.join(os.path.dirname(os.getcwd()), "src"))

In [2]:
# Import the hosp_dataloader module and preview the datasets (loaded as pandas DataFrames)

import hosp_dataloader as hd
from IPython.display import display

kir_op = hd.kir_op
kir_sp = hd.kir_sp
kir_vt = hd.kir_vt

display(kir_op.head())
display(kir_sp.head())
display(kir_vt.head())

Unnamed: 0,Region,År,Måned,Kirurgi_Operationer_total
0,Region Hovedstaden,2015,1,5143
1,Region Hovedstaden,2015,2,5139
2,Region Hovedstaden,2015,3,5636
3,Region Hovedstaden,2015,4,5507
4,Region Hovedstaden,2015,5,5091


Unnamed: 0,Region,År,Måned,Disponible_senge,Normerede_senge,Belægningsgrad_disponible,Belægningsgrad_normerede,Patienter_total
0,Region Hovedstaden,2014,1,1631.0,1662.0,81.0,79.0,40895.0
1,Region Hovedstaden,2014,2,1602.0,1651.0,81.0,79.0,36503.0
2,Region Hovedstaden,2014,3,1641.0,1659.0,80.0,79.0,40555.0
3,Region Hovedstaden,2014,4,1580.0,1629.0,79.0,77.0,37499.0
4,Region Hovedstaden,2014,5,1629.0,1656.0,78.0,76.0,39248.0


Unnamed: 0,Region,År,Måned,Kirurgi_Ventetid_0_30_dage,Kirurgi_Ventetid_31_60_dage,Kirurgi_Ventetid_61_90_dage,Kirurgi_Ventetid_90_plus_dage,Kirurgi_Venteliste_total
0,Region Hovedstaden,2020,1,510,248,225,268,1251
1,Region Hovedstaden,2020,2,510,248,225,268,1251
2,Region Hovedstaden,2020,3,509,247,225,267,1248
3,Region Hovedstaden,2020,4,335,76,99,295,805
4,Region Hovedstaden,2020,5,334,76,99,295,804


## Explanation
- `kir_op`: Surgical operations by region/year/month  
- `kir_sp`: Surgical bed capacity (available vs. normed) by region/year/month  
- `kir_vt`: Waiting-time buckets (0–30, 31–60, 61–90, 90+ days) by region/year/month  

In [3]:
# Show the size of each DataFrame (number of rows and columns)

for name, df in [("operations", kir_op), ("beds", kir_sp), ("waiting", kir_vt)]:
    print(f"{name}: {df.shape}")

operations: (600, 4)
beds: (690, 8)
waiting: (330, 8)


In [None]:
import hosp_clean as hc

kir_op_cleaned = hc.clean(kir_op)
kir_sp_cleaned = hc.clean(kir_sp)
kir_vt_cleaned = hc.clean(kir_vt)

['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'clean_beds', 'hd', 'os', 'pd']


In [None]:
display(kir_op_cleaned.info())
display(kir_sp_cleaned.info())
display(kir_vt_cleaned.info())
