
# 🏥 BI Dashboard Notebook — Hospitals & Waiting Times

This notebook is designed to **run your existing BI Python script** and present **all diagrams** with clear, structured explanations.  
You can keep developing in the `.py` file, and simply re-run this notebook to refresh all visuals.

**What you get here:**
- Import & execute your existing Python file (no code duplication)
- Narrative markdown cells explaining **what the code does** and **what the charts show**
- Optional utility cells to re-run specific plots without re-importing everything



## 1) Configuration

Set the path to your BI Python file below.  
Place this notebook **in the same project** as your `data/` folder so the script can resolve relative paths correctly.


In [None]:

# --- REQUIRED: point to your BI script (.py) ---
# If this notebook sits next to your .py file, the default below will work.
PY_FILE = "e8378042-da47-43d4-a7f4-446cdaecb2ab.py"

# If your file has a different name or location, update the path:
# PY_FILE = "path/to/your_script.py"



## 2) Environment (optional)

If you're running this for the first time in a fresh environment, you may need these installs.  
If everything already works, you can **skip** this cell.


In [None]:

# Uncomment if needed
# %pip install pandas numpy matplotlib seaborn scikit-learn graphviz scipy openpyxl



## 3) Import the BI module (executes your code)

The cell below **imports your `.py` file as a module** and executes it once.  
Since your script defines `BASE_DIR` using `__file__`, importing it this way ensures the `data/` paths work correctly.

> **Note:** Because your `.py` runs its plotting code at top-level, importing the module will **display all figures** once.


In [None]:

import importlib.util, sys, pathlib

mod_name = "bi_project_module"
py_path = pathlib.Path(PY_FILE).resolve()
assert py_path.exists(), f"File not found: {py_path}"

spec = importlib.util.spec_from_file_location(mod_name, str(py_path))
bi = importlib.util.module_from_spec(spec)
sys.modules[mod_name] = bi
spec.loader.exec_module(bi)

print(f"Imported {mod_name} from {py_path}")



## 4) Quick data sanity check

Peek at the cleaned frames exposed by your module to confirm shapes and a few rows.


In [None]:

import pandas as pd

frames = {
    "riget_disp_clean_df": getattr(bi, "riget_disp_clean_df", None),
    "riget_norm_clean_df": getattr(bi, "riget_norm_clean_df", None),
    "aarhus_disp_clean_df": getattr(bi, "aarhus_disp_clean_df", None),
    "aarhus_norm_clean_df": getattr(bi, "aarhus_norm_clean_df", None),
    "wp_hstaden_clean_df": getattr(bi, "wp_hstaden_clean_df", None),
    "wp_mjylland_clean_df": getattr(bi, "wp_mjylland_clean_df", None),
    "all_hosp_disp": getattr(bi, "all_hosp_disp", None),
    "all_hosp_norm": getattr(bi, "all_hosp_norm", None),
}

for name, df in frames.items():
    if isinstance(df, pd.DataFrame):
        print(f"{name:>24}: shape={df.shape}")
        display(df.head(3))



## 5) What the code does (high level)

**Load & Clean**
- Loads hospital bed capacity datasets (available and occupied) for **Rigshospitalet** and **Aarhus Universitetshospital**.
- Loads patient **waiting time** buckets for **Region Hovedstaden** and **Region Midtjylland**.
- Fixes header issues, standardizes column names, enforces valid year/month, and converts numeric columns.

**Combine**
- Concatenates hospitals and regions into `all_hosp_disp`, `all_hosp_norm`, and a long-form waiting dataset.
- Computes `TotalBeds` per region-month for capacity comparisons.

**Visualize**
- **Histograms** for bed counts by department.
- **Time-series line plots** for departments over time.
- **Correlation heatmaps** for department-level relationships.
- **Scatter**: patients waiting vs available beds (colored by waiting bucket; styled by region).
- **Bar plots**: average beds per year.
- **Box plots**: distribution of beds across departments.



## 6) Histograms — Distribution of beds

These histograms show how bed counts are distributed across months for each department (**Kirurgi, Medicin, Onkologi, Øvrige**).  
Use them to spot **skew**, **outliers**, and **typical ranges**.

> Already rendered when you imported the module. Re-run below to regenerate selectively.


In [None]:

# Re-run selected histograms (optional)
bi.plot_hosp_histogram(bi.riget_disp_clean_df, title="Available Beds Riget")
bi.plot_hosp_histogram(bi.riget_norm_clean_df, title="Occupied Beds Riget")
bi.plot_hosp_histogram(bi.aarhus_disp_clean_df, title="Available Beds Aarhus")
bi.plot_hosp_histogram(bi.aarhus_norm_clean_df, title="Occupied Beds Aarhus")
bi.plot_hosp_histogram(bi.all_hosp_disp, title="All Available")
bi.plot_hosp_histogram(bi.all_hosp_norm, title="All Occupied")



## 7) Beds over time — Trends

Time-series charts show month-by-month trends for departments.  
Use them to identify **seasonality**, **shifts in capacity**, or **structural changes**.


In [None]:

# Re-run selected time-series (optional)
bi.plot_beds_over_time(bi.riget_disp_clean_df, title="Available Beds Riget")
bi.plot_beds_over_time(bi.riget_norm_clean_df, title="Occupied Beds Riget")
bi.plot_beds_over_time(bi.aarhus_disp_clean_df, title="Available Beds Aarhus")
bi.plot_beds_over_time(bi.aarhus_norm_clean_df, title="Occupied Beds Aarhus")



## 8) Correlation heatmaps — Department relationships

Correlation heatmaps compare departments to see whether capacity tends to **move together** (positive) or **trade off** (negative).  
Beware that correlation ≠ causation; these simply indicate **linear co-movement**.


In [None]:

# Re-run heatmaps (optional)
bi.plot_correlation_heatmap(bi.all_hosp_disp, title="Hospital Bed Data (available) — Correlation Heatmap")
bi.plot_correlation_heatmap(bi.all_hosp_norm, title="Hospital Bed Data (occupied) — Correlation Heatmap")



## 9) Waiting patients vs available beds — Capacity signal

This scatter compares **patients waiting** (y-axis) against **available beds** (x-axis), segmented by `bucket` (waiting time bands) and region.  
Look for patterns such as **higher waiting at lower capacity** or **bucket-specific clustering**.


In [None]:

# Re-run the scatter (optional)
bi.plot_wait_vs_capacity_scatter(bi.wait_vs_beds, title="Waiting Patients vs Available Beds")



## 10) Average beds per year — Capacity levels

Bar charts show the **average** beds per year by department.  
Use to communicate **year-over-year** changes at a glance.


In [None]:

# Re-run yearly averages (optional)
bi.plot_avg_beds_per_year(bi.riget_disp_clean_df, group_col="År", title="Riget — Avg Available Beds per Year")
bi.plot_avg_beds_per_year(bi.aarhus_disp_clean_df, group_col="År", title="Aarhus — Avg Available Beds per Year")



## 11) Box plots — Variability across departments

Box plots summarize **median**, **interquartile range**, and **outliers** for each department.  
Great for highlighting **spread** and **comparative variability**.


In [None]:

# Re-run box plots (optional)
bi.plot_department_boxplots(bi.all_hosp_disp, "Available Beds — Boxplot per Department")
bi.plot_department_boxplots(bi.all_hosp_norm, "Occupied Beds — Boxplot per Department")



## 12) Tips & next steps

- To keep imports clean, consider moving plot calls in your `.py` under `if __name__ == "__main__":` and exposing functions only.  
  The notebook can then call the functions explicitly — giving you full control over which figures render.
- You can add **parameter cells** (e.g., date filters, region selection) and pass them into your plotting functions.
- Notebook generated on **2025-10-01 19:06**.
