# DICOM Viewer with AP/Lateral Labels

A notebook for reviewing DSA (Digital Subtraction Angiography) DICOM images with occlusion location labels.

---

## How to Use

### Running the Notebook
1. **Cell 1:** Installs required packages and mounts Google Drive (if in Colab)
2. **Cell 2:** Sets file paths based on environment
3. **Cell 3:** Loads helper functions
4. **Cell 4:** Reads the Excel file and builds the list of DICOM images
5. **Cell 5:** Displays the interactive viewer

### Viewer Layout

**Left Panel - Run Review:**
- **Mark as OK / Mark for manual review:** Buttons to flag the current run's status
- **Save All:** Saves all mark statuses to the Excel file
- **Run list:** Click any run to load it in the viewer. Shows status prefix:
  - `[OK]` - Marked as reviewed and okay
  - `[REVIEW]` - Marked for manual review
  - `[--]` - Not yet reviewed

**Right Panel - Image Viewer:**
- **Previous Run / Next Run:** Navigate between different DSA runs
- **Prev Frame / Next Frame:** Navigate frames within a multi-frame DICOM
- **Frame slider:** Drag to quickly scrub through frames
- **Study info:** Shows Study_Key, column (AP_1, Lateral_1, etc.), filename, location label, and frame count
- **DICOM image:** The current frame of the DSA run
- **Notes:** Text area to add notes for the current study
- **Save Notes:** Saves notes to the Excel file

### Workflow for Manual Review

1. **Start reviewing:** The viewer loads the first run automatically
2. **View the DSA:** Use the frame slider or Prev/Next Frame buttons to scrub through the angiogram
3. **Check the location label:** Compare the displayed "Location" (e.g., "L M2") with what imaging findings
4. **Mark the run:**
   - Click **Mark as OK** if the label is correct
   - Click **Mark for manual review** if the label needs correction or is unclear
5. **Add notes (optional):** Type any observations in the notes box and click **Save Notes**
6. **Move to next run:** Click **Next Run** or select from the list on the left
7. **Save your progress:** Click **Save All** periodically to save mark statuses to Excel

### Data Columns in Excel

The notebook reads/writes these columns:
- `Study_Key` - Anonymous UUID shown to user
- `AP_1`, `AP_2`, `AP_3` - AP view DICOM filenames
- `Lateral_1`, `Lateral_2`, `Lateral_3` - Lateral view DICOM filenames
- `AP_1_Location`, `Lateral_1_Location`, etc. - Occlusion location labels
- `AP_1_Review_Flag`, `Lateral_1_Review_Flag`, etc. - Review status per run (OK/Review)
- `Manual Review Notes` - Free-text notes per study

---


In [1]:
# Detect environment and setup
import sys
import os

# Check if running in Google Colab
IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    # Mount Google Drive
    from google.colab import drive
    drive.mount('/content/drive')
    print("Google Drive mounted at /content/drive")

    # Install required packages
    !pip install -q pandas pydicom matplotlib ipywidgets openpyxl

    # Enable ipywidgets in Colab
    from google.colab import output
    output.enable_custom_widget_manager()
    print("Widget manager enabled for Colab")
else:
    print(f"Running locally: {sys.executable}")
    !{sys.executable} -m pip install pandas pydicom matplotlib ipywidgets openpyxl

Mounted at /content/drive
Google Drive mounted at /content/drive
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m38.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m64.7 MB/s[0m eta [36m0:00:00[0m
[?25hWidget manager enabled for Colab


In [2]:
from pathlib import Path
import sys

# Detect environment
IN_COLAB = 'google.colab' in sys.modules

# Set paths based on environment
if IN_COLAB:
    # Google Colab paths (Google Drive mounted at /content/drive)
    EXCEL_PATH = Path("/content/drive/MyDrive/M2_M3_data/AP_Lateral_Labels_Split.xlsx")
    BASE_DIR = Path("/content/drive/MyDrive/M2_M3_data")
else:
    # Local Windows paths
    EXCEL_PATH = Path(r"H:\My Drive\M2_M3_data\AP_Lateral_Labels_Split.xlsx")
    BASE_DIR = Path(r"H:\My Drive\M2_M3_data")

print(f"Environment: {'Google Colab' if IN_COLAB else 'Local'}")
print(f"Excel path: {EXCEL_PATH}")
print(f"Base dir: {BASE_DIR}")

# Ordered by DSA run: AP then Lateral for each run
RUN_ORDER = [
    "AP_1",
    "Lateral_1",
    "AP_2",
    "Lateral_2",
    "AP_3",
    "Lateral_3",
]
COLUMNS_TO_CHECK = RUN_ORDER

# Column to show to users instead of Accession
KEY_COLUMN = "Study_Key"


Environment: Google Colab
Excel path: /content/drive/MyDrive/M2_M3_data/Accession_MRN_AP_Lateral_Labels_Split.xlsx
Base dir: /content/drive/MyDrive/M2_M3_data


In [3]:
try:
    import pandas as pd
    import numpy as np
    import pydicom
    import matplotlib.pyplot as plt
    import ipywidgets as widgets
    from IPython.display import display, clear_output
except ImportError as exc:
    raise ImportError(
        "Missing packages. Install with: pip install pandas pydicom matplotlib ipywidgets"
    ) from exc


def normalize_value(value: object) -> str:
    if value is None or (isinstance(value, float) and pd.isna(value)):
        return ""
    text = str(value).strip()
    if text.lower() == "nan":
        return ""
    return text


def candidate_paths(base_dir: Path, accession: str, image_value: str) -> list[Path]:
    accession_dir = base_dir / accession
    if image_value.lower().endswith(".dcm"):
        return [accession_dir / image_value]
    return [accession_dir / f"{image_value}.dcm", accession_dir / image_value]


def dcm_display_name(base_dir: Path, accession: str, image_value: str) -> str:
    accession_dir = base_dir / accession
    if image_value.lower().endswith(".dcm"):
        return (accession_dir / image_value).name
    return f"{image_value}.dcm"


def location_column_for(column: str) -> str:
    return f"{column}_Location"


def dicom_to_image(ds: pydicom.Dataset) -> np.ndarray:
    image = ds.pixel_array.astype(np.float32)
    if image.ndim == 3:
        # Multi-frame DICOM: show the middle frame
        mid_index = image.shape[0] // 2
        image = image[mid_index]
    slope = float(getattr(ds, "RescaleSlope", 1.0))
    intercept = float(getattr(ds, "RescaleIntercept", 0.0))
    image = image * slope + intercept
    if getattr(ds, "PhotometricInterpretation", "") == "MONOCHROME1":
        image = np.max(image) - image
    return image

In [4]:
if not EXCEL_PATH.exists():
    raise FileNotFoundError(f"Excel file not found: {EXCEL_PATH}")
if not BASE_DIR.exists():
    raise FileNotFoundError(f"Base directory not found: {BASE_DIR}")

df = pd.read_excel(EXCEL_PATH)
required_columns = {"Accession", *COLUMNS_TO_CHECK}
missing = required_columns - set(df.columns)
if missing:
    missing_list = ", ".join(sorted(missing))
    raise KeyError(f"Missing required column(s): {missing_list}")
if KEY_COLUMN not in df.columns:
    raise KeyError(
        f"Missing required column: {KEY_COLUMN}. Run add_study_key.py first."
    )
if "Manual Review Notes" not in df.columns:
    df["Manual Review Notes"] = ""
else:
    df["Manual Review Notes"] = df["Manual Review Notes"].fillna("").astype(str)
RUN_FLAG_COLUMNS = [f"{col}_Review_Flag" for col in RUN_ORDER]
for flag_col in RUN_FLAG_COLUMNS:
    if flag_col not in df.columns:
        df[flag_col] = ""
    else:
        df[flag_col] = df[flag_col].fillna("").astype(str)

image_records: list[dict[str, str]] = []
missing_count = 0
for _, row in df.iterrows():
    accession = normalize_value(row.get("Accession"))
    if not accession:
        continue

    key_value = normalize_value(row.get(KEY_COLUMN)) or "Unknown"

    for column in COLUMNS_TO_CHECK:
        image_value = normalize_value(row.get(column))
        if not image_value:
            continue

        paths = candidate_paths(BASE_DIR, accession, image_value)
        display_name = dcm_display_name(BASE_DIR, accession, image_value)
        existing_path = next((path for path in paths if path.exists()), None)
        if existing_path is None:
            print(f"Missing file for {KEY_COLUMN} {key_value} ({column}): {display_name}")
            missing_count += 1
            continue

        location_col = location_column_for(column)
        location_value = (
            normalize_value(row.get(location_col)) if location_col in df.columns else ""
        )

        image_records.append(
            {
                "study_key": key_value,
                "column": column,
                "filename": display_name,
                "location": location_value,
                "path": str(existing_path),
            }
        )

print(f"Loaded {len(image_records)} images. Missing files: {missing_count}.")

Missing file for Study_Key 7b034d82-a11a-4bd6-b9c9-07c950539e59 (AP_2): 17.51.1.dcm
Missing file for Study_Key 7b034d82-a11a-4bd6-b9c9-07c950539e59 (Lateral_2): 16.5.1.dcm
Missing file for Study_Key 7b034d82-a11a-4bd6-b9c9-07c950539e59 (AP_3): 25.6.1.dcm
Missing file for Study_Key aabb7474-1a32-4c19-95d0-739e21be7b45 (AP_2): 13.1.1.dcm
Missing file for Study_Key 18be06b7-4429-4dfd-b12d-1c10635b86a6 (Lateral_1): 38.30.1.dcm
Missing file for Study_Key bff2c100-e44b-46c2-a6c8-4f3dada0337a (Lateral_2): 17.3.1.dcm
Missing file for Study_Key c6e89ee4-8878-4579-8830-aeb358b6dc9b (AP_2): 23.8.1.dcm
Missing file for Study_Key dcdcb51e-8964-404e-ab35-04e7332528dc (AP_3): 18.5.1.dcm
Missing file for Study_Key 794fee71-54af-4679-a65c-f72c324f4abd (Lateral_1): 11.1.11.dcm
Missing file for Study_Key 1b832d14-6c92-4cab-9690-b3d14994a946 (AP_3): 18.4.1.dcm
Missing file for Study_Key c7539472-c632-4b7d-8af4-32ab7fa0df59 (AP_2): 19.5.1.dcm
Missing file for Study_Key c7539472-c632-4b7d-8af4-32ab7fa0df59 

In [5]:
if not image_records:
    raise ValueError("No existing DICOM files found for the configured columns.")

image_out = widgets.Output()
info_out = widgets.Output()
status_label = widgets.HTML(value="")

record_index = 0
current_record_id = None
slider_update = False

frame_slider = widgets.FloatSlider(
    value=0,
    min=0,
    max=1,
    step=1,
    description="Frame",
    continuous_update=True,
    readout=True,
    readout_format=".0f",
    layout=widgets.Layout(width="400px"),
    style={"description_width": "50px", "handle_color": "#ffffff"},
)
prev_button = widgets.Button(description="Previous Run", button_style="info")
next_button = widgets.Button(description="Next Run", button_style="info")
frame_prev_button = widgets.Button(description="Prev Frame", button_style="warning")
frame_next_button = widgets.Button(description="Next Frame", button_style="warning")


def current_record() -> dict[str, str]:
    return image_records[record_index]


def update_frame_slider(frame_count: int, set_value: bool) -> None:
    global slider_update
    slider_update = True
    try:
        if frame_count <= 1:
            frame_slider.min = 0.0
            frame_slider.max = 0.0
            frame_slider.value = 0.0
            frame_slider.disabled = True
        else:
            frame_slider.disabled = False
            frame_slider.min = 0.0
            frame_slider.max = float(frame_count - 1)
            frame_slider.step = 1.0
            if set_value:
                frame_slider.value = float(frame_count // 2)
    finally:
        slider_update = False


def step_frame(delta: int) -> None:
    if frame_slider.disabled:
        return
    new_val = max(frame_slider.min, min(frame_slider.max, frame_slider.value + float(delta)))
    if new_val != frame_slider.value:
        frame_slider.value = new_val


def render_current(frame_index: int | None = None) -> None:
    global current_record_id
    record = current_record()
    record_id = f"{record['study_key']}|{record['column']}|{record['path']}"
    record_changed = record_id != current_record_id
    current_record_id = record_id

    ds = pydicom.dcmread(record["path"])
    image = ds.pixel_array.astype(np.float32)

    if image.ndim == 3:
        frame_count = image.shape[0]
        update_frame_slider(frame_count, set_value=record_changed)
        idx = int(frame_slider.value) if frame_index is None else int(frame_index)
        idx = max(0, min(frame_count - 1, idx))
        image = image[idx]
    else:
        update_frame_slider(1, set_value=True)

    slope = float(getattr(ds, "RescaleSlope", 1.0))
    intercept = float(getattr(ds, "RescaleIntercept", 0.0))
    image = image * slope + intercept
    if getattr(ds, "PhotometricInterpretation", "") == "MONOCHROME1":
        image = np.max(image) - image

    with info_out:
        clear_output(wait=True)
        frame_info = "Frames: 1" if frame_slider.disabled else f"Frame: {int(frame_slider.value) + 1} / {int(frame_slider.max) + 1}"
        print(f"{KEY_COLUMN}: {record['study_key']}")
        print(f"Column: {record['column']}")
        print(f"Filename: {record['filename']}")
        print(f"Location: {record['location']}")
        print(frame_info)

    highlight_run(record["study_key"], record["column"])
    load_notes(record["study_key"])

    with image_out:
        clear_output(wait=True)
        fig, ax = plt.subplots(figsize=(6, 6))
        ax.imshow(image, cmap="gray")
        ax.axis("off")
        plt.tight_layout()
        plt.show()


def on_frame_change(change):
    if slider_update:
        return
    if change.get("name") == "value":
        render_current(change["new"])


def on_prev_click(_):
    global record_index
    record_index = max(0, record_index - 1)
    render_current()


def on_next_click(_):
    global record_index
    record_index = min(len(image_records) - 1, record_index + 1)
    render_current()


def on_frame_prev(_):
    step_frame(-1)


def on_frame_next(_):
    step_frame(1)


frame_slider.observe(on_frame_change)
prev_button.on_click(on_prev_click)
next_button.on_click(on_next_click)
frame_prev_button.on_click(on_frame_prev)
frame_next_button.on_click(on_frame_next)

run_controls = widgets.HBox([prev_button, next_button])
frame_buttons = widgets.HBox([frame_prev_button, frame_next_button])
frame_controls = widgets.VBox([frame_buttons, frame_slider])

study_keys = sorted({record["study_key"] for record in image_records})
run_labels: list[tuple[str, str]] = []
for key in study_keys:
    columns_present = {
        record["column"] for record in image_records if record["study_key"] == key
    }
    for column in RUN_ORDER:
        if column in columns_present:
            run_labels.append((key, column))


def format_run_label(column: str, key: str) -> str:
    return f"{column.replace('_', '')} {key}"


run_to_index: dict[str, int] = {}
for idx, record in enumerate(image_records):
    run_key = f"{record['study_key']}|{record['column']}"
    run_to_index.setdefault(run_key, idx)

notes_by_key = {
    normalize_value(row.get(KEY_COLUMN)): normalize_value(row.get("Manual Review Notes"))
    for _, row in df.iterrows()
    if normalize_value(row.get(KEY_COLUMN))
}

flag_by_run: dict[str, str] = {}
for _, row in df.iterrows():
    study_key = normalize_value(row.get(KEY_COLUMN))
    if not study_key:
        continue
    for col in RUN_ORDER:
        flag_col = f"{col}_Review_Flag"
        flag_value = normalize_value(row.get(flag_col))
        run_key = f"{study_key}|{col}"
        flag_by_run[run_key] = flag_value
notes_status = widgets.HTML(value="")
notes_area = widgets.Textarea(
    value="",
    placeholder="Enter manual review notes here",
    layout=widgets.Layout(width="520px", height="140px"),
)
save_notes_button = widgets.Button(description="Save Notes", button_style="primary")

current_notes_key = None


def load_notes(study_key: str) -> None:
    global current_notes_key
    if not study_key:
        return
    if current_notes_key != study_key:
        notes_area.value = notes_by_key.get(study_key, "")
        current_notes_key = study_key
    notes_status.value = f"Notes for {KEY_COLUMN}: {study_key}"


def save_notes(_):
    if not current_notes_key:
        return
    notes_by_key[current_notes_key] = notes_area.value
    df.loc[
        df[KEY_COLUMN].astype(str) == current_notes_key,
        "Manual Review Notes",
    ] = notes_area.value
    df.to_excel(EXCEL_PATH, index=False)
    notes_status.value = f"Saved notes for {KEY_COLUMN}: {current_notes_key}"


save_notes_button.on_click(save_notes)

run_key_to_study = {f"{key}|{column}": key for key, column in run_labels}
run_options = [
    (format_run_label(column, key), f"{key}|{column}")
    for key, column in run_labels
]

marked_runs: set[str] = set()
reviewed_runs: set[str] = set()
mark_label = widgets.HTML(value="")
mark_status = widgets.HTML(value="")
mark_button = widgets.Button(description="Mark for manual review", button_style="warning")
ok_button = widgets.Button(description="Mark as OK", button_style="warning")
mark_save_button = widgets.Button(description="Save All", button_style="primary")

select_update = False


def get_flag_status(value: str) -> str:
    val = (value or "").strip().lower()
    if val in {"review", "1", "true", "yes", "y"}:
        return "review"
    if val == "ok":
        return "ok"
    return ""


for key, column in run_labels:
    run_key = f"{key}|{column}"
    status = get_flag_status(flag_by_run.get(run_key, ""))
    if status == "review":
        marked_runs.add(run_key)
    elif status == "ok":
        reviewed_runs.add(run_key)


def get_status_prefix(run_key: str) -> str:
    if run_key in marked_runs:
        return "[REVIEW]"
    elif run_key in reviewed_runs:
        return "[OK]"
    return "[--]"


def build_select_options() -> list[tuple[str, str]]:
    options = []
    for key, column in run_labels:
        run_key = f"{key}|{column}"
        label = format_run_label(column, key)
        prefix = get_status_prefix(run_key)
        options.append((f"{prefix} {label}", run_key))
    return options


run_select = widgets.Select(
    options=build_select_options(),
    rows=min(30, len(run_labels)) if run_labels else 10,
    layout=widgets.Layout(width="420px", height="600px"),
)


def update_run_list() -> None:
    global select_update
    current_value = run_select.value
    select_update = True
    run_select.options = build_select_options()
    run_select.value = current_value
    select_update = False


def update_mark_label() -> None:
    mark_label.value = f"Marked for review: {len(marked_runs)} | OK: {len(reviewed_runs)} | Total: {len(run_labels)}"


def update_buttons() -> None:
    run_key = run_select.value
    if not run_key:
        mark_button.button_style = "warning"
        ok_button.button_style = "warning"
        return
    mark_button.button_style = "danger" if run_key in marked_runs else "warning"
    ok_button.button_style = "success" if run_key in reviewed_runs else "warning"


def save_marks(_):
    for key, column in run_labels:
        run_key = f"{key}|{column}"
        flag_col = f"{column}_Review_Flag"
        if run_key in marked_runs:
            df.loc[df[KEY_COLUMN].astype(str) == key, flag_col] = "Review"
        elif run_key in reviewed_runs:
            df.loc[df[KEY_COLUMN].astype(str) == key, flag_col] = "OK"
        else:
            df.loc[df[KEY_COLUMN].astype(str) == key, flag_col] = ""
    df.to_excel(EXCEL_PATH, index=False)
    mark_status.value = f"Saved: {len(marked_runs)} review, {len(reviewed_runs)} OK"


def toggle_mark_review(_):
    run_key = run_select.value
    if not run_key:
        return
    reviewed_runs.discard(run_key)
    if run_key in marked_runs:
        marked_runs.remove(run_key)
    else:
        marked_runs.add(run_key)
    update_mark_label()
    update_buttons()
    update_run_list()


def toggle_mark_ok(_):
    run_key = run_select.value
    if not run_key:
        return
    marked_runs.discard(run_key)
    if run_key in reviewed_runs:
        reviewed_runs.remove(run_key)
    else:
        reviewed_runs.add(run_key)
    update_mark_label()
    update_buttons()
    update_run_list()


def on_run_selected(change):
    if select_update:
        return
    if change.get("name") == "value":
        run_key = change.get("new")
        if run_key and run_key in run_to_index:
            global record_index
            record_index = run_to_index[run_key]
            study_key = run_key_to_study.get(run_key, "")
            load_notes(study_key)
            update_buttons()
            update_run_list()
            render_current()


run_select.observe(on_run_selected, names="value")
mark_button.on_click(toggle_mark_review)
ok_button.on_click(toggle_mark_ok)
mark_save_button.on_click(save_marks)
update_mark_label()
update_buttons()
update_run_list()


def highlight_run(study_key: str, column: str) -> None:
    global select_update
    run_key = f"{study_key}|{column}"
    if run_select.value == run_key:
        return
    select_update = True
    try:
        run_select.value = run_key
    except Exception:
        pass
    select_update = False
    update_buttons()


# No custom dark styling - VS Code handles dark mode itself

left_panel = widgets.VBox(
    [
        widgets.HTML(value="<b>Run review</b>"),
        mark_label,
        widgets.HBox([ok_button, mark_button]),
        mark_save_button,
        mark_status,
        run_select,
    ],
)
right_panel = widgets.VBox(
    [
        run_controls,
        frame_controls,
        info_out,
        image_out,
        notes_status,
        notes_area,
        save_notes_button,
    ]
)
container = widgets.HBox([left_panel, right_panel])

display(container)

render_current()

HBox(children=(VBox(children=(HTML(value='<b>Run review</b>'), HTML(value='Marked for review: 0 | OK: 0 | Tota…