# 02 — MAUDE Brand-Level Safety Summary Export

This helper notebook takes the unified MAUDE event-level dataset produced by the core **maude_signals** pipeline and condenses it into a compact, brand-level safety table.

**Goals:**
- Load the flattened MAUDE event data (`_analysis_event_level.csv`).
- Normalize device brand names to a consistent form.
- Aggregate unique event counts per brand (`total_events`).
- Derive a simple relative `safety_score` in \[0, 1\].
- Save the result as `maude_safety_summary.csv` so it can be copied into the  
  `quantum-clinical-trial-optimization/data/external/` folder and used as an optional safety-signal input for trial optimization scenarios.


In [1]:
# ============================================================
# Suppress pandas DtypeWarning (mixed types) for this notebook
# ============================================================

import warnings
from pandas.errors import DtypeWarning

warnings.filterwarnings("ignore", category=DtypeWarning)
print("DtypeWarning suppressed for this notebook.")




In [2]:
# ============================================================
# Cell 1 — Load unified MAUDE event-level data
# ============================================================

from pathlib import Path
import pandas as pd

# Unified event-level CSV written by maude_signals.ipynb (Cell 04)
EVENTS_PATH = Path("/home/parallels/data/capstone-maude/Data/outputs/tables/_analysis_event_level.csv")

print("Reading:", EVENTS_PATH)

df_events = pd.read_csv(EVENTS_PATH)
print("df_events shape:", df_events.shape)
df_events.head()


Reading: /home/parallels/data/capstone-maude/Data/outputs/tables/_analysis_event_level.csv
df_events shape: (11693760, 7)


Unnamed: 0,event_id,event_date,brand,model,problem_code,problem_desc,quarter
0,11662326,2021-04-13 00:00:00,630G INSULIN PUMP MMT-1715K 630G BLACK MG,MMT-1715K,3010,Power Problem,2021-Q2.0
1,11662327,2021-04-13 00:00:00,EMBLEM S-ICD,A209,1057,Premature Discharge of Battery,2021-Q2.0
2,11662328,2021-04-13 00:00:00,EMBLEM MRI S-ICD,A219,1057,Premature Discharge of Battery,2021-Q2.0
3,11662328,2021-04-13 00:00:00,EMBLEM MRI S-ICD,A219,3273,"Noise, Audible",2021-Q2.0
4,11662329,2021-04-13 00:00:00,EMBLEM S-ICD,A209,1057,Premature Discharge of Battery,2021-Q2.0


### Cell 1 — Load Unified MAUDE Event-Level Data

This cell loads the fully flattened MAUDE event dataset created by the main `maude_signals` pipeline. The file `_analysis_event_level.csv` contains one row per device-related adverse event, with key fields such as:

- `event_id` — unique identifier for the MAUDE event  
- `event_date` — when the event was reported  
- `brand` — reported device brand name  
- `model` — device model identifier  
- `problem_code` / `problem_desc` — coded and textual problem descriptions  
- `quarter` — time-bucketed reporting period

By reading this table into `df_events` and printing its shape and first few rows, we confirm that the unified event-level data is available and correctly structured before aggregating to brand-level safety signals in the next cell.

In [3]:
# ============================================================
# Cell 2 — Build brand-level MAUDE safety summary
# ============================================================

from pathlib import Path
import pandas as pd
import numpy as np

def normalize_name(name: str) -> str | None:
    if pd.isna(name):
        return None
    return " ".join(str(name).upper().split())

# 1) Normalize brand names
df_events["brand_norm"] = df_events["brand"].apply(normalize_name)

# 2) Group by normalized brand and count unique events
summary = (
    df_events
    .groupby("brand_norm", dropna=True)
    .agg(
        total_events=("event_id", "nunique"),
    )
    .reset_index()
)

# 3) Simple safety_score: scale total_events into [0, 1]
max_events = summary["total_events"].max()
summary["safety_score"] = summary["total_events"] / max_events

# 4) Rename brand_norm -> brand for clarity
summary.rename(columns={"brand_norm": "brand"}, inplace=True)

# 5) Write summary to disk (same area as other MAUDE outputs)
out_path = Path("/home/parallels/data/capstone-maude/Data/outputs/tables/maude_safety_summary.csv")
out_path.parent.mkdir(parents=True, exist_ok=True)
summary.to_csv(out_path, index=False)

print("Wrote", out_path, "with shape", summary.shape)
summary.head()


Wrote /home/parallels/data/capstone-maude/Data/outputs/tables/maude_safety_summary.csv with shape (155489, 3)


Unnamed: 0,brand,total_events,safety_score
0,!!! POWERFLEXX,1,1e-06
1,!!93-P PROFLEXX,1,1e-06
2,!M1,1,1e-06
3,"""1.0MM"" SYSTEM TWIST DRILL TW 7X50MM 3MMSTOP W...",1,1e-06
4,"""1.5MM"" SYSTEM 1.5X4MM HT SD X-DR SCR 5-PK",1,1e-06


### Cell 2 — Build Brand-Level MAUDE Safety Summary

This cell condenses the full MAUDE event table into a compact, brand-level safety view.

1. **Normalize brand names**  
   The raw `brand` values are cleaned into a standardized form (`brand_norm`) by uppercasing and collapsing whitespace. This reduces noise from minor formatting differences.

2. **Aggregate events by brand**  
   For each normalized brand, the notebook counts the number of unique `event_id` values, producing `total_events`. This gives a simple measure of how frequently each brand appears in MAUDE.

3. **Derive a relative safety_score**  
   The `total_events` values are rescaled into the \[0, 1\] range, where brands with the highest event volume receive a `safety_score` near 1 and brands with very few events receive scores near 0.

4. **Persist for downstream use**  
   The resulting three-column table:

   - `brand`  
   - `total_events`  
   - `safety_score`  

   is written to `Data/outputs/tables/maude_safety_summary.csv` so it can be copied into the quantum-clinical-trial-optimization project and used as an optional safety signal.


## Notebook Summary — MAUDE Brand-Level Safety Export

This helper notebook bridges the MAUDE safety pipeline and the quantum clinical-trial optimization project.

Starting from the unified MAUDE event-level file (`_analysis_event_level.csv`), it:

1. Loads all device adverse events into a single DataFrame (`df_events`).
2. Normalizes device brand names to a consistent form.
3. Aggregates unique event counts per brand (`total_events`).
4. Converts those counts into a simple relative `safety_score` scaled to \[0, 1\].
5. Writes the resulting three-column table:

   `brand`, `total_events`, `safety_score`

to:

`/home/parallels/data/capstone-maude/Data/outputs/tables/maude_safety_summary.csv`

This summary can then be copied into the `quantum-clinical-trial-optimization/data/external/` folder and used as an optional safety-signal input in downstream trial scenario and optimization notebooks.