# OpenFDA → RWE Governance Scorecard (Demo)

This notebook pulls a small sample from the **OpenFDA** API (drug adverse events), computes
basic governance/quality metrics, and renders a **scorecard**. It is intentionally simple to
communicate strategic governance concepts with a credible, runnable example.

**Note:** Internet access is required to fetch data from the API.


In [2]:
import os, json, requests, pandas as pd, numpy as np
from pathlib import Path
import matplotlib.pyplot as plt

from scripts.scorecard import compute_basic_metrics, score, plot_scorecard

pd.set_option('display.max_columns', 50)
print('Setup complete.')

ModuleNotFoundError: No module named 'scripts'

## 1) Fetch data from OpenFDA API

We request a small sample of **drug adverse events**. Endpoint docs: https://api.fda.gov/drug/event.json

You can change the `search` and `limit` parameters to explore different products or time windows.


In [None]:
BASE = 'https://api.fda.gov/drug/event.json'
params = {
    'search': 'receivedate:[20240101+TO+20251231]',
    'limit': 100
}
r = requests.get(BASE, params=params, timeout=30)
r.raise_for_status()
payload = r.json()
len(payload.get('results', [])), list(payload.keys())

## 2) Normalize to a tabular view

OpenFDA returns nested JSON. We extract a few commonly available fields for governance checks.


In [None]:
results = payload.get('results', [])
def flatten(rec):
    return {
        'safetyreportid': rec.get('safetyreportid'),
        'receiptdate': rec.get('receiptdate'),
        'receivedate': rec.get('receivedate'),
        'occurcountry': rec.get('occurcountry'),
        'serious': rec.get('serious'),
        'senderorganization': rec.get('sender', {}).get('senderorganization') if isinstance(rec.get('sender'), dict) else None,
        'companynumb': rec.get('companynumb'),
    }

df = pd.DataFrame([flatten(r) for r in results])
df.head()

## 3) Compute governance/quality metrics and an overall score

These are illustrative metrics only; refine as needed for your organization and data model.


In [None]:
required_fields = ['safetyreportid','receiptdate','occurcountry','serious']
metrics = compute_basic_metrics(df, required_fields)
overall = score(metrics)
metrics, overall

## 4) Plot scorecard

In [None]:
plot_scorecard(metrics)

## 5) Save artifacts (optional)

Cache raw JSON and the flattened CSV locally to `data/`.


In [None]:
Path('data').mkdir(exist_ok=True)
(Path('data') / 'openfda_sample.json').write_text(json.dumps(payload, indent=2))
df.to_csv('data/openfda_flat.csv', index=False)
print('Saved data/openfda_sample.json and data/openfda_flat.csv')

## Next steps
- Extend metrics: OMOP/FHIR conformity checks, vocabulary coverage (RxNorm, SNOMED), negative controls.
- Add jurisdiction-aware compliance flags (e.g., GDPR, HIPAA proxies).
- Create a portfolio dashboard comparing multiple sources/time windows.
