# 07 Responsible Interpretation & Limitations

**Goal:** Provide transparent assumptions, limitations, and responsible guidance for interpreting results.

This notebook is non-technical by design:
- Summarizes dataset constraints and bias risks
- Documents analytical choices made across notebooks
- Lists recommended responsible uses + non-uses

In [1]:
from pathlib import Path
import pandas as pd
from IPython.display import display

DATA = Path("../data")

inc = pd.read_csv(DATA / "incidents.csv")
rep = pd.read_csv(DATA / "reports.csv")
sub = pd.read_csv(DATA / "submissions.csv")

mit = pd.read_csv(DATA / "classifications_MIT.csv") if (DATA / "classifications_MIT.csv").exists() else None
gmf = pd.read_csv(DATA / "classifications_GMF.csv") if (DATA / "classifications_GMF.csv").exists() else None
cset = pd.read_csv(DATA / "classifications_CSETv1.csv") if (DATA / "classifications_CSETv1.csv").exists() else None

# normalize cols
for df in [inc, rep, sub]:
    df.columns = [c.strip().lower() for c in df.columns]
if mit is not None: mit.columns = [c.strip().lower() for c in mit.columns]
if gmf is not None: gmf.columns = [c.strip().lower() for c in gmf.columns]
if cset is not None: cset.columns = [c.strip().lower() for c in cset.columns]

# normalize incident id naming
def norm(df):
    if df is None: return None
    if "incident id" in df.columns and "incident_id" not in df.columns:
        df = df.rename(columns={"incident id": "incident_id"})
    return df

mit, gmf, cset = norm(mit), norm(gmf), norm(cset)

incident_ids = set(inc["incident_id"]) if "incident_id" in inc.columns else set()

card = {
    "Incidents (rows)": len(inc),
    "Reports (rows)": len(rep),
    "Unique report URLs": rep["url"].nunique() if "url" in rep.columns else None,
    "Unique source domains": rep["source_domain"].nunique() if "source_domain" in rep.columns else None,
    "Submissions rows (auxiliary)": len(sub),
    "MIT coverage (incident-level)": (mit["incident_id"].nunique()/len(incident_ids)) if (mit is not None and len(incident_ids)>0) else None,
    "GMF coverage (incident-level)": (gmf["incident_id"].nunique()/len(incident_ids)) if (gmf is not None and len(incident_ids)>0) else None,
    "CSET coverage (incident-level)": (cset["incident_id"].nunique()/len(incident_ids)) if (cset is not None and len(incident_ids)>0) else None,
}

display(pd.DataFrame([card]))

Unnamed: 0,Incidents (rows),Reports (rows),Unique report URLs,Unique source domains,Submissions rows (auxiliary),MIT coverage (incident-level),GMF coverage (incident-level),CSET coverage (incident-level)
0,1367,6687,5846,1781,45,0.908559,0.238478,0.156547


## What this dataset is (and is not)

**What it is:**
- A curated, documented collection of AI-related incidents and reporting artifacts.
- Useful for identifying broad patterns in incident types, harms, and failure modes.
- Useful for transparency and reproducible policy-facing analytics.

**What it is not:**
- A complete census of all AI harms.
- A measurement of “true prevalence” of incidents across sectors.
- A causal dataset (co-occurrence ≠ causality).

## Key limitations and risks of misinterpretation

1. **Selection / reporting bias**
   - Incidents included depend on discoverability, media coverage, and monitoring.
   - Certain regions/sectors may be under-represented.

2. **Taxonomy coverage differences**
   - MIT coverage is high, GMF and CSET are lower.
   - Deep technical failure patterns may not generalize to all incidents.

3. **Schema variation across snapshots**
   - Column naming and completeness may differ between exports.
   - Analyses must defensively detect columns and document assumptions.

4. **Temporal ambiguity**
   - Incident dates and report publication dates measure different phenomena.
   - Without full incident↔report mapping, “reporting lag” can only be approximated on subsets.

## Responsible usage guidance

 Appropriate:
- Macro-level descriptive trends (domains, intent, failure types)
- Transparency reporting and “what kinds of harms show up”
- Hypothesis generation for deeper audits

 Not appropriate:
- Ranking organizations/products as “most harmful”
- Making causal claims about model architectures or actors
- Estimating the true rate of AI harms in the world

## Reproducibility commitments in this submission

- Every notebook is runnable from a clean environment with the provided snapshot.
- Outputs are saved to `outputs/figures/` and key tables can be exported as CSV.
- Limitations are explicitly stated alongside findings.