# 07 Responsible Interpretation and Limitations

**Purpose**  
Document dataset scope, key constraints, and responsible interpretation guidance for all analytical outputs.

This notebook intentionally avoids new computations beyond reporting core dataset diagnostics.

## Configuration and Dataset Snapshot

In [1]:
import sys
from pathlib import Path
import pandas as pd
from IPython.display import display

PROJECT_ROOT = next((p for p in [Path.cwd(), *Path.cwd().parents] if (p / "src").exists()), Path.cwd())
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

from src.notebook_utils import load_data, normalize_incident_id

DATA_PATH = PROJECT_ROOT / "data"
OUTPUT_PATH = PROJECT_ROOT / "outputs" / "figures"
TOP_N = 15
DATE_CANDIDATES = ["date", "date_published", "date_submitted"]

loaded_tables = load_data(DATA_PATH, tables=["incidents", "reports", "submissions", "mit", "gmf", "cset"])

incidents_df = loaded_tables["incidents"]
reports_df = loaded_tables["reports"]
submissions_df = loaded_tables["submissions"]
mit_df = normalize_incident_id(loaded_tables["mit"])
gmf_df = normalize_incident_id(loaded_tables["gmf"])
cset_df = normalize_incident_id(loaded_tables["cset"])

if incidents_df is None or reports_df is None or submissions_df is None:
    raise FileNotFoundError("Required tables missing: incidents.csv, reports.csv, or submissions.csv.")

incident_id_values = set(incidents_df["incident_id"]) if "incident_id" in incidents_df.columns else set()

overview_card = {
    "Incidents (rows)": len(incidents_df),
    "Reports (rows)": len(reports_df),
    "Unique report URLs": reports_df["url"].nunique() if "url" in reports_df.columns else None,
    "Unique source domains": reports_df["source_domain"].nunique() if "source_domain" in reports_df.columns else None,
    "Submissions rows (auxiliary)": len(submissions_df),
    "MIT coverage (incident-level)": (mit_df["incident_id"].nunique() / len(incident_id_values)) if (mit_df is not None and "incident_id" in mit_df.columns and len(incident_id_values) > 0) else None,
    "GMF coverage (incident-level)": (gmf_df["incident_id"].nunique() / len(incident_id_values)) if (gmf_df is not None and "incident_id" in gmf_df.columns and len(incident_id_values) > 0) else None,
    "CSET coverage (incident-level)": (cset_df["incident_id"].nunique() / len(incident_id_values)) if (cset_df is not None and "incident_id" in cset_df.columns and len(incident_id_values) > 0) else None,
}

display(pd.DataFrame([overview_card]))

Unnamed: 0,Incidents (rows),Reports (rows),Unique report URLs,Unique source domains,Submissions rows (auxiliary),MIT coverage (incident-level),GMF coverage (incident-level),CSET coverage (incident-level)
0,1367,6687,5846,1781,45,0.908559,0.238478,0.156547


## What this dataset is (and is not)

**What it is:**
- A curated, documented collection of AI-related incidents and reporting artifacts.
- Useful for identifying broad patterns in incident types, harms, and failure modes.
- Useful for transparency and reproducible policy-facing analytics.

**What it is not:**
- A complete census of all AI harms.
- A measurement of true prevalence of incidents across sectors.
- A causal dataset (co-occurrence does not imply causality).

## Key limitations and risks of misinterpretation

1. **Selection and reporting bias**
   - Incidents included depend on discoverability, media coverage, and monitoring.
   - Certain regions or sectors may be under-represented.

2. **Taxonomy coverage differences**
   - MIT coverage is higher than GMF and CSET.
   - Deep technical patterns may not generalize to all incidents.

3. **Schema variation across snapshots**
   - Column naming and completeness can differ between exports.
   - Notebooks therefore use defensive column detection and explicit assumptions.

4. **Temporal ambiguity**
   - Incident dates and report publication dates measure different events.
   - Reporting lag can only be approximated on subset joins.

## Responsible usage guidance

Appropriate:
- Macro-level descriptive trends (domains, intent, failure types)
- Transparency reporting and hypothesis generation
- Reproducible policy-facing analytics

Not appropriate:
- Ranking organizations as most harmful
- Making causal claims about architectures or actors
- Estimating true global incidence rates

## Reproducibility commitments

- Notebooks run top-to-bottom with shared configuration and reusable helpers.
- Outputs are saved to `outputs/figures/` and key tables are exportable.
- Limitations are explicitly documented next to findings.