# Gap 5: Temporal Drift Analysis (Scenario 5)

**Objective:** Analyze how record linkage quality changes across enrollment years,
detect name/state migration patterns, and quantify temporal drift in the
Med→PECOS and OP→Med linkage pipelines.

**Key Questions:**
1. Does the name-mismatch rate between Medicare and PECOS increase for older enrollments?
2. How many providers change states over time (geographic mobility)?
3. Does fuzzy match confidence (Jaro-Winkler scores) degrade with enrollment age?
4. Are there year-specific linkage failures that suggest schema or coding changes?

**Input:**
- `../artifacts/phase5_entity_resolution/unified_provider_entities.parquet`
- `../artifacts/phase4_linkage_med_pecos/med_pecos_tier1_npi.parquet`
- `../artifacts/phase2_preprocessing/pecos_clean.parquet`

**Output:** `../artifacts/phase7_temporal_drift/`

## 5.0 Setup

In [5]:
import os, time
import pandas as pd
import numpy as np
from collections import Counter

INPUTDIR_P2 = "../artifacts/phase2_preprocessing"
INPUTDIR_P4 = "../artifacts/phase4_linkage"
INPUTDIR_P5 = "../artifacts/phase5_entity_resolution"
OUTPUTDIR = "../artifacts/phase7_temporal_drift"
os.makedirs(OUTPUTDIR, exist_ok=True)

print("Setup complete.")


Setup complete.


## 5.1 Load Datasets

In [6]:
print("=" * 60)
print("5.1 LOAD DATASETS")
print("=" * 60)

# Unified provider table (from Phase 5)
unified = pd.read_parquet(os.path.join(INPUTDIR_P5, "unified_provider_entities.parquet"))
print(f"Unified providers: {len(unified):,}")

# Med→PECOS tier-1 links (individual) — has ENRLMT_YEAR
mp_tier1 = pd.read_parquet(os.path.join(INPUTDIR_P4, "med_pecos_tier1_npi.parquet"))
print(f"Med→PECOS tier-1 links: {len(mp_tier1):,}")

# Full PECOS clean — for multi-enrollment analysis
pecos = pd.read_parquet(os.path.join(INPUTDIR_P2, "pecos_clean.parquet"))
print(f"PECOS records: {len(pecos):,}")

# Year distribution
print(f"\nEnrollment year range: {mp_tier1['ENRLMT_YEAR'].min()} – {mp_tier1['ENRLMT_YEAR'].max()}")
print(f"\nYear distribution (Med→PECOS tier-1):")
print(mp_tier1["ENRLMT_YEAR"].value_counts().sort_index().to_string())


5.1 LOAD DATASETS
Unified providers: 1,237,145
Med→PECOS tier-1 links: 1,084,185
PECOS records: 2,936,748

Enrollment year range: 2003 – 2025

Year distribution (Med→PECOS tier-1):
ENRLMT_YEAR
2003     5859
2004    70057
2005    56807
2006    33209
2007    39593
2008    43134
2009    38273
2010    73436
2011    47627
2012    39073
2013    31955
2014    36830
2015    41113
2016    43470
2017    46877
2018    48994
2019    55587
2020    58635
2021    68297
2022    78378
2023    70089
2024    31928
2025    24964


## 5.2 Name Mismatch Rate by Enrollment Year

**Hypothesis:** Older PECOS enrollments are more likely to have name mismatches
with the 2023 Medicare snapshot, because providers may have changed names
(marriage, legal changes) or because older PECOS records used different
naming conventions.

In [8]:
print("=" * 60)
print("5.2 NAME MISMATCH RATE BY ENROLLMENT YEAR")
print("=" * 60)

# mp_tier1 has both Medicare names (Rndrng_Prvdr*) and PECOS names (FIRSTNAME, LASTNAME)
# Compare them by year

mp = mp_tier1.copy()

# Standardize for comparison
mp["med_first"] = mp["Rndrng_Prvdr_First_Name"].str.upper().str.strip()
mp["med_last"]  = mp["Rndrng_Prvdr_Last_Org_Name"].str.upper().str.strip()
mp["pecos_first"] = mp["FIRST_NAME"].str.upper().str.strip()
mp["pecos_last"]  = mp["LAST_NAME"].str.upper().str.strip()

# Detect mismatches
mp["first_mismatch"] = (mp["med_first"] != mp["pecos_first"]) & mp["med_first"].notna() & mp["pecos_first"].notna()
mp["last_mismatch"]  = (mp["med_last"] != mp["pecos_last"]) & mp["med_last"].notna() & mp["pecos_last"].notna()
mp["any_mismatch"]   = mp["first_mismatch"] | mp["last_mismatch"]

# Aggregate by year
year_stats = mp.groupby("ENRLMT_YEAR").agg(
    total=("any_mismatch", "count"),
    n_mismatch=("any_mismatch", "sum"),
    n_first_mm=("first_mismatch", "sum"),
    n_last_mm=("last_mismatch", "sum"),
).reset_index()

year_stats["mismatch_rate"] = (year_stats["n_mismatch"] / year_stats["total"] * 100).round(2)
year_stats["first_mm_rate"] = (year_stats["n_first_mm"] / year_stats["total"] * 100).round(2)
year_stats["last_mm_rate"]  = (year_stats["n_last_mm"] / year_stats["total"] * 100).round(2)

print(year_stats.to_string(index=False))

# Save
year_stats.to_csv(os.path.join(OUTPUTDIR, "name_mismatch_by_year.csv"), index=False)
print(f"\n✓ Saved name_mismatch_by_year.csv")


5.2 NAME MISMATCH RATE BY ENROLLMENT YEAR
 ENRLMT_YEAR  total  n_mismatch  n_first_mm  n_last_mm  mismatch_rate  first_mm_rate  last_mm_rate
        2003   5859         226         103        126           3.86           1.76          2.15
        2004  70057        2381        1108       1303           3.40           1.58          1.86
        2005  56807        1968         940       1062           3.46           1.65          1.87
        2006  33209        1112         491        635           3.35           1.48          1.91
        2007  39593        1262         536        740           3.19           1.35          1.87
        2008  43134        1409         660        767           3.27           1.53          1.78
        2009  38273        1180         547        651           3.08           1.43          1.70
        2010  73436        2025        1000       1044           2.76           1.36          1.42
        2011  47627        1404         639        794           2.

## 5.3 Mismatch Trend Visualization

In [9]:
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt

fig, ax1 = plt.subplots(figsize=(12, 6))

# Bar chart: record count per year
ax1.bar(year_stats["ENRLMT_YEAR"], year_stats["total"], alpha=0.3, color="steelblue", label="Records")
ax1.set_xlabel("PECOS Enrollment Year")
ax1.set_ylabel("Number of Med→PECOS Links", color="steelblue")
ax1.tick_params(axis="y", labelcolor="steelblue")

# Line chart: mismatch rate
ax2 = ax1.twinx()
ax2.plot(year_stats["ENRLMT_YEAR"], year_stats["mismatch_rate"], "ro-", linewidth=2, markersize=5, label="Any Mismatch %")
ax2.plot(year_stats["ENRLMT_YEAR"], year_stats["first_mm_rate"], "g^--", linewidth=1, markersize=4, label="First Name MM %")
ax2.plot(year_stats["ENRLMT_YEAR"], year_stats["last_mm_rate"], "bs--", linewidth=1, markersize=4, label="Last Name MM %")
ax2.set_ylabel("Mismatch Rate (%)", color="red")
ax2.tick_params(axis="y", labelcolor="red")

# Legend
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc="upper left")

plt.title("Name Mismatch Rate by PECOS Enrollment Year\n(Medicare 2023 vs PECOS historical)")
plt.tight_layout()
plt.savefig(os.path.join(OUTPUTDIR, "temporal_mismatch_trend.png"), dpi=150)
plt.close()
print("✓ Saved temporal_mismatch_trend.png")


✓ Saved temporal_mismatch_trend.png


## 5.4 Geographic Mobility — State Changes Over Time

Providers who enrolled in PECOS in one state but appear in Medicare's 2023 snapshot
in a different state represent geographic mobility. This affects state-based blocking
strategies and can cause linkage failures.

In [12]:
print("=" * 60)
print("5.4 GEOGRAPHIC MOBILITY (STATE CHANGES)")
print("=" * 60)

# Compare PECOS state (STATE_CD) vs Medicare state (Rndrng_Prvdr_State_Abrvtn)
mp["state_changed"] = (
    (mp["Rndrng_Prvdr_State_Abrvtn"] != mp["STATE_CD"]) &
    mp["Rndrng_Prvdr_State_Abrvtn"].notna() &
    mp["STATE_CD"].notna()
)

# By year
state_drift = mp.groupby("ENRLMT_YEAR").agg(
    total=("state_changed", "count"),
    n_changed=("state_changed", "sum"),
).reset_index()
state_drift["change_rate"] = (state_drift["n_changed"] / state_drift["total"] * 100).round(2)

print("State change rate by enrollment year:")
print(state_drift.to_string(index=False))

# Overall
total_changed = mp["state_changed"].sum()
total_checked = mp["state_changed"].count()
print(f"\nOverall state changes: {total_changed:,} / {total_checked:,} ({total_changed/total_checked*100:.2f}%)")

# Top migration corridors
movers = mp[mp["state_changed"]].copy()
if len(movers) > 0:
    movers["corridor"] = movers["STATE_CD"] + " → " + movers["Rndrng_Prvdr_State_Abrvtn"]
    print(f"\nTop 15 migration corridors (PECOS state → Medicare 2023 state):")
    print(movers["corridor"].value_counts().head(15).to_string())

state_drift.to_csv(os.path.join(OUTPUTDIR, "state_drift_by_year.csv"), index=False)
print(f"\n✓ Saved state_drift_by_year.csv")


5.4 GEOGRAPHIC MOBILITY (STATE CHANGES)
State change rate by enrollment year:
 ENRLMT_YEAR  total  n_changed  change_rate
        2003   5859         71         1.21
        2004  70057       1289         1.84
        2005  56807       1294         2.28
        2006  33209        862         2.60
        2007  39593       1528         3.86
        2008  43134       1888         4.38
        2009  38273       2068         5.40
        2010  73436       2705         3.68
        2011  47627       2996         6.29
        2012  39073       2895         7.41
        2013  31955       3032         9.49
        2014  36830       4026        10.93
        2015  41113       4961        12.07
        2016  43470       5709        13.13
        2017  46877       7125        15.20
        2018  48994       7975        16.28
        2019  55587       9390        16.89
        2020  58635      11809        20.14
        2021  68297      13578        19.88
        2022  78378      17395        22.1

## 5.5 State Drift Visualization

In [13]:
fig, ax1 = plt.subplots(figsize=(12, 6))

ax1.bar(state_drift["ENRLMT_YEAR"], state_drift["total"], alpha=0.3, color="steelblue", label="Records")
ax1.set_xlabel("PECOS Enrollment Year")
ax1.set_ylabel("Number of Med→PECOS Links", color="steelblue")
ax1.tick_params(axis="y", labelcolor="steelblue")

ax2 = ax1.twinx()
ax2.plot(state_drift["ENRLMT_YEAR"], state_drift["change_rate"], "ro-", linewidth=2, markersize=5, label="State Change %")
ax2.set_ylabel("State Change Rate (%)", color="red")
ax2.tick_params(axis="y", labelcolor="red")

lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc="upper left")

plt.title("Geographic Mobility: PECOS Enrollment State vs Medicare 2023 State")
plt.tight_layout()
plt.savefig(os.path.join(OUTPUTDIR, "state_drift_trend.png"), dpi=150)
plt.close()
print("✓ Saved state_drift_trend.png")


✓ Saved state_drift_trend.png


## 5.6 Multi-Enrollment Provider Analysis

Some providers have multiple PECOS enrollment records across different years.
This section analyzes enrollment churn and identifies providers whose attributes
(name, state, org) change between enrollments.

In [14]:
print("=" * 60)
print("5.6 MULTI-ENROLLMENT TEMPORAL ANALYSIS")
print("=" * 60)

# Filter individual PECOS enrollments
pecos_indiv = pecos[pecos["ENRLMT_ENTITY"] == "I"].copy()

# How many enrollments per NPI?
enrl_per_npi = pecos_indiv.groupby("NPI").agg(
    n_enrollments=("ENRLMT_ID", "nunique"),
    n_years=("ENRLMT_YEAR", "nunique"),
    first_year=("ENRLMT_YEAR", "min"),
    last_year=("ENRLMT_YEAR", "max"),
    n_states=("STATE_CD", "nunique"),
    n_names=("LAST_NAME", "nunique"),
).reset_index()

print(f"Unique individual NPIs in PECOS: {len(enrl_per_npi):,}")
print(f"\nEnrollment count distribution:")
print(enrl_per_npi["n_enrollments"].describe().to_string())

print(f"\nProviders with multiple enrollments: {(enrl_per_npi['n_enrollments'] > 1).sum():,}")
print(f"Providers spanning multiple years:   {(enrl_per_npi['n_years'] > 1).sum():,}")
print(f"Providers in multiple states:        {(enrl_per_npi['n_states'] > 1).sum():,}")
print(f"Providers with name changes:         {(enrl_per_npi['n_names'] > 1).sum():,}")

# Year span distribution
enrl_per_npi["year_span"] = enrl_per_npi["last_year"] - enrl_per_npi["first_year"]
print(f"\nYear span distribution (last_year - first_year):")
print(enrl_per_npi["year_span"].describe().to_string())


5.6 MULTI-ENROLLMENT TEMPORAL ANALYSIS
Unique individual NPIs in PECOS: 2,142,466

Enrollment count distribution:
count    2.142466e+06
mean     1.167989e+00
std      6.895810e-01
min      1.000000e+00
25%      1.000000e+00
50%      1.000000e+00
75%      1.000000e+00
max      5.100000e+01

Providers with multiple enrollments: 250,185
Providers spanning multiple years:   220,390
Providers in multiple states:        247,564
Providers with name changes:         0

Year span distribution (last_year - first_year):
count    2.142466e+06
mean     6.745055e-01
std      2.618637e+00
min      0.000000e+00
25%      0.000000e+00
50%      0.000000e+00
75%      0.000000e+00
max      2.200000e+01


## 5.7 Enrollment Tenure Cohort Analysis

Group providers into tenure cohorts (years since first enrollment) and measure
how linkage-relevant attributes change with tenure.

In [15]:
print("=" * 60)
print("5.7 TENURE COHORT ANALYSIS")
print("=" * 60)

# Define cohorts based on first enrollment year
# Reference year = 2023 (Medicare snapshot)
enrl_per_npi["tenure"] = 2023 - enrl_per_npi["first_year"]

bins = [0, 2, 5, 10, 15, 50]
labels = ["0-2yr", "3-5yr", "6-10yr", "11-15yr", "16+yr"]
enrl_per_npi["cohort"] = pd.cut(enrl_per_npi["tenure"], bins=bins, labels=labels, right=True)

# Merge with unified to get linkage status
unified_slim = unified[["npi", "has_op_payments", "has_pecos_enrollment", "linkage_coverage"]].copy()
unified_slim["npi"] = pd.to_numeric(unified_slim["npi"], errors="coerce")
enrl_per_npi["NPI"] = pd.to_numeric(enrl_per_npi["NPI"], errors="coerce")

cohort_merged = enrl_per_npi.merge(unified_slim, left_on="NPI", right_on="npi", how="left")

# Aggregate by cohort
cohort_stats = cohort_merged.groupby("cohort", observed=True).agg(
    n_providers=("NPI", "count"),
    multi_state_pct=("n_states", lambda x: (x > 1).mean() * 100),
    name_change_pct=("n_names", lambda x: (x > 1).mean() * 100),
    has_op_pct=("has_op_payments", lambda x: x.mean() * 100),
    has_pecos_pct=("has_pecos_enrollment", lambda x: x.mean() * 100),
    avg_enrollments=("n_enrollments", "mean"),
).round(2).reset_index()

print(cohort_stats.to_string(index=False))
cohort_stats.to_csv(os.path.join(OUTPUTDIR, "tenure_cohort_analysis.csv"), index=False)
print(f"\n✓ Saved tenure_cohort_analysis.csv")


5.7 TENURE COHORT ANALYSIS
 cohort  n_providers  multi_state_pct  name_change_pct  has_op_pct  has_pecos_pct  avg_enrollments
  0-2yr       243722            12.72              0.0       45.54          100.0             1.17
  3-5yr       280836            16.17              0.0       47.90          100.0             1.23
 6-10yr       411750            14.30              0.0       49.06          100.0             1.21
11-15yr       384652            10.99              0.0       51.62          100.0             1.17
  16+yr       321847            12.21              0.0       51.86          100.0             1.19

✓ Saved tenure_cohort_analysis.csv


## 5.8 Fuzzy Match Score Degradation by Year

For tier-2 fuzzy matches (from Phase 4 feature matrix), check if Jaro-Winkler
or other similarity scores decrease for providers with older PECOS enrollments.

In [16]:
print("=" * 60)
print("5.8 FUZZY MATCH SCORE DEGRADATION")
print("=" * 60)

# Load feature matrix from Phase 4 (OP→Med)
PH4DIR = "../artifacts/phase4_linkage"
feat_path = os.path.join(PH4DIR, "feature_matrix.csv")

if os.path.exists(feat_path):
    feat = pd.read_csv(feat_path)
    print(f"Feature matrix loaded: {len(feat):,} pairs")
    print(f"Columns: {feat.columns.tolist()}")

    # Get the match_tier column
    if "match_tier" in feat.columns:
        matches = feat[feat["match_tier"].isin(["match", "possible"])].copy()
        print(f"Matched/possible pairs: {len(matches):,}")

        # If we have similarity scores, analyze them
        score_cols = [c for c in feat.columns if any(k in c.lower() for k in ["jaro", "score", "sim", "ratio"])]
        if score_cols:
            print(f"\nScore columns found: {score_cols}")
            print(f"Score distributions for matched pairs:")
            print(matches[score_cols].describe().round(3).to_string())
        else:
            print("\nNo explicit similarity score columns found in feature matrix.")
            print("Available columns:", feat.columns.tolist())
    else:
        print("No match_tier column — cannot separate matches from non-matches.")
else:
    print(f"Feature matrix not found at {feat_path}")
    print("Skipping fuzzy score degradation analysis.")
    print("To enable: ensure Phase 4 feature_matrix.csv exists.")


5.8 FUZZY MATCH SCORE DEGRADATION
Feature matrix loaded: 492,427 pairs
Columns: ['index_op', 'index_med', 'first_jw', 'first_lev', 'last_jw', 'last_lev', 'first_soundex_match', 'last_soundex_match', 'first_metaphone_match', 'last_metaphone_match', 'street_jw', 'city_match', 'zip5_match', 'name_avg', 'addr_avg', 'raw_score', 'name_len_ratio', 'full_name_jw', 'match_tier', 'which_path', 'ml_match_prob', 'ml_match_pred']
Matched/possible pairs: 558

Score columns found: ['raw_score', 'name_len_ratio']
Score distributions for matched pairs:
       raw_score  name_len_ratio
count    558.000         558.000
mean       0.687           0.981
std        0.145           0.064
min        0.540           0.500
25%        0.589           1.000
50%        0.602           1.000
75%        0.751           1.000
max        1.000           1.000


## 5.9 Year-Specific Linkage Failure Detection

Identify enrollment years with anomalously high linkage failure rates,
which could indicate schema changes, data quality issues, or coding shifts
in the PECOS data.

In [17]:
print("=" * 60)
print("5.9 YEAR-SPECIFIC LINKAGE FAILURES")
print("=" * 60)

# Combine name mismatch + state drift into a single "linkage risk" score per year
combined = year_stats[["ENRLMT_YEAR", "total", "mismatch_rate"]].merge(
    state_drift[["ENRLMT_YEAR", "change_rate"]],
    on="ENRLMT_YEAR"
)
combined["combined_risk"] = combined["mismatch_rate"] + combined["change_rate"]

# Flag anomalous years (> 1.5 IQR above Q3)
q1, q3 = combined["combined_risk"].quantile([0.25, 0.75])
iqr = q3 - q1
threshold = q3 + 1.5 * iqr
combined["anomalous"] = combined["combined_risk"] > threshold

print("Year-by-year linkage risk:")
print(combined.to_string(index=False))

anomalous = combined[combined["anomalous"]]
if len(anomalous) > 0:
    print(f"\n⚠️  Anomalous years detected (combined risk > {threshold:.2f}%):")
    print(anomalous[["ENRLMT_YEAR", "combined_risk"]].to_string(index=False))
else:
    print(f"\n✅ No anomalous years detected (threshold: {threshold:.2f}%)")

combined.to_csv(os.path.join(OUTPUTDIR, "year_linkage_risk.csv"), index=False)
print(f"\n✓ Saved year_linkage_risk.csv")


5.9 YEAR-SPECIFIC LINKAGE FAILURES
Year-by-year linkage risk:
 ENRLMT_YEAR  total  mismatch_rate  change_rate  combined_risk  anomalous
        2003   5859           3.86         1.21           5.07      False
        2004  70057           3.40         1.84           5.24      False
        2005  56807           3.46         2.28           5.74      False
        2006  33209           3.35         2.60           5.95      False
        2007  39593           3.19         3.86           7.05      False
        2008  43134           3.27         4.38           7.65      False
        2009  38273           3.08         5.40           8.48      False
        2010  73436           2.76         3.68           6.44      False
        2011  47627           2.95         6.29           9.24      False
        2012  39073           3.08         7.41          10.49      False
        2013  31955           2.88         9.49          12.37      False
        2014  36830           2.62        10.93   

## 5.10 Summary & Export

In [18]:
print("=" * 60)
print("5.10 TEMPORAL DRIFT SUMMARY")
print("=" * 60)

# Overall statistics
overall_mm = mp["any_mismatch"].mean() * 100
overall_state = mp["state_changed"].mean() * 100
multi_enrl = (enrl_per_npi["n_enrollments"] > 1).mean() * 100
name_churn = (enrl_per_npi["n_names"] > 1).mean() * 100

print(f"--- Key Findings ---")
print(f"Overall name mismatch rate (Med vs PECOS):  {overall_mm:.2f}%")
print(f"Overall state change rate:                   {overall_state:.2f}%")
print(f"Providers with multiple PECOS enrollments:   {multi_enrl:.1f}%")
print(f"Providers with name changes across enrl:     {name_churn:.2f}%")

# Correlation: does older enrollment = higher mismatch?
if len(year_stats) > 3:
    corr = year_stats["ENRLMT_YEAR"].corr(year_stats["mismatch_rate"])
    print(f"\nCorrelation (year vs mismatch rate):          {corr:.3f}")
    if corr < -0.3:
        print("  → Older enrollments DO have higher mismatch rates (negative correlation)")
    elif corr > 0.3:
        print("  → Newer enrollments have higher mismatch rates (unexpected)")
    else:
        print("  → No strong linear trend detected")

corr_state = state_drift["ENRLMT_YEAR"].corr(state_drift["change_rate"])
print(f"Correlation (year vs state change rate):      {corr_state:.3f}")

# Implications for blocking
print(f"\n--- Implications for Record Linkage ---")
print(f"1. State-based blocking will miss ~{overall_state:.1f}% of providers who moved")
print(f"2. Name-based matching should use fuzzy methods to handle {overall_mm:.1f}% name drift")
print(f"3. {name_churn:.1f}% of providers have PECOS name changes — use most-recent enrollment")

# List all outputs
print(f"\n--- Artifacts ---")
for f in sorted(os.listdir(OUTPUTDIR)):
    sz = os.path.getsize(os.path.join(OUTPUTDIR, f)) / 1024
    print(f"  {f:45s} {sz:8.1f} KB")

print("\n" + "=" * 60)
print("GAP 5: TEMPORAL DRIFT ANALYSIS COMPLETE")
print("=" * 60)


5.10 TEMPORAL DRIFT SUMMARY
--- Key Findings ---
Overall name mismatch rate (Med vs PECOS):  2.66%
Overall state change rate:                   15.60%
Providers with multiple PECOS enrollments:   11.7%
Providers with name changes across enrl:     0.00%

Correlation (year vs mismatch rate):          -0.945
  → Older enrollments DO have higher mismatch rates (negative correlation)
Correlation (year vs state change rate):      0.742

--- Implications for Record Linkage ---
1. State-based blocking will miss ~15.6% of providers who moved
2. Name-based matching should use fuzzy methods to handle 2.7% name drift
3. 0.0% of providers have PECOS name changes — use most-recent enrollment

--- Artifacts ---
  name_mismatch_by_year.csv                          1.0 KB
  state_drift_by_year.csv                            0.5 KB
  state_drift_trend.png                             81.1 KB
  temporal_mismatch_trend.png                      132.9 KB
  tenure_cohort_analysis.csv                         0