# 07 - Master Quality Report

**Purpose:** Aggregate all pipeline outputs into a comprehensive quality audit report.

**Outputs:**
- `reports/Master_Audit_Log_YYYYMMDD_HHMMSS.xlsx` (4 sheets)
- Console summary with quality decisions

**References:**
- Cereatti et al. (2024) - Data lineage & SNR
- Winter (2009) - Residual validation
- R√°cz et al. (2025) - Calibration layer

---

## Table of Contents

1. [Setup & Data Loading](#setup) - Load all JSON files once
2. [Data Lineage & Provenance](#section-0) - Section 0
3. [Calibration Layer](#section-1) - Section 1 (R√°cz)
4. [Temporal Quality](#section-2) - Section 2
5. [Interpolation Transparency](#section-3) - Section 3 (Winter)
6. [Filtering Validation](#section-4) - Section 4 (Winter)
7. [Reference Quality](#section-5) - Section 5
8. [Biomechanics & Outliers](#section-6) - Section 6
9. [Quality Scores](#section-7) - Component breakdown
10. [Decision Matrix](#section-8) - Final ACCEPT/REVIEW/REJECT
11. [Excel Export](#export) - Generate Master Audit Log

---

<a id="setup"></a>
## 1. Setup & Data Loading

Load all JSON files **once** and reuse throughout the notebook.

In [1]:
# ============================================================
# IMPORTS & PATH SETUP
# ============================================================
import os
import sys
import pandas as pd
from datetime import datetime
from IPython.display import display, HTML

# Setup paths
if os.path.basename(os.getcwd()) == 'notebooks':
    PROJECT_ROOT = os.path.abspath(os.path.join(os.getcwd(), ".."))
else:
    PROJECT_ROOT = os.path.abspath(os.getcwd())
SRC_PATH = os.path.join(PROJECT_ROOT, "src")
if SRC_PATH not in sys.path:
    sys.path.insert(0, SRC_PATH)

# Import our utility module
from utils_nb07 import (
    load_all_runs, filter_complete_runs, build_quality_row,
    extract_parameters_flat, export_to_excel, export_schema_json,
    export_schema_markdown, safe_get_path, safe_float, safe_int,
    get_git_hash, print_section_header, PARAMETER_SCHEMA, SECTION_DESCRIPTIONS
)

print(f"Project Root: {PROJECT_ROOT}")
print(f"Git Hash: {get_git_hash(PROJECT_ROOT)}")
print(f"Timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

Project Root: c:\Users\drorh\OneDrive - Mobileye\Desktop\gaga
Git Hash: d5c8380
Timestamp: 2026-01-29 14:56:38


In [2]:
# ============================================================
# LOAD ALL DATA (ONCE)
# ============================================================
DERIV_ROOT = os.path.join(PROJECT_ROOT, "derivatives")

# Load all JSON files
print("Loading JSON files...")
all_runs = load_all_runs(DERIV_ROOT)
print(f"Found {len(all_runs)} total runs")

# Filter to complete runs (require step_01 and step_06)
runs_data = filter_complete_runs(all_runs, required_steps=["step_01", "step_06"])
print(f"Complete runs: {len(runs_data)}")

# Show available steps per run (expecting: step_01 through step_06)
print("\nSteps available per run:")
expected_steps = ['step_01', 'step_02', 'step_03', 'step_04', 'step_05', 'step_06']
for run_id, steps in runs_data.items():
    steps_list = sorted(steps.keys())
    missing = [s for s in expected_steps if s not in steps_list]
    print(f"  {run_id[:50]}...")
    print(f"    Found: {steps_list}")
    if missing:
        print(f"    ‚ö†Ô∏è Missing: {missing}")

Loading JSON files...
Found 1 total runs
Complete runs: 1

Steps available per run:
  734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002...
    Found: ['step_01', 'step_02', 'step_03', 'step_04', 'step_05', 'step_06']


In [3]:
# ============================================================
# BUILD MASTER DATAFRAMES (REUSED IN ALL SECTIONS)
# ============================================================

# Quality report DataFrame (aggregated metrics)
quality_rows = [build_quality_row(run_id, steps) for run_id, steps in runs_data.items()]
df_quality = pd.DataFrame(quality_rows)
df_quality = df_quality.sort_values("Quality_Score", ascending=False).reset_index(drop=True)

# Parameter audit DataFrame (raw JSON values)
param_rows = [extract_parameters_flat(run_id, steps) for run_id, steps in runs_data.items()]
df_params = pd.DataFrame(param_rows)

print(f"Quality DataFrame: {len(df_quality)} rows x {len(df_quality.columns)} columns")
print(f"Parameter DataFrame: {len(df_params)} rows x {len(df_params.columns)} columns")

Quality DataFrame: 1 rows x 118 columns
Parameter DataFrame: 1 rows x 78 columns


---

<a id="section-0"></a>
## 2. Section 0: Data Lineage & Provenance

**Purpose:** Ensure recording traceability from raw file to final result (Cereatti et al., 2024)

In [4]:
print_section_header("SECTION 0: DATA LINEAGE & PROVENANCE")

# Display provenance info
cols_s0 = ['Run_ID', 'Subject_ID', 'Session_ID', 'Processing_Date', 'Pipeline_Version']
display(df_quality[cols_s0])

print(f"\nTotal Runs: {len(df_quality)}")
print(f"Subjects: {df_quality['Subject_ID'].nunique()}")
print(f"Sessions: {df_quality['Session_ID'].nunique()}")

SECTION 0: DATA LINEAGE & PROVENANCE


Unnamed: 0,Run_ID,Subject_ID,Session_ID,Processing_Date,Pipeline_Version
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,734,T3,2026-01-29 11:38,v2.6_calibration_enhanced



Total Runs: 1
Subjects: 1
Sessions: 1


---

<a id="section-1"></a>
## 3. Section 1: R√°cz Calibration Layer

**Purpose:** Verify the "Ground Truth" of the skeleton setup (R√°cz et al., 2025)

**Thresholds:**
- Pointer ‚â§ 2.0mm
- Wand ‚â§ 1.0mm
- Bone CV ‚â§ 1.5%
- Static Offset ‚â§ 15.0¬∞

In [5]:
print_section_header("SECTION 1: R√ÅCZ CALIBRATION LAYER")

cols_s1 = ['Run_ID', 'OptiTrack_Error_mm', 'Bone_CV_%', 'Bone_Status', 
           'Worst_Bone', 'Left_Offset_Deg', 'Right_Offset_Deg', 'Score_Calibration']
display(df_quality[cols_s1])

# Summary
print(f"\nCalibration Summary:")
print(f"  Mean Bone CV: {df_quality['Bone_CV_%'].mean():.3f}%")
print(f"  Max Left Offset: {df_quality['Left_Offset_Deg'].max():.1f}¬∞")
print(f"  Max Right Offset: {df_quality['Right_Offset_Deg'].max():.1f}¬∞")
print(f"  Mean Calibration Score: {df_quality['Score_Calibration'].mean():.1f}/100")

SECTION 1: R√ÅCZ CALIBRATION LAYER


Unnamed: 0,Run_ID,OptiTrack_Error_mm,Bone_CV_%,Bone_Status,Worst_Bone,Left_Offset_Deg,Right_Offset_Deg,Score_Calibration
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,0.0,0.532,GOLD,Hips->Spine,7.67,9.89,95.0



Calibration Summary:
  Mean Bone CV: 0.532%
  Max Left Offset: 7.7¬∞
  Max Right Offset: 9.9¬∞
  Mean Calibration Score: 95.0/100


---

<a id="section-2"></a>
## 4. Section 2: Temporal Quality & Sampling

**Purpose:** Verify sampling rate and recording duration

In [6]:
print_section_header("SECTION 2: TEMPORAL QUALITY & SAMPLING")

# Include step_03 resample validation fields
cols_s2 = ['Run_ID', 'Total_Frames', 'Duration_Sec', 'Sampling_Rate_Hz', 
           'Target_Fs_Hz', 'Time_Grid_Std_Dt', 'Temporal_Status', 'Score_Temporal']
display(df_quality[cols_s2])

print(f"\nTemporal Summary:")
print(f"  Total Frames: {df_quality['Total_Frames'].sum():,}")
print(f"  Total Duration: {df_quality['Duration_Sec'].sum()/60:.1f} minutes")
print(f"  Mean Sampling Rate: {df_quality['Sampling_Rate_Hz'].mean():.2f} Hz")

# Step 03 resample validation
if 'Temporal_Status' in df_quality.columns:
    perfect_count = (df_quality['Temporal_Status'] == 'PERFECT').sum()
    print(f"  Temporal Grid PERFECT: {perfect_count}/{len(df_quality)}")
print(f"  Mean Temporal Score: {df_quality['Score_Temporal'].mean():.1f}/100")

SECTION 2: TEMPORAL QUALITY & SAMPLING


Unnamed: 0,Run_ID,Total_Frames,Duration_Sec,Sampling_Rate_Hz,Target_Fs_Hz,Time_Grid_Std_Dt,Temporal_Status,Score_Temporal
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,16504,137.5,120.0,120.0,0.0,PERFECT,100.0



Temporal Summary:
  Total Frames: 16,504
  Total Duration: 2.3 minutes
  Mean Sampling Rate: 120.00 Hz
  Temporal Grid PERFECT: 1/1
  Mean Temporal Score: 100.0/100


---

<a id="section-3"></a>
## 5. Section 3: Gap & Interpolation Transparency

**Purpose:** "No Silent Fixes" (Winter, 2009) - Full disclosure of data reconstruction

**Thresholds:**
- Missing data ‚â§ 5.0%

In [7]:
print_section_header("SECTION 3: GAP & INTERPOLATION TRANSPARENCY (Winter, 2009)")

# Include step_03 interpolation methods for positions and rotations
cols_s3 = ['Run_ID', 'Raw_Missing_%', 'Interpolation_Method', 
           'Resample_Interp_Positions', 'Resample_Interp_Rotations', 'Score_Interpolation']
display(df_quality[cols_s3])

# Classify interpolation methods
def classify_method(method):
    method_str = str(method).lower()
    if 'quaternion' in method_str or 'slerp' in method_str:
        return '‚úÖ Quaternion (SLERP)'
    elif 'spline' in method_str or 'cubic' in method_str:
        return '‚úÖ Spline/Cubic'
    elif 'linear' in method_str:
        return 'üü† Linear Fallback'
    else:
        return '‚ö†Ô∏è Unknown'

df_quality['Method_Category'] = df_quality['Interpolation_Method'].apply(classify_method)

print(f"\nInterpolation Summary:")
print(f"  Pristine Data (0% missing): {(df_quality['Raw_Missing_%'] == 0).sum()}/{len(df_quality)}")
print(f"  Mean Missing: {df_quality['Raw_Missing_%'].mean():.2f}%")
print(f"  Mean Interpolation Score: {df_quality['Score_Interpolation'].mean():.1f}/100")

# Step 03: Rotation & Position Methods
if 'Resample_Interp_Rotations' in df_quality.columns:
    slerp_count = df_quality['Resample_Interp_Rotations'].str.contains('SLERP', case=False, na=False).sum()
    print(f"\nStep 03 Resample Methods:")
    print(f"  Rotations using SLERP: {slerp_count}/{len(df_quality)}")
    print(f"  Position Methods: {df_quality['Resample_Interp_Positions'].value_counts().to_dict()}")

print(f"\nGap-Fill Method Distribution:")
print(df_quality['Method_Category'].value_counts().to_string())

SECTION 3: GAP & INTERPOLATION TRANSPARENCY (Winter, 2009)


Unnamed: 0,Run_ID,Raw_Missing_%,Interpolation_Method,Resample_Interp_Positions,Resample_Interp_Rotations,Score_Interpolation
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,0.0,linear_quaternion_normalized,CubicSpline,SLERP,100.0



Interpolation Summary:
  Pristine Data (0% missing): 1/1
  Mean Missing: 0.00%
  Mean Interpolation Score: 100.0/100

Step 03 Resample Methods:
  Rotations using SLERP: 1/1
  Position Methods: {'CubicSpline': 1}

Gap-Fill Method Distribution:
Method_Category
‚úÖ Quaternion (SLERP)    1


---

<a id="section-4"></a>
## 6. Section 4: Winter's Residual Validation

**Purpose:** Justify the filtering frequency (Winter, 2009) - Signal vs. Noise separation

**Acceptable Range:** 4.0-12.0 Hz for dance movements

In [8]:
print_section_header("SECTION 4: PER-REGION FILTERING VALIDATION")

# Core filtering columns
cols_s4 = ['Run_ID', 'Filtering_Mode', 'Region_Cutoffs_Applied', 'Residual_RMS_mm']
display(df_quality[cols_s4])

# Per-region Winter validation details
print("\n" + "="*60)
print("WINTER VALIDATION PER REGION")
print("="*60)
cols_s4_detail = ['Run_ID', 'RMS_Knee_Per_Region', 'Diminishing_Per_Region', 'Region_Validation_Status']
display(df_quality[cols_s4_detail])

# Summary statistics
print(f"\nFiltering Summary:")
print(f"  Filtering Mode: {df_quality['Filtering_Mode'].iloc[0] if len(df_quality) > 0 else 'N/A'}")
if 'Residual_RMS_mm' in df_quality.columns:
    print(f"  Mean Residual RMS: {df_quality['Residual_RMS_mm'].mean():.2f} mm")
print(f"  Mean Filtering Score: {df_quality['Score_Filtering'].mean():.1f}/100")

# TRUE RAW SNR - Capture Quality Assessment
print("\n" + "="*60)
print("TRUE RAW SNR (CAPTURE QUALITY)")
print("="*60)
print("Method: Raw data frequency analysis (signal: 0.5-10Hz, noise: 15-50Hz)")
print("This measures inherent capture quality, NOT filtering effectiveness.")
snr_cols = ['Run_ID', 'Raw_SNR_Mean_dB', 'Raw_SNR_Min_dB', 'Raw_SNR_Max_dB', 'Raw_SNR_Status']
snr_cols_available = [c for c in snr_cols if c in df_quality.columns]
if snr_cols_available:
    display(df_quality[snr_cols_available])
    if 'Raw_SNR_Mean_dB' in df_quality.columns:
        mean_snr = df_quality['Raw_SNR_Mean_dB'].mean()
        print(f"\nSNR Summary: Mean = {mean_snr:.1f} dB")
        if mean_snr >= 30:
            print("  Status: EXCELLENT - Publication quality capture")
        elif mean_snr >= 20:
            print("  Status: GOOD - Acceptable for research")
        elif mean_snr >= 15:
            print("  Status: ACCEPTABLE - Review recommended")
        else:
            print("  Status: POOR - Check capture environment")
else:
    print("SNR data not yet computed. Re-run notebook 04_filtering.ipynb.")

SECTION 4: PER-REGION FILTERING VALIDATION


Unnamed: 0,Run_ID,Filtering_Mode,Region_Cutoffs_Applied,Residual_RMS_mm
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,3_stage_pipeline,"{'head': 14.5, 'upper_proximal': 14.5, 'trunk'...",0.0



WINTER VALIDATION PER REGION


Unnamed: 0,Run_ID,RMS_Knee_Per_Region,Diminishing_Per_Region,Region_Validation_Status
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,{},{},{}



Filtering Summary:
  Filtering Mode: 3_stage_pipeline
  Mean Residual RMS: 0.00 mm
  Mean Filtering Score: 50.0/100

TRUE RAW SNR (CAPTURE QUALITY)
Method: Raw data frequency analysis (signal: 0.5-10Hz, noise: 15-50Hz)
This measures inherent capture quality, NOT filtering effectiveness.


Unnamed: 0,Run_ID,Raw_SNR_Mean_dB,Raw_SNR_Min_dB,Raw_SNR_Max_dB,Raw_SNR_Status
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,47.9,39.5,55.9,EXCELLENT



SNR Summary: Mean = 47.9 dB
  Status: EXCELLENT - Publication quality capture


---

<a id="section-5"></a>
## 7. Section 5: Reference Detection & Stability

**Purpose:** Verify static pose alignment quality

In [9]:
print_section_header("SECTION 5: REFERENCE DETECTION & STABILITY")

cols_s5 = ['Run_ID', 'Ref_Quality_Score', 'Ref_Confidence', 'Score_Reference']
display(df_quality[cols_s5])

print(f"\nReference Summary:")
print(f"  Mean Quality Score: {df_quality['Ref_Quality_Score'].mean():.3f}")
print(f"  HIGH Confidence: {(df_quality['Ref_Confidence'] == 'HIGH').sum()}/{len(df_quality)}")
print(f"  Mean Reference Score: {df_quality['Score_Reference'].mean():.1f}/100")

SECTION 5: REFERENCE DETECTION & STABILITY


Unnamed: 0,Run_ID,Ref_Quality_Score,Ref_Confidence,Score_Reference
0,734_T3_P2_R1_Take 2025-12-30 04.12.54 PM_002,0.895,HIGH,100.0



Reference Summary:
  Mean Quality Score: 0.895
  HIGH Confidence: 1/1
  Mean Reference Score: 100.0/100


---

<a id="section-6"></a>
## 8. Section 6: Biomechanics & Outlier Analysis

**Purpose:** Gaga-aware movement validation - distinguish extreme dance from tracking errors

In [10]:
print_section_header("SECTION 6: BIOMECHANICS & OUTLIER ANALYSIS")

cols_s6 = ['Run_ID', 'Pipeline_Status', 'Max_Ang_Vel_deg_s', 'Outlier_Frames', 
           'Outlier_%', 'Path_Length_mm', 'Intensity_Index', 'Score_Biomechanics']
display(df_quality[cols_s6])

print(f"\nBiomechanics Summary:")
print(f"  Recordings Processed: {len(df_quality)}")
print(f"  Latest Processing Step: {df_quality['Pipeline_Status'].mode()[0] if len(df_quality) > 0 else 'N/A'}")
print(f"  Mean Outlier %: {df_quality['Outlier_%'].mean():.2f}%")
print(f"  Max Angular Velocity: {df_quality['Max_Ang_Vel_deg_s'].max():.1f} deg/s")
print(f"  Mean Biomechanics Score: {df_quality['Score_Biomechanics'].mean():.1f}/100")

SECTION 6: BIOMECHANICS & OUTLIER ANALYSIS


KeyError: "['Path_Length_mm', 'Intensity_Index'] not in index"

---

<a id="section-7"></a>
## 9. Quality Score Breakdown

**Purpose:** Show how the overall quality score is computed from components

In [None]:
print_section_header("QUALITY SCORE BREAKDOWN")

score_cols = ['Run_ID', 'Quality_Score', 'Score_Calibration', 'Score_Temporal', 
              'Score_Interpolation', 'Score_Filtering', 'Score_Reference', 
              'Score_Biomechanics', 'Score_Signal']
display(df_quality[score_cols])

print(f"\nComponent Score Summary (Mean):")
print(f"  Calibration (15%):   {df_quality['Score_Calibration'].mean():.1f}")
print(f"  Temporal (10%):      {df_quality['Score_Temporal'].mean():.1f}")
print(f"  Interpolation (15%): {df_quality['Score_Interpolation'].mean():.1f}")
print(f"  Filtering (10%):     {df_quality['Score_Filtering'].mean():.1f}")
print(f"  Reference (15%):     {df_quality['Score_Reference'].mean():.1f}")
print(f"  Biomechanics (15%):  {df_quality['Score_Biomechanics'].mean():.1f}")
print(f"  Signal (20%):        {df_quality['Score_Signal'].mean():.1f}")
print(f"  ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ")
print(f"  OVERALL (weighted):  {df_quality['Quality_Score'].mean():.1f}")

---

<a id="section-8"></a>
## 10. Section 8: Decision Matrix

**Purpose:** Final verdict combining all QC metrics

**Thresholds:**
- ACCEPT: Score ‚â• 80
- REVIEW: Score 60-79
- REJECT: Score < 60

In [None]:
print_section_header("SECTION 8: DECISION MATRIX")

# Final decision table
cols_decision = ['Run_ID', 'Quality_Score', 'Research_Decision', 'Pipeline_Status']
display(df_quality[cols_decision])

# Decision summary
total = len(df_quality)
accept = (df_quality['Research_Decision'] == 'ACCEPT').sum()
review = (df_quality['Research_Decision'] == 'REVIEW').sum()
reject = (df_quality['Research_Decision'] == 'REJECT').sum()

print(f"\n{'='*60}")
print(f"DECISION SUMMARY")
print(f"{'='*60}")
print(f"  ‚úÖ ACCEPT: {accept}/{total} ({accept/total*100:.1f}%)")
print(f"  ‚ö†Ô∏è REVIEW: {review}/{total} ({review/total*100:.1f}%)")
print(f"  ‚ùå REJECT: {reject}/{total} ({reject/total*100:.1f}%)")
print(f"{'='*60}")

# List runs by decision
if accept > 0:
    print(f"\n‚úÖ ACCEPTED RUNS:")
    for _, row in df_quality[df_quality['Research_Decision'] == 'ACCEPT'].iterrows():
        print(f"  {row['Run_ID'][:60]} (Score: {row['Quality_Score']})")

if review > 0:
    print(f"\n‚ö†Ô∏è REVIEW REQUIRED:")
    for _, row in df_quality[df_quality['Research_Decision'] == 'REVIEW'].iterrows():
        print(f"  {row['Run_ID'][:60]} (Score: {row['Quality_Score']})")

if reject > 0:
    print(f"\n‚ùå REJECTED RUNS:")
    for _, row in df_quality[df_quality['Research_Decision'] == 'REJECT'].iterrows():
        print(f"  {row['Run_ID'][:60]} (Score: {row['Quality_Score']})")

---

<a id="export"></a>
## 11. Export to Excel

**Output:** `reports/Master_Audit_Log_YYYYMMDD_HHMMSS.xlsx`

**Sheets:**
1. Executive_Summary - High-level statistics
2. Quality_Report - Aggregated metrics per run
3. Parameter_Audit - All raw JSON values
4. Parameter_Schema - What each parameter means

In [None]:
print_section_header("EXPORT TO EXCEL")

# Create output path
REPORTS_DIR = os.path.join(PROJECT_ROOT, "reports")
os.makedirs(REPORTS_DIR, exist_ok=True)

timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
excel_path = os.path.join(REPORTS_DIR, f"Master_Audit_Log_{timestamp}.xlsx")

# Export
output_path = export_to_excel(runs_data, excel_path, PROJECT_ROOT)

print(f"\n‚úÖ Excel Report Created:")
print(f"   {output_path}")
print(f"\n   Sheets:")
print(f"   1. Executive_Summary - High-level statistics")
print(f"   2. Quality_Report - {len(df_quality)} runs with metrics")
print(f"   3. Parameter_Audit - {len(df_params.columns)} parameters extracted")
print(f"   4. Parameter_Schema - Parameter documentation")

In [None]:
# ============================================================
# OPTIONAL: Export schema documentation
# ============================================================
DOCS_DIR = os.path.join(PROJECT_ROOT, "docs", "technical")
CONFIG_DIR = os.path.join(PROJECT_ROOT, "config")

# Export Markdown schema
md_path = os.path.join(DOCS_DIR, "PARAMETER_SCHEMA.md")
export_schema_markdown(md_path)
print(f"‚úÖ Markdown Schema: {md_path}")

# Export JSON schema
json_path = os.path.join(CONFIG_DIR, "report_schema.json")
export_schema_json(json_path)
print(f"‚úÖ JSON Schema: {json_path}")

---

## Summary

This notebook aggregated quality metrics from all pipeline steps and generated:

1. **Console Summary** - Section-by-section quality analysis
2. **Excel Report** - 4-sheet comprehensive audit log
3. **Schema Documentation** - Parameter reference (MD + JSON)

### Next Steps

- Review runs marked as **REVIEW** manually
- Investigate runs marked as **REJECT** for reprocessing
- Use **Parameter_Audit** sheet to trace any issues back to source JSON

In [None]:
print_section_header("NOTEBOOK COMPLETE")
print(f"\nRuns Processed: {len(runs_data)}")
print(f"Excel Output: {excel_path}")
print(f"\nDecision Distribution:")
print(df_quality['Research_Decision'].value_counts().to_string())
print(f"\nMean Quality Score: {df_quality['Quality_Score'].mean():.2f}")