# ðŸŒ¿ Carbon Offset Quality Screener â€” Demo Notebook

Full pipeline: fetch â†’ clean â†’ score â†’ flag â†’ report.
Uses synthetic data mirroring Verra VCS schema when live API is unavailable.

*Basis: ICVCM Core Carbon Principles (2023), Verra VCS Standard v4.1*

In [None]:
import sys, warnings
sys.path.insert(0, '..')
warnings.filterwarnings('ignore')

import pandas as pd
import plotly.io as pio
pio.renderers.default = 'notebook_connected'

from src.fetcher import fetch_verra_projects
from src.cleaner import clean_project_data
from src.scorer import compute_quality_index, get_score_summary
from src.red_flags import detect_red_flags, get_flag_statistics
from src.visualizer import (plot_qi_distribution, plot_retirement_vs_qi,
    plot_flag_heatmap, plot_portfolio_risk_summary, generate_html_report)
from pathlib import Path
print("âœ“ Imports OK")

## 1 â€” Load Data

In [None]:
df_raw = fetch_verra_projects(project_types=None, min_credits_issued=0,
                               raw_data_dir=Path('../data/raw'))
print(f"Projects loaded: {len(df_raw)}")
df_raw.head(3)

## 2 â€” Clean & Standardize

In [None]:
df_clean = clean_project_data(df_raw)
print(f"Projects after cleaning: {len(df_clean)}")
print(df_clean['project_type'].value_counts().to_string())

## 3 â€” Quality Index

In [None]:
df_scored = compute_quality_index(df_clean)
print(get_score_summary(df_scored).to_string())
print(f"\nMean QI: {df_scored['quality_index'].mean():.1f} | Tiers: {df_scored['quality_tier'].value_counts().to_dict()}")

In [None]:
plot_qi_distribution(df_scored).show()

In [None]:
plot_retirement_vs_qi(df_scored).show()

## 4 â€” Red Flag Detection

In [None]:
df_flagged = detect_red_flags(df_scored)
stats = get_flag_statistics(df_flagged)
print(stats[['Flag','Severity','N Projects','% of Total']].to_string(index=False))

In [None]:
plot_flag_heatmap(df_flagged).show()

In [None]:
plot_portfolio_risk_summary(df_flagged).show()

## 5 â€” Inspect Results

In [None]:
cols = [c for c in ['project_id','project_type','country','quality_index',
         'quality_tier','retirement_rate','buffer_pool_ratio',
         'flags_count','max_severity','flags_summary'] if c in df_flagged.columns]

print("TOP 10 HIGHEST QUALITY:")
display(df_flagged.sort_values('quality_index', ascending=False).head(10)[cols])

print("\nCRITICAL FLAGS:")
critical = df_flagged[df_flagged.get('max_severity','NONE') == 'CRITICAL'] if 'max_severity' in df_flagged.columns else df_flagged.head(0)
display(critical[cols].head(10))

## 6 â€” Export

In [None]:
report_path = generate_html_report(df_flagged,
    output_path=Path('../outputs/screening_report.html'), top_n=15)
df_flagged.to_csv('../outputs/projects_screened_full.csv', index=False)
print(f"Report: {report_path}")
print(f"CSV: {len(df_flagged)} rows exported")

---
## Methodological Notes

| Dimension | Weight | Rationale |
|---|---|---|
| Retirement Rate | 25% | Strongest demand signal |
| Buffer Pool | 20% | Reversal insurance adequacy |
| Vintage Freshness | 15% | Recency reduces regulatory risk |
| Issuance Consistency | 15% | Operational stability |
| Project Longevity | 10% | Track record |
| Type Risk | 10% | REDD+ faces higher permanence risk |
| Verification Diversity | 5% | Auditor independence |

Scores are relative rankings, not absolute certifications. REDD+ additionality requires field verification beyond registry data.

**References:** ICVCM CCP (2023), Verra VCS v4.1, Berkeley Carbon Trading Project ratings methodology.