<!--
✒ Metadata
    - Title: {Notebook Title} (SME Episode {X.X} - v1.0)
    - File Name: {NN}-{notebook-name}.ipynb
    - Relative Path: research-stacks/{X.0}-{section}/{X.X}-{episode}/notebooks/{NN}-{notebook-name}.ipynb
    - Artifact Type: notebook
    - Version: 1.0.0
    - Date: {YYYY-MM-DD}
    - Update: {Day, Month DD, YYYY}
    - Author: Dennis 'dnoice' Smaltz
    - A.I. Acknowledgement: Anthropic - Claude Opus 4.5
    - Signature: ︻デ═─── ✦ ✦ ✦ | Aim Twice, Shoot Once!

✒ Description:
    {Brief description of the notebook's purpose and what analysis it performs.
    How does this notebook contribute to the episode's overall analysis?}

✒ Key Features:
    - Feature 1: {Description}
    - Feature 2: {Description}
    - Feature 3: {Description}
    - Feature 4: {Description}
    - Feature 5: {Description}

✒ Usage Instructions:
    Run cells sequentially after activating the project virtual environment.
    Ensure data files are present in ../data/raw/ before execution.

✒ Other Important Information:
    - Dependencies: pandas, numpy, matplotlib, seaborn, plotly
    - Input Data: ../data/raw/{source_files}
    - Output Data: ../data/processed/{output_files}
    - Output Figures: ../visualizations/figures/{figure_files}
    - Related Notebooks: {list related notebooks}
    - Article Section: {which article sections this supports}
---------
-->

# Episode {X.X}: {Episode Title}

## Notebook {NN}: {Notebook Purpose}

> **Series:** Sixth Mass Extinction | **Section:** {X.0} - {Section Name} | **Episode:** {X.X}
>
> **Article:** [ARTICLE_{X.X}.md](../article/ARTICLE_{X.X}.md) | **Data:** [MANIFEST.md](../data/metadata/MANIFEST.md)

---

## Table of Contents

1. [Setup & Configuration](#1-setup--configuration)
2. [Data Acquisition](#2-data-acquisition)
3. [Data Processing](#3-data-processing)
4. [Exploratory Analysis](#4-exploratory-analysis)
5. [Statistical Analysis](#5-statistical-analysis)
6. [Visualizations](#6-visualizations)
7. [Key Findings](#7-key-findings)
8. [Export & Documentation](#8-export--documentation)

---

## 1. Setup & Configuration

In [None]:
# =============================================================================
# STANDARD IMPORTS
# =============================================================================

# Data manipulation
import pandas as pd
import numpy as np

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

# Statistics
from scipy import stats

# Utilities
from pathlib import Path
import warnings
import json
from datetime import datetime

# Suppress warnings
warnings.filterwarnings('ignore')

# =============================================================================
# CONFIGURATION
# =============================================================================

# Episode metadata
EPISODE = '{X.X}'
EPISODE_NAME = '{Episode Name}'
SECTION = '{X.0}'
SECTION_NAME = '{Section Name}'

# Paths
PROJECT_ROOT = Path('../../../..')
EPISODE_ROOT = Path('..')

DATA_RAW = EPISODE_ROOT / 'data' / 'raw'
DATA_PROCESSED = EPISODE_ROOT / 'data' / 'processed'
DATA_METADATA = EPISODE_ROOT / 'data' / 'metadata'

VIZ_FIGURES = EPISODE_ROOT / 'visualizations' / 'figures'
VIZ_INTERACTIVE = EPISODE_ROOT / 'visualizations' / 'interactive'
VIZ_EXPORTS = EPISODE_ROOT / 'visualizations' / 'exports'

# Ensure directories exist
for path in [DATA_RAW, DATA_PROCESSED, DATA_METADATA, VIZ_FIGURES, VIZ_INTERACTIVE, VIZ_EXPORTS]:
    path.mkdir(parents=True, exist_ok=True)

# =============================================================================
# SME STYLE CONFIGURATION
# =============================================================================

# Load SME color palette
import sys
sys.path.insert(0, str(PROJECT_ROOT / 'research-stacks' / '_shared' / 'styles'))
try:
    from color_palettes import *
    print('SME color palette loaded')
except ImportError:
    print('Warning: SME color palette not found, using defaults')
    CRISIS_RED = '#d32f2f'
    DATA_BLUE = '#1976d2'
    HOPE_GREEN = '#388e3c'

# Matplotlib configuration
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams.update({
    'figure.figsize': (12, 6),
    'figure.dpi': 100,
    'font.size': 12,
    'axes.titlesize': 14,
    'axes.titleweight': 'bold',
    'axes.labelsize': 12,
    'axes.spines.top': False,
    'axes.spines.right': False,
    'savefig.dpi': 300,
    'savefig.bbox': 'tight'
})

print(f'\n{"="*60}')
print(f'SME Episode {EPISODE}: {EPISODE_NAME}')
print(f'Section {SECTION}: {SECTION_NAME}')
print(f'{"="*60}')
print(f'Setup complete - {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}')

---

## 2. Data Acquisition

### 2.1 Data Source Overview

| Source | Type | Tier | File | Description |
|--------|------|------|------|-------------|
| {Source 1} | {Type} | {1-4} | `{filename}` | {Description} |
| {Source 2} | {Type} | {1-4} | `{filename}` | {Description} |

In [None]:
# =============================================================================
# LOAD RAW DATA
# =============================================================================

# Primary dataset
# df_raw = pd.read_csv(DATA_RAW / 'primary_data.csv')

# Secondary dataset (if applicable)
# df_secondary = pd.read_csv(DATA_RAW / 'secondary_data.csv')

# Display data info
# print(f'Primary dataset shape: {df_raw.shape}')
# print(f'Columns: {list(df_raw.columns)}')
# df_raw.head()

In [None]:
# =============================================================================
# DATA QUALITY CHECK
# =============================================================================

# def data_quality_report(df, name='Dataset'):
#     """Generate data quality report."""
#     print(f'\n{"="*60}')
#     print(f'Data Quality Report: {name}')
#     print(f'{"="*60}')
#     print(f'Shape: {df.shape[0]} rows × {df.shape[1]} columns')
#     print(f'\nMissing Values:')
#     missing = df.isnull().sum()
#     missing_pct = (missing / len(df)) * 100
#     for col in df.columns:
#         if missing[col] > 0:
#             print(f'  {col}: {missing[col]} ({missing_pct[col]:.1f}%)')
#     print(f'\nData Types:')
#     print(df.dtypes)
#     return df.describe()

# data_quality_report(df_raw, 'Primary Dataset')

---

## 3. Data Processing

### 3.1 Data Cleaning

In [None]:
# =============================================================================
# DATA CLEANING
# =============================================================================

# Create working copy
# df = df_raw.copy()

# Handle missing values
# df = df.dropna(subset=['required_column'])
# df['optional_column'] = df['optional_column'].fillna(0)

# Remove duplicates
# df = df.drop_duplicates()

# Type conversions
# df['date'] = pd.to_datetime(df['date'])
# df['category'] = df['category'].astype('category')

# print(f'After cleaning: {df.shape[0]} rows ({len(df_raw) - len(df)} removed)')

### 3.2 Feature Engineering

In [None]:
# =============================================================================
# FEATURE ENGINEERING
# =============================================================================

# Temporal features
# df['year'] = df['date'].dt.year
# df['decade'] = (df['year'] // 10) * 10

# Calculated metrics
# df['rate'] = df['count'] / df['total'] * 100

# Categorical encoding
# df['region_code'] = df['region'].map({'North': 1, 'South': 2, 'East': 3, 'West': 4})

# Aggregations
# df_grouped = df.groupby(['year', 'category']).agg({
#     'value': ['sum', 'mean', 'count']
# }).reset_index()

# print(f'Features added: {list(df.columns)}')

In [None]:
# =============================================================================
# SAVE PROCESSED DATA
# =============================================================================

# Save main processed dataset
# df.to_csv(DATA_PROCESSED / f'{EPISODE}_processed.csv', index=False)
# print(f'Saved: {DATA_PROCESSED / f"{EPISODE}_processed.csv"}')

# Save aggregated dataset
# df_grouped.to_csv(DATA_PROCESSED / f'{EPISODE}_aggregated.csv', index=False)
# print(f'Saved: {DATA_PROCESSED / f"{EPISODE}_aggregated.csv"}')

---

## 4. Exploratory Analysis

### 4.1 Summary Statistics

In [None]:
# =============================================================================
# SUMMARY STATISTICS
# =============================================================================

# Descriptive statistics
# df.describe()

In [None]:
# =============================================================================
# KEY METRICS CALCULATION
# =============================================================================

# Example: Calculate E/MSY (extinctions per million species-years)
# def calculate_e_msy(extinctions, species_count, years):
#     """Calculate extinction rate in E/MSY."""
#     species_years = species_count * years
#     return extinctions / species_years * 1_000_000

# e_msy = calculate_e_msy(
#     extinctions=df['extinctions'].sum(),
#     species_count=df['species'].nunique(),
#     years=df['year'].nunique()
# )
# print(f'E/MSY: {e_msy:.2f}')

### 4.2 Distribution Analysis

In [None]:
# =============================================================================
# DISTRIBUTION PLOTS
# =============================================================================

# fig, axes = plt.subplots(1, 3, figsize=(15, 4))

# Histogram
# axes[0].hist(df['value'], bins=30, color=DATA_BLUE, edgecolor='white')
# axes[0].set_title('Distribution of Values')
# axes[0].set_xlabel('Value')
# axes[0].set_ylabel('Frequency')

# Box plot
# sns.boxplot(data=df, y='value', ax=axes[1], color=DATA_BLUE)
# axes[1].set_title('Value Distribution')

# QQ plot
# stats.probplot(df['value'], dist='norm', plot=axes[2])
# axes[2].set_title('Q-Q Plot')

# plt.tight_layout()
# plt.show()

---

## 5. Statistical Analysis

### 5.1 Trend Analysis

In [None]:
# =============================================================================
# TREND ANALYSIS
# =============================================================================

# def calculate_trend(x, y):
#     """Calculate linear trend with statistics."""
#     slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
#     return {
#         'slope': slope,
#         'intercept': intercept,
#         'r_squared': r_value**2,
#         'p_value': p_value,
#         'std_err': std_err
#     }

# trend = calculate_trend(df['year'], df['value'])
# print(f'Trend Analysis:')
# print(f'  Slope: {trend["slope"]:.4f} per year')
# print(f'  R²: {trend["r_squared"]:.4f}')
# print(f'  P-value: {trend["p_value"]:.4e}')

### 5.2 Correlation Analysis

In [None]:
# =============================================================================
# CORRELATION ANALYSIS
# =============================================================================

# Calculate correlation matrix
# numeric_cols = df.select_dtypes(include=[np.number]).columns
# corr_matrix = df[numeric_cols].corr()

# Visualize
# fig, ax = plt.subplots(figsize=(10, 8))
# sns.heatmap(corr_matrix, annot=True, cmap='RdBu_r', center=0, ax=ax)
# ax.set_title('Correlation Matrix')
# plt.tight_layout()
# plt.show()

---

## 6. Visualizations

### 6.1 Figure 1: {Primary Visualization Title}

**Purpose:** {What this figure shows and why it matters}

**Article Section:** {Which section of the article this supports}

In [None]:
# =============================================================================
# FIGURE 1: {TITLE}
# =============================================================================

# fig, ax = plt.subplots(figsize=(12, 6))

# Main plot
# ax.plot(df['year'], df['value'], color=CRISIS_RED, linewidth=2, label='Observed')

# Trend line
# x_trend = np.array(df['year'])
# y_trend = trend['slope'] * x_trend + trend['intercept']
# ax.plot(x_trend, y_trend, '--', color='gray', linewidth=1.5, label='Trend')

# Labels and formatting
# ax.set_title('Figure 1: {Title}', fontsize=14, fontweight='bold')
# ax.set_xlabel('Year')
# ax.set_ylabel('Value')
# ax.legend(loc='upper left')
# ax.grid(True, alpha=0.3)

# Save
# fig.savefig(VIZ_FIGURES / 'fig01-{name}.png', dpi=300, bbox_inches='tight')
# fig.savefig(VIZ_EXPORTS / 'fig01-{name}.png', dpi=300, bbox_inches='tight')
# plt.show()
# print(f'Saved: fig01-{name}.png')

### 6.2 Figure 2: {Secondary Visualization Title}

In [None]:
# =============================================================================
# FIGURE 2: {TITLE}
# =============================================================================

# {Visualization code}

### 6.3 Interactive Visualization

In [None]:
# =============================================================================
# INTERACTIVE VISUALIZATION
# =============================================================================

# fig = px.line(
#     df, 
#     x='year', 
#     y='value',
#     title=f'Episode {EPISODE}: {EPISODE_NAME}',
#     labels={'year': 'Year', 'value': 'Value'},
#     color_discrete_sequence=[CRISIS_RED]
# )

# fig.update_layout(
#     template='plotly_white',
#     hovermode='x unified'
# )

# fig.write_html(VIZ_INTERACTIVE / f'{EPISODE}_interactive.html')
# fig.show()
# print(f'Saved: {EPISODE}_interactive.html')

---

## 7. Key Findings

### 7.1 Summary Statistics Table

In [None]:
# =============================================================================
# KEY FINDINGS SUMMARY
# =============================================================================

# findings = {
#     'Episode': EPISODE,
#     'Episode Name': EPISODE_NAME,
#     'Analysis Date': datetime.now().strftime('%Y-%m-%d'),
#     'Sample Size': len(df),
#     'Time Period': f"{df['year'].min()}-{df['year'].max()}",
#     'Primary Metric': '{value}',
#     'Trend (per year)': f"{trend['slope']:.4f}",
#     'R-squared': f"{trend['r_squared']:.4f}",
#     'P-value': f"{trend['p_value']:.4e}"
# }

# findings_df = pd.DataFrame(list(findings.items()), columns=['Metric', 'Value'])
# findings_df

### 7.2 Key Insights

| Finding | Value | Confidence | Implication |
|---------|-------|------------|-------------|
| {Finding 1} | {Value} | {High/Medium/Low} | {What it means} |
| {Finding 2} | {Value} | {High/Medium/Low} | {What it means} |
| {Finding 3} | {Value} | {High/Medium/Low} | {What it means} |

### 7.3 Limitations and Caveats

- {Limitation 1}
- {Limitation 2}
- {Limitation 3}

---

## 8. Export & Documentation

In [None]:
# =============================================================================
# EXPORT SUMMARY DATA
# =============================================================================

# Export findings
# findings_df.to_csv(DATA_PROCESSED / f'{EPISODE}_findings.csv', index=False)

# Export as JSON for web integration
# with open(DATA_PROCESSED / f'{EPISODE}_findings.json', 'w') as f:
#     json.dump(findings, f, indent=2)

print(f'\n{"="*60}')
print(f'Analysis Complete: Episode {EPISODE}')
print(f'{"="*60}')
print(f'\nOutputs:')
print(f'  Data: {DATA_PROCESSED}')
print(f'  Figures: {VIZ_FIGURES}')
print(f'  Interactive: {VIZ_INTERACTIVE}')
print(f'\nNext Steps:')
print(f'  1. Review figures for article integration')
print(f'  2. Update ARTICLE_{EPISODE}.md with findings')
print(f'  3. Cross-check with methodology documentation')

---

## Notebook Metadata

| Field | Value |
|-------|-------|
| **Episode** | {X.X} - {Episode Name} |
| **Notebook** | {NN} - {Notebook Purpose} |
| **Type** | {Article-Aligned / Novel Analysis} |
| **Version** | 1.0.0 |
| **Last Run** | {YYYY-MM-DD} |
| **Python** | 3.11+ |
| **Dependencies** | pandas, numpy, matplotlib, seaborn, plotly, scipy |

---

> **︻デ═─── ✦ ✦ ✦ | Aim Twice, Shoot Once!**