## 🔗 Open This Notebook in Google Colab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavidLangworthy/ds4s/blob/master/days/day05/notebook/day05_starter.ipynb)

# 🔥 Day 5 – Capstone: Emissions and Temperature
You now have a full storytelling toolkit. We will still move in tight loops: inspect each dataset, align them carefully, and build a final two-panel story that avoids dual-axis ambiguity.

## 🗂️ Data Card: CO₂ Emissions + Temperature Anomalies
- **Sources:**
  - Global CO₂ emissions from fossil fuels and cement production (Our World in Data, based on Global Carbon Project).
  - NASA GISTEMP global surface temperature anomalies.
- **Temporal coverage:** 1880–2022 (emissions start 1750 but overlap begins in 1880).
- **Units:** CO₂ in gigatonnes per year; temperature anomaly in °C relative to 1951–1980.
- **Last updated:** November 2024 downloads.
- **Method notes:** Annual aggregates; emissions exclude land-use change. Temperature anomalies already cleaned in Day 1.
- **Caveats:** Emissions data includes methodological revisions; temperature anomalies reflect global mean, not regional extremes.
- **Integrity prompt:** Dual axes can mislead — commit to a layout (like small multiples) that keeps scales transparent.

## Story Scaffold Reminder
- **Claim:** What relationship between emissions and warming do you want to highlight?
- **Evidence:** Which decades or inflection points illustrate it?
- **Visual:** How will you juxtapose the series without confusing scales?
- **Takeaway:** Draft the closing message for your portfolio piece.

## Step 0 · Imports

In [None]:
import matplotlib.pyplot as plt
import pandas as pd

from days.utils import (
    add_story_footer,
    baseline_style,
    check_story_metadata,
    load_data,
    quick_diagnostics,
    save_last_fig,
)

baseline_style()

## Step 1 · Load each dataset

In [None]:
co2 = load_data("data/global_co2.csv")
temperature = load_data(
    "data/GLB.Ts+dSST.csv",
    skiprows=1,
    usecols=["Year", "J-D"],
    na_values=["***"],
).rename(columns={"J-D": "TempAnomaly"})
temperature["TempAnomaly"] = pd.to_numeric(temperature["TempAnomaly"], errors="coerce")
co2.rename(columns={"CO2": "CO2_Gt"}, inplace=True)

In [None]:
quick_diagnostics(
    co2,
    expected_columns=["Year", "CO2_Gt"],
    rows_between=(250, 350),
    head_rows=5,
)
quick_diagnostics(
    temperature,
    expected_columns=["Year", "TempAnomaly"],
    rows_between=(140, 160),
    head_rows=5,
)

## Step 2 · Align the time span and combine

In [None]:
merged = pd.merge(co2, temperature, on="Year", how="inner")
merged = merged[merged["Year"] >= 1880]
merged = merged.dropna(subset=["CO2_Gt", "TempAnomaly"])
merged = merged.reset_index(drop=True)
quick_diagnostics(
    merged,
    expected_columns=["Year", "CO2_Gt", "TempAnomaly"],
    rows_between=(140, 160),
    head_rows=5,
)
print("Expected: overlapping 1880–2022 range with no missing values.")

## Step 3 · Interim checkpoint
Plot each series quickly to verify trends before crafting the final story layout.

![Interim preview – emission and temperature lines.](../../plots/day05_solution_plot.png)

In [None]:
fig, axes = plt.subplots(2, 1, figsize=(10, 6), sharex=True)
axes[0].plot(merged["Year"], merged["CO2_Gt"], color="#444e86")
axes[0].set_ylabel("CO₂ emissions (Gt)")
axes[0].set_title("Draft: Emissions keep rising")
axes[1].plot(merged["Year"], merged["TempAnomaly"], color="#d1495b")
axes[1].set_ylabel("Temperature anomaly (°C)")
axes[1].set_xlabel("Year")
axes[1].set_title("Draft: Global temperature anomaly")
plt.tight_layout()
plt.show()

## Step 4 · Story metadata

In [None]:
TITLE = "Emissions climbed sixfold — and the planet warmed a full degree"
SUBTITLE = "Global CO₂ emissions vs. temperature anomaly, 1880–2022"
ANNOTATION = "Post-1950 industrial growth drives both the emissions surge and rapid warming."
SOURCE = "Sources: Global Carbon Project via OWID; NASA GISTEMP"
UNITS = "Units: CO₂ (gigatonnes per year); temperature anomaly (°C vs. 1951–1980)"

check_story_metadata(
    TITLE=TITLE,
    SUBTITLE=SUBTITLE,
    ANNOTATION=ANNOTATION,
    SOURCE=SOURCE,
    UNITS=UNITS,
)

## Step 5 · Rationale for avoiding dual axes
Dual axes can imply a causal relationship where none is quantified and can mislead if scales are chosen arbitrarily. Small multiples (stacked panels with a shared x-axis) keep context while preserving trustworthy scales.

## Step 6 · Final two-panel narrative

In [None]:
fig, (ax_top, ax_bottom) = plt.subplots(2, 1, figsize=(11, 7), sharex=True, gridspec_kw=dict(height_ratios=[2, 1]))
ax_top.plot(merged["Year"], merged["CO2_Gt"], color="#3f37c9", linewidth=2.3)
ax_top.fill_between(merged["Year"], 0, merged["CO2_Gt"], color="#3f37c9", alpha=0.15)
ax_top.set_ylabel("CO₂ emissions (Gt)")
ax_top.set_title(f"{TITLE}
{SUBTITLE}")
ax_top.axvline(1950, color="#8d99ae", linestyle="--", linewidth=1)
ax_top.annotate(
    "Great Acceleration",
    xy=(1950, merged.loc[merged["Year"] == 1950, "CO2_Gt"].values[0]),
    xytext=(1910, merged["CO2_Gt"].max() * 0.6),
    arrowprops=dict(arrowstyle="->", color="#3f37c9"),
    fontsize=11,
    color="#3f37c9",
)
ax_bottom.plot(merged["Year"], merged["TempAnomaly"], color="#d1495b", linewidth=2.3)
ax_bottom.axhline(0, color="#8d99ae", linestyle="--", linewidth=1)
ax_bottom.set_ylabel("Temp anomaly (°C)")
ax_bottom.set_xlabel("Year")
ax_bottom.annotate(
    ANNOTATION,
    xy=(1980, merged.loc[merged["Year"] == 1980, "TempAnomaly"].values[0]),
    xytext=(1890, merged["TempAnomaly"].max()),
    arrowprops=dict(arrowstyle="->", color="#d1495b"),
    fontsize=11,
    color="#d1495b",
)
add_story_footer(ax_bottom, source=SOURCE, units=UNITS)
plt.tight_layout(h_pad=2.5)
plt.show()

## Step 7 · Interpret with the scaffold
- **Claim:** Rapid post-1950 emission growth aligns with the steep rise in global temperature anomalies.
- **Evidence:** Emissions jump from <10 Gt to >35 Gt, while the temperature panel shows +1 °C warming over the same window.
- **Visual:** Stacked panels share the timeline, avoiding dual axes while keeping both stories visible.
- **Takeaway:** “Industrial emissions surged after 1950 and the climate responded — limiting warming means bending the emissions curve down.”

### Limitations to note
- Correlation is not causation; other forcings (aerosols, land use) also influence temperature.
- Annual aggregates hide intra-annual variability.
- Future work could add uncertainty bands or scenario projections to extend the story.

## Step 8 · Save the figure

In [None]:
save_last_fig("day05_solution_plot.png")