## 🔗 Open This Notebook in Google Colab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavidLangworthy/ds4s/blob/master/days/day01/notebook/day01_starter.ipynb)

# 🌎 Day 1 – Visualizing Global Warming

Welcome to your first day of the sprint! We will move in short, confidence-building passes: learn a single idea, apply it right away, and check that everything worked before you advance. By the time you reach the final cell you will have a polished climate story ready to share.

## 🗂️ Data Card: NASA GISTEMP Annual Anomalies
- **Source:** [NASA Goddard Institute for Space Studies – GISTEMP v4](https://data.giss.nasa.gov/gistemp/).
- **Temporal coverage:** 1880–2024, one row per year.
- **Units:** Temperature anomaly relative to the 1951–1980 global mean (°C).
- **Last updated:** December 2024 (download mirrored in this repo).
- **Method notes:** Combined land–sea surface temperature estimates using meteorological stations, ship and buoy records, and a 30-year baseline.
- **Caveats:** Values are anomalies (not absolute °C). Annual means mask seasonal extremes; polar regions remain uncertain.
- **Integrity prompt:** What choices (smoothing, axes range, color) might exaggerate or understate warming? Name one safeguard you will use.

## 🧭 Story Scaffold
We will keep looping on this structure so it becomes automatic:
- **Claim:** What signal does the data reveal?
- **Evidence:** Which columns/statistics support it?
- **Visual:** Which design choices make it legible and honest?
- **Takeaway:** How will you phrase the message for a public audience?

## Step 0 · Set up your workspace
Run the imports once so every later cell can focus on interpretation rather than boilerplate.

In [None]:
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd

from days.utils import (
    add_story_footer,
    baseline_style,
    check_story_metadata,
    load_data,
    quick_diagnostics,
    save_last_fig,
)

baseline_style()

## Step 1 · Load and inspect the climate record
Focus on a single tidy DataFrame: one row per year and a numeric anomaly column. The quick diagnostic cell prints everything you should verify before plotting.

In [None]:
temperature = load_data(
    "data/GLB.Ts+dSST.csv",
    skiprows=1,
    usecols=[0, 13],
    names=["Year", "TempAnomaly"],
    header=0,
)
temperature["TempAnomaly"] = pd.to_numeric(temperature["TempAnomaly"], errors="coerce")
temperature = temperature.dropna(subset=["TempAnomaly"]).sort_values("Year").reset_index(drop=True)

In [None]:
quick_diagnostics(
    temperature,
    expected_columns=["Year", "TempAnomaly"],
    rows_between=(140, 160),
)
print("Expected: ~145 rows from 1880 through 2024; anomalies expressed in °C.")

## Step 2 · Build context columns
A rolling average turns the jagged annual series into a smoother climate signal that is easier to read. Keep the original series too so viewers can see variability.

In [None]:
temperature["RollingDecade"] = temperature["TempAnomaly"].rolling(window=10, min_periods=3).mean()
quick_diagnostics(
    temperature[["Year", "TempAnomaly", "RollingDecade"]],
    expected_columns=["Year", "TempAnomaly", "RollingDecade"],
    rows_between=(140, 160),
)
print("Rolling values should be slightly smoother and may start with NaNs until enough years accumulate.")

## Step 3 · Sketch an interim view
Check your progress with a draft Matplotlib line chart. Focus on legibility: units on the axes, zero line for context, and color that remains readable to colorblind viewers.

![Interim progress preview – grey line chart with a zero reference.](../../plots/day01_solution_plot.png)

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(temperature["Year"], temperature["TempAnomaly"], color="#6c7a89", linewidth=1.5)
ax.axhline(0, color="#444444", linewidth=1, linestyle="--")
ax.set_xlabel("Year")
ax.set_ylabel("Temperature anomaly (°C)")
ax.set_title("Draft: Global temperature anomalies")
plt.show()

## Step 4 · Lock in your story metadata
Titles, subtitles, annotations, and credits make the visual trustworthy. Fill every field before rendering the final figure.

In [None]:
TITLE = "Earth keeps warming"
SUBTITLE = "NASA GISTEMP temperature anomalies relative to 1951–1980, 1880–2024"
ANNOTATION = "Recent years stay above +1°C; the trend has no sign of reversing."
SOURCE = "Source: NASA GISTEMP v4 (downloaded Dec 2024)"
UNITS = "Units: Annual global mean temperature anomaly (°C)"

check_story_metadata(
    TITLE=TITLE,
    SUBTITLE=SUBTITLE,
    ANNOTATION=ANNOTATION,
    SOURCE=SOURCE,
    UNITS=UNITS,
)

## Step 5 · Craft the publishable chart
Blend the raw line with the rolling mean, annotate the headline takeaway, and include ethical guardrails such as a zero baseline and transparent smoothing choices.

In [None]:
fig, ax = plt.subplots(figsize=(11, 6))
ax.plot(temperature["Year"], temperature["TempAnomaly"], color="#a7b1c2", linewidth=1.2, label="Annual anomaly")
ax.plot(temperature["Year"], temperature["RollingDecade"], color="#c73f46", linewidth=2.4, label="10-year average")
ax.axhline(0, color="#444444", linewidth=1, linestyle="--", label="20th century baseline")
ax.fill_between(
    temperature["Year"],
    0,
    temperature["RollingDecade"],
    where=temperature["RollingDecade"] > 0,
    color="#c73f46",
    alpha=0.08,
    step="pre",
)
ax.set_title(f"{TITLE}\n{SUBTITLE}")
ax.set_xlabel("Year")
ax.set_ylabel("Temperature anomaly (°C)")
ax.legend(loc="upper left", frameon=False)
peak_year = temperature.loc[temperature["RollingDecade"].idxmax(), "Year"]
peak_value = temperature["RollingDecade"].max()
ax.annotate(
    ANNOTATION,
    xy=(peak_year, peak_value),
    xytext=(peak_year - 25, peak_value + 0.2),
    arrowprops=dict(arrowstyle="->", color="#c73f46"),
    fontsize=11,
    color="#c73f46",
)
add_story_footer(ax, source=SOURCE, units=UNITS)
plt.show()

## Step 6 · Interpret with the storytelling scaffold
- **Claim:** Global temperatures have risen steadily and the past decade remains more than 1 °C above the 20th-century baseline.
- **Evidence:** The 10-year average crosses +1 °C after 2015 and never returns below; individual annual anomalies cluster above zero after the 1980s.
- **Visual:** Twin lines (raw + smoothed), a zero reference, and an annotation on the sustained warmth clarify the message.
- **Takeaway:** “NASA’s record shows that the world has warmed over 1 °C compared to the mid-1900s — a trend driven by human activity and still climbing.”

### What this plot cannot tell us
- It does **not** show within-year extremes or regional differences — a future analysis could map seasonal or geographic patterns.
- Measurement uncertainty and revisions are small but real; cite version numbers when sharing.
- Even with a careful palette, the fill area could imply cumulative heat. Make clear in captions that values are anomalies, not total degrees of warming.

## Step 7 · Export the figure
Save the latest Matplotlib figure so you can reuse it in slide decks or reports without rerunning the notebook.

In [None]:
save_last_fig("day01_solution_plot.png")