## 🔗 Open Solution in Google Colab

[![Open Solution in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavidLangworthy/ds4s/blob/master/days/day01/solution/day01_solution.ipynb)

# 🌎 Day 1 – Global Temperature Signals

Welcome! This lab eases you into Python by tracing how Earth's temperature has shifted since 1880. Work through each small step, run the quick checks, and use the hints only if you truly need them.

### Data Card — NASA GISTEMP Annual Temperature Anomalies

| Field | Details |
| --- | --- |
| **Source** | [NASA Goddard Institute for Space Studies (GISS)](https://data.giss.nasa.gov/gistemp/) |
| **Download** | `data/GLB.Ts+dSST.csv` (auto-downloads if missing) |
| **Temporal coverage** | 1880–2024 (annual) |
| **Units** | °C anomaly relative to the 1951–1980 baseline |
| **Last updated** | January 2025 release |
| **Caveats** | Annual means are reconstructed from land + sea stations; recent years may be revised as new measurements arrive. |


### Step 1 · Imports, style, and shared helpers

Run the cell to load the libraries used throughout the lab.

In [None]:
# Ensure shared helpers are available when running on Google Colab
import pathlib
import urllib.request

UTILS_PATH = pathlib.Path("utils.py")
if not UTILS_PATH.exists():
    UTILS_URL = "https://raw.githubusercontent.com/DavidLangworthy/ds4s/master/utils.py"
    print("Fetching shared helper module…")
    UTILS_PATH.write_bytes(urllib.request.urlopen(UTILS_URL).read())

import pandas as pd
import matplotlib.pyplot as plt

from utils import (
    baseline_style,
    expect_rows_between,
    load_csv,
    quick_check,
    save_last_fig,
    validate_columns,
    validate_story_elements,
)

baseline_style()


### Step 2 · Load the climate dataset

We pull the CSV from the local `data` folder. If it is not available, the helper downloads the official copy from GitHub.

In [None]:
temperature_raw = load_csv("data/GLB.Ts+dSST.csv", read_csv_kwargs=dict(skiprows=1))
print(f"Rows loaded: {len(temperature_raw)}")
temperature_raw.head()


<details>
<summary>Need a nudge?</summary>

- Confirm the file path matches the entry in the data card.
- `.head()` should show columns for each month plus the annual mean (`J-D`).

</details>

### Step 3 · Clean and focus the columns we need

Convert the annual anomaly column to numeric values and keep only the year/anomaly pair for plotting.

In [None]:
temperature_df = (
    temperature_raw[["Year", "J-D"]]
    .rename(columns={"J-D": "TempAnomaly"})
    .assign(TempAnomaly=lambda df: pd.to_numeric(df["TempAnomaly"], errors="coerce"))
    .dropna(subset=["TempAnomaly"])
)

validate_columns(temperature_df, ["Year", "TempAnomaly"])
expect_rows_between(temperature_df, minimum=140, maximum=200)


In [None]:
quick_check(temperature_df, name="NASA anomalies")


<details>
<summary>Still stuck?</summary>

- Use `pd.to_numeric(..., errors=\"coerce\")` to turn `***` placeholders into `NaN`.
- Drop rows without anomalies before plotting.

</details>

### Step 4 · Smooth the series for context (optional but recommended)

A rolling average highlights the long-term signal without hiding yearly variation.

In [None]:
temperature_df = temperature_df.assign(
    Rolling5=lambda df: df["TempAnomaly"].rolling(window=5, center=True, min_periods=1).mean()
)
quick_check(temperature_df.tail(), name="Recent values")


### Step 5 · Draft the story elements

Use the checklist to craft the narrative that will appear with the chart. All fields should be filled before plotting.

In [None]:
story = {
    "title": "Global Temperature Anomalies Keep Rising",
    "subtitle": "NASA GISTEMP annual means relative to the 1951–1980 climate baseline",
    "annotation": "Recent years exceed +1.3°C above the 20th-century average.",
    "source": "NASA GISS Surface Temperature Analysis (accessed 2025-01)",
    "units": "°C anomaly",
}
validate_story_elements(story)


### Step 6 · Plot the warming trend with debug checkpoints

The preview should show a red line trending upward, a light rolling average, and an annotation on the latest year.

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(temperature_df["Year"], temperature_df["TempAnomaly"], color="#c1121f", linewidth=1.5, label="Annual anomaly")
ax.plot(temperature_df["Year"], temperature_df["Rolling5"], color="#003049", linewidth=2.5, label="5-year rolling mean")
ax.axhline(0, color="#6c757d", linestyle="--", linewidth=1)

last_year = int(temperature_df["Year"].iloc[-1])
last_value = temperature_df["TempAnomaly"].iloc[-1]
ax.annotate(
    story["annotation"],
    xy=(last_year, last_value),
    xytext=(last_year - 25, last_value + 0.3),
    arrowprops=dict(arrowstyle="->", color="#6c757d"),
    fontsize=11,
)
ax.set_title(story["title"], fontsize=16, pad=14)
ax.set_xlabel("Year")
ax.set_ylabel(story["units"])
ax.legend(loc="upper left")
ax.grid(alpha=0.3)
ax.text(0.01, -0.18, f"Source: {story['source']}", transform=ax.transAxes, fontsize=9, ha="left")
plt.tight_layout()
plt.show()


### Step 7 · Save the figure for the automated export

This uses the shared helper so the GitHub Action can archive a high-resolution copy.

In [None]:
save_last_fig(fig, "plots/day01_solution_plot.png")


### Step 8 · Reflect

- How quickly did the anomaly cross +1°C?
- Which checkpoints helped you confirm the calculation?
- Note one uncertainty or caveat from the data card to mention in discussion.