## 🔗 Open This Notebook in Google Colab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavidLangworthy/ds4s/blob/main/days/day04/notebook/day04_starter.ipynb)

# 🌍 Day 4 – Mapping Forest Change
Let's turn long-table data into a map that shows where forests are shrinking or rebounding.

## 🧾 Data Card – World Bank Forest Area
- **Source:** [World Bank – Forest area (% of land area)](https://data.worldbank.org/indicator/AG.LND.FRST.ZS).
- **Temporal coverage:** 1990–2021 (annual).
- **Units:** Percent of land area covered by forest.
- **Update cadence:** Annual, based on FAO Global Forest Resources Assessment.
- **Method notes:** Derived from FAO country reports; includes natural forests and plantations.
- **Caveats:** Small countries may show volatility due to rounding; some territories share aggregate codes.

## 🧭 Story Scaffold
- **Claim:** Which regions have lost or gained forest cover since 1990?
- **Evidence:** Percent point change and baseline levels.
- **Visual:** Choropleth map highlighting change with an accessible color scale.
- **Takeaway:** Emphasize that percentages hide absolute area differences and that reporting methods vary.

In [None]:
from __future__ import annotations

from pathlib import Path
import sys

import pandas as pd
import plotly.express as px

for candidate in [Path.cwd(), Path.cwd().parent, Path.cwd().parent.parent]:
    utils_path = candidate / "utils.py"
    if utils_path.exists():
        if str(candidate) not in sys.path:
            sys.path.insert(0, str(candidate))
        break
else:
    raise FileNotFoundError("Unable to locate utils.py. Did you download the full project?")

from utils import (
    diagnose_dataframe,
    expect_rows_between,
    load_data,
    save_last_fig,
    validate_columns,
    validate_story_elements,
)


In [None]:
# Example: simple choropleth with placeholder data
example_df = pd.DataFrame(
    {
        "Country Code": ["USA", "BRA", "IDN"],
        "Change": [-5.0, -8.2, -3.4],
    }
)
fig = px.choropleth(example_df, locations="Country Code", color="Change", color_continuous_scale="Earth")
fig.update_layout(title="Example: change in forest cover (percent points)")
fig


In [None]:
# Step 1 – Load the forest area dataset
forest = load_data("forest_area_long.csv")

# TODO: Load the forest_area_long.csv file into a dataframe


<details>
<summary>Loading hint</summary>
<ul>
<li>Call <code>load_data</code> with the filename inside the <code>data</code> folder.</li>
<li>Store the result in a descriptive variable such as <code>forest</code>.</li>
</ul>
</details>

In [None]:
# Step 2 – Filter to the baseline and latest years
forest_focus = forest[forest["Year"].isin([1990, 2020])].copy()
forest_focus["Year"] = forest_focus["Year"].astype(int)

# TODO: Keep only rows for 1990 and 2020, ensuring Year is numeric


<details>
<summary>Filtering hint</summary>
<ul>
<li>Use <code>.isin([1990, 2020])</code> to grab specific years.</li>
<li><code>astype(int)</code> keeps the year column tidy for grouping.</li>
</ul>
</details>

In [None]:
# Step 3 – Compute change between 1990 and 2020
forest_change = (
    forest_focus
    .pivot(index=["Country Name", "Country Code"], columns="Year", values="ForestPercent")
    .rename(columns={1990: "percent_1990", 2020: "percent_2020"})
    .dropna()
)
forest_change["change_pp"] = forest_change["percent_2020"] - forest_change["percent_1990"]
forest_change = forest_change.reset_index()

# TODO: Pivot to wide format, rename, drop missing rows, and compute change_pp


<details>
<summary>Pivot hint</summary>
<ul>
<li>Use <code>.pivot(..., columns="Year")</code> to get separate 1990 and 2020 columns.</li>
<li>Rename the columns for clarity.</li>
<li>Subtract 1990 from 2020 to get a percent-point change column.</li>
</ul>
</details>

In [None]:
# Step 4 – Diagnostics
diagnose_dataframe(forest_change, name="Forest change (1990–2020)")
validate_columns(forest_change, ["Country Name", "Country Code", "percent_1990", "percent_2020", "change_pp"], name="forest_change")
expect_rows_between(forest_change, 180, 220, name="forest_change")


In [None]:
# Step 5 – Bucket countries into change categories
bins = [-100, -20, -10, 0, 10, 100]
labels = ["Severe loss", "Moderate loss", "Slight loss", "Slight gain", "Moderate gain"]
forest_change["change_bucket"] = pd.cut(forest_change["change_pp"], bins=bins, labels=labels)
forest_change.head()


In [None]:
# Step 6 – Story metadata strings
TITLE = "Three Decades of Forest Change"
SUBTITLE = "Percent-point change in forest area, 1990–2020 (World Bank)"
ANNOTATION = "Large losses cluster in the Amazon basin and Southeast Asia."
SOURCE = "World Bank, FAO FRA (downloaded 2024-04-15)"
UNITS = "Percent-point change in forest area"

validate_story_elements(
    {
        "TITLE": TITLE,
        "SUBTITLE": SUBTITLE,
        "ANNOTATION": ANNOTATION,
        "SOURCE": SOURCE,
        "UNITS": UNITS,
    }
)

# TODO: Customize the metadata strings for your map


In [None]:
# Step 7 – Choropleth map of forest change
fig = px.choropleth(
    forest_change,
    locations="Country Code",
    color="change_pp",
    color_continuous_scale="BrBG",
    range_color=(-25, 15),
    hover_name="Country Name",
    hover_data={"change_pp": ':.1f', "percent_1990": ':.1f', "percent_2020": ':.1f'},
)
fig.update_layout(
    title=TITLE,
    coloraxis_colorbar_title=UNITS,
    margin=dict(l=0, r=0, t=60, b=0),
    annotations=[
        dict(
            x=0.5,
            y=-0.08,
            xref="paper",
            yref="paper",
            text=f"Source: {SOURCE} — {ANNOTATION}",
            showarrow=False,
        )
    ],
)
forest_map = fig
fig

# TODO: Configure the choropleth with clear color limits and annotations


In [None]:
# Step 8 – Final validation and save option
validate_story_elements(
    {
        "TITLE": TITLE,
        "SUBTITLE": SUBTITLE,
        "ANNOTATION": ANNOTATION,
        "SOURCE": SOURCE,
        "UNITS": UNITS,
    }
)
save_last_fig("day04_forest_change.png", fig=forest_map)
