## 🔗 Open This Notebook in Google Colab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavidLangworthy/ds4s/blob/master/Day%204_%20Mapping%20Biodiversity%20%26%20Deforestation.ipynb)

# 🌍 Day 4 – Mapping Biodiversity & Deforestation
### Advanced Visualization: Forest Loss Over Time

Today’s focus is geographic storytelling. You’ll animate forest cover change, highlight uncertainty, and practice ethical map design.

### 🗂️ Data Card
| Field | Details |
| --- | --- |
| **Dataset** | World Bank – Forest Area (% of Land Area) |
| **Source & link** | World Bank SDG Atlas — [AG.LND.FRST.ZS](https://data.worldbank.org/indicator/AG.LND.FRST.ZS) |
| **Temporal / spatial coverage** | Country-level, annual 1990–2020 |
| **Key units** | Forest area as % of total land area |
| **Method & caveats** | Interpolated values for some countries. Differences in national reporting can introduce year-to-year noise. |

### ⏱️ Learning Path for Today

            Each loop takes about 10–15 minutes:
            - [ ] Load and inspect the long-form forest dataset.
- [ ] Check for missing values and prepare friendly metadata.
- [ ] Design the choropleth scale and narrative scaffold.
- [ ] Render the animated map and document limitations.

            > 👩‍🏫 **Teacher tip:** Use these checkpoints for quick formative assessment. Have students raise a colored card after each check cell to signal confidence or questions.

> ### 👩‍🏫 Teacher Sidebar
> **Suggested timing:** ~55 minutes including reflection.
>
> **Likely misconceptions:** Interpreting lighter colors as ‘better’ without context; forgetting map projections.
>
> **Fast finisher extension:** Ask students to compute forest loss by region and annotate biodiversity hotspots.

In [None]:
from __future__ import annotations

from pathlib import Path
from typing import Mapping, Sequence

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

try:
    import plotly.express as px  # noqa: F401 - imported for student use
except ModuleNotFoundError:  # pragma: no cover - Plotly installed in Colab
    px = None

pd.options.display.float_format = "{:.2f}".format
sns.set_theme(style="whitegrid", context="talk")
plt.rcParams.update(
    {
        "axes.titlesize": 18,
        "axes.titleweight": "bold",
        "axes.labelsize": 13,
        "axes.grid": True,
        "grid.alpha": 0.25,
        "figure.dpi": 120,
        "axes.spines.top": False,
        "axes.spines.right": False,
    }
)

STORY_KEYS = (
    "title",
    "subtitle",
    "claim",
    "evidence",
    "visual",
    "takeaway",
    "source",
    "units",
    "annotation",
    "alt_text",
)


def load_csv(path: Path, *, description: str = "", **read_kwargs) -> pd.DataFrame:
    df = pd.read_csv(path, **read_kwargs)
    label = description or path.name
    print(
        f"✅ Loaded {label} with shape {df.shape[0]} rows × {df.shape[1]} columns."
    )
    return df


def validate_columns(
    df: pd.DataFrame, required: Sequence[str], *, df_name: str = "DataFrame"
) -> None:
    missing = [col for col in required if col not in df.columns]
    if missing:
        raise ValueError(f"{df_name} is missing columns: {missing}")
    print(f"✅ {df_name} includes required columns: {', '.join(required)}")


def expect_rows_between(
    df: pd.DataFrame,
    lower: int,
    upper: int,
    *,
    df_name: str = "DataFrame",
) -> None:
    rows = len(df)
    if not (lower <= rows <= upper):
        raise ValueError(
            f"{df_name} has {rows} rows; expected between {lower} and {upper}."
        )
    print(f"✅ {df_name} row count {rows} within [{lower}, {upper}].")


def quick_null_check(df: pd.DataFrame, *, df_name: str = "DataFrame") -> pd.Series:
    nulls = df.isna().sum()
    print(f"{df_name} missing values per column:\n{nulls}")
    return nulls


def quick_preview(
    df: pd.DataFrame, *, n: int = 5, df_name: str = "DataFrame"
) -> pd.DataFrame:
    print(f"🔍 Previewing {df_name} (first {n} rows):")
    return df.head(n)


def numeric_sanity_check(
    series: pd.Series,
    *,
    minimum: float | None = None,
    maximum: float | None = None,
    name: str = "Series",
) -> None:
    if minimum is not None and series.min() < minimum:
        raise ValueError(
            f"{name} has values below the expected minimum of {minimum}."
        )
    if maximum is not None and series.max() > maximum:
        raise ValueError(
            f"{name} has values above the expected maximum of {maximum}."
        )
    print(
        f"✅ {name} within expected range"
        f"{f' ≥ {minimum}' if minimum is not None else ''}"
        f"{f' and ≤ {maximum}' if maximum is not None else ''}."
    )


def story_fields_are_complete(story: Mapping[str, str]) -> None:
    missing = [key for key in STORY_KEYS if not str(story.get(key, "")).strip()]
    if missing:
        raise ValueError(
            "Please complete the storytelling scaffold before plotting: "
            + ", ".join(missing)
        )
    print(
        "✅ Story scaffold complete (title, subtitle, claim, evidence, visual,"
        " takeaway, source, units, annotation, alt text)."
    )


def print_story_scaffold(story: Mapping[str, str]) -> None:
    story_fields_are_complete(story)
    print("\n📖 Story Scaffold")
    print(f"Claim: {story['claim']}")
    print(f"Evidence: {story['evidence']}")
    print(f"Visual focus: {story['visual']}")
    print(f"Takeaway: {story['takeaway']}")
    print(f"Source: {story['source']} ({story['units']})")


def apply_matplotlib_story(ax: plt.Axes, story: Mapping[str, str]) -> None:
    story_fields_are_complete(story)
    ax.set_title(f"{story['title']}\n{story['subtitle']}", loc="left", pad=18)
    ax.figure.text(
        0.01,
        -0.08,
        (
            f"Claim: {story['claim']} | Evidence: {story['evidence']}"
            f" | Takeaway: {story['takeaway']}"
            f"\nSource: {story['source']} • Units: {story['units']}"
        ),
        ha="left",
        fontsize=10,
    )


def annotate_callout(
    ax: plt.Axes,
    *,
    xy: tuple[float, float],
    xytext: tuple[float, float],
    text: str,
) -> None:
    ax.annotate(
        text,
        xy=xy,
        xytext=xytext,
        arrowprops=dict(arrowstyle="->", color="black", lw=1),
        bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="black", alpha=0.8),
    )


def record_alt_text(text: str) -> None:
    print(f"📝 Alt text ready: {text}")


def accessibility_checklist(
    *, palette: str, has_alt_text: bool, contrast_passed: bool = True
) -> None:
    print("♿ Accessibility checklist:")
    print(f" • Palette: {palette}")
    print(
        f" • Alt text provided: {'yes' if has_alt_text else 'add alt text before sharing'}"
    )
    print(f" • Contrast OK: {'yes' if contrast_passed else 'adjust colors'}")


def save_figure(fig: plt.Figure, filename: str) -> Path:
    plots_dir = Path.cwd() / "plots"
    plots_dir.mkdir(parents=True, exist_ok=True)
    output_path = plots_dir / filename
    fig.savefig(output_path, dpi=300, bbox_inches="tight")
    print(f"💾 Saved figure to {output_path}")
    return output_path


def save_plotly_figure(fig, filename: str) -> Path:
    plots_dir = Path.cwd() / "plots"
    plots_dir.mkdir(parents=True, exist_ok=True)
    html_path = plots_dir / filename.replace(".png", ".html")
    fig.write_html(html_path)
    print(f"💾 Saved interactive figure to {html_path}")
    try:
        static_path = plots_dir / filename
        fig.write_image(str(static_path))
        print(f"💾 Saved static image to {static_path}")
    except Exception as exc:  # pragma: no cover - depends on kaleido
        print(f"⚠️ Static export skipped: {exc}")
    return html_path

In [None]:
from pathlib import Path

DATA_DIR = Path.cwd() / "data"
PLOTS_DIR = Path.cwd() / "plots"
PLOTS_DIR.mkdir(parents=True, exist_ok=True)

print(f"Data directory: {DATA_DIR}")
print(f"Plots directory: {PLOTS_DIR}")

## Loop 1 · Load & Validate Forest Data
Confirm schema and range before designing the map.

In [None]:
forest_path = DATA_DIR / "forest_area_long.csv"
df_forest = load_csv(
    forest_path, description="World Bank forest area long-form"
)
validate_columns(
    df_forest,
    ["Country Name", "Country Code", "Year", "ForestPercent"],
    df_name="forest data",
)
expect_rows_between(df_forest, 5000, 8000, df_name="forest data")

In [None]:
quick_preview(df_forest, n=5, df_name="forest data")
quick_null_check(df_forest, df_name="forest data")

## Loop 2 · Prepare Metadata & Diagnostics
Clean types, compute regional summaries, and surface limitations.

In [None]:
df_forest["Year"] = df_forest["Year"].astype(int)
numeric_sanity_check(
    df_forest["ForestPercent"], minimum=0, maximum=100, name="ForestPercent"
)

global_summary = df_forest.groupby("Year")["ForestPercent"].mean()
quick_preview(
    global_summary.reset_index().tail(),
    n=5,
    df_name="global forest % by year",
)

caveats = (
    "Some countries interpolate census years; small island nations may report 0% forest despite mangroves."
)
print("Caveats to mention: ", caveats)

## Loop 3 · Story Scaffold for the Map
Capture the claim, evidence, and annotation before rendering.

In [None]:
latest_year = df_forest["Year"].max()
earliest_year = df_forest["Year"].min()
brazil_recent = df_forest[
    (df_forest["Country Name"] == "Brazil") & (df_forest["Year"] == latest_year)
]["ForestPercent"].iloc[0]
brazil_past = df_forest[
    (df_forest["Country Name"] == "Brazil") & (df_forest["Year"] == earliest_year)
]["ForestPercent"].iloc[0]

story = {
    "title": "Forest Cover Has Fallen in Key Biodiversity Hotspots",
    "subtitle": f"Share of land area covered by forest, {earliest_year}–{latest_year}",
    "claim": "Global forest cover is shrinking, with pronounced declines in tropical nations.",
    "evidence": (
        f"Brazil dropped from {brazil_past:.1f}% forest cover in {earliest_year} to {brazil_recent:.1f}% in {latest_year}."
    ),
    "visual": "Animated choropleth highlighting forest percent by country.",
    "takeaway": "Protecting remaining forests is critical for biodiversity and carbon sinks.",
    "source": "World Bank SDG Atlas (2024 download)",
    "units": "% of national land area covered by forest",
    "annotation": "Brazil’s forest share has declined in three decades",
    "alt_text": (
        "Animated world map where deep green indicates high forest share; many tropical countries lighten"
        f" between {earliest_year} and {latest_year}, showing shrinking forest cover."
    ),
}

print_story_scaffold(story)

## Loop 4 · Render the Animated Choropleth
Apply ethical mapping practices: consistent scales, clear annotation, and limitations.

In [None]:
fig = px.choropleth(
    df_forest,
    locations="Country Code",
    color="ForestPercent",
    hover_name="Country Name",
    animation_frame="Year",
    color_continuous_scale="Greens",
    range_color=[0, 100],
    labels={"ForestPercent": "Forest area (% of land)"},
)

fig.update_layout(
    title=dict(
        text=f"<b>{story['title']}</b><br><sup>{story['subtitle']}</sup>",
        x=0,
        xanchor="left",
    ),
    coloraxis_colorbar=dict(title="% forest"),
    margin=dict(l=10, r=10, t=80, b=120),
)

story_fields_are_complete(story)
fig.add_annotation(
    xref="paper",
    yref="paper",
    x=0,
    y=-0.2,
    align="left",
    showarrow=False,
    text=(
        f"Claim: {story['claim']}<br>Evidence: {story['evidence']}<br>Takeaway: {story['takeaway']}<br>Source: {story['source']} • Units: {story['units']}"
    ),
)

fig.add_annotation(
    x=0.8,
    y=0.2,
    xref="paper",
    yref="paper",
    text=story["annotation"],
    showarrow=True,
    arrowcolor="#386641",
    arrowhead=2,
)

record_alt_text(story["alt_text"])
accessibility_checklist(
    palette="Sequential greens with fixed endpoints",
    has_alt_text=True,
)

fig.show()

### 🔍 Map Limitations to Discuss
- Interpolated values can create smooth trends where reality is jagged.
- Choropleths emphasise area, so large countries appear visually dominant.
- Animations can hide precise numbers; pair with a static chart for reporting.

In [None]:
save_plotly_figure(fig, "day04_solution_plot.png")