## üîó Open This Notebook in Google Colab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavidLangworthy/ds4s/blob/master/days/day04/solution/day04_solution.ipynb)

# üåç Day 4 ‚Äì Mapping Forest Change and Biodiversity Pressure
### Guided loops: validate ‚Üí map ‚Üí animate ‚Üí interpret

Today‚Äôs loops focus on geospatial storytelling. We‚Äôll load a long-form forest dataset, check for plausible ranges, build a baseline choropleth, then extend it into an animation that reveals regional forest loss over three decades.

## üìá Data Card ‚Äî World Bank Forest Area (% of Land)
- **Source**: FAO & World Bank World Development Indicators (downloaded January 2024).
- **Temporal coverage**: 1990‚Äì2021, annual values for countries and regions.
- **Units**: Percent of land area covered by forest (0‚Äì100).
- **Processing notes**: Data reshaped to long format with ISO3 codes; regional aggregates included.
- **Last updated**: December 2023 WDI refresh.
- **Caveats**: Forest definition follows FAO (‚â•0.5 ha with trees >5m); natural vs plantation forests are not distinguished. Aggregated regions need to be filtered for country-level analysis.

> üîé **What this map cannot tell us**: Biodiversity quality, sub-national hotspots, or drivers (logging vs fire vs agriculture). Complement with qualitative context before policy decisions.

## üó∫Ô∏è Workflow Map
1. **Setup & helpers**.
2. **Load & inspect** the long-form dataset.
3. **Filter** to ISO-coded countries and confirm ranges.
4. **Story scaffold** for map metadata and ethics.
5. **Visualise** static baseline + animated timeline with Plotly.
6. **Reflect** on uncertainty, color choices, and interpretation limits.

## Step 0 ¬∑ Imports, style, and diagnostics

In [None]:

from pathlib import Path
from textwrap import dedent

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import Image, display

sns.set_theme(style="whitegrid")
plt.rcParams.update({
    "axes.titlesize": 18,
    "axes.labelsize": 13,
    "axes.titleweight": "bold",
    "figure.titlesize": 20,
    "xtick.labelsize": 11,
    "ytick.labelsize": 11,
})


def baseline_style():
    """Reset the Matplotlib/Seaborn style so every figure starts consistent."""
    sns.set_theme(style="whitegrid")
    plt.rcParams.update({
        "axes.titlesize": 18,
        "axes.labelsize": 13,
        "axes.titleweight": "bold",
        "figure.titlesize": 20,
        "xtick.labelsize": 11,
        "ytick.labelsize": 11,
        "legend.title_fontsize": 12,
        "legend.fontsize": 11,
    })
    return plt


def quick_peek(df, expected_columns=None, sample=3, label="DataFrame"):
    """Print a friendly snapshot so students can self-diagnose issues quickly."""
    print(f"
üîç {label} preview")
    print(df.head(sample))
    print(f"Rows: {len(df):,} | Columns: {list(df.columns)}")
    if expected_columns:
        missing = [col for col in expected_columns if col not in df.columns]
        if missing:
            print(f"‚ö†Ô∏è Missing column(s): {missing}")
        else:
            print("‚úÖ Columns match the expectation.")
    return df


def expect_rows_between(df, low, high, label="row count"):
    rows = len(df)
    if low <= rows <= high:
        print(f"‚úÖ {label.title()} looks right: {rows:,}.")
    else:
        print(f"‚ö†Ô∏è {label.title()} looks off: {rows:,}. Expected between {low:,} and {high:,}.")
    return rows


def validate_story_elements(elements):
    missing = [key for key, value in elements.items() if not value or not str(value).strip()]
    if missing:
        print(f"‚ö†Ô∏è Please fill in these storytelling fields: {', '.join(missing)}")
    else:
        print("‚úÖ Story scaffold is ready ‚Äî every element is filled in.")
    return elements


def save_last_fig(filename, fig=None, dpi=300):
    """Save the latest Matplotlib figure with consistent export settings."""
    output_path = Path.cwd() / filename
    output_path.parent.mkdir(parents=True, exist_ok=True)
    if fig is None:
        fig = plt.gcf()
    if fig and getattr(fig, "axes", None):
        fig.savefig(output_path, dpi=dpi, bbox_inches="tight")
        print(f"üíæ Saved figure to {output_path}")
    else:
        print("‚ö†Ô∏è No figure detected to save.")
    return output_path

baseline_style()


## Step 1 ¬∑ Load the long-form forest dataset
**Micro-task**: read the CSV, inspect the first rows, and note column names.

In [None]:

data_dir = Path.cwd() / "data"
forest_df = pd.read_csv(data_dir / "forest_area_long.csv")
quick_peek(forest_df, expected_columns=["Country Name", "Country Code", "Year", "ForestPercent"], label="Forest area raw table")


### Sanity checks: year range and percent bounds

In [None]:

min_year, max_year = forest_df["Year"].min(), forest_df["Year"].max()
print(f"üóìÔ∏è Years covered: {min_year} ‚Üí {max_year}")
if min_year > 1990 or max_year < 2021:
    print("‚ö†Ô∏è Expected coverage of 1990‚Äì2021. Check the CSV path or filters.")
else:
    print("‚úÖ Year range matches the data card.")

min_pct, max_pct = forest_df["ForestPercent"].min(), forest_df["ForestPercent"].max()
print(f"üå≤ Forest percent range: {min_pct:.2f} ‚Üí {max_pct:.2f}")
if min_pct < 0 or max_pct > 100:
    print("‚ö†Ô∏è Percent values outside 0‚Äì100. Investigate outliers before mapping.")
else:
    print("‚úÖ Percent values fall within 0‚Äì100.")


## Step 2 ¬∑ Filter to ISO3 countries
Regional aggregates can distort the map; keep ISO codes only.

In [None]:

def is_iso(code: str) -> bool:
    return isinstance(code, str) and len(code) == 3 and code.isalpha()

forest_countries = forest_df[forest_df["Country Code"].apply(is_iso)].copy()
expect_rows_between(forest_countries, 5000, 6000, label="country-year rows")
quick_peek(forest_countries, label="Filtered country records")


### Progress anchor

In [None]:
display(Image(filename=str(Path.cwd() / 'plots' / 'day04_solution_plot.png')), width=420)

## Step 3 ¬∑ Story-first map checklist

In [None]:

TITLE = "Forest Cover Has Fallen Sharply in Key Biodiversity Hotspots"
SUBTITLE = "Share of national land area covered by forest, 1990‚Äì2021"
ANNOTATION = "Watch tropical regions in the Amazon, Congo Basin, and Southeast Asia thin over time."
SOURCE = "World Bank WDI (FAO Forest Resources Assessment, 2023 update)"
UNITS = "Forest area (% of land area)"
ACCESSIBILITY_NOTES = "Sequential 'Greens' palette with fixed 0‚Äì100 scale; animation labeled by year; includes source footnote."

validate_story_elements({
    "TITLE": TITLE,
    "SUBTITLE": SUBTITLE,
    "ANNOTATION": ANNOTATION,
    "SOURCE": SOURCE,
    "UNITS": UNITS,
    "ACCESSIBILITY_NOTES": ACCESSIBILITY_NOTES,
})


## Step 4 ¬∑ Build static and animated maps with Plotly

In [None]:

import plotly.express as px

latest_year = int(forest_countries["Year"].max())
latest_snapshot = forest_countries[forest_countries["Year"] == latest_year]

fig_static = px.choropleth(
    latest_snapshot,
    locations="Country Code",
    color="ForestPercent",
    hover_name="Country Name",
    color_continuous_scale="Greens",
    range_color=[0, 100],
    labels={"ForestPercent": "Forest area (%)"},
    title=f"{TITLE}<br><sup>{SUBTITLE} ‚Äî {latest_year}</sup>",
)
fig_static.update_layout(
    margin=dict(l=0, r=0, t=80, b=0),
    coloraxis_colorbar=dict(title="% forest"),
    annotations=[
        dict(
            text=f"Source: {SOURCE} | Notes: {ACCESSIBILITY_NOTES}",
            x=0,
            y=-0.1,
            showarrow=False,
            xref="paper",
            yref="paper",
            font=dict(size=11, color="#4f4f4f"),
            align="left",
        )
    ],
)

fig_animation = px.choropleth(
    forest_countries,
    locations="Country Code",
    color="ForestPercent",
    hover_name="Country Name",
    animation_frame="Year",
    color_continuous_scale="Greens",
    range_color=[0, 100],
    labels={"ForestPercent": "Forest area (%)"},
    title=f"{TITLE}<br><sup>{SUBTITLE}</sup>",
)
fig_animation.update_layout(
    margin=dict(l=0, r=0, t=80, b=0),
    coloraxis_colorbar=dict(title="% forest"),
)
fig_animation.add_annotation(
    dict(
        text=ANNOTATION,
        x=0.01,
        y=0.05,
        xref="paper",
        yref="paper",
        showarrow=False,
        bgcolor="rgba(255,255,255,0.8)",
        font=dict(size=11, color="#264653"),
        align="left",
    )
)

fig_static.show()
fig_animation.show()


### Export checkpoint

In [None]:

plots_dir = Path.cwd() / "plots"
plots_dir.mkdir(parents=True, exist_ok=True)
try:
    fig_animation.write_html(str(plots_dir / "day04_solution_map.html"))
    fig_static.write_image(str(plots_dir / "day04_solution_plot.png"))
    print("üíæ Saved animated map HTML and static PNG to plots/")
except Exception as exc:
    print("‚ö†Ô∏è Export step warning:", exc)


## Step 5 ¬∑ Reflect on interpretation, integrity, and uncertainty
- **Claim ‚Üí Evidence ‚Üí Visual ‚Üí Takeaway**:
  - **Claim**: Forest cover is shrinking fastest in tropical biodiversity hotspots, while some temperate countries gain cover.
  - **Evidence**: Animated choropleth shows Amazon, Congo Basin, and Southeast Asia shifting from dark to pale green; Europe thickens modestly.
  - **Visual**: Static snapshot plus animation with consistent scale and annotation.
  - **Takeaway**: Habitat loss is spatially concentrated and ongoing, underscoring the need for targeted conservation.
- **Limitations**: National averages hide sub-national variation; FAO definitions include plantations.
- **Potential misreads**: Sequential palette may imply linear change‚Äîhighlight that percent points can mask absolute area differences.
- **Next questions**: Which policies reversed trends in nations with gains? How do forest changes align with biodiversity loss metrics?

## Process quality checklist
‚úÖ Loaded and validated long-form data ‚Ä¢ ‚úÖ Filtered ISO-coded countries ‚Ä¢ ‚úÖ Completed story scaffold ‚Ä¢ ‚úÖ Built static + animated maps with clear metadata ‚Ä¢ ‚úÖ Reflected on spatial uncertainty and integrity.