[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mihiarc/pyfia/blob/main/notebooks/04_change_analysis.ipynb)

---

In [None]:
# Google Colab Setup - Run this cell first!
import sys
if 'google.colab' in sys.modules:
    print("Running in Google Colab - installing pyFIA...")
    !pip install -q pyfia polars duckdb matplotlib rich
    
    # Download helpers.py for Colab
    import urllib.request
    helpers_url = "https://raw.githubusercontent.com/mihiarc/pyfia/main/notebooks/helpers.py"
    urllib.request.urlretrieve(helpers_url, "helpers.py")
    print("Setup complete! You may now run the remaining cells.")
else:
    print("Running locally - no additional setup needed.")

# Change Analysis: Growth, Mortality, Removals

This notebook covers temporal forest change analysis using FIA's Growth-Removal-Mortality (GRM) methodology.

## What You'll Learn

1. Introduction to GRM (Growth-Removal-Mortality)
2. `mortality()` - Annual mortality rates by species and cause
3. `growth()` - Net growth estimation
4. `removals()` - Harvest analysis
5. Net change calculation (growth - mortality - removals)
6. Different measure options (volume, biomass, TPA)
7. Forest sustainability assessment

**Prerequisites**: Complete Notebooks 1-3

**Estimated time**: 45 minutes

---

## Setup

In [None]:
# Core imports
from pyfia import (
    FIA,
    volume,
    mortality,
    growth,
    removals,
    join_species_names,
)
import polars as pl
import matplotlib.pyplot as plt

# Notebook helpers
from helpers import ensure_ri_data, display_estimate, plot_by_category

# Ensure data is available
db_path = ensure_ri_data()
print("Ready to begin!")

---

## 1. Introduction to GRM

### What is GRM?

**GRM (Growth-Removal-Mortality)** is FIA's methodology for tracking forest change over time. It measures:

| Component | Description |
|-----------|-------------|
| **Growth** | Volume/biomass added to surviving trees + ingrowth |
| **Removals** | Trees harvested or removed |
| **Mortality** | Trees that died from natural causes |

### Net Change

**Net Change = Growth - Mortality - Removals**

- Positive net change → Forest is accumulating
- Negative net change → Forest is depleting

### How FIA Tracks Change

FIA uses a **rotating panel design**:
- Plots are revisited every 5-10 years (varies by state)
- Trees are tagged and remeasured
- GRM tables track what happened to each tree between visits

### GRM Tables

pyFIA uses these FIA tables for change analysis:
- `TREE_GRM_COMPONENT` - Change by component (growth, mortality, removal)
- `TREE_GRM_MIDPT` - Mid-point estimates (for annualization)
- `TREE_GRM_BEGIN` - Beginning-of-period tree state

---

## 2. Mortality Estimation

The `mortality()` function estimates annual tree mortality. This is critical for understanding forest health and disturbance impacts.

### Basic Mortality Estimate

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")  # GRM evaluation needed
    
    result = mortality(db)

display_estimate(result, title="Annual Mortality")

### Understanding Mortality Results

| Column | Description | Unit |
|--------|-------------|------|
| `MORTALITY_ACRE` | Annual mortality per acre | cubic feet/acre/year |
| `MORTALITY_ACRE_SE` | Standard error | cubic feet/acre/year |
| `MORTALITY_TOTAL` | Total annual mortality | cubic feet/year |
| `AREA` | Forest area | acres |
| `N_PLOTS` | Plots with mortality | count |
| `N_TREES` | Dead trees measured | count |

**Note**: GRM estimates are already **annualized** - they represent yearly rates.

In [None]:
# Extract key metrics
mort_per_acre = result["MORTALITY_ACRE"][0]
mort_total = result["MORTALITY_TOTAL"][0]
area_total = result["AREA"][0]

print(f"Rhode Island Annual Mortality:")
print(f"  Per acre: {mort_per_acre:,.1f} cubic feet/acre/year")
print(f"  Total:    {mort_total/1e6:,.2f} million cubic feet/year")
print(f"  Area:     {area_total:,.0f} acres")

### Mortality by Species

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    result = mortality(db, by_species=True)
    
    # Add species names (pass db for reference table lookup)
    result_named = join_species_names(result, db)

# Top 10 by mortality
top_mortality = result_named.sort("MORT_TOTAL", descending=True).head(10)
display_estimate(
    top_mortality.select(["SPCD_NAME", "MORT_ACRE", "MORT_TOTAL", "N_TREES"]),
    title="Top 10 Species by Mortality Volume"
)

In [None]:
# Visualize mortality by species
fig = plot_by_category(
    top_mortality,
    category_col="SPCD_NAME",
    value_col="MORTALITY_TOTAL",
    title="Annual Mortality by Species (Top 10)",
    xlabel="Mortality (cubic feet/year)",
    color="#C62828"
)
plt.show()

### Mortality by Size Class

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    result = mortality(db, by_size_class=True)

# Sort by size class
result_sorted = result.sort("SIZE_CLASS")
display_estimate(
    result_sorted.select(["SIZE_CLASS", "MORTALITY_ACRE", "MORTALITY_TOTAL", "N_TREES"]),
    title="Mortality by Diameter Class"
)

### Different Mortality Measures

Use the `measure` parameter to get mortality in different units:

| `measure` | Description | Columns |
|-----------|-------------|--------|
| `"volume"` | Cubic foot volume (default) | MORTALITY_ACRE, MORTALITY_TOTAL |
| `"biomass"` | Short tons of biomass | MORTALITY_BIOMASS_ACRE, etc. |
| `"tpa"` | Trees per acre | MORTALITY_TPA_ACRE, etc. |
| `"basal_area"` | Square feet per acre | MORTALITY_BAA_ACRE, etc. |

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    mort_vol = mortality(db, measure="volume")
    mort_tpa = mortality(db, measure="tpa")
    mort_bio = mortality(db, measure="biomass")

print("Annual Mortality by Different Measures:")
print(f"  Volume:  {mort_vol['MORT_TOTAL'][0]/1e6:,.2f} million cuft/year")
print(f"  Trees:   {mort_tpa['MORTALITY_TPA_TOTAL'][0]/1e6:,.1f} million trees/year")
print(f"  Biomass: {mort_bio['MORTALITY_BIOMASS_TOTAL'][0]/1e3:,.1f} thousand tons/year")

---

## 3. Growth Estimation

The `growth()` function estimates annual net growth, including:
- **Survivor growth**: Volume added to trees that survived between measurements
- **Ingrowth**: Trees that crossed the 5" diameter threshold

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    result = growth(db)

display_estimate(result, title="Annual Net Growth")

In [None]:
# Growth metrics
growth_per_acre = result["GROWTH_ACRE"][0]
growth_total = result["GROWTH_TOTAL"][0]

print(f"Rhode Island Annual Growth:")
print(f"  Per acre: {growth_per_acre:,.1f} cubic feet/acre/year")
print(f"  Total:    {growth_total/1e6:,.2f} million cubic feet/year")

### Growth by Species

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    result = growth(db, by_species=True)
    
    # Add species names (pass db for reference table lookup)
    result_named = join_species_names(result, db)

top_growth = result_named.sort("GROWTH_TOTAL", descending=True).head(10)

display_estimate(
    top_growth.select(["SPCD_NAME", "GROWTH_ACRE", "GROWTH_TOTAL", "N_TREES"]),
    title="Top 10 Species by Growth"
)

In [None]:
# Visualize growth by species
fig = plot_by_category(
    top_growth,
    category_col="SPCD_NAME",
    value_col="GROWTH_TOTAL",
    title="Annual Growth by Species (Top 10)",
    xlabel="Growth (cubic feet/year)",
    color="#2E7D32"
)
plt.show()

---

## 4. Removals Estimation

The `removals()` function estimates trees harvested or removed. This includes:
- Timber harvesting
- Land clearing
- Salvage operations

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    result = removals(db)

display_estimate(result, title="Annual Removals")

In [None]:
# Removals metrics
rem_per_acre = result["REMOVALS_ACRE"][0]
rem_total = result["REMOVALS_TOTAL"][0]

print(f"Rhode Island Annual Removals:")
print(f"  Per acre: {rem_per_acre:,.1f} cubic feet/acre/year")
print(f"  Total:    {rem_total/1e6:,.2f} million cubic feet/year")

### Removals by Species

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    result = removals(db, by_species=True)
    
    # Add species names (pass db for reference table lookup)
    result_named = join_species_names(result, db)

top_removals = result_named.sort("REMOVALS_TOTAL", descending=True).head(10)

display_estimate(
    top_removals.select(["SPCD_NAME", "REMOVALS_ACRE", "REMOVALS_TOTAL", "N_TREES"]),
    title="Top 10 Species by Removals"
)

---

## 5. Net Change Calculation

Now let's calculate the forest balance: **Net Change = Growth - Mortality - Removals**

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    growth_result = growth(db)
    mort_result = mortality(db)
    rem_result = removals(db)
    
    # Also get current standing volume for context
    db.clip_most_recent(eval_type="VOL")  # Switch to volume evaluation
    standing_vol = volume(db)

# Extract totals
g = growth_result["GROWTH_TOTAL"][0]
m = mort_result["MORTALITY_TOTAL"][0]
r = rem_result["REMOVALS_TOTAL"][0]
standing = standing_vol["VOLCFNET_TOTAL"][0]

# Calculate net change
net_change = g - m - r

print("="*55)
print("RHODE ISLAND FOREST CHANGE SUMMARY (Annual)")
print("="*55)
print(f"\n  Growth:           +{g/1e6:>8,.2f} million cuft/year")
print(f"  Mortality:        -{m/1e6:>8,.2f} million cuft/year")
print(f"  Removals:         -{r/1e6:>8,.2f} million cuft/year")
print(f"  " + "-"*40)
print(f"  Net Change:       {'+' if net_change >= 0 else ''}{net_change/1e6:>8,.2f} million cuft/year")
print(f"\n  Standing Volume:   {standing/1e6:>8,.1f} million cuft")
print(f"  Annual Change %:   {'+' if net_change >= 0 else ''}{net_change/standing*100:.2f}%")
print("="*55)

In [None]:
# Visualize the forest balance
fig, ax = plt.subplots(figsize=(10, 6))

components = ['Growth', 'Mortality', 'Removals', 'Net Change']
values = [g/1e6, -m/1e6, -r/1e6, net_change/1e6]
colors = ['#2E7D32', '#C62828', '#F57C00', '#1565C0' if net_change >= 0 else '#C62828']

bars = ax.bar(components, values, color=colors, edgecolor='white', linewidth=2)

# Add value labels
for bar, val in zip(bars, values):
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height,
            f'{val:+.2f}',
            ha='center', va='bottom' if height >= 0 else 'top',
            fontweight='bold', fontsize=12)

ax.axhline(y=0, color='black', linestyle='-', linewidth=0.5)
ax.set_ylabel('Million Cubic Feet per Year')
ax.set_title('Rhode Island Annual Forest Change Components', fontsize=14, fontweight='bold')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.tight_layout()
plt.show()

---

## 6. Species-Level Change Analysis

Let's calculate net change by species to see which are gaining or losing.

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    growth_sp = growth(db, by_species=True)
    mort_sp = mortality(db, by_species=True)
    rem_sp = removals(db, by_species=True)

    # Merge datasets on SPCD
    change_df = (
        growth_sp.select(["SPCD", "GROWTH_TOTAL"])
        .join(
            mort_sp.select(["SPCD", "MORT_TOTAL"]),
            on="SPCD",
            how="outer"
        )
        .join(
            rem_sp.select(["SPCD", "REMOVALS_TOTAL"]),
            on="SPCD",
            how="outer"
        )
        .fill_null(0)  # Species with no mortality or removals
    )

    # Calculate net change
    change_df = change_df.with_columns(
        (pl.col("GROWTH_TOTAL") - pl.col("MORT_TOTAL") - pl.col("REMOVALS_TOTAL")).alias("NET_CHANGE")
    )

    # Add species names (pass db for reference table lookup)
    change_named = join_species_names(change_df, db)

# Sort by net change
change_sorted = change_named.sort("NET_CHANGE", descending=True)

In [None]:
# Top gainers
print("Species GAINING Volume:")
gainers = change_sorted.filter(pl.col("NET_CHANGE") > 0).head(10)
for row in gainers.iter_rows(named=True):
    print(f"  {row['SPCD_NAME']:<30} +{row['NET_CHANGE']/1e6:,.2f} M cuft/yr")

In [None]:
# Top losers
print("Species LOSING Volume:")
losers = change_sorted.filter(pl.col("NET_CHANGE") < 0).sort("NET_CHANGE").head(10)
for row in losers.iter_rows(named=True):
    print(f"  {row['SPCD_NAME']:<30} {row['NET_CHANGE']/1e6:,.2f} M cuft/yr")

In [None]:
# Visualize winners and losers
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Gainers
top_gainers = change_sorted.filter(pl.col("NET_CHANGE") > 0).head(8)
ax1.barh(top_gainers["SPCD_NAME"].to_list()[::-1], 
         [v/1e6 for v in top_gainers["NET_CHANGE"].to_list()[::-1]], 
         color='#2E7D32')
ax1.set_xlabel('Net Change (million cuft/yr)')
ax1.set_title('Species Gaining Volume', fontweight='bold')
ax1.spines['top'].set_visible(False)
ax1.spines['right'].set_visible(False)

# Losers
top_losers = change_sorted.filter(pl.col("NET_CHANGE") < 0).sort("NET_CHANGE").head(8)
ax2.barh(top_losers["SPCD_NAME"].to_list()[::-1], 
         [v/1e6 for v in top_losers["NET_CHANGE"].to_list()[::-1]], 
         color='#C62828')
ax2.set_xlabel('Net Change (million cuft/yr)')
ax2.set_title('Species Losing Volume', fontweight='bold')
ax2.spines['top'].set_visible(False)
ax2.spines['right'].set_visible(False)

plt.tight_layout()
plt.show()

---

## 7. Change by Ownership

Analyze change patterns across different ownership categories.

In [None]:
ownership_names = {10: "National Forest", 20: "Other Federal", 30: "State/Local", 40: "Private"}

with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    growth_own = growth(db, grp_by="OWNGRPCD")
    mort_own = mortality(db, grp_by="OWNGRPCD")
    rem_own = removals(db, grp_by="OWNGRPCD")

# Build change summary
print("Net Change by Ownership:")
print(f"{'Ownership':<20} {'Growth':>12} {'Mortality':>12} {'Removals':>12} {'Net Change':>12}")
print("-" * 70)

for own_code, own_name in ownership_names.items():
    g_row = growth_own.filter(pl.col("OWNGRPCD") == own_code)
    m_row = mort_own.filter(pl.col("OWNGRPCD") == own_code)
    r_row = rem_own.filter(pl.col("OWNGRPCD") == own_code)
    
    if len(g_row) > 0:
        g_val = g_row["GROWTH_TOTAL"][0]
        m_val = m_row["MORTALITY_TOTAL"][0] if len(m_row) > 0 else 0
        r_val = r_row["REMOVALS_TOTAL"][0] if len(r_row) > 0 else 0
        net = g_val - m_val - r_val
        
        print(f"{own_name:<20} {g_val/1e3:>+12,.0f} {-m_val/1e3:>12,.0f} {-r_val/1e3:>12,.0f} {net/1e3:>+12,.0f}")

print("\n(Values in thousand cubic feet per year)")

---

## 8. Forest Sustainability Assessment

Use GRM data to assess forest sustainability.

In [None]:
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    
    g = growth(db)
    m = mortality(db)
    r = removals(db)
    
    db.clip_most_recent(eval_type="VOL")
    standing = volume(db)

# Calculate key ratios
growth_val = g["GROWTH_TOTAL"][0]
mort_val = m["MORTALITY_TOTAL"][0]
rem_val = r["REMOVALS_TOTAL"][0]
standing_val = standing["VOLCFNET_TOTAL"][0]

# Growth-to-drain ratio (sustainable if > 1.0)
drain = mort_val + rem_val
growth_drain_ratio = growth_val / drain if drain > 0 else float('inf')

# Removal-to-growth ratio (sustainable harvest level)
removal_growth_ratio = rem_val / growth_val if growth_val > 0 else 0

# Mortality-to-growth ratio
mort_growth_ratio = mort_val / growth_val if growth_val > 0 else 0

print("="*55)
print("FOREST SUSTAINABILITY INDICATORS")
print("="*55)
print(f"\nGrowth-to-Drain Ratio:        {growth_drain_ratio:>6.2f}")
print(f"  (Values > 1.0 indicate accumulation)")

print(f"\nRemoval-to-Growth Ratio:      {removal_growth_ratio:>6.1%}")
print(f"  (Sustainable harvest typically < 80%)")

print(f"\nMortality-to-Growth Ratio:    {mort_growth_ratio:>6.1%}")
print(f"  (High values may indicate forest stress)")

print(f"\nAnnual Net Change:            {'+' if (growth_val - drain) >= 0 else ''}{(growth_val - drain)/standing_val*100:.2f}%")
print(f"  (of standing inventory)")
print("="*55)

# Assessment
print("\nSUSTAINABILITY ASSESSMENT:")
if growth_drain_ratio > 1.0:
    print("  Forest is ACCUMULATING volume")
else:
    print("  Forest is DEPLETING - removals + mortality exceed growth")
    
if removal_growth_ratio < 0.5:
    print("  Harvest level is well below sustainable capacity")
elif removal_growth_ratio < 0.8:
    print("  Harvest level is within sustainable range")
else:
    print("  Harvest level is at or above sustainable capacity")

---

## Exercise 1: Mortality Analysis by Size Class

**Task**: Analyze mortality patterns across diameter classes.

1. Get mortality by size class using `by_size_class=True`
2. Calculate the mortality rate (mortality / standing volume) for each class
3. Identify which size class has the highest mortality rate

**Hint**: You'll need to also get standing volume by size class from `volume()`

In [None]:
# Your code here


<details>
<summary><b>Click to reveal solution</b></summary>

```python
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="GRM")
    mort_size = mortality(db, by_size_class=True)
    
    db.clip_most_recent(eval_type="VOL")
    vol_size = volume(db, by_size_class=True)

# Merge and calculate rate
analysis = (
    mort_size.select(["SIZE_CLASS", "MORTALITY_TOTAL"])
    .join(
        vol_size.select(["SIZE_CLASS", "VOLCFNET_TOTAL"]),
        on="SIZE_CLASS"
    )
    .with_columns(
        (pl.col("MORTALITY_TOTAL") / pl.col("VOLCFNET_TOTAL") * 100).alias("MORTALITY_RATE_PCT")
    )
    .sort("SIZE_CLASS")
)

print("Mortality Rate by Size Class:")
print(f"{'Size Class':<15} {'Mortality':>15} {'Standing Vol':>15} {'Rate':>10}")
print("-" * 55)
for row in analysis.iter_rows(named=True):
    print(f"{row['SIZE_CLASS']:<15} {row['MORT_TOTAL']:>15,.0f} {row['VOLCFNET_TOTAL']:>15,.0f} {row['MORTALITY_RATE_PCT']:>9.2f}%")

highest = analysis.sort("MORTALITY_RATE_PCT", descending=True)[0]
print(f"\nHighest mortality rate: {highest['SIZE_CLASS']} class at {highest['MORTALITY_RATE_PCT']:.2f}% per year")
```

</details>

---

## Exercise 2: Species Sustainability Report

**Task**: Create a sustainability report for the top 5 species by volume.

For each species, calculate:
1. Standing volume
2. Annual growth
3. Annual mortality
4. Annual removals
5. Net change
6. Years to double (if accumulating) or deplete (if declining)

**Hint**: Years to double/deplete ≈ Standing Volume / |Net Change|

In [None]:
# Your code here


<details>
<summary><b>Click to reveal solution</b></summary>

```python
with FIA(db_path) as db:
    db.clip_most_recent(eval_type="VOL")
    standing = volume(db, by_species=True)
    
    db.clip_most_recent(eval_type="GRM")
    growth_sp = growth(db, by_species=True)
    mort_sp = mortality(db, by_species=True)
    rem_sp = removals(db, by_species=True)

    # Get top 5 species
    top_species = standing.sort("VOLCFNET_TOTAL", descending=True).head(5)["SPCD"].to_list()

    print("SPECIES SUSTAINABILITY REPORT")
    print("="*80)

    for spcd in top_species:
        # Get values for this species
        s = standing.filter(pl.col("SPCD") == spcd)
        g = growth_sp.filter(pl.col("SPCD") == spcd)
        m = mort_sp.filter(pl.col("SPCD") == spcd)
        r = rem_sp.filter(pl.col("SPCD") == spcd)
        
        if len(s) > 0:
            s_val = s["VOLCFNET_TOTAL"][0]
            g_val = g["GROWTH_TOTAL"][0] if len(g) > 0 else 0
            m_val = m["MORT_TOTAL"][0] if len(m) > 0 else 0
            r_val = r["REMOVALS_TOTAL"][0] if len(r) > 0 else 0
            net = g_val - m_val - r_val
            
            # Get species name (pass db for reference table lookup)
            s_named = join_species_names(s, db)
            name = s_named["SPCD_NAME"][0]
            
            years = abs(s_val / net) if net != 0 else float('inf')
            
            print(f"\n{name} (SPCD {spcd})")
            print(f"  Standing Volume: {s_val/1e6:,.2f} million cuft")
            print(f"  Annual Growth:   +{g_val/1e3:,.0f} thousand cuft/yr")
            print(f"  Annual Mortality: -{m_val/1e3:,.0f} thousand cuft/yr")
            print(f"  Annual Removals:  -{r_val/1e3:,.0f} thousand cuft/yr")
            print(f"  Net Change:      {'+' if net >= 0 else ''}{net/1e3:,.0f} thousand cuft/yr")
            
            if net > 0:
                print(f"  Status: ACCUMULATING (~{years:.0f} years to double)")
            elif net < 0:
                print(f"  Status: DECLINING (~{years:.0f} years to deplete)")
            else:
                print(f"  Status: STABLE")

print("\n" + "="*80)
```

</details>

---

## Summary

In this notebook, you learned:

1. **GRM methodology** - Growth, Removal, Mortality estimation
2. **`mortality()`** - Annual mortality by species, size, ownership
3. **`growth()`** - Annual net growth (survivor + ingrowth)
4. **`removals()`** - Annual harvest and removal volumes
5. **Net Change** - Growth - Mortality - Removals
6. **Measure options** - volume, biomass, tpa, basal_area
7. **Sustainability metrics** - Growth-to-drain ratio, removal-to-growth

### Key Points

- GRM estimates are **annualized** - they represent yearly rates
- Use `eval_type="GRM"` for change analysis
- Positive net change indicates accumulation, negative indicates depletion
- Growth-to-drain ratio > 1.0 indicates sustainable forest

## Next Steps

Continue to **Notebook 5: Validation and Statistics** to learn:
- Statistical methodology (Bechtold & Patterson 2005)
- Understanding variance and confidence intervals
- Validating results against EVALIDator
- Best practices and common pitfalls