# Getting Started with FIA Data in Python

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mihiarc/fiatools/blob/main/tutorials/01_getting_started_with_fia_data.ipynb)
[![FIAtools](https://img.shields.io/badge/FIAtools-Ecosystem-2E7D32)](https://fiatools.org)

This tutorial introduces you to working with USDA Forest Service Forest Inventory and Analysis (FIA) data using Python and the **pyFIA** library.

## What You'll Learn

- What FIA data is and why it's useful
- How to install pyFIA and download FIA data
- How to query forest area, timber volume, biomass, and tree counts
- How to filter data by state, species, and other criteria
- How to interpret the statistical output

## Prerequisites

- Basic Python knowledge
- Python 3.11 or higher

---

## 1. What is FIA Data?

The [Forest Inventory and Analysis (FIA)](https://www.fia.fs.usda.gov/) program is the nation's forest census. The USDA Forest Service collects data on:

- **Forest area** - How much land is forested?
- **Timber volume** - How much merchantable wood is available?
- **Biomass & Carbon** - How much carbon is stored in forests?
- **Tree counts** - How many trees per acre by species?
- **Forest health** - Mortality rates, growth, removals

FIA data is collected from ~300,000 permanent sample plots across the US, with each plot revisited every 5-10 years.

### Why pyFIA?

The official tool for querying FIA data is [EVALIDator](https://apps.fs.usda.gov/fiadb-api/evalidator), a web-based interface. **pyFIA** provides programmatic access with:

- Python API for scripting and automation
- DuckDB backend for fast queries
- EVALIDator-compatible statistical methods
- Full control over filtering and grouping

## 2. Installation

Install pyFIA from PyPI:

In [None]:
# Install pyFIA
!pip install -q pyfia

## 3. Downloading FIA Data

pyFIA can download data directly from the FIA DataMart. Let's download data for a small state (Rhode Island) to keep the download quick:

In [None]:
from pyfia import FIA

# Download FIA data for Rhode Island (smallest state = fastest download)
# This downloads ~20MB and takes about 1-2 minutes
db = FIA.from_download(
    states="RI",          # Rhode Island
    common=True,          # Only tables needed for pyFIA
    show_progress=True
)

print(f"Database loaded: {db.db_path}")

### State Abbreviations

Use standard 2-letter state abbreviations:

| State | Code | State | Code |
|-------|------|-------|------|
| Alabama | AL | Montana | MT |
| California | CA | North Carolina | NC |
| Colorado | CO | Oregon | OR |
| Florida | FL | Texas | TX |
| Georgia | GA | Virginia | VA |
| Maine | ME | Washington | WA |

## 4. Filtering to a State

Before running queries, filter to a specific state and evaluation:

In [None]:
# Filter to Rhode Island with most recent evaluation
# State FIPS code for Rhode Island is 44
db.clip_by_state(44)

print(f"Filtered to state FIPS: {db.state_filter}")
print(f"Using EVALID: {db.evalid}")

### State FIPS Codes

FIA uses FIPS codes internally. Common codes:

| State | FIPS | State | FIPS |
|-------|------|-------|------|
| Alabama | 1 | North Carolina | 37 |
| California | 6 | Oregon | 41 |
| Florida | 12 | Rhode Island | 44 |
| Georgia | 13 | Texas | 48 |

See the full list at [FIPS state codes](https://www.census.gov/library/reference/code-lists/ansi.html).

## 5. Core Queries

pyFIA provides functions for common forest metrics. Let's explore each one.

### 5.1 Forest Area

Query the total forest land area:

In [None]:
from pyfia import area

# Total forest area
forest_area = area(db, land_type="forest")
print("Forest Area in Rhode Island:")
print(forest_area)

In [None]:
# Forest area by ownership group
area_by_owner = area(db, land_type="forest", grp_by="OWNGRPCD")
print("\nForest Area by Ownership Group:")
print(area_by_owner)

**Ownership Group Codes (OWNGRPCD):**
- 10 = National Forest
- 20 = Other Federal
- 30 = State/Local Government
- 40 = Private

**Parameters:**
- `land_type`: "forest", "timber", "all"
- `grp_by`: Group results by column(s) like "FORTYPCD" (forest type), "OWNGRPCD" (ownership)

### 5.2 Timber Volume

Query merchantable timber volume:

In [None]:
from pyfia import volume

# Total merchantable volume on timberland
timber_vol = volume(db, land_type="timber", tree_type="gs")
print("Timber Volume (Growing Stock):")
print(timber_vol)

In [None]:
# Volume by species group
vol_by_species = volume(db, land_type="timber", grp_by="SPGRPCD")
print("\nVolume by Species Group:")
print(vol_by_species)

**Parameters:**
- `land_type`: "forest", "timber"
- `tree_type`: "all", "gs" (growing stock)
- `grp_by`: Group by species, size class, etc.

### 5.3 Biomass and Carbon

Query above-ground and below-ground biomass:

In [None]:
from pyfia import biomass

# Total biomass
total_biomass = biomass(db)
print("Total Biomass:")
print(total_biomass)

In [None]:
# Biomass by species group
biomass_by_species = biomass(db, grp_by="SPGRPCD")
print("\nBiomass by Species Group:")
print(biomass_by_species)

**Output columns:**
- `BIO_TOTAL`: Total dry biomass (tons)
- `BIO_ACRE`: Dry biomass per acre (tons/acre)
- `CARB_TOTAL`: Total carbon (tons)
- `CARB_ACRE`: Carbon per acre (tons/acre)

### 5.4 Trees Per Acre

Query tree density:

In [None]:
from pyfia import tpa

# All live trees
live_trees = tpa(db, tree_domain="STATUSCD == 1")
print("Live Trees Per Acre:")
print(live_trees)

In [None]:
# Trees by species group
tpa_by_species = tpa(db, tree_domain="STATUSCD == 1", grp_by="SPGRPCD")
print("\nTrees Per Acre by Species Group:")
print(tpa_by_species)

**Tree Domain Filters:**
- `STATUSCD == 1`: Live trees only
- `STATUSCD == 2`: Dead trees only
- `DIA >= 5.0`: Trees 5 inches diameter or larger
- `SPCD == 131`: Loblolly pine only (species code 131)

## 6. Understanding the Output

pyFIA returns Polars DataFrames with statistical estimates. Key columns:

| Column | Description |
|--------|-------------|
| `*_TOTAL` | Total estimate for the area (acres, cubic feet, tons) |
| `*_SE` | Standard error of the estimate |
| `*_PERCENT` | Percentage of total |
| `N_PLOTS` | Number of plots in the estimate |

### Interpreting Standard Error

The standard error (SE) indicates uncertainty. A 95% confidence interval is approximately:

```
Estimate ± (1.96 × SE)
```

Lower SE = more precise estimate (usually from more plots).

## 7. Putting It All Together

Here's a complete example that generates a forest summary report:

In [None]:
from pyfia import FIA, area, volume, biomass, tpa

def generate_forest_summary(db):
    """Generate a summary of forest resources."""
    
    # Get forest metrics
    forest_area_result = area(db, land_type="forest")
    timber_vol_result = volume(db, land_type="timber", tree_type="gs")
    biomass_result = biomass(db)
    tpa_result = tpa(db, tree_domain="STATUSCD == 1")
    
    # Print summary
    print(f"\n{'='*50}")
    print(f"Forest Summary Report")
    print(f"{'='*50}\n")
    
    if not forest_area_result.is_empty():
        row = forest_area_result.row(0, named=True)
        total = row.get('AREA', 0)
        se = row.get('AREA_SE', 0)
        print(f"Forest Area: {total:,.0f} acres (SE: {se:,.0f})")
    
    if not timber_vol_result.is_empty():
        row = timber_vol_result.row(0, named=True)
        total = row.get('VOLCFNET_TOTAL', 0)
        print(f"Timber Volume: {total:,.0f} cubic feet")
    
    if not biomass_result.is_empty():
        row = biomass_result.row(0, named=True)
        total = row.get('BIO_TOTAL', 0)
        print(f"Above-ground Biomass: {total:,.0f} tons")
    
    if not tpa_result.is_empty():
        row = tpa_result.row(0, named=True)
        tpa_val = row.get('TPA', 0)
        print(f"Live Trees: {tpa_val:,.0f} trees/acre")
    
    print(f"\n{'='*50}")

# Run the summary
generate_forest_summary(db)

## 8. Try It Yourself: Different State

Download data for a different state and run the same analysis:

In [None]:
# Try Delaware (small state, quick download)
# Uncomment to run:

# db_de = FIA.from_download(states="DE", common=True)
# db_de.clip_by_state(10)  # Delaware FIPS = 10
# generate_forest_summary(db_de)

## 9. Next Steps

Now that you understand the basics, explore more:

### More pyFIA Features
- **Mortality analysis**: `mortality(db)` for tree death rates
- **Growth estimation**: `growth(db)` for net growth
- **Custom grouping**: Use `grp_by` parameter for any FIA column
- **Reference tables**: `join_species_names()` to add species names

### Other FIAtools
- **[gridFIA](https://fiatools.org/tools/gridfia/)**: 30m resolution biomass maps
- **[pyFVS](https://fiatools.org/tools/pyfvs/)**: Forest growth simulation
- **[askFIA](https://fiatools.org/tools/askfia/)**: Natural language queries

### Resources
- [pyFIA Documentation](https://mihiarc.github.io/pyfia/)
- [FIAtools Website](https://fiatools.org)
- [FIA DataMart](https://apps.fs.usda.gov/fia/datamart/datamart.html)
- [EVALIDator](https://apps.fs.usda.gov/fiadb-api/evalidator) (for comparison)

---

**Questions or feedback?** Open an issue on [GitHub](https://github.com/mihiarc/pyfia/issues).