# 04. Population Data Ingestion

## Why Population Data Matters for Fire Prediction

Population data is crucial for wildfire risk assessment because:

### **Human Fire Causes**
- **90% of wildfires** are human-caused (campfires, equipment, arson, etc.)
- Higher population density = higher probability of human ignition
- Urban areas have more ignition sources (power lines, vehicles, etc.)

### **Wildland-Urban Interface (WUI)**
- Where people meet forests = highest fire risk areas
- Evacuation planning requires knowing how many people live in fire-prone areas
- Resource allocation prioritizes protecting populated areas

### **Fire Suppression & Response**
- More people = more fire stations and resources nearby
- Population density affects emergency response times
- Evacuation routes depend on population distribution

### **Economic Impact**
- Property damage scales with population density
- Insurance costs and fire suppression budgets correlate with population
- Business disruption affects more people in dense areas

## Data Source
- **Source**: US Census Bureau API
- **Coverage**: California counties (2000-2024)
- **Update Frequency**: Annual (Decennial Census + American Community Survey)
- **API Documentation**: https://www.census.gov/data/developers/data-sets.html

## Objectives
1. Load and validate California population data
2. Explore population trends and distributions
3. Calculate population density metrics
4. Identify high-risk Wildland-Urban Interface areas
5. Prepare population features for ML model


## Import Libraries


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas as gpd
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
plt.style.use('default')
sns.set_palette("husl")

print("📚 Libraries imported successfully!")
