# eBird County-Level Weekly Bird Group Maps (Illinois)

This notebook pulls bird observations from the **eBird API** for a specified date range in Illinois,
maps each observation into one of **15 predefined bird groups**, and generates:

- a filtered dataset containing **only the species in those groups**
- weekly aggregated counts by:
  - `week_start`
  - `county`
  - `bird group`
- choropleth maps showing weekly county totals for each group

Data Source:
- eBird API (historic observations)
- Census TIGER/Cartographic Boundary County Shapefiles (Illinois counties)


## 1. Imports

Imports for API calls, date handling, data processing, and mapping.

In [1]:
import os
import json
import requests
from pathlib import Path
from datetime import datetime, timedelta, date as DateType
from concurrent.futures import ThreadPoolExecutor, as_completed

import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt

from bird_groups import get_group_for_species

## 2. API Fetching (Historic Observations)

We use the eBird *historic observations* endpoint to pull raw observations for each day
in the date range. Each day’s response is cached to disk so reruns are faster.

In [2]:
def get_bird_data_date(state: str, date: DateType) -> list[dict]:
	'''
	Handles a single request (since I run this with ThreadPoolExecutor).
	Uses the historic data endpoint to pull observation data for specified state.

	NOTE: `date` should be a datetime.date (or datetime.datetime.date()).
	If you start from "mm/dd/yy", parse it before calling.

	Uses the following format:
	https://api.ebird.org/v2/data/obs/{state}/historic/{y}/{m}/{d}
	'''

	# --- auth ---
	api_key = os.getenv("EBIRD_API_KEY")
	if not api_key:
		raise RuntimeError("EBIRD_API_KEY not found in environment.")
	headers = {'X-eBirdApiToken': api_key}

	# --- ensure cache dir exists ---
	cache_dir = Path("cache")
	cache_dir.mkdir(parents=True, exist_ok=True)

	# --- cache file ---
	cache_file = cache_dir / f"{state}_{date.strftime('%Y-%m-%d')}.json"

	# check if i already pulled this data
	if cache_file.exists():
		with open(cache_file, "r") as f:
			return json.load(f)

	# --- request ---
	url = f"https://api.ebird.org/v2/data/obs/{state}/historic/{date.year}/{date.month:02}/{date.day:02}"
	# need full detail to get county
	response = requests.get(url, headers=headers, params={"detail": "full"}, timeout=60)

	if response.status_code == 200:
		day_data = response.json()
		with open(cache_file, "w") as f:
			json.dump(day_data, f)
		return day_data
	else:
		print(f"Failed {state} {date.isoformat()} ({response.status_code})")
		return []

## 3. Download Observations for a Date Range

We pull eBird observations for each day in the date range (threaded using ThreadPoolExecutor),
then combine everything into a single DataFrame.

In [3]:
def daterange(start_date: str, end_date: str) -> list:
    """
    start_date/end_date in mm/dd/yy (inclusive)
    """
    start = datetime.strptime(start_date, "%m/%d/%y").date()
    end   = datetime.strptime(end_date, "%m/%d/%y").date()
    days = []
    cur = start
    while cur <= end:
        days.append(cur)
        cur += timedelta(days=1)
    return days


def get_data_as_df(state: str, start_date: str, end_date: str, max_workers: int = 6) -> pd.DataFrame:
    """
    Pulls historic observations for all dates in [start_date, end_date] and returns one DataFrame.
    """
    days = daterange(start_date, end_date)

    all_rows: list[dict] = []

    with ThreadPoolExecutor(max_workers=max_workers) as ex:
        futures = [ex.submit(get_bird_data_date, state, d) for d in days]

        for fut in as_completed(futures):
            rows = fut.result()  # list[dict]
            if rows:
                all_rows.extend(rows)

    df = pd.DataFrame(all_rows)
    return df

## 4. County Name Lookup (eBird Region Reference)

eBird returns counties using `subnational2Code` (example: `US-IL-031`).
This helper function downloads the region list for Illinois counties and provides a
mapping from:

`subnational2Code → county name`

This is later merged into the aggregated weekly dataset.


In [4]:
def get_county_names(state_code: str) -> pd.DataFrame:
    """
    Fetch eBird county reference list for a state.

    Returns DataFrame with columns:
    - subnational2Code (e.g., US-IL-031)
    - county (e.g., Cook)
    """
    url = f"https://api.ebird.org/v2/ref/region/list/subnational2/{state_code}"

    api_key = os.getenv("EBIRD_API_KEY")
    if not api_key:
        raise RuntimeError("EBIRD_API_KEY not found in environment.")
    headers = {'X-eBirdApiToken': api_key}

    r = requests.get(url, headers=headers, timeout=60)
    r.raise_for_status()

    rows = r.json()  # [{'code': 'US-IL-031', 'name': 'Cook'}, ...]
    return pd.DataFrame(rows).rename(
        columns={"code": "subnational2Code", "name": "county"}
    )

## 5. Download Observations and Filter to 15 Groups

We pull raw observations for the date range, assign each record to one of our predefined bird groups,
and remove any species not included in those groups.

In [5]:
df = get_data_as_df("US-IL", "01/06/26", "01/15/26", max_workers=6)

df["group"] = df["comName"].apply(get_group_for_species)

# Keep only observations that match our predefined 15 groups/species list
df = df[df["group"] != "NOT_IN_GROUPS"].copy()

df["group"].value_counts()

group
Passerines                  72
Diving Ducks                66
Dabbling Ducks              63
Raptors - Hawks & Eagles    57
Geese                       31
Owls                        26
Swans                       24
Gulls & Terns               16
Pigeons & Coots             16
Raptors - Falcons           16
Pelicans & Cormorants       16
Vultures                    15
Cranes                      13
Herons & Egrets             10
Grebes                       8
Name: count, dtype: int64

## 6. Clean Key Fields

We convert:
- `obsDt` → datetime
- `howMany` → numeric (missing counts are treated as 1 observation)

In [6]:
df["obsDt"] = pd.to_datetime(df["obsDt"], errors="coerce")
df["howMany"] = pd.to_numeric(df.get("howMany"), errors="coerce").fillna(1)

df[["obsDt", "howMany", "subnational2Code", "group"]].head()

Unnamed: 0,obsDt,howMany,subnational2Code,group
0,2026-01-08 22:55:00,1.0,US-IL-103,Owls
3,2026-01-08 18:11:00,2.0,US-IL-001,Passerines
4,2026-01-08 18:11:00,13.0,US-IL-001,Passerines
7,2026-01-08 18:11:00,1.0,US-IL-001,Passerines
9,2026-01-08 16:54:00,70.0,US-IL-181,Vultures


## 7. Weekly Aggregation (County × Group)

We compute a Monday-based `week_start` and aggregate total observations per:
- week_start
- county (subnational2Code)
- group

In [7]:
county_df = get_county_names("US-IL")

# Week starts on Monday
df["week_start"] = df["obsDt"].dt.to_period("W-MON").apply(lambda r: r.start_time.date())

weekly = (
    df.groupby(["week_start", "subnational2Code", "group"], as_index=False)["howMany"]
      .sum()
      .rename(columns={"howMany": "total_count"})
)

# Add county names for readability
weekly = weekly.merge(county_df, on="subnational2Code", how="left")

weekly[["week_start", "county", "group", "total_count"]].head()

Unnamed: 0,week_start,county,group,total_count
0,2026-01-06,Adams,Passerines,137.0
1,2026-01-06,Adams,Pelicans & Cormorants,60.0
2,2026-01-06,Bond,Dabbling Ducks,2.0
3,2026-01-06,Boone,Owls,1.0
4,2026-01-06,Carroll,Raptors - Hawks & Eagles,2.0


## (Optional) Export Weekly Aggregated Dataset

This saves the weekly county × group totals as a CSV for reuse outside the notebook.

In [8]:
# weekly.to_csv("weekly_bird_counts_by_county_group.csv", index=False)
# print("Saved: weekly_bird_counts_by_county_group.csv")

## 8. County Boundaries (Illinois)

We load Illinois county geometries from the US Census *cartographic boundary* shapefiles (cb_).
These are optimized for mapping and avoid local shapefile dependencies.

In [9]:
# Census cartographic boundary counties (good for visualization)
COUNTIES_URL = "https://www2.census.gov/geo/tiger/GENZ2024/shp/cb_2024_us_county_5m.zip"

# Load all US counties, then filter Illinois (STATEFP = 17)
counties = gpd.read_file(COUNTIES_URL).to_crs("EPSG:4326")
il_counties = counties[counties["STATEFP"] == "17"].copy()

# Standardize county name column for merging
il_counties["county"] = il_counties["NAME"].str.strip()

# Keep only the required fields
il_counties = il_counties[["GEOID", "county", "geometry"]]

## 9. Generate Weekly Choropleth Maps (County × Group)

For each week and each bird group:
- merge aggregated county totals with Illinois county geometry
- generate a choropleth map
- save output as PNG

Output folder structure:
`maps/<week_start>/<group>.png`

In [10]:
out_dir = Path("maps")
out_dir.mkdir(exist_ok=True)

weeks = sorted(weekly["week_start"].unique())
groups = sorted(weekly["group"].unique())

print("Weeks:", weeks)
print("Groups:", groups)

Weeks: [datetime.date(2026, 1, 6), datetime.date(2026, 1, 13)]
Groups: ['Cranes', 'Dabbling Ducks', 'Diving Ducks', 'Geese', 'Grebes', 'Gulls & Terns', 'Herons & Egrets', 'Owls', 'Passerines', 'Pelicans & Cormorants', 'Pigeons & Coots', 'Raptors - Falcons', 'Raptors - Hawks & Eagles', 'Swans', 'Vultures']


In [11]:
for wk in weeks:
    wk_dir = out_dir / str(wk)
    wk_dir.mkdir(parents=True, exist_ok=True)

    wk_data = weekly[weekly["week_start"] == wk].copy()

    for grp in groups:
        grp_data = wk_data[wk_data["group"] == grp][["county", "total_count"]].copy()

        # merge to geometry
        gdf = il_counties.merge(grp_data, on="county", how="left")
        gdf["total_count"] = gdf["total_count"].fillna(0)

        fig, ax = plt.subplots(1, 1, figsize=(10, 10))
        gdf.plot(
            column="total_count",
            ax=ax,
            legend=True,
            linewidth=0.7,
            edgecolor="black",
            missing_kwds={"color": "lightgrey"},
        )

        ax.set_title(f"{grp} | Week starting {wk}", fontsize=14)
        ax.axis("off")

        # safe filename
        safe_grp = grp.replace("/", "-").replace("&", "and").replace(" ", "_")
        out_path = wk_dir / f"{safe_grp}.png"

        plt.savefig(out_path, dpi=300, bbox_inches="tight")
        plt.close(fig)

print("Maps saved to:", out_dir.resolve())

Maps saved to: /Users/aarshpatel/Downloads/midwestern_birds/maps
