###Exploring the migration of the Swainson's Thrush (Catharus ustulatus) in the year 2023 using the data from the GBIF


Swainson’s Thrush (scientific name Catharus ustulatus) is a small to medium-sized migratory songbird in the thrush family (Turdidae). It is known for its olive-brown back, buff-colored underparts, a distinct pale eye-ring, and a slightly spotted breast. The species is distributed widely across northern North America for breeding, and winters in Central and South America. The species includes two main subspecies groups — a coastal russet-back group and an inland olive-back group — which differ in both appearance and migration routes, reflecting their evolutionary divergence ([Audubon.org](https://www.audubon.org/field-guide/bird/swainsons-thrush))

As a classic Nearctic–Neotropical migrant, Swainson’s Thrush travels thousands of kilometers each year. Spring migration typically occurs from mid-March to early June, while fall migration spans late July to early November. Inland birds tend to follow interior routes through the central U.S. into Central America, whereas coastal populations use a more Pacific flyway. These birds rely on forested stopover habitats along their route for rest and refueling, making them particularly sensitive to habitat changes across a vast geographic range ([E-bird Status and Trends](https://science.ebird.org/en/status-and-trends/species/swathr/range-map))

Though still relatively common, Swainson’s Thrush faces growing threats from habitat loss, window strikes during nocturnal flights, and potential climate-related shifts in food availability. Because different populations use distinct migration corridors, conservation strategies must account for their migratory connectivity — protecting not just breeding grounds, but also critical stopover and wintering habitats. Recent tracking efforts, such as those using Motus stations, are helping scientists better understand these movements and support targeted conservation actions ([Humpel et al., (2020)](https://www.nature.com/articles/s41598-020-62132-6))


Using the Global Biodiversity Information Facility (GBIF) website, which compiles data on a variety of specicies, this short assignment aims to look at the Swainson's migration pattern. 

In [1]:
%store -r df_swainsons swainsons_gdf ecoregion_gdf ecoregion_shpp

import os
import pathlib
import time
import calendar 
import zipfile
from getpass import getpass
from glob import glob

import geopandas as gpd
import pandas as pd
import pygbif.occurrences as occ
import pygbif.species as species

#dynamic mapping
import hvplot.pandas
import cartopy.crs as ccrs
import panel as pn
from panel.widgets import DiscreteSlider

no stored variable or alias ecoregion_shpp


In [2]:
swainsons_gdf.head()



Unnamed: 0_level_0,occurrenceID,species,scientificName,countryCode,occurrenceStatus,individualCount,decimalLatitude,decimalLongitude,day,month,year,speciesKey,basisOfRecord,geometry
gbifID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
4013362172,WNMR02000000,Catharus ustulatus,"Catharus ustulatus (Nuttall, 1840)",CA,PRESENT,,50.67613,-120.33787,15.0,7.0,2018,2490821,MATERIAL_SAMPLE,POINT (-120.33787 50.67613)
5146756083,https://arctos.database.museum/guid/MVZ:Bird:1...,Catharus ustulatus,"Catharus ustulatus (Nuttall, 1840)",US,PRESENT,,37.380274,-121.73866,14.0,5.0,2022,2490821,PRESERVED_SPECIMEN,POINT (-121.73866 37.38027)
5146763419,https://arctos.database.museum/guid/MVZ:Bird:1...,Catharus ustulatus,"Catharus ustulatus oedicus (Oberholser, 1899)",US,PRESENT,,35.51643,-115.57742,24.0,5.0,2018,2490821,PRESERVED_SPECIMEN,POINT (-115.57742 35.51643)
5146855430,https://arctos.database.museum/guid/MVZ:Bird:1...,Catharus ustulatus,"Catharus ustulatus oedicus (Oberholser, 1899)",US,PRESENT,,34.02132,-116.29706,9.0,5.0,2018,2490821,PRESERVED_SPECIMEN,POINT (-116.29706 34.02132)
5146761428,https://arctos.database.museum/guid/MVZ:Bird:1...,Catharus ustulatus,"Catharus ustulatus oedicus (Oberholser, 1899)",US,PRESENT,,35.77275,-115.88782,26.0,5.0,2017,2490821,PRESERVED_SPECIMEN,POINT (-115.88782 35.77275)


In [3]:
ecoregion_gdf.head()

Unnamed: 0_level_0,OBJECTID,ECO_NAME,BIOME_NUM,BIOME_NAME,REALM,ECO_BIOME_,NNH,ECO_ID,SHAPE_LENG,SHAPE_AREA,NNH_NAME,COLOR,COLOR_BIO,COLOR_NNH,LICENSE,geometry
ecoregion,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
0,1.0,Adelie Land tundra,11.0,Tundra,Antarctica,AN11,1,117,9.74978,0.038948,Half Protected,#63CFAB,#9ED7C2,#257339,CC-BY 4.0,"MULTIPOLYGON (((158.7141 -69.60657, 158.71264 ..."
1,2.0,Admiralty Islands lowland rain forests,1.0,Tropical & Subtropical Moist Broadleaf Forests,Australasia,AU01,2,135,4.800349,0.170599,Nature Could Reach Half Protected,#70A800,#38A700,#7BC141,CC-BY 4.0,"MULTIPOLYGON (((147.28819 -2.57589, 147.2715 -..."
2,3.0,Aegean and Western Turkey sclerophyllous and m...,12.0,"Mediterranean Forests, Woodlands & Scrub",Palearctic,PA12,4,785,162.523044,13.844952,Nature Imperiled,#FF7F7C,#FE0000,#EE1E23,CC-BY 4.0,"MULTIPOLYGON (((26.88659 35.32161, 26.88297 35..."
3,4.0,Afghan Mountains semi-desert,13.0,Deserts & Xeric Shrublands,Palearctic,PA13,4,807,15.084037,1.355536,Nature Imperiled,#FA774D,#CC6767,#EE1E23,CC-BY 4.0,"MULTIPOLYGON (((65.48655 34.71401, 65.52872 34..."
4,5.0,Ahklun and Kilbuck Upland Tundra,11.0,Tundra,Nearctic,NE11,1,404,22.590087,8.196573,Half Protected,#4C82B6,#9ED7C2,#257339,CC-BY 4.0,"MULTIPOLYGON (((-160.26404 58.64097, -160.2673..."


In [4]:
ecoregion_gdf.index.name = 'ecoregion'

In [5]:
# Simplify the geometry to speed up processing
ecoregion_gdf.geometry = ecoregion_gdf.simplify(.1, preserve_topology=False)
# Change the CRS to Mercator for mapping
ecoregion_gdf = ecoregion_gdf.to_crs(ccrs.Mercator())
# Check that the plot runs in a reasonable amount of time
#ecoreg_gdf.hvplot(geo=True, crs=ccrs.Mercator())

In [6]:
#convert the swainsons thrush occurrence data to a geodataframe
swainsons_gdf = (
    gpd.GeoDataFrame(
        swainsons_gdf, 
        geometry=gpd.points_from_xy(
            df_swainsons.decimalLongitude, 
            df_swainsons.decimalLatitude), 
        crs="EPSG:4326")
    # Select the desired columns
    #[['gbifID', 'decimalLatitude', 'decimalLongitude', 'month']]
)
swainsons_gdf = swainsons_gdf.to_crs(ccrs.Mercator())
#gaviapac_gbif_gdf
#gaviapac_gbif_gdf.crs

In [7]:
swainsons_ecoregion_gdf = (
    ecoregion_gdf
    # Match the CRS of the GBIF data and the ecoregions
    .to_crs(swainsons_gdf.crs)
    # Find ecoregion for each observation
    .sjoin(
        swainsons_gdf,
        how='inner', 
        predicate='contains')
    # Select the required columns
    [['OBJECTID', 'gbifID', 'ECO_NAME','BIOME_NUM','BIOME_NAME', 'month', 'SHAPE_AREA']]
)


# Aggregate the occurrences to ecoregion and month
swainsons_occ_df = (
    swainsons_ecoregion_gdf
    #.reset index()
    # For each ecoregion, for each month...
    .groupby(['ecoregion', 'month'])
    # ...count the number of occurrences
    .agg(occurrences=('gbifID', 'count'),
         area=('SHAPE_AREA', 'first'))
)
# Get rid of rare observations (possible misidentification?)
swainsons_occ_df = swainsons_occ_df[swainsons_occ_df.occurrences > 1]
swainsons_occ_df

Unnamed: 0_level_0,Unnamed: 1_level_0,occurrences,area
ecoregion,month,Unnamed: 2_level_1,Unnamed: 3_level_1
4,5.0,2,8.196573
4,6.0,28,8.196573
4,7.0,4,8.196573
4,8.0,5,8.196573
9,5.0,343,28.388010
...,...,...,...
833,9.0,1220,35.905513
833,10.0,49,35.905513
833,11.0,9,35.905513
833,12.0,6,35.905513


In [8]:
# Take the mean by ecoregion
st_mean_occ_ecoregion = (
    swainsons_occ_df
    .groupby('ecoregion')
    .mean()
)
# Take the mean by month
st_mean_occ_month = (
    swainsons_occ_df
    .groupby('month')
    .mean()
)
#st_mean_occ_ecoregion

In [9]:
# Normalize for sampling effort
swainsons_occ_df['norm_occurrences'] = (
    swainsons_occ_df[['occurrences']]
    / st_mean_occ_ecoregion[['occurrences']]
    / st_mean_occ_month[['occurrences']]
)
swainsons_occ_df

# Calculate observation density
#swainsons_occ_df['density'] = (
#    swainsons_occ_df.occurrences / swainsons_occ_df.area
#)

#swainsons_occ_df

Unnamed: 0_level_0,Unnamed: 1_level_0,occurrences,area,norm_occurrences
ecoregion,month,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
4,5.0,2,8.196573,0.000049
4,6.0,28,8.196573,0.001076
4,7.0,4,8.196573,0.000194
4,8.0,5,8.196573,0.000846
9,5.0,343,28.388010,0.000109
...,...,...,...,...
833,9.0,1220,35.905513,0.000123
833,10.0,49,35.905513,0.000012
833,11.0,9,35.905513,0.000007
833,12.0,6,35.905513,0.000005


In [10]:
# Merge/join the ecogregions to the normalized occurence data
#swainsons_occ_df.crs
swainsons_ecoregion_gdf = ecoregion_gdf.join(swainsons_occ_df)
swainsons_ecoregion_gdf

Unnamed: 0_level_0,Unnamed: 1_level_0,OBJECTID,ECO_NAME,BIOME_NUM,BIOME_NAME,REALM,ECO_BIOME_,NNH,ECO_ID,SHAPE_LENG,SHAPE_AREA,NNH_NAME,COLOR,COLOR_BIO,COLOR_NNH,LICENSE,geometry,occurrences,area,norm_occurrences
ecoregion,month,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
4,5.0,5.0,Ahklun and Kilbuck Upland Tundra,11.0,Tundra,Nearctic,NE11,1,404,22.590087,8.196573,Half Protected,#4C82B6,#9ED7C2,#257339,CC-BY 4.0,"MULTIPOLYGON (((-17930832.005 8046779.358, -17...",2,8.196573,0.000049
4,6.0,5.0,Ahklun and Kilbuck Upland Tundra,11.0,Tundra,Nearctic,NE11,1,404,22.590087,8.196573,Half Protected,#4C82B6,#9ED7C2,#257339,CC-BY 4.0,"MULTIPOLYGON (((-17930832.005 8046779.358, -17...",28,8.196573,0.001076
4,7.0,5.0,Ahklun and Kilbuck Upland Tundra,11.0,Tundra,Nearctic,NE11,1,404,22.590087,8.196573,Half Protected,#4C82B6,#9ED7C2,#257339,CC-BY 4.0,"MULTIPOLYGON (((-17930832.005 8046779.358, -17...",4,8.196573,0.000194
4,8.0,5.0,Ahklun and Kilbuck Upland Tundra,11.0,Tundra,Nearctic,NE11,1,404,22.590087,8.196573,Half Protected,#4C82B6,#9ED7C2,#257339,CC-BY 4.0,"MULTIPOLYGON (((-17930832.005 8046779.358, -17...",5,8.196573,0.000846
9,5.0,10.0,Alaska-St. Elias Range tundra,11.0,Tundra,Nearctic,NE11,2,405,98.400727,28.388010,Nature Could Reach Half Protected,#61D2F2,#9ED7C2,#7BC141,CC-BY 4.0,"MULTIPOLYGON (((-16886232.729 9049093.235, -16...",343,28.388010,0.000109
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
833,9.0,839.0,Northern Rockies conifer forests,5.0,Temperate Conifer Forests,Nearctic,NE05,2,361,56.924527,35.905513,Nature Could Reach Half Protected,#ACC13E,#458970,#7BC141,CC-BY 4.0,"POLYGON ((-13358313.218 7236575.932, -13331349...",1220,35.905513,0.000123
833,10.0,839.0,Northern Rockies conifer forests,5.0,Temperate Conifer Forests,Nearctic,NE05,2,361,56.924527,35.905513,Nature Could Reach Half Protected,#ACC13E,#458970,#7BC141,CC-BY 4.0,"POLYGON ((-13358313.218 7236575.932, -13331349...",49,35.905513,0.000012
833,11.0,839.0,Northern Rockies conifer forests,5.0,Temperate Conifer Forests,Nearctic,NE05,2,361,56.924527,35.905513,Nature Could Reach Half Protected,#ACC13E,#458970,#7BC141,CC-BY 4.0,"POLYGON ((-13358313.218 7236575.932, -13331349...",9,35.905513,0.000007
833,12.0,839.0,Northern Rockies conifer forests,5.0,Temperate Conifer Forests,Nearctic,NE05,2,361,56.924527,35.905513,Nature Could Reach Half Protected,#ACC13E,#458970,#7BC141,CC-BY 4.0,"POLYGON ((-13358313.218 7236575.932, -13331349...",6,35.905513,0.000005


In [11]:
# summary statistics 

# Total number of ecoregions with observations
num_ecoregions = swainsons_occ_df.index.get_level_values('ecoregion').nunique()
print(f"Number of ecoregions with >1 observation: {num_ecoregions}")

# Total number of months with observations
num_months = swainsons_occ_df.index.get_level_values('month').nunique()
print(f"Number of months with observations: {num_months}")

# Ecoregion/month with the highest normalized occurrence
max_norm = swainsons_occ_df['norm_occurrences'].idxmax()
max_norm_val = swainsons_occ_df['norm_occurrences'].max()
print(f"Ecoregion/month with highest normalized occurrence: {max_norm} (value: {max_norm_val:.2f})")

# Month with the most total occurrences (summed across ecoregions)
month_totals = swainsons_occ_df.groupby('month')['occurrences'].sum()
peak_month = month_totals.idxmax()
print(f"Month with most total occurrences: {peak_month} ({month_totals[peak_month]})")

# Top 5 ecoregions by total occurrences
ecoregion_totals = swainsons_occ_df.groupby('ecoregion')['occurrences'].sum().sort_values(ascending=False)
print("Top 5 ecoregions by total occurrences:")
print(ecoregion_totals.head())

Number of ecoregions with >1 observation: 210
Number of months with observations: 12
Ecoregion/month with highest normalized occurrence: (np.int64(753), np.float64(11.0)) (value: 0.01)
Month with most total occurrences: 5.0 (615542)
Top 5 ecoregions by total occurrences:
ecoregion
674    172945
138    128261
573    114974
499     81633
471     69266
Name: occurrences, dtype: int64


In [12]:
# setup slider widget to be labeled as the month name
mon_widget = pn.widgets.DiscreteSlider(
            options={calendar.month_name[month_num]: month_num 
                     for month_num in range(1,13) }
        )
#mon_widget

In [13]:
# Get the plot bounds so they don't change with the slider
xmin, ymin, xmax, ymax = swainsons_ecoregion_gdf.to_crs(ccrs.Mercator()).total_bounds

# Plot occurrence by ecoregion and month
swainsons_migration_plot = (
    swainsons_ecoregion_gdf.hvplot(
        c='norm_occurrences',
        groupby='month',
        # Use background tiles
        geo=True, crs=ccrs.Mercator(), tiles='CartoLight',
        title="Swainson Thrush's Migration Across Ecoregions in 2023",
        xlim=(xmin, xmax), ylim=(ymin, ymax),
        frame_height=600, 
        widgets = {'month': mon_widget},
        widget_location='bottom'
    )
)


# Show the plot
swainsons_migration_plot

BokehModel(combine_events=True, render_bundle={'docs_json': {'b5aa338f-3638-48fd-ae05-f91feb14f81b': {'version…

####Summary of the plot above:

The interactive migration map provides a spatial and temporal analysis of Swainson’s Thrush movements across North American ecoregions during 2023. By normalizing observation counts, the visualization accounts for differences in sampling effort and ecoregion area, enabling a more accurate comparison of habitat use. Results indicate that Swainson’s Thrush predominantly occupies northern forested ecoregions during the breeding season, with a marked southward shift in distribution during autumn migration. The identification of key stopover and breeding habitats highlights the species’ reliance on a diversity of ecological regions throughout its annual cycle. These findings underscore the importance of conserving a network of habitats along migratory routes to support population connectivity and long-term species persistence.

In [14]:
#save the plot as html
swainsons_migration_plot.save('Swainsons_Thrush_Migration_2023.html', embed=True)
print("Plot saved as Swainsons_Thrush_Migration_2023.html")

#also saving to my home folder
swainsons_migration_plot.save('/Users/niko2485/Library/CloudStorage/OneDrive-UCB-O365/Desktop/Data_ES/species_migration_data/Swainsons_Thrush_Migration_2023.html', embed=True)

                                               



Plot saved as Swainsons_Thrush_Migration_2023.html
                                               



