## Examining Migration of the Pacific Loon (Gavia pacifica) in 2023 using data from the Global Biodiversity Information Facility (GBIF)

The Pacific loon, *Gavia pacifica* (or *Colimbo pacífico* in Spanish) are one of five species of loons found in North America. Loons as a group are known for their distinctive and "haunting" calls, which include wails, hoots, and yodel sounds [(Committee to Protect Loons, n.d.)](https://loon.org/the-call-of-the-loon/). Male and female loons have a similar appearance, with black or dark gray heads and white-spotted or striped dark plumage.

Pacific loons are considered to be a medium- to long-distance migration species, breeding throughout Alaska and the northern Canadian provinces and spending winters off the Pacific coast of North America from southern Alaska through Baha California, Mexico and the Sea of Cortez [(Billerman et al., 2022)](https://www.allaboutbirds.org/guide/Pacific_Loon/maps-range). However, Pacific loons have been spotted throughout many parts of the U.S., including along the Atlantic coast and in the Northwestern Passages in Canada [(eBird, 2021)](https://ebird.org/species/pacloo). 

According to [Audubon.org](https://explorer.audubon.org/explore/species/1497/pacific-loon) (2024), the Pacific loon is a species of least concern in the International Union for the Conservation of Nature's (IUCN) threatened species ratings, with an estimated global population of about 840,000 individuals.

<footer>
    <a data-flickr-embed="true" data-footer="true" href="https://www.flickr.com/photos/mickthompson/19238537242/in/photolist-vj3wRY-2ovBQMc-b9j9ov-2j3jZHY-2ntaWdC-2oC6cDC-2nntnDC-HLwDEG-28kJ9Ly-4S5YCf-2iRcV9S-S2jQjR-2j3rboZ-2n4xgwt-2nnvPT4-284dNe6-Cg8B2p-Xnruj1-GDe29X-vhZQpu-28HdFvN-unJp9V-2nideZQ-2qfonJJ-qpkqQ2-NbwGEF-2nt9Ddg-PhHejN-2p6MXBb-Xnruhh-PhGWJ3-PhGUzU-ca4y95-PhHaUm-2p6LnBY-CfVufJ-21we98p-2f1Ncy5-zpZ35i-8UrVUs-4c2ozn-LaWqw5-2f1NbuG-H89Bjd-GmKeaf-Ciw2Gy-2nt9DfR-pVWECy-2nt3aDZ-2nt9CDf" title="Pacific Loon"><img src="https://live.staticflickr.com/65535/19238537242_f9b5c528f7_z.jpg" width="640" height="427" alt="Pacific Loon"/></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script>
    <p>Image credit: Flickr/Mick Thompson, CC by-NC 2.0</p>
</footer>

# GBIF

The Global Biodiversity Information Facility (GBIF) website compiles a variety of species identification and observation data, including observations by citizen scientists. 

In [1]:
%store -r

import os
import pathlib
import time
import calendar 
import zipfile
from getpass import getpass
from glob import glob

import geopandas as gpd
import pandas as pd
import pygbif.occurrences as occ
import pygbif.species as species

#dynamic mapping
import hvplot.pandas
import cartopy.crs as ccrs
import panel as pn

In [None]:
reset_credentials = True
# GBIF needs a username, password, and email
credentials = dict(
    GBIF_USER=(input, ''),
    GBIF_PWD=(getpass, ''),
    GBIF_EMAIL=(input, ''),
)
for env_variable, (prompt_func, prompt_text) in credentials.items():
    # Delete credential from environment if requested
    if reset_credentials and (env_variable in os.environ):
        os.environ.pop(env_variable)
    # Ask for credential and save to environment
    if not env_variable in os.environ:
        os.environ[env_variable] = prompt_func(prompt_text)

In [None]:
# Create data directory in the home folder
data_dir = os.path.join(
    # Home directory
    pathlib.Path.home(),
    # Earth analytics data directory
    'earth-analytics',
    'data',
    # Project directory
    'spec_dist_pacific_loon',
)
os.makedirs(data_dir, exist_ok=True)

# Define the directory name for GBIF data
gbif_dir = os.path.join(data_dir, 'spec_dist_pacific_loon')

In [None]:
# Query species
species_info = species.name_lookup('Gavia pacifica', rank='SPECIES')
# Query species
species_info = species.name_lookup('Gavia pacifica', rank='SPECIES')

# Get the first result
first_result = species_info['results'][0]

# Get the species key (nubKey)
species_key = first_result['nubKey']

# Check the result
#first_result['species'], species_key

In [None]:
# Only download once!
gbif_pattern = os.path.join(gbif_dir, '*.csv')
if not glob(gbif_pattern):
    # Only submit one request
    if not 'GBIF_DOWNLOAD_KEY' in os.environ:
        # Submit query to GBIF
        gbif_query = occ.download([
        "speciesKey = 2481955",
        "year = 2023",
        "hasCoordinate = True"
        ])
        os.environ['GBIF_DOWNLOAD_KEY'] = gbif_query[0]

    # Wait for the download to build
    download_key = os.environ['GBIF_DOWNLOAD_KEY']
    wait = occ.download_meta(download_key)['status']
    while not wait=='SUCCEEDED':
        wait = occ.download_meta(download_key)['status']
        time.sleep(5)

    # Download GBIF data
    download_info = occ.download_get(
        os.environ['GBIF_DOWNLOAD_KEY'], 
        path=data_dir)

    # Unzip GBIF data
    with zipfile.ZipFile(download_info['path']) as download_zip:
        download_zip.extractall(path=gbif_dir)

# Find the extracted .csv file path (take the first result)
gbif_path = glob(gbif_pattern)[0]
gbif_pattern = os.path.join(gbif_dir, '*.csv')
# Find the extracted .csv file path (take the first result)
gbif_path = glob(gbif_pattern)[0]

# Load the GBIF data
gaviapac_gbif_df = pd.read_csv(
    gbif_path,
    sep='\t',
    index_col='gbifID',
    header='infer',
    usecols=['gbifID', 'occurrenceID', 'species', 'scientificName', 
             'countryCode', 'occurrenceStatus', 'individualCount',
             'decimalLatitude', 'decimalLongitude', 'month', 'year', 
             'speciesKey', 'basisOfRecord']
    )
gaviapac_gbif_df.head()

In [2]:
#Get the ecoregions shapefile
ecoreg_shp_dir = os.path.join(
    # Home directory
    pathlib.Path.home(),
    # Earth analytics data directory
    'earth-analytics',
    'data',
    # Project directory
    'species_dist_coding_assign',
    'ecoregions_dirname'
)
os.makedirs(ecoreg_shp_dir, exist_ok=True)
ecoregion_shppath = os.path.join(ecoreg_shp_dir, 'ecoregions_filename.shp')
# Open up the ecoregions boundaries
ecoreg_gdf = gpd.read_file(ecoregion_shppath)

# Name the index so it will match the other data later on
ecoreg_gdf.index.name = 'ecoregion'
#ecoreg_gdf.crs

In [4]:
# Simplify the geometry to speed up processing
ecoreg_gdf.geometry = ecoreg_gdf.simplify(.1, preserve_topology=False)
# Change the CRS to Mercator for mapping
ecoreg_gdf = ecoreg_gdf.to_crs(ccrs.Mercator())
# Check that the plot runs in a reasonable amount of time
#ecoreg_gdf.hvplot(geo=True, crs=ccrs.Mercator())

In [None]:
#convert the pacific loon occurrence data to a geodataframe
gaviapac_gbif_gdf = (
    gpd.GeoDataFrame(
        gaviapac_gbif_df, 
        geometry=gpd.points_from_xy(
            gaviapac_gbif_df.decimalLongitude, 
            gaviapac_gbif_df.decimalLatitude), 
        crs="EPSG:4326")
    # Select the desired columns
    #[['gbifID', 'decimalLatitude', 'decimalLongitude', 'month']]
)
gaviapac_gbif_gdf = gaviapac_gbif_gdf.to_crs(ccrs.Mercator())
#gaviapac_gbif_gdf
#gaviapac_gbif_gdf.crs

Unnamed: 0_level_0,occurrenceID,species,scientificName,countryCode,occurrenceStatus,individualCount,decimalLatitude,decimalLongitude,month,year,speciesKey,basisOfRecord,geometry
gbifID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
4953151418,https://www.inaturalist.org/observations/17616...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",CA,PRESENT,,58.765791,-94.122485,8,2023,2481955,HUMAN_OBSERVATION,POINT (-10477667.102 8093368.02)
4950273871,https://www.inaturalist.org/observations/24385...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,32.758759,-117.245769,11,2023,2481955,HUMAN_OBSERVATION,POINT (-13051739.303 3840207.905)
4946632056,https://www.inaturalist.org/observations/23034...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,36.909267,-122.026831,11,2023,2481955,HUMAN_OBSERVATION,POINT (-13583964.69 4400804.577)
4937187752,https://www.inaturalist.org/observations/14980...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,49.000610,-123.167541,2,2023,2481955,HUMAN_OBSERVATION,POINT (-13710947.946 6242699.205)
4936192321,https://www.inaturalist.org/observations/19074...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,34.725271,-118.167093,11,2023,2481955,HUMAN_OBSERVATION,POINT (-13154300.621 4102268.936)
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4018248179,https://www.inaturalist.org/observations/14657...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",CA,PRESENT,,48.315404,-123.650051,1,2023,2481955,HUMAN_OBSERVATION,POINT (-13764660.714 6127561.023)
4018104911,https://www.inaturalist.org/observations/14636...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,36.959997,-122.018938,1,2023,2481955,HUMAN_OBSERVATION,POINT (-13583086.045 4407839.324)
4015258054,https://www.inaturalist.org/observations/14594...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,57.082025,-135.379996,1,2023,2481955,HUMAN_OBSERVATION,POINT (-15070432.218 7741003.326)
4011669284,https://www.inaturalist.org/observations/14571...,Gavia pacifica,"Gavia pacifica (Lawrence, 1858)",US,PRESENT,,39.517589,-83.990892,1,2023,2481955,HUMAN_OBSERVATION,POINT (-9349823.329 4768891.522)


In [None]:

gaviapac_ecoregion_gdf = (
    ecoreg_gdf
    # Match the CRS of the GBIF data and the ecoregions
    .to_crs(gaviapac_gbif_gdf.crs)
    # Find ecoregion for each observation
    .sjoin(
        gaviapac_gbif_gdf,
        how='inner', 
        predicate='contains')
    # Select the required columns
    [['OBJECTID', 'gbifID', 'ECO_NAME','BIOME_NUM','BIOME_NAME', 'month', 'SHAPE_AREA']]
)
#gaviapac_ecoregion_gdf

# Aggregate the occurrences to ecoregion and month
gaviapac_occ_df = (
    gbif_ecoregion_gdf
    #.reset index()
    # For each ecoregion, for each month...
    .groupby(['ecoregion', 'month'])
    # ...count the number of occurrences
    .agg(occurrences=('gbifID', 'count'),
         area=('SHAPE_AREA', 'first'))
)
# Get rid of rare observations (possible misidentification?)
gaviapac_occ_df = occurrence_df[occurrence_df.occurrences > 1]

#Normalize by area
gaviapac_occ_df['density'] = (
    gaviapac_occ_df.occurrences / gaviapac_occ_df.area
)

#gaviapac_occ_df

In [None]:
#check the monthly data values across all ecoregions
# gaviapac_occ_df.groupby('month').mean()

#check the data given plotting issues
# #ecoregion_mean = gaviapac_occ_df.groupby('ecoregion').mean()
#ecoregion_mean
#ecoregion_mean.to_csv('ecoregion_means.csv')
#gaviapac_occ_df.to_csv('gaviapacifica_occur_df.csv')

In [None]:
# Merge/join the ecogregions to the normalized occurence data
#gaviapac_occ_df.crs
gaviapac_ecoreg_gdf = ecoreg_gdf.join(gaviapac_occ_df)

# Check the data for plot troubleshooting 
#gaviapac_ecoreg_gdf.to_csv('gavia_pacifica_ecoreg_gdf.csv')

In [12]:
# setup slider widget to be labeled as the month name
mon_widget = pn.widgets.DiscreteSlider(
            options={calendar.month_name[month_num]: month_num 
                     for month_num in range(1,13) }
        )
#mon_widget

In [17]:
# Get the plot bounds so they don't change with the slider
xmin, ymin, xmax, ymax = gaviapac_ecoreg_gdf.to_crs(ccrs.Mercator()).total_bounds

# Plot occurrence by ecoregion and month
gaviapac_migration_plot = (
    gaviapac_ecoreg_gdf.hvplot(
        c='norm_occurrences',
        groupby='month',
        # Use background tiles
        geo=True, crs=ccrs.Mercator(), tiles='CartoLight',
        title="Pacific Loon (Gavia pacifica) Migration Across Ecoregions in 2023",
        xlim=(xmin, xmax), ylim=(ymin, ymax),
        frame_height=600, 
        widgets = {'month': mon_widget},
        widget_location='bottom'
    )
)

# Save the plot
gaviapac_migration_plot.save('gaviapac_migration_plot_migration.html', 
                                            embed=True)

# Show the plot
gaviapac_migration_plot

AbbreviatedException: ValueError: failed to validate MultiPolygons(id='p16914', ...).fill_color: expected an element of either String, Nullable(Color), Instance(Value), Instance(Field), Instance(Expr), Struct(value=Nullable(Color), transform=Instance(Transform)), Struct(field=String, transform=Instance(Transform)) or Struct(expr=Instance(Expression), transform=Instance(Transform)), got dim('norm_occurrences')

To view the original traceback, catch this exception and call print_traceback() method.

### What the GBIF migration data tell us

According to GBIF-reported field observations in 2023, Gavia Pacifica overwinter in the Amazon region of South America, primarily in Brazil. However, the Pacific loon is known to overwinter off the Pacific coast of the U.S., so it is likely that these are mis-identifications of another species. 

It is important to keep in mind that when using crowd-sourced data such as that included in GBIF, there may not be 100% accuracy in identification. Therefore, it's possible that some observations are actually of different Gavia species, or even a different genus entirely. 

### References

Natinoal Audubon Society. 2024. Field guide: Pacific Loon. https://explorer.audubon.org/explore/species/1497/pacific-loon 

Billerman S. M., B. K. Keeney, P. G. Rodewald, and T. S. Schulenberg (Editors) (2022). Birds of the World. Cornell Laboratory of Ornithology, Ithaca, NY, USA. https://birdsoftheworld.org/bow/home

eBird. 2021. eBird: An online database of bird distribution and abundance [web application]. eBird, Cornell Lab of Ornithology, Ithaca, New York. Available: http://www.ebird.org.

GBIF.org. (22 October 2024.) GBIF Occurrence Download. https://doi.org/10.15468/dl.c96k9k

Loon Preservation Committee. n.d. About Loons. https://loon.org/about-the-common-loon/

U.S. Geological Survey (USGS) - Gap Analysis Project (GAP), 2018, Pacific Loon (Gavia pacifica) bPALOx_CONUS_2001v1 Range Map: U.S. Geological Survey data release, https://doi.org/10.5066/F7MK6BXK.