## Willow Flycatcher

<img src="willow-flycatcher.png" alt="Willow Flycatcher" width="720" height="550" longdesc="https://macaulaylibrary.org/asset/451259001" /> 

**About**

Willow Flycatchers (*Empidonax traillii*) specialize in areas with willows and other shrubs [near running and still water](https://www.audubon.org/field-guide/bird/willow-flycatcher). They are about 6 inches in length with brown, gray, and white feathers, a rounded wing, and a square-tipped tail. The call of a Willow Flycatcher is a chirp, buzz, or trill and matches an undulating pattern. Nests are placed 4-15 feet above water or damp ground and constructed as an open cup of grass, bark, and plant fibers. The species migrates long distances, breeding in the U.S. and Canada and wintering in Mexico, Central America, and northern South America. They are common in most locations in their range despite a [25% decline](https://www.allaboutbirds.org/guide/Willow_Flycatcher/lifehistory) in population between 1966 to 2019. The loss of wet marshes, wet meadows, and riparian vegetation has contributed to [declining species abundance](https://www.fs.usda.gov/detail/tahoe/landmanagement/resourcemanagement/?cid=stelprdb5357314#:~:text=The%20scientific%20name%20for%20willow,t). According to the Bird Genoscape Project, there are seven geneticially [distinct populations](https://www.birdgenoscape.org/willow-flycatcher/) of Willow Flycatcher in North America: the Pacific Northwest, Kern, California, southern California, White Mountain, Arizona, Interior West, Southwest, and Eastern. In my home state of California, there are three [endangered subspecies](https://www.fs.usda.gov/detail/tahoe/landmanagement/resourcemanagement/?cid=stelprdb5357314#:~:text=The%20scientific%20name%20for%20willow,t): Southwestern Willow Flycatcher in central and southern California (Federal and State), Little Willow Flycatcher in high elevation Sierra Nevada (State), and Great Basin Willow Flycatcher in desert riparian area (State). Researchers have found that the Southwestern Willow Flycatcher has a higher prevalence of gene variants today compared to 100 years ago that are associated with [adapting to wet and humid conditions](https://www.allaboutbirds.org/news/endangered-willow-flycatchers-in-san-diego-are-adapting-to-climate-change/). This difference is likely due to interbreeding with species in the Southwest and Pacific Northwest, producing an evolutionary response to climate change. Adaptations like these are why it is vital to preserve the interconnectivity of species populations through the protection of habitat and landscape mobility.

#### Imports

In [1]:
%%bash
pip install pygbif



In [2]:
import os
import pathlib
import time
import zipfile
from getpass import getpass
from glob import glob
import pandas as pd

import pygbif.occurrences as occ
import pygbif.species as species

#### Analysis

In [3]:
# Create data directory in the home folder
data_dir = os.path.join(
    # Home directory
    pathlib.Path.home(),
    # Earth analytics data directory
    'earth-analytics',
    'data',
    # Project directory
    'species-distribution',
)
os.makedirs(data_dir, exist_ok=True)

# Define the directory name for GBIF data
gbif_dir = os.path.join(data_dir, 'willow-flycatcher', '2023')

In [4]:
gbif_dir

'/home/jovyan/earth-analytics/data/species-distribution/willow-flycatcher/2023'

#### Access GBIF

In [5]:
reset_credentials = False
# GBIF needs a username, password, and email
credentials = dict(
    GBIF_USER=(input, 'lauren-alexandra'),
    GBIF_PWD=(getpass, 'benga3-gaZsax-ruqdan'),
    GBIF_EMAIL=(input, 'laurenalexandra999@gmail.com'),
)

for env_variable, (prompt_func, prompt_text) in credentials.items():
    # Delete credential from environment if requested
    if reset_credentials and (env_variable in os.environ):
        os.environ.pop(env_variable)
    # Ask for credential and save to environment
    if not env_variable in os.environ:
        os.environ[env_variable] = prompt_func(prompt_text)


In [6]:
# Query species
species_info = species.name_lookup('Empidonax traillii', rank='SPECIES')

# Get the first result
first_result = species_info['results'][0]

# Get the species key (nubKey)
species_key = first_result['nubKey']

# Check the result
first_result['species'], species_key

('Empidonax traillii', 2482786)

### Download data from GBIF

::: {.callout-task title=“Submit a request to GBIF”

1.  Replace `csv_file_pattern` with a string that will match **any**
    `.csv` file when used in the `glob` function. HINT: the character
    `*` represents any number of any values except the file separator
    (e.g. `/`)

2.  Add parameters to the GBIF download function, `occ.download()` to
    limit your query to:

    -   observations
    -   from 2023
    -   with spatial coordinates.

3.  Then, run the download. **This can take a few minutes**. :::

In [8]:
# Only download once
gbif_pattern = os.path.join(gbif_dir, '*.csv')
if not glob(gbif_pattern):
    # Submit query to GBIF
    gbif_query = occ.download([
        "speciesKey = 2482786",
        "year = 2023",
        "hasCoordinate = TRUE",
    ],
    user=credentials['GBIF_USER'][1], 
    pwd=credentials['GBIF_PWD'][1], 
    email=credentials['GBIF_EMAIL'][1])

    # Only download once
    if not 'GBIF_DOWNLOAD_KEY' in os.environ:
        os.environ['GBIF_DOWNLOAD_KEY'] = gbif_query[0]

        # Wait for the download to build
        wait = occ.download_meta(os.environ['GBIF_DOWNLOAD_KEY'])['status'] 
        while not wait=='SUCCEEDED':
            wait = occ.download_meta(os.environ['GBIF_DOWNLOAD_KEY'])['status'] 
            time.sleep(5)

        # Download GBIF data
        download_info = occ.download_get(
            os.environ['GBIF_DOWNLOAD_KEY'], 
            path=data_dir)

        # Unzip GBIF data
        with zipfile.ZipFile(download_info['path']) as download_zip:
            download_zip.extractall(path=gbif_dir)

# Find the extracted .csv file path
gbif_path = glob(gbif_pattern)[0]

In [9]:
gbif_path

'/home/jovyan/earth-analytics/data/species-distribution/willow-flycatcher/2023/0001229-241007104925546.csv'

### Load the GBIF data into Python

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It: Load GBIF data</div></div><div class="callout-body-container callout-body"><ol type="1">
<li>Look at the beginning of the file you downloaded using the code
below. What do you think the <strong>delimiter</strong> is?</li>
<li>Run the following code cell. What happens?</li>
<li>Uncomment and modify the parameters of <code>pd.read_csv()</code>
below until your data loads successfully and you have only the columns
you want.</li>
</ol></div></div>

You can use the following code to look at the beginning of your file:

In [10]:
!head -n 2 $gbif_path 

gbifID	datasetKey	occurrenceID	kingdom	phylum	class	order	family	genus	species	infraspecificEpithet	taxonRank	scientificName	verbatimScientificName	verbatimScientificNameAuthorship	countryCode	locality	stateProvince	occurrenceStatus	individualCount	publishingOrgKey	decimalLatitude	decimalLongitude	coordinateUncertaintyInMeters	coordinatePrecision	elevation	elevationAccuracy	depth	depthAccuracy	eventDate	day	month	year	taxonKey	speciesKey	basisOfRecord	institutionCode	collectionCode	catalogNumber	recordNumber	identifiedBy	dateIdentified	license	rightsHolder	recordedBy	typeStatus	establishmentMeans	lastInterpreted	mediaType	issue
4846727844	50c9509d-22c7-4a22-a47d-8c48425ef4a7	https://www.inaturalist.org/observations/176130744	Animalia	Chordata	Aves	Passeriformes	Tyrannidae	Empidonax	Empidonax traillii		SPECIES	Empidonax traillii (Audubon, 1828)	Empidonax traillii		US		Illinois	PRESENT		28eb1a3f-1c15-4a95-931a-4af90ecb574d	41.840597	-88.075757	61.0						2023-08-01T09:28	1	8	2023	2482786	

#### Year 2023

In [11]:
# Load the GBIF data
gbif_df_2023 = pd.read_csv(
    gbif_path, 
    delimiter='\t',
    index_col='gbifID',
    on_bad_lines='skip',
    usecols=['gbifID', 'month', 'year', 'countryCode', 'stateProvince', 'decimalLatitude', 'decimalLongitude']
)

gbif_df_2023.head(2)

Unnamed: 0_level_0,countryCode,stateProvince,decimalLatitude,decimalLongitude,month,year
gbifID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
4846727844,US,Illinois,41.840597,-88.075757,8,2023
4458666436,CA,Alberta,51.1703,-115.593645,6,2023


In [20]:
len(gbif_df_2023["countryCode"].unique())

14

In [21]:
gbif_df_2023["countryCode"].unique()

array(['US', 'CA', 'EC', 'MX', 'NI', 'HN', 'GT', 'CR', 'CO', 'VE', 'BZ',
       'PA', 'SV', 'BM'], dtype=object)

In [13]:
# United States 

wf_US = gbif_df_2023.loc[gbif_df_2023['countryCode'] == 'US']
wf_US.value_counts()

countryCode  stateProvince  decimalLatitude  decimalLongitude  month  year
US           Pennsylvania   39.889350        -75.260150        5      2023    227
                                                               6      2023    216
             Illinois       41.963383        -87.634420        5      2023    177
             Ohio           41.451912        -82.667366        5      2023    169
             Pennsylvania   40.271667        -76.247734        7      2023    166
                                                                             ... 
             Wyoming        44.811672        -108.800170       5      2023      1
                            44.659695        -106.948850       5      2023      1
                            44.510040        -109.146800       6      2023      1
                            44.462100        -110.853570       5      2023      1
                            44.438520        -109.217010       8      2023      1
Name: count, Length: 32

In [14]:
# Canada

wf_CA = gbif_df_2023.loc[gbif_df_2023['countryCode'] == 'CA']
wf_CA.value_counts()

countryCode  stateProvince     decimalLatitude  decimalLongitude  month  year
CA           Ontario           43.628270        -79.32917         5      2023    160
             British Columbia  49.234290        -122.79964        7      2023    142
                               48.319800        -123.54715        8      2023    139
                               49.234290        -122.79964        6      2023    115
             Ontario           41.955400        -82.51400         5      2023    109
                                                                                ... 
                               42.325730        -82.89813         6      2023      1
             British Columbia  49.235115        -122.89093        6      2023      1
             Ontario           42.325302        -82.92369         5      2023      1
             British Columbia  49.235120        -122.63271        6      2023      1
             Ontario           42.335026        -81.85847         7     

In [17]:
# Ecuador

wf_EC = gbif_df_2023.loc[gbif_df_2023['countryCode'] == 'EC']
wf_EC.value_counts()

countryCode  stateProvince  decimalLatitude  decimalLongitude  month  year
EC           Napo           -1.083466        -77.577210        10     2023    7
             Sucumbíos      -0.438398        -76.170040        3      2023    6
                            -0.439735        -76.280334        3      2023    4
                            -0.418850        -76.530075        3      2023    4
                            -0.410007        -76.566800        2      2023    3
                            -0.408760        -76.569570        4      2023    3
                            -0.366095        -76.589090        1      2023    2
                            -0.407598        -76.562935        1      2023    2
                            -0.473889        -76.316444        1      2023    2
             Orellana       -0.492047        -76.331710        1      2023    2
             Chimborazo     -1.738496        -78.755394        11     2023    2
             Orellana       -0.498996        

In [18]:
# Mexico 

wf_MX = gbif_df_2023.loc[gbif_df_2023['countryCode'] == 'MX']
wf_MX.value_counts()

countryCode  stateProvince  decimalLatitude  decimalLongitude  month  year
MX           Tabasco        17.989754        -92.973090        9      2023    18
             Oaxaca         15.761899        -96.127110        11     2023    12
             Sonora         27.009628        -108.913380       10     2023    10
             Veracruz       19.469866        -96.788970        9      2023     8
             Nayarit        21.528181        -105.219970       2      2023     7
                                                                              ..
             Veracruz       18.543131        -95.148419        5      2023     1
                            18.455866        -95.185870        10     2023     1
                            19.110413        -96.121056        5      2023     1
                            19.067734        -96.075860        5      2023     1
                            19.490616        -96.332400        5      2023     1
Name: count, Length: 212, dtype: i

In [None]:
# hv plot migration 



#### Year 2022

In [54]:
# Load the GBIF data
gbif_df_2022 = pd.read_csv(
    gbif_path, 
    delimiter='\t',
    index_col='gbifID',
    on_bad_lines='skip',
    usecols=['gbifID', 'countryCode', 'stateProvince', 'decimalLatitude', 'decimalLongitude', 'month', 'year']
)

gbif_df_2022.head(2)

Unnamed: 0_level_0,countryCode,stateProvince,decimalLatitude,decimalLongitude,month,year
gbifID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
4901985213,US,California,38.341145,-119.924987,6,2022
4901604313,US,California,38.341145,-119.924987,6,2022


In [55]:
gbif_df_2022.value_counts()

countryCode  stateProvince  decimalLatitude  decimalLongitude  month  year
US           California     34.360283        -118.395890       4      2022    170
             Arizona        32.392628        -110.702470       8      2022    133
             Pennsylvania   39.855350        -75.445404        11     2022    121
             Arizona        32.417030        -110.725080       8      2022    114
             California     32.553860        -117.084620       5      2022    103
                                                                             ... 
             Washington     47.500560        -123.282100       5      2022      1
                            47.472195        -123.838000       7      2022      1
                            47.469204        -122.914185       5      2022      1
                            47.468520        -123.845520       7      2022      1
                            47.464817        -123.852104       7      2022      1
Name: count, Length: 82

#### Year 2021

In [60]:
# Load the GBIF data
gbif_df_2021 = pd.read_csv(
    gbif_path, 
    delimiter='\t',
    index_col='gbifID',
    on_bad_lines='skip',
    usecols=['gbifID', 'countryCode', 'stateProvince', 'decimalLatitude', 'decimalLongitude', 'month', 'year']
)

gbif_df_2021.head(2)

Unnamed: 0_level_0,countryCode,stateProvince,decimalLatitude,decimalLongitude,month,year
gbifID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
4936044596,US,California,40.893766,-123.769021,5,2021
4908633473,US,California,40.190598,-121.102956,5,2021


In [61]:
gbif_df_2021.value_counts()

countryCode  stateProvince  decimalLatitude  decimalLongitude  month  year
US           California     34.360283        -118.39589        4      2021    179
                            32.553860        -117.08462        5      2021    145
                            37.995620        -122.97821        9      2021    125
             Arizona        31.727170        -110.88075        4      2021    103
                            31.917200        -109.27910        8      2021    101
                                                                             ... 
GT           Alta Verapaz   15.306603        -90.45372         10     2021      1
                            15.412526        -90.40979         8      2021      1
US           Washington     47.031920        -123.09377        7      2021      1
                                                               8      2021      1
                            47.040035        -122.49458        5      2021      1
Name: count, Length: 78

#### References

Afzal, P. (2024, January 4). *Endangered Willow Flycatchers in San Diego are adapting to climate change.* Cornell Lab of Ornithology. https://www.allaboutbirds.org/news/endangered-willow-flycatchers-in-san-diego-are-adapting-to-climate-change 

Bird Genoscape Project. (n.d.). *Willow Flycatcher.* https://www.birdgenoscape.org/willow-flycatcher 

Cornell Lab of Ornithology. (n.d.). *Willow Flycatcher life history.* All About Birds. https://www.allaboutbirds.org/guide/Willow_Flycatcher/lifehistory

National Audubon Society. (n.d.). *Willow Flycatcher.* Audubon. https://www.audubon.org/field-guide/bird/willow-flycatcher

Tahoe National Forest. (n.d.). *Willow Flycatcher - introduction.* U.S. Forest Service. https://www.fs.usda.gov/detail/tahoe/landmanagement/resourcemanagement/?cid=stelprdb5357314#:~:text=The%20scientific%20name%20for%20willow,t 





