#  US Gap Analysis Project - WV Breeding Bird Atlas Data Comparison 
Nathan Tarr and Jessie Jordan

## Species Lists
We investigated the agreement between WV Breeding Bird Atlas (2011-2015) and USGS Gap Analysis Project data on the species occupying West Virginia.  We also investigated the species that were reported to eBird between 2011 and 2015.

This document demonstrates a comparsison of three species lists:

__Gap Analysis Project__ -- we queried the GAP range data to retrieve a list of species that GAP predicts occur within WV during summer or winter.

__WVBBA__ -- we summarized the species detected during WVBBA surveys.

__eBird__ -- we pulled a list of species, and the frequency of detection from the eBird API via the R rebird package for May and June of the years from 2011 to 2015.  The table returned had the following columns for WV during the months of May and June and years 2011 to 2015:

"monthQt": month and week (eBird data divides each month by four
weeks)

"comName": species common name

"frequency": proportion of times the species was seen in a
specified week

"sampleSize" number of complete eBird checklists submitted for
specified given week.

We aggregated weekly mean frequencies of detection to a single value by calculating their mean.

### WVBBA and GAP
We combined a list of species that GAP predicted to be in WV during the breeding season with the WVBBA data by linking records on common names.  We manually resolved unmatched names due to typographical differences, although very few existed.

In [1]:
import pandas as pd
import repo_functions as fun 
import numpy as np
pd.set_option('display.width', 2000)
pd.set_option('display.max_colwidth', 400)
pd.set_option('display.max_rows', 400)
pd.set_option('display.max_columns', 15)

# Read in the spp crosswalk
crosswalk = pd.read_csv(fun.dataDir + "SpeciesLists/WV_GAP_Atlas2.csv", header=0, 
                        names=["GAP_sci_name", "GAP_code", "common_name", "GAP_habitat(ha)", "WVBBA_sci_name", 
                               "WVBBA_individuals", "WVBBA_points", "WVBBA_rate", "%_of_blocks", "modeled", "WV_code", "notes"],
                       dtype={"GAP_sci_name":str, "GAP_code":str, "common_name":str, "GAP_habitat(ha)":float, 
                               "WVBBA_sci_name":str, "%_of_blocks":float, "modeled":int, "WV_code":str, "notes":str})
print("The first 5 records")
crosswalk.head().T

The first 5 records


Unnamed: 0,0,1,2,3,4
GAP_sci_name,Falco sparverius,Thryomanes bewickii,Passerina caerulea,Quiscalus quiscula,Geothlypis trichas
GAP_code,bamkex,bbewrx,bblgrx,bcogrx,bcoyex
common_name,American Kestrel,Bewick's Wren,Blue Grosbeak,Common Grackle,Common Yellowthroat
GAP_habitat(ha),1.91945e+06,1454.67,1.66457e+06,2.09797e+06,4.31626e+06
WVBBA_sci_name,Falco sparverius,,Passerina caerulea,Quiscalus quiscula,Geothlypis trichas
WVBBA_individuals,,,,841,734
WVBBA_points,,,,274,545
WVBBA_rate,,,,8.6,17.1
%_of_blocks,0.4,,0.3,7.2,14.3
modeled,0,0,0,1,1


#### Remove winter only species from GAP list
The above table (crosswalk) was developed using a list of species that inhabit WV during summer or winter.  For our comparison, we needed to remove "winter-only" species that were included because of GAP winter data.

In [2]:
# Bring in list of GAP summer species
GAP_summer = pd.read_csv(fun.dataDir + "SpeciesLists/GAP_birds_in_WV_Summer.txt", delimiter="\t", header=0)

In [3]:
# Drop a species if it is a GAP species not in the GAP summer list and not detected by WVBBA
winter_only = crosswalk[(crosswalk["GAP_sci_name"].isin(GAP_summer['strScientificName']) == False) & 
                        (crosswalk["WV_code"].isnull() == True)]
crosswalk = crosswalk[crosswalk['common_name'].isin(winter_only['common_name']) == False]

In [4]:
# Update habitat area estimate for species that also winter in WV
crosswalk = pd.merge(crosswalk, GAP_summer.filter(["strCommonName", "intHa"], axis=1),
                     left_on='common_name', right_on='strCommonName', how='inner')
crosswalk["GAP_habitat(ha)"] = crosswalk["intHa"]
crosswalk.drop(["intHa"], axis=1, inplace=True)
crosswalk.head().T

Unnamed: 0,0,1,2,3,4
GAP_sci_name,Falco sparverius,Thryomanes bewickii,Passerina caerulea,Quiscalus quiscula,Geothlypis trichas
GAP_code,bamkex,bbewrx,bblgrx,bcogrx,bcoyex
common_name,American Kestrel,Bewick's Wren,Blue Grosbeak,Common Grackle,Common Yellowthroat
GAP_habitat(ha),1.877e+06,1454.67,1.66457e+06,2.09797e+06,4.31626e+06
WVBBA_sci_name,Falco sparverius,,Passerina caerulea,Quiscalus quiscula,Geothlypis trichas
WVBBA_individuals,,,,841,734
WVBBA_points,,,,274,545
WVBBA_rate,,,,8.6,17.1
%_of_blocks,0.4,,0.3,7.2,14.3
modeled,0,0,0,1,1


### Exceptions
Some species entries have been identified as problematic, so we exluded them from this summary.

__*Solitary Vireo*__ -- this concept is equivalent to Blue-headed Vireo, which is also in the tables.

__*Brewster's Warbler*__ and __*Lawrence's Warbler*__ -- these hybrids were not modeled by GAP.

__*Slate-colored Junco*__ -- equivalent to Dark-eyed Junco, which is also in the tables.

In [5]:
drop = ["Solitary Vireo", "Brewster's Warbler", "Lawrence's Warbler", "Slate-colored Junco"]
crosswalk = crosswalk[crosswalk["common_name"].isin(drop) == False]

### eBird

In [6]:
# Ebird pulled via rebird frequency 2011 to 2015 in WV
eBird =(pd.read_csv(fun.dataDir + "SpeciesLists/ebird_WV_2011_2015.csv")
        [lambda x: x['frequency'] > 0]
        [lambda x: x['comName'].str.contains(' sp.') == False]
        [lambda x: x['monthQt'].str.contains('May|June')]
        [lambda x: x['comName'].str.contains('hybrid') == False]
        [lambda x: x['comName'].str.contains('/') == False]
        .drop(['monthQt'], axis=1)
        .groupby('comName').mean()
        .drop(['sampleSize'], axis=1)
        .rename(columns={'frequency': 'eBird_mean_freq'})
        .reset_index()
        .sort_values(by=['eBird_mean_freq'], ascending=False))
eBird.loc[eBird[eBird['comName'] == 'Black-crowned Night-Heron'].index, 'comName'] = 'Black-crowned Night-heron'
eBird.loc[eBird[eBird['comName'] == 'Eastern Whip-poor-will'].index, 'comName'] = 'Whip-poor-will'
eBird.loc[eBird[eBird['comName'] == 'Eurasian Collared-Dove'].index, 'comName'] = 'Eurasian Collared-dove'
eBird['COMMON_NAME'] = [x.upper() for x in eBird['comName']]

We merged the GAP-WVBBA species list crosswalk with the eBird species list to facilitate comparisons of species lists.

In [7]:
# Merge the eBird, WV, and GAP lists
crosswalk["COMMON"] = [x.upper() for x in crosswalk['common_name']]
df = (pd.merge(crosswalk, eBird, left_on='COMMON', right_on='COMMON_NAME',how='outer')
      .drop(["COMMON", "COMMON_NAME"], axis=1)
      .sort_values(by=['common_name']))
df.to_csv(fun.dataDir + "/SpeciesLists/merged_spp_lists.csv")

### Which species were on the eBird and GAP list but not detected by WVBBA?

In [8]:
a = (df[(df['%_of_blocks'].isnull() == True) & 
        (df['GAP_code'].isnull() == False) &
        (df['comName'].isnull() == False)]
     .drop(["WVBBA_sci_name", "WVBBA_individuals", "WVBBA_points", "WVBBA_rate", "comName", "%_of_blocks",
           "GAP_code", "modeled", "WV_code", "notes"], axis=1)
     .sort_values("GAP_habitat(ha)", ascending=False)
     .set_index(["common_name"]))
print(str(len(a)) + " species")
a

16 species


Unnamed: 0_level_0,GAP_sci_name,GAP_habitat(ha),strCommonName,eBird_mean_freq
common_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Cattle Egret,Bubulcus ibis,1079016.39,Cattle Egret,0.00117
Olive-sided Flycatcher,Contopus cooperi,641097.45,Olive-sided Flycatcher,0.004344
Northern Saw-whet Owl,Aegolius acadicus,439404.03,Northern Saw-whet Owl,0.001441
Chuck-will's-widow,Caprimulgus carolinensis,188641.35,Chuck-will's-widow,0.001556
Barn Owl,Tyto alba,141250.86,Barn Owl,0.0015
Long-eared Owl,Asio otus,139664.34,Long-eared Owl,0.000655
Red Crossbill,Loxia curvirostra,93662.37,Red Crossbill,0.003942
Upland Sandpiper,Bartramia longicauda,88442.28,Upland Sandpiper,0.001187
Peregrine Falcon,Falco peregrinus,46388.61,Peregrine Falcon,0.002353
Pied-billed Grebe,Podilymbus podiceps,42171.03,Pied-billed Grebe,0.006013


### Which species were on the eBird list but not GAP or WVBBA?

In [9]:
b = (df[(df['%_of_blocks'].isnull() == True) & 
     (df['GAP_code'].isnull() == True) &
     (df['comName'].isnull() == False)]
     .sort_values("eBird_mean_freq", ascending=False)
     .set_index(["comName"]))
b[["eBird_mean_freq"]]

Unnamed: 0_level_0,eBird_mean_freq
comName,Unnamed: 1_level_1
Yellow-rumped Warbler,0.10598
Spotted Sandpiper,0.07156
White-throated Sparrow,0.053249
Blackpoll Warbler,0.046135
Solitary Sandpiper,0.041571
Tennessee Warbler,0.034274
Ruby-crowned Kinglet,0.029439
White-crowned Sparrow,0.028275
Osprey,0.027728
Double-crested Cormorant,0.026005


### Were any species detected by WVBBA but not submitted to eBird or predicted present by GAP?

In [10]:
c = (df[(df['%_of_blocks'].isnull() == False) & 
     (df['GAP_code'].isnull() == True) &
     (df['comName'].isnull() == True)])
if len(c) > 0:
    print(c)
else:
    print("No")

No


### Were any species predicted present by GAP but not detected by WVBBA or submitted to eBird?

In [11]:
d = (df[(df['%_of_blocks'].isnull() == True) & 
     (df['GAP_code'].isnull() == False) &
     (df['comName'].isnull() == True)]
    .set_index(["common_name"])
    .sort_values(["GAP_habitat(ha)"], ascending=False))
if len(d) > 0:
    print(d[["GAP_code", "GAP_sci_name", "GAP_habitat(ha)"]])
else:
    print("No")

                       GAP_code           GAP_sci_name  GAP_habitat(ha)
common_name                                                            
Lark Sparrow             blaspx   Chondestes grammacus         34328.16
Eurasian Collared-dove   beucdx  Streptopelia decaocto          8702.73
Bewick's Wren            bbewrx    Thryomanes bewickii          1454.67
King Rail                bkirax         Rallus elegans          1255.32


### Which species were predicted present by GAP but not detected by WVBBA? eBird ignored here.

In [12]:
e = (df[(df['%_of_blocks'].isnull() == True) & 
     (df['GAP_code'].isnull() == False)]
    .set_index(["common_name"])
    .sort_values(["common_name"], ascending=True))
if len(e) > 0:
    print(e[["GAP_code", "GAP_sci_name", "GAP_habitat(ha)"]])
else:
    print("No")

                          GAP_code              GAP_sci_name  GAP_habitat(ha)
common_name                                                                  
Barn Owl                    bbanox                 Tyto alba        141250.86
Bewick's Wren               bbewrx       Thryomanes bewickii          1454.67
Black-crowned Night-heron   bbcnhx     Nycticorax nycticorax          7736.94
Blue-winged Teal            bbwtex              Anas discors         33558.84
Cattle Egret                bcaegx             Bubulcus ibis       1079016.39
Chuck-will's-widow          bcwwix  Caprimulgus carolinensis        188641.35
Eurasian Collared-dove      beucdx     Streptopelia decaocto          8702.73
Hooded Merganser            bhomex     Lophodytes cucullatus         38241.27
King Rail                   bkirax            Rallus elegans          1255.32
Lark Sparrow                blaspx      Chondestes grammacus         34328.16
Long-eared Owl              bleowx                 Asio otus    