### Working with Web APIs - eBird

[eBird.org](http://ebird.org/content/ebird/) is an extremely popular online site for entering bird sightings. It was started by the [Cornell Lab of Ornithology](http://www.birds.cornell.edu) and the [National Audobon Society](https://www.audubon.org/).

> A real-time, online checklist program, eBird has revolutionized the way that the birding
> community reports and accesses information about birds. Launched in 2002 by the Cornell 
> Lab of Ornithology and National Audubon Society, eBird provides rich data sources for
> basic information on bird abundance and distribution at a variety of spatial and temporal
> scales.

> eBird’s goal is to maximize the utility and accessibility of the vast numbers of bird 
> observations made each year by recreational and professional bird watchers. It is
> amassing one of the largest and fastest growing biodiversity data resources in existence.
> For example, in March 2012, participants reported more than 3.1 million bird observations
> across North America!

> The observations of each participant join those of others in an international network of eBird users.
> eBird then shares these observations with a global community of educators, 
> land managers, ornithologists, and conservation biologists. In time these data will
> become the foundation for a better understanding of bird distribution across the western
> hemisphere and beyond.

Not only does eBird make it easy for you to enter sightings and manage your own lists of birds seen, it has a nice set of tools for exploring the massive amount of data it collects.

* Summary graphs and tables
* Search for recent sightings in "hotspots" or by any location
* Interactive species maps
* ... and even more goodies

In addition, [they have an API](https://confluence.cornell.edu/display/CLOISAPI/eBird+API+1.1) that makes it possible for others to create custom apps that use the sightings data. For example, I was talking with a fellow birder friend and mentioned that it would be cool if there was an app for identifying good birding spots associated with a planned road trip. The next day he emailed me and said, "Hey, check out this app I found!"

[http://hotspotbirding.com/roadtrip](http://hotspotbirding.com/roadtrip) - unfortunately, it's
gone the way of the Ivory Billed Woodpecker.

The whole site is dedicated to using data and computation to help birders bird - [http://hotspotbirding.com/](http://hotspotbirding.com/). It's bird geek heaven. All because eBird has an API. Another simple but useful use of the eBird API is at [birdsearch.org](http://birdsearch.org/).



A nice overview of web scraping of eBird data can be found at [http://www.programmingforbiologists.org/lectures/web-data/](http://www.programmingforbiologists.org/lectures/web-data/). I've borrowed some things from there for the following example.



Install ebird-api for Python: https://pypi.python.org/pypi/ebird-api/2.1.0

Get api key for ebird.org from 

In [1]:
# Get api key from ebird.org
api_key = '83svj2q1mga6'

For now let's put hotspot ids into a dictionary. Should be able to find a way to download a table of hotspots from ebird. Could certainly webscrape from https://ebird.org/region/US-MI-125/hotspots?yr=all&m=.

In [2]:
hotspot_ids = {'Bear Creek Nature Park':'L2776037',
              'Cranberry Lake Park': 'L2776024',
              'Charles Ilsley Park': 'L2905470',
              'Draper Twin Lake Park': 'L1581963'}

scmp_ids = {'SCMP-North': 'L389646'}

In [3]:
for park, id in hotspot_ids.items():
    print(park, id)

Bear Creek Nature Park L2776037
Cranberry Lake Park L2776024
Charles Ilsley Park L2905470
Draper Twin Lake Park L1581963


In [4]:
loc_ids = [id for (park, id) in hotspot_ids.items()]
loc_ids

['L2776037', 'L2776024', 'L2905470', 'L1581963']

In [6]:
import pandas as pd
import requests
import time #used to put .5 second delay in API data call

In [7]:
start_date = pd.Timestamp('20180301')
end_date = pd.Timestamp('20191130')
num_days = (end_date - start_date).days + 1
rng = pd.date_range(start_date, periods=num_days, freq='D')

In [8]:
for d in rng:
    print(d.year, d.month, d.day)

2018 3 1
2018 3 2
2018 3 3
2018 3 4
2018 3 5
2018 3 6
2018 3 7
2018 3 8
2018 3 9
2018 3 10
2018 3 11
2018 3 12
2018 3 13
2018 3 14
2018 3 15
2018 3 16
2018 3 17
2018 3 18
2018 3 19
2018 3 20
2018 3 21
2018 3 22
2018 3 23
2018 3 24
2018 3 25
2018 3 26
2018 3 27
2018 3 28
2018 3 29
2018 3 30
2018 3 31
2018 4 1
2018 4 2
2018 4 3
2018 4 4
2018 4 5
2018 4 6
2018 4 7
2018 4 8
2018 4 9
2018 4 10
2018 4 11
2018 4 12
2018 4 13
2018 4 14
2018 4 15
2018 4 16
2018 4 17
2018 4 18
2018 4 19
2018 4 20
2018 4 21
2018 4 22
2018 4 23
2018 4 24
2018 4 25
2018 4 26
2018 4 27
2018 4 28
2018 4 29
2018 4 30
2018 5 1
2018 5 2
2018 5 3
2018 5 4
2018 5 5
2018 5 6
2018 5 7
2018 5 8
2018 5 9
2018 5 10
2018 5 11
2018 5 12
2018 5 13
2018 5 14
2018 5 15
2018 5 16
2018 5 17
2018 5 18
2018 5 19
2018 5 20
2018 5 21
2018 5 22
2018 5 23
2018 5 24
2018 5 25
2018 5 26
2018 5 27
2018 5 28
2018 5 29
2018 5 30
2018 5 31
2018 6 1
2018 6 2
2018 6 3
2018 6 4
2018 6 5
2018 6 6
2018 6 7
2018 6 8
2018 6 9
2018 6 10
2018 6 11
2018 6

In [23]:


url1 = "https://ebird.org/ws2.0/data/obs/L389646/recent?key=83svj2q1mga6"
url2 = "https://ebird.org/ws2.0/data/obs/US-MI/recent?key=83svj2q1mga6"
url3 = 'https://ebird.org/ws2.0/data/obs/L389646/historic/2018/2/27?rank=mrec&detail=full&cat=species&key=83svj2q1mga6'


recent_observations = requests.get(url3)
print(recent_observations) # We get a Response object

recent_observations.json() # Check out the JSON formatted data that was returned

<Response [200]>


[{'checklistId': 'CL25924',
  'comName': 'Canada Goose',
  'countryCode': 'US',
  'countryName': 'United States',
  'firstName': 'Heather',
  'hasComments': False,
  'hasRichMedia': False,
  'howMany': 45,
  'lastName': 'Slayton',
  'lat': 42.7606883,
  'lng': -83.0727124,
  'locID': 'L389646',
  'locId': 'L389646',
  'locName': 'Stony Creek Metropark--north (Macomb Co.)',
  'locationPrivate': False,
  'obsDt': '2018-02-27 11:37',
  'obsId': 'OBS582845729',
  'obsReviewed': False,
  'obsValid': True,
  'presenceNoted': False,
  'sciName': 'Branta canadensis',
  'speciesCode': 'cangoo',
  'subId': 'S43240458',
  'subnational1Code': 'US-MI',
  'subnational1Name': 'Michigan',
  'subnational2Code': 'US-MI-099',
  'subnational2Name': 'Macomb',
  'userDisplayName': 'Heather Slayton'},
 {'checklistId': 'CL25924',
  'comName': 'Mute Swan',
  'countryCode': 'US',
  'countryName': 'United States',
  'firstName': 'Heather',
  'hasComments': False,
  'hasRichMedia': False,
  'howMany': 7,
  'lastN

In [46]:
# Base URL for eBird API 2.0
url_base_obs = 'https://ebird.org/ws2.0/data/obs/'

# Create a list to hold the individual dictionaries of observations
observations = []

# Loop over the locations of interest and dates of interest
for loc_id in loc_ids:
    for d in rng:
        time.sleep(0.5) # time delay
        ymd = '{}/{}/{}'.format(d.year, d.month, d.day)
        # Build the URL
        url_obs = url_base_obs + loc_id + '/historic/' + ymd + \
        '?rank=mrec&detail=full&cat=species&key=' + api_key
        print(url_obs)
        # Get the observations for one location and date
        obs = requests.get(url_obs)
        # Append the new observations to the master list
        observations.extend(obs.json())

# Convert the list of dictionaries to a pandas dataframe        
obs_df = pd.DataFrame(observations)
# Check out the structure of the dataframe
print(obs_df.info())
# Check out the first few rows
obs_df.head()
# Export the dataframe to a csv file
obs_df.to_csv("observations.csv", index=False)

https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/1?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/2?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/3?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/4?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/5?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/6?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/7?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/8?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
https://ebird.org/ws2.0/data/obs/L2905470/historic/2015/1/9?rank=mrec&detail=full&cat=species&key=83svj2q1mga6
h

In [34]:
observations

[{'checklistId': 'CL24105',
  'comName': 'Canada Goose',
  'countryCode': 'US',
  'countryName': 'United States',
  'firstName': 'Benjamin',
  'hasComments': False,
  'hasRichMedia': False,
  'howMany': 6,
  'lastName': 'VanderWeide',
  'lat': 42.7909918,
  'lng': -83.1456256,
  'locID': 'L2776024',
  'locId': 'L2776024',
  'locName': 'Cranberry Lake Park',
  'locationPrivate': False,
  'obsDt': '2017-04-12 08:07',
  'obsId': 'OBS484737507',
  'obsReviewed': False,
  'obsValid': True,
  'presenceNoted': False,
  'sciName': 'Branta canadensis',
  'speciesCode': 'cangoo',
  'subId': 'S35894830',
  'subnational1Code': 'US-MI',
  'subnational1Name': 'Michigan',
  'subnational2Code': 'US-MI-125',
  'subnational2Name': 'Oakland',
  'userDisplayName': 'Benjamin VanderWeide'},
 {'checklistId': 'CL24105',
  'comName': 'Mute Swan',
  'countryCode': 'US',
  'countryName': 'United States',
  'firstName': 'Benjamin',
  'hasComments': False,
  'hasRichMedia': False,
  'howMany': 1,
  'lastName': 'Va

https://ebird.org/ws2.0/data/obs/US-MI/historic/2018/02/19/

#### Recent sightings near Oakland University

Here's an example of a URL structured according to the API (v1.1) that will return JSON
formatted data related to recenting sightings near Oakland University [42.6756294,-83.2373724].

[http://ebird.org/ws1.1/data/obs/geo/recent?lng=-83.2373724&lat=42.6756294&fmt=json](http://ebird.org/ws1.1/data/obs/geo/recent?lng=-83.2373724&lat=42.6756294&fmt=json)

Earlier in this notebook we saw how easy it was to use the Requests module to process JSON data with Python.

In [None]:
import requests

url = "http://ebird.org/ws1.1/data/obs/geo/recent?lng=-83.2373724&lat=42.6756294&fmt=json"
recent_observations = requests.get(url)
print(recent_observations) # We get a Response object

recent_observations.json() # Check out the JSON formatted data that was returned

In [None]:
for observation in recent_observations.json():
    print (observation['comName'], ",", observation['sciName'], ",", observation.get('howMany',1))

The simple structure of this JSON data (it's a list of dictionaries) makes it really easy to create a Pandas dataframe from it - just pass in the JSON data.

In [38]:
obs_df = pd.DataFrame(observations)
print(obs_df.info())
obs_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88 entries, 0 to 87
Data columns (total 28 columns):
checklistId         88 non-null object
comName             88 non-null object
countryCode         88 non-null object
countryName         88 non-null object
firstName           88 non-null object
hasComments         88 non-null bool
hasRichMedia        88 non-null bool
howMany             88 non-null int64
lastName            88 non-null object
lat                 88 non-null float64
lng                 88 non-null float64
locID               88 non-null object
locId               88 non-null object
locName             88 non-null object
locationPrivate     88 non-null bool
obsDt               88 non-null object
obsId               88 non-null object
obsReviewed         88 non-null bool
obsValid            88 non-null bool
presenceNoted       88 non-null bool
sciName             88 non-null object
speciesCode         88 non-null object
subId               88 non-null object
subnational

Unnamed: 0,checklistId,comName,countryCode,countryName,firstName,hasComments,hasRichMedia,howMany,lastName,lat,...,obsValid,presenceNoted,sciName,speciesCode,subId,subnational1Code,subnational1Name,subnational2Code,subnational2Name,userDisplayName
0,CL24105,Canada Goose,US,United States,Benjamin,False,False,6,VanderWeide,42.790992,...,True,False,Branta canadensis,cangoo,S35894830,US-MI,Michigan,US-MI-125,Oakland,Benjamin VanderWeide
1,CL24105,Mute Swan,US,United States,Benjamin,False,False,1,VanderWeide,42.790992,...,True,False,Cygnus olor,mutswa,S35894830,US-MI,Michigan,US-MI-125,Oakland,Benjamin VanderWeide
2,CL24105,Mallard,US,United States,Benjamin,False,False,4,VanderWeide,42.790992,...,True,False,Anas platyrhynchos,mallar,S35894830,US-MI,Michigan,US-MI-125,Oakland,Benjamin VanderWeide
3,CL24105,Ring-necked Duck,US,United States,Benjamin,False,False,2,VanderWeide,42.790992,...,True,False,Aythya collaris,rinduc,S35894830,US-MI,Michigan,US-MI-125,Oakland,Benjamin VanderWeide
4,CL24105,Hooded Merganser,US,United States,Benjamin,False,False,1,VanderWeide,42.790992,...,True,False,Lophodytes cucullatus,hoomer,S35894830,US-MI,Michigan,US-MI-125,Oakland,Benjamin VanderWeide


If you find yourself working with a particular API, you might look for a Python module that acts as a "wrapper" and makes the API easier to use. For example, there's a [nice wrapper for the eBird API](https://github.com/carsonmcdonald/python-ebird-wrapper).

In [40]:
obs_df.to_csv("test_obs_df.csv", index=False)

In [None]:
from EBird import EBird

ebird = EBird()
ebird.recent_notable_observations_geo(42.6756294, -83.2373724)

https://pypi.python.org/pypi/ebird-api/2.1.0

In [3]:
from ebird.api import location_observations, location_species, location_notable
from ebird.api import hotspot_observations, hotspot_species, hotspot_notable

In [18]:
hotspots = [id for park, id in hotspot_ids.items()]

In [19]:
records = hotspot_observations(hotspots, back=30)

In [20]:
records

[{'comName': 'Mourning Dove',
  'howMany': 3,
  'lat': 42.7637414,
  'lng': -83.1172353,
  'locID': 'L1581963',
  'locId': 'L1581963',
  'locName': 'Draper Twin Lake Park',
  'locationPrivate': False,
  'obsDt': '2018-02-21 10:40',
  'obsReviewed': False,
  'obsValid': True,
  'sciName': 'Zenaida macroura'},
 {'comName': 'Blue Jay',
  'howMany': 2,
  'lat': 42.7637414,
  'lng': -83.1172353,
  'locID': 'L1581963',
  'locId': 'L1581963',
  'locName': 'Draper Twin Lake Park',
  'locationPrivate': False,
  'obsDt': '2018-02-21 10:40',
  'obsReviewed': False,
  'obsValid': True,
  'sciName': 'Cyanocitta cristata'},
 {'comName': 'American Crow',
  'howMany': 4,
  'lat': 42.7637414,
  'lng': -83.1172353,
  'locID': 'L1581963',
  'locId': 'L1581963',
  'locName': 'Draper Twin Lake Park',
  'locationPrivate': False,
  'obsDt': '2018-02-21 10:40',
  'obsReviewed': False,
  'obsValid': True,
  'sciName': 'Corvus brachyrhynchos'},
 {'comName': 'Black-capped Chickadee',
  'howMany': 2,
  'lat': 42.

In [23]:
# Get all the records for Canada Goose in the past 2 weeks. Include
# records that have not been reviewed and return all the fields available.
records = hotspot_species(
'Spinus pinus', hotspots, provisional=True, detail='full')



In [24]:
records

[{'checklistID': 'CL24105',
  'checklistId': 'CL24105',
  'comName': 'Pine Siskin',
  'countryCode': 'US',
  'countryName': 'United States',
  'firstName': 'Benjamin',
  'hasComments': False,
  'hasRichMedia': False,
  'howMany': 3,
  'lastName': 'VanderWeide',
  'lat': 42.7637414,
  'lng': -83.1172353,
  'locID': 'L1581963',
  'locId': 'L1581963',
  'locName': 'Draper Twin Lake Park',
  'locationPrivate': False,
  'obsDt': '2018-02-21 10:40',
  'obsID': 'OBS580895700',
  'obsId': 'OBS580895700',
  'obsReviewed': False,
  'obsValid': True,
  'presenceNoted': False,
  'sciName': 'Spinus pinus',
  'subID': 'S43084576',
  'subId': 'S43084576',
  'subnational1Code': 'US-MI',
  'subnational1Name': 'Michigan',
  'subnational2Code': 'US-MI-125',
  'subnational2Name': 'Oakland',
  'userDisplayName': 'Benjamin VanderWeide'}]

In [None]:
# Get all the most sightings of locally or nationally rare birds for the past
# 30 days. Include all the fields available.
records = hotspot_notable(hotspots, back=30, detail='full')