#### Table of Contents
1. [Introduction](#birds-around-my-area-🕊)
2. [Constructing the data pipeline](#constructing-the-data-pipeline)

# Birds Around My Area 🕊
This notebook is an exploratory analysis on bird data provided by Cornell's public eBird API. 

The following steps are taken to construct the data pipeline:
1. list
2. the steps
3. here
4. until done

Pandas and matplotlib are used to further aid in data visualization.

## Constructing the data pipeline

### Get region codes and coordinates using `geopy` and the FCC Area API
We'll be looking at bird data from Los Angeles.

In order to get the correct `regionCode` for Los Angeles, we take the following steps:
1. Use the `geopy` Python library to convert a given location name to it's latidunal and longitdunal coordinates.
2. Use the [FCC Area API](https://geo.fcc.gov/api/census/) to convert coordinates to region codes.
    - Note, eBird region codes follow [ISO 3166-2](https://en.wikipedia.org/wiki/ISO_3166-2#Current_codes) guidelines.

In [4]:
from encodings.utf_8 import getregentry
import requests
import json
from geopy import geocoders

def getLatLong(location):
    """Fetches latitude and longitude of an area.
    
    Args:
        location (str): The location
    Returns:
        tuple: A tuple (lat, long) of the location
        
    """
    geolocator = geocoders.Nominatim(user_agent="Geo Locate")
    location = geolocator.geocode(location)
    
    return ((location.latitude, location.longitude))
    

def getRegionCode(lat, long):
    """Fetches eBird regionCode using municipal census API
    
    Args:
        lat (float): latitude
        long (float): longitude 
    Returns: 
        str: regionCode
    
    """
    
    # Use FCC Area API that is publicly available to get codes for a given coordinate
    censusUrl = str('https://geo.fcc.gov/api/census/area?lat=' + 
                    str(lat) +
                    '&lon=' +
                    str(long) +
                    '&format=json')
    
    # Sending out a GET request
    get = requests.request('GET', censusUrl, data={})
    
    # Parse the response. All API values are contained in the 'results' list
    response = json.loads(get.content)['results'][0]
    
    # regionCode follows ISO 3166-2 guidelines. Each complete code consists of two parts:
    # 1. The ISO 3166-1 alpha-2 code of the country.
    # 2. A string of up to three alphanumeric characters, obtained from already existing codes for countries. Since we'll be looking at bird data in the US, we use the in-state FIPS code.
    fips = response['county_fips']
    regionCode = 'US-' + response['state_code'] + '-' + fips[2] + fips[3] + fips[4]
    
    return regionCode
    
lat_long = getLatLong('Angeles National Forest Los Angeles')
regionCode = getRegionCode(lat_long[0], lat_long[1])

### Connecting to the eBird API to get bird data in JSON format
Note: Python wrapper [ebird-api](https://pypi.org/project/ebird-api/) must be downloaded. Documentation for eBird API and instructions on how to sign up for a key can be found [here](https://documenter.getpostman.com/view/664302/S1ENwy59#4e020bc2-fc67-4fb6-a926-570cedefcc34)

Equipped with our `regionCode`, we can now search for bird data by region. We can also use function `getLatLong()` to obtain coordinates for a given region.

In [58]:
from ebird.api import get_observations
import csv
import pandas as pd
import datetime as dt

API_KEY = 'b0e60cbbp1n6'

def getWeeklyBirdData(regionCode):
    """Fetches weekly bird data by calling a GET request from eBird API and converts the request to a CSV
    
    Args:
        regionCode (str): region code for a given location
        
    """
    
    # Get observations from Los Angeles for the past week
    records = get_observations(API_KEY, regionCode, back=7)
    df = pd.DataFrame(records)
    df.to_csv(getName())
    
def getName():
    """Returns name for bird data file. Has the following structure:
    bird_data + CURRENT_YEAR + WEEK_NUM, where
    WEEK_NUM = the week number of the current day in which the function is being run
    
    Returns:
        String: the data file name
        
    """
    today = dt.date.today()
    week_ago = today - dt.timedelta(days=7)
    
    res = 'bird_data_' + str(week_ago) + '_' + str(today) + '.csv'
    
    return res

bird_df = getWeeklyBirdData(regionCode)

Unnamed: 0,speciesCode,comName,sciName,locId,locName,obsDt,howMany,lat,lng,obsValid,obsReviewed,locationPrivate,subId
0,hooori,Hooded Oriole,Icterus cucullatus,L9091524,"28903 Dargan Street, Agoura Hills, California,...",2022-08-23 08:28,2.0,34.163094,-118.755984,True,False,True,S117439878
1,houfin,House Finch,Haemorhous mexicanus,L9091524,"28903 Dargan Street, Agoura Hills, California,...",2022-08-23 08:28,7.0,34.163094,-118.755984,True,False,True,S117439878
2,allhum,Allen's Hummingbird,Selasphorus sasin,L9091524,"28903 Dargan Street, Agoura Hills, California,...",2022-08-23 08:28,3.0,34.163094,-118.755984,True,False,True,S117439878
3,annhum,Anna's Hummingbird,Calypte anna,L9091524,"28903 Dargan Street, Agoura Hills, California,...",2022-08-23 08:28,3.0,34.163094,-118.755984,True,False,True,S117439878
4,coshum,Costa's Hummingbird,Calypte costae,L9091524,"28903 Dargan Street, Agoura Hills, California,...",2022-08-23 08:28,1.0,34.163094,-118.755984,True,False,True,S117439878


## Connecting to an existing S3 bucket and storing csv files in them
Using Python library `Boto3` we connect to AWS S3. We'll be storing csv files to the *birds-around-my-area* bucket. 

In [60]:
import boto3
import os, glob

def storeToBucket(csvFile, bucket_name):
    """Stores csv file to the specified bucket
    
    Args:
        csvFile (str): name of the csv file
        bucket_name (str): name of the S3 bucket to connect and store data to
            
    """
    # Creating the connection
    session = boto3.Session(profile_name='default')
    s3 = session.resource('s3')

    # Storing data
    s3_ojbect = s3.Object(bucket_name, csvFile).put(Body=open(csvFile, 'rb'))

    # Delete .csv in local folder
    os.remove(csvFile)
    
storeToBucket(getName(), 'birds-around')


SyntaxError: invalid syntax (3861614069.py, line 4)

In [54]:
import pandas as pd
import datetime as dt

pd.Timestamp('today').week

today = dt.date.today()
week_ago = today - dt.timedelta(days=7)



<bound method Timestamp.weekday of Timestamp('2022-08-23 14:34:50.268800')>