Run the cell below to setup the notebook (_shift+return in cell to run_ or Press Run button in the menu)
Do you see a number in the left margin of the cell below? If so, click on _Kernel->Restart and Clear Output_

In [1]:
%%capture
!pip install --no-cache-dir shapely
!pip install -U folium

import os
import time
import folium
from datetime import datetime
from shapely.geometry import Point, mapping
from shapely.geometry.polygon import Polygon
import matplotlib as mpl
from matplotlib.collections import PatchCollection
import matplotlib.pyplot as plot
import numpy as np
import pandas as pd
import requests
from datascience import *
from shapely import geometry as sg, wkt
from scripts.espm_module import *
import json
import random
from IPython.display import display, HTML
import ipywidgets as widgets
import urllib3
%matplotlib inline


# ESPM / IB 105 Natural History Museums and Data Science

Do this Notebook after successfully completing the BNHM notebook that shows how to access species occurrence data. This notebook will show how to integrate diverse data sets to visualize correlations and discover patterns to address questions of species’ responses to environmental change. We will use programmatic tools to show how to use Berkeley resources such as the biodiversity data from biocollections and online databases, field stations, climate models, and other environmental data.

## Table of Contents

4 - [Cal-Adapt API](#adapt)

#### ---

# Part 4: Cal-Adapt API<a id='adapt'></a>

Let's get back the data from *Argia agrioides* with the GBIF API:

In [None]:
req = GBIFRequest()  # creating a request to the API
params = {'scientificName': 'Argia agrioides'}  # setting our parameters (the specific species we want)
pages = req.get_pages(params)  # using those parameters to complete the request
records = [rec for page in pages for rec in page['results'] if rec.get('decimalLatitude')]  # sift out valid records
records[:5]  # print first 5 records

We'll make a `DataFrame` again for later use:

In [None]:
records_df = pd.DataFrame(records)
records_df.head()

Now we will use the [Cal-Adapt](http://api.cal-adapt.org/api/) web API to work with time series raster data. It will request an entire time series for any geometry and return a Pandas `DataFrame` object for each record in all of our *Argia agrioides* records:

In [None]:
pip install -q intake-esm s3fs


In [None]:
req = CalAdaptRequest()
records_g = [dict(rec, geometry=sg.Point(rec['decimalLongitude'], rec['decimalLatitude']))
             for rec in records]
ca_df = req.concat_features(records_g, 'gbifID')

Let's look at the first five rows:

In [None]:
ca_df.head()

In [None]:
len(ca_df.columns), len(ca_df)

This looks like the time series data we want for each record (the unique ID numbers as the columns). Each record has the projected temperature in Fahrenheit for 273 years (every row!). We can plot predictions for few random records:

In [None]:
# Make a line plot using the first 9 columns of dataframe
ca_df.iloc[:,:9].plot()

# Use matplotlib to title your plot.
plot.title('Argia agrioides - %s' % req.slug)

# Use matplotlib to add labels to the x and y axes of your plot.
plot.xlabel('Year', fontsize=18)
plot.ylabel('Degrees (Fahrenheit)', fontsize=16)


It looks like temperature is increasing across the board wherever these observations are occuring. We can calculate the average temperature for each year across observations in California:

In [None]:
tmax_means = ca_df.mean(axis=1)
tmax_means

What's happening to the average temperature that *Argia agrioides* is going to experience in the coming years across California?

In [None]:
tmax_means.plot()

Is there a temperature at which the *Argia agrioides* cannot survive? Is there one in which they particularly thrive?

---

What if we look specifically at the field stations and reserves? We can grab our same code that checked whether a record was within a station, and then map those `gbifID`s back to this temperature dataset:

In [None]:
records_df["point"] = records_df.apply(lambda row: make_point (row),axis=1)
records_df["station"] = records_df.apply(lambda row: in_station(reserves, row),axis=1)
in_stations_df = records_df[records_df["station"] != False]
in_stations_df[['gbifID', 'station']].head()

Recall the column headers of our `ca_df` are the `gbifID`:

In [None]:
ca_df.head()

Now we subset the temperature dataset for only the observations that occurr within the bounds of a reserve or field station:

In [None]:
station_obs = [str(id) for id in list(in_stations_df['gbifID'])]
ca_df[station_obs]

Let's graph these observations from Santa Cruz Island against the average temperature across California where this species was observed:

In [None]:
plot.plot(tmax_means)
plot.plot(ca_df[station_obs])

# Use matplotlib to title your plot.
plot.title('Argia agrioides and temperatures in Santa Cruz Island')

# Use matplotlib to add labels to the x and y axes of your plot.
plot.xlabel('Year', fontsize=18)
plot.ylabel('Degrees (Fahrenheit)', fontsize=16)
plot.legend(["CA Average", "Santa Cruz Island"])

What does this tell you about Santa Cruz Island? As time goes on and the temperature increases, might Santa Cruz Island serve as a refuge for *Argia agrioides*?