# 11/21/22 Live Coding Demo 

# **Detecting sea level rise from Florida tide gauge records**
### Practice with Pandas time series analysis
<img src="https://drive.google.com/uc?export=view&id=1hSIOjtYu1jHM1sfi32KjCSwxi1ADybyR" width="1200" />

**Data description:** A netCDF file with hourly tide gauge records from Key West, FL (from 1913 to present). The tide gauge measurements are of relative sea level (RSL), which includes both sea level rise (from ice melt and thermal expansion) and local vertical land motion (from subsidence and isostatic rebound).

**Data source:** [University of Hawaii Sea Level Center](https://uhslc.soest.hawaii.edu/datainfo/)

In [None]:
# Import NumPy, xarray, Matplotlib
import numpy as np
import netCDF4
import pandas as pd
from scipy import interpolate, stats
import xarray as xr
import matplotlib.pyplot as plt
from datetime import datetime, timedelta

# **Smoothing Data** 

# Step 1: 
Open the Key West data file and convert the sea level record to a Pandas series. Display the new Pandas series.
Note: Drop the record id.

In [None]:
# read in the filepath 
filepath = "data/tide_gauge_key_west_fl.nc"
# open with xarray
...
# data cleaning
...

# Step 2. Plot the time series from Key West.
- Consider using xlim and ylim to change the resolution of the plot by drilling down on a smaller timeframe (Ex: Eight months of tide gauge records from FL)





In [None]:
...

# Step 3. Daily averages using `.resample()'


*   Calculate daily average of the time series using .resample().






In [None]:
# resampling by day
...
# plot
...

# Step 4. Monthly running means using `.rolling()`


* Calculate monthly running mean of the hourly time series using .rolling().

* Save these as two new variables. Display one of them to check that it worked.

* Make a new plot using the monthly rolling values



In [None]:
# calculate rolling monthly mean
...
# plot
...

# **Interpolating**

# Step 1:

* Load the NASA_GISS_global_temp CSV using pandas 
Note: Set your index_col to 'Year'

In [None]:
# get filepath to temperature dataset
filepath = "data/NASA_GISS_global_temp.csv"

# load dataframe with index set to 'Year'
...

# Step 2:
* Create a line plot of the years vs temperature anomaly. 


In [6]:
#  Make a black solid line plot of the original 1880-2019 time series in "global_temp".
#    Add a grid and axis labels.
...

# Step 3: Removing a portion of data 
* Recreate the original line plot
* Create a copy of "global_temp" and save it as a new variable, "gt_missing_1990s"
* Drop the 1990s data from the new copy
* Add scatter points for the incomplete time series in "gt_missing_1990s"


In [None]:
# copy over plot
...

# make copy of data
...

# Delete years 1990-1999 from dataframe copy
...

# On the same plot, add red scatter points for the incomplete time series in "gt_missing_1990s".
...

# Add label arguments to your plot functions above, then add a legend.
...

# Step 4: Interpolate over the missing data
* Use SciPy's interp1d() to linearly interpolate the data in "gt_missing_1990s" to the missing years, which are stored in the variable "years_missing_1990s".
* On the same plot, add scatter points for the temperatures interpolated to years in "years_missing_1990s".

In [None]:
# Copy/paste plot from step 3
...

# Use SciPy's interp1d() to linearly interpolate the data
...

# add to plot
...
#Add label arguments to your plot functions above, then add a legend.
...

# Step 5: Run linear regression over data (if time permits)
* Use scipy.stats linregress() to calculate the linear regression of temperature over all yearsm

In [None]:
# copy/paste plot from step 4
...

# calculate linear regression
...

# add to plot