# 08 Demo: Using snotel_ccss_stations to get SNOTEL data

UW Geospatial Data Analysis  
CEE467/CEWA567  
Eric Gagliano  

**Thanks for checking out this notebook! My hope is that this repository makes it easier to retrieve daily SNOTEL and CCSS data without having to do clunky downloads and conversions. Snow depth / SWE / PRCPSA are in meters, temperatures are in celsius. The only required packages you'll need are geopandas and pandas, the rest of the imports are for applications :)**

In [None]:
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
import datetime
import tqdm
import contextily as ctx

## View all SNOTEL & CCSS stations
- the [SNOwpack TELemetry (SNOTEL) network](https://www.nrcs.usda.gov/wps/portal/wcc/home/aboutUs/monitoringPrograms/automatedSnowMonitoring/) includes over 800 automated weather stations in the Western U.S. for mountain snowpack observation
- the [CCSS program](https://water.ca.gov/Programs/Flood-Management/Flood-Data/Snow-Surveys) manages a network of 130 automated snow sensors located in the Szierra Nevada and Shasta-Trinity Mountains

### Read the geojson stored at https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/all_stations.geojson
- the daily recurring github action should regularly update the `endDate` column 
- set the index of the geodataframe to the code column
- let's only look at sites for which we have data

In [None]:
all_stations_gdf = gpd.read_file('https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/all_stations.geojson').set_index('code')
all_stations_gdf = all_stations_gdf[all_stations_gdf['csvData']==True]

In [None]:
all_stations_gdf

### Use geopandas `GeoDataFrame.explore()` on the `all_stations_gdf` geodataframe to interactively view the stations 
- color by network: red is SNOTEL, blue is CCSS.

In [None]:
all_stations_gdf.astype(dict(beginDate=str, endDate=str)).explore(column='network',cmap='bwr')

## Read a singular CSV: *Paradise, WA*
- check out information about the [SNOTEL station near Mt. Rainier at Paradise, WA](https://wcc.sc.egov.usda.gov/nwcc/site?sitenum=679)
- cool plots available at the [Northwest River Forecast Center website](https://www.nwrfc.noaa.gov/snow/snowplot.cgi?AFSW1)

### Place a station code (which you can find in this interactive plot, or by other means) in the url: https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station_id}.csv
- for SNOTEL stations, this will be of the form {unique number}_{two letter state abbreviation}_SNTL (e.g. 679_WA_SNTL).   
- for CCSS stations, this will be a three letter code (e.g. BLK).   
- use `pd.read_csv()` with `index_col='datetime'` and `parse_dates=True` so we interpret the datetime column as pandas datetime objects

In [None]:
station_id = '679_WA_SNTL'
paradise_snotel = pd.read_csv(f'https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station_id}.csv',index_col='datetime', parse_dates=True)

In [None]:
paradise_snotel

### Try a simple plot of snow depth and SWE
- select the column of interest and use pandas built in `Series.plot()`

In [None]:
f,ax=plt.subplots(figsize=(12,5))

paradise_snotel['SNWD'].plot(ax=ax,label='snow depth')
paradise_snotel['WTEQ'].plot(ax=ax,label='snow water equivalent')

ax.set_xlim(pd.to_datetime(['2017-10-01','2018-09-30']))

ax.grid()
ax.legend()

ax.set_xlabel('time')
ax.set_ylabel('snow depth / SWE [meters]')
ax.set_title('Snow depth and SWE at Paradise, WA \n(water year 2018)')

f.tight_layout()

## Read a variable from multiple CSVs by looping over a subset of the geodataframe: *CCSS stations*
- the Sierra Nevada [received a historic amount of snow in 2023](https://www.nps.gov/articles/000/sien-sierranevadamonitor-spring2023.htm)
- let's explore the magnitude of this season by comparing to the median snow pack

### As before, create a list of the stations we are interested in, loop through and add data to a dictionary with the station code as the key, then read into pandas using `pd.DataFrame.from_dict()`
- create a geodataframe `ccss_stations_gdf` of only CCSS stations from `all_stations_gdf` by creating an index where network equals CCSS
- loop through the CCSS stations and create a dataframe `ccss_stations_snwd_df`

In [None]:
ccss_stations_gdf = all_stations_gdf[all_stations_gdf['network']=='CCSS']

In [None]:
ccss_stations_gdf

In [None]:
%%time 
station_dict = {}

for station in tqdm.tqdm(ccss_stations_gdf.index):
    try:
        tmp = pd.read_csv(f'https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station}.csv',index_col='datetime',parse_dates=True)['SNWD']
        station_dict[station] = tmp
    except:
        print(f'failed to retrieve {station}')

ccss_stations_snwd_df = pd.DataFrame.from_dict(station_dict).dropna(how='all')

In [None]:
ccss_stations_snwd_df