# Winter 2022-2023 Wind Events at Kettle Ponds

Author: Daniel Hogan
Created: January 10, 2024

This notebook will start to address two main questions (with sub-focuses discussed below):
1) What events had the highest percentile of wind speeds?
2) What were the general storm characteristics? Were they related?
2) How much sublimation over the season came from these events?

### Imports


In [242]:
# general
import os
import datetime as dt
import json
# data 
import xarray as xr 
from sublimpy import utils, variables, tidy
import numpy as np
import pandas as pd
from act import discovery, plotting
# plotting
import matplotlib.pyplot as plt
import plotly.express as px 
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
# helper tools
from scripts.get_sail_data import get_sail_data
from scripts.helper_funcs import create_windrose_df
import scripts.helper_funcs as hf
from metpy import calc, units
# make plotly work 
init_notebook_mode(connected=True)
cf.go_offline()

## 1. What events at Kettle Ponds had the highest percentile of wind speeds?
We will begin to address this by looking at daily average wind speeds  from the SOS data to classify the top 90th percentile of windy days during the main snow season which we will call December 1, 2023 to May 1, 2024. This may need to be broken down further to capture wind events, but we'll start with this. We'll begin by comparing the days for 3-20 m wind speeds and compare tower-to-tower to make sure we have consistency at our location.

1) We'll first make some box plots of daily average wind speeds at each height and each tower to look at the total distribution over winter
2) We'll make a timeseries plot for each height bin and mark out the highest percentile of wind speeds for the year
3) We'll also make a correlation plot with tower on one axis and measurement height on the other.
4) We'll then filter to the days with the highest 10% of wind speeds over our period

In [2]:
start_date = '20221129'
end_date = '20230507'

In [243]:
%%capture
# Let's begin by downloading the SOS data and storing it in the /storage/ directory
output_dir = '/storage/dlhogan/synoptic_sublimation/sos_data/'
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
sos_5min_ds = utils.download_sos_data(
                        start_date=start_date,
                        end_date=end_date,
                        variable_names=variables.DEFAULT_VARIABLES+hf.WATER_VAPOR_VARIABLES,
                        local_download_dir=output_dir,
                        cache=True
                    );  

**NOTE**: No filtering is done here. Will update with filters in the future.

In [21]:
# only get the wind variables and convert to dataframe
sos_5min_wind_df = sos_5min_ds[hf.WIND_VARIABLES].to_dataframe()
# resample to daily, get the mean for spd_*, u_*, v_* and w_* and the median for dir_*
# make a dictionary of the aggregations by iterating through the wind variables
sos_daily_avg_dict = {}
for var in hf.WIND_VARIABLES:
    if 'dir' in var:
        sos_daily_avg_dict[var] = 'median'
    else:
        sos_daily_avg_dict[var] = 'mean'
sos_daily_avg_df = sos_5min_wind_df.resample('1D').agg(sos_daily_avg_dict)
# reset index for sos_daily_avg_df
sos_daily_avg_df = sos_daily_avg_df.reset_index()

# resample to daily, get the max for spd_*, u_*, v_* and w_*
# make a dictionary of the aggregations by iterating through the wind variables
sos_daily_max_dict = {}
for var in hf.WIND_VARIABLES:
    if 'dir_' in var:
        continue
    else:
        sos_daily_max_dict[var] = 'max'
sos_daily_max_df = sos_5min_wind_df.resample('1D').agg(sos_daily_max_dict)

# find the wind direction during the max wind speed and add it to the daily max dataframe
idx = sos_5min_wind_df.filter(regex='spd_*').dropna().groupby(sos_5min_wind_df.filter(regex='spd_*').dropna().index.date).idxmax(skipna=True)
for dir_var in sos_5min_wind_df.filter(regex='dir_*').columns:
    # for the column, extract everything after the first underscore
    loc = dir_var.split('_', 1)[1]
    spd_loc = 'spd_' + loc
    # create a column with the max wind direction
    dates = sos_5min_wind_df.loc[idx[spd_loc].values, dir_var].index.date
    # fill a new dir_var colummn with nan
    sos_daily_max_df[dir_var] = np.nan
    sos_daily_max_df.loc[dates, dir_var] = sos_5min_wind_df.loc[idx[spd_loc].values, dir_var].values
# reset index for sos_daily_max_df
sos_daily_max_df = sos_daily_max_df.reset_index()

In [22]:
# for values greater than 25 in the spd_* columns, fill with nan
for spd_var in sos_daily_avg_df.filter(regex='spd_*').columns:
    sos_daily_avg_df.loc[sos_daily_avg_df[spd_var] > 25, spd_var] = np.nan
    sos_daily_max_df.loc[sos_daily_max_df[spd_var] > 25, spd_var] = np.nan
    # do the same for the dir vars 
    dir_var = 'dir_' + spd_var.split('_', 1)[1]
    sos_daily_avg_df.loc[sos_daily_avg_df[spd_var] > 25, dir_var] = np.nan
    sos_daily_max_df.loc[sos_daily_max_df[spd_var] > 25, dir_var] = np.nan

In [23]:
sos_daily_avg_tidy_df = tidy.get_tidy_dataset(sos_daily_avg_df, hf.WIND_VARIABLES)
sos_daily_max_tidy_df = tidy.get_tidy_dataset(sos_daily_max_df, hf.WIND_VARIABLES)
# filter to only spd variables
sos_daily_avg_tidy_df = sos_daily_avg_tidy_df[sos_daily_avg_tidy_df['variable'].str.contains('spd_')]
sos_daily_max_tidy_df = sos_daily_max_tidy_df[sos_daily_max_tidy_df['variable'].str.contains('spd_')]

Let's start to get an understanding for daily wind speeds by plotting wind speed and max wind speed on each day as a time series

In [24]:
# create a color dictionary for the unique height values in the sos_tidy_df
color_values = ['spd_1m_uw','spd_1m_d','spd_1m_ue',  'spd_2m_c', 'spd_3m_uw','spd_3m_c', 'spd_3m_ue','spd_3m_d', 'spd_5m_c',
                             'spd_10m_uw','spd_10m_c','spd_10m_ue','spd_10m_d','spd_15m_c', 'spd_20m_c']
n_colors = len(color_values)                      
color_scale = px.colors.sample_colorscale("viridis_r", [n/(n_colors -1) for n in range(n_colors)])
color_dict = dict(zip(color_values,color_scale))
fig = go.Figure()
fig = make_subplots(rows=2, 
                    cols=1, 
                    shared_xaxes=True, 
                    vertical_spacing=0.04, 
                    subplot_titles=('Daily Average Wind Speed', 'Daily Max 5 min Wind Speed'))
for variable in sos_daily_avg_df.filter(regex='spd_*').columns:
    fig.add_trace(go.Scatter(
        x=sos_daily_avg_df['time'], 
        y=sos_daily_avg_df[variable],
        name=f"{variable}",
        marker_color=color_dict[variable],
        connectgaps=False
    ),
    row=1, col=1)
    fig.add_trace(go.Scatter(
        x=sos_daily_max_df['time'], 
        y=sos_daily_max_df[variable],
        name=f"{variable}",
        marker_color=color_dict[variable],
        connectgaps=False,
        showlegend=False
    ),
    row=2, col=1)
# add a hortizontal line in the first plot at 5 m/s
fig.add_hline(y=5, row=1, col=1, line_dash="dash", line_color="black")
# add an annotation for the horizontal line
fig.add_annotation(xref="paper", yref="y", x=dt.date(2023,3,5), y=5.5,
            text="<b>5 m/s</b>",
            showarrow=False,
            # increase font size
            font=dict(
                size=16,
                color="black",
                ),
            row=1, col=1)
# update traces to not connect gaps
fig.update_traces(connectgaps=False)
# update layout
fig.update_layout(
    title='Daily Wind Speeds at Kettle Ponds',
    xaxis_title='Date',
    # set 1st yaxis title
    yaxis1=dict(
        title='Wind Speed (m/s)',
        range=[0, 12],
    ),
    # set 2nd yaxis title
    yaxis2=dict(
        title='Wind Speed (m/s)',
        range=[0, 25],
    ),
    legend_title_text='Wind Speeds',
    height=800,
    width=800,
    template='plotly_white'
)
fig

We see strong relationships between values. Lower levels are not so clean. But max wind speeds for the winter occured on the December 2022 wind event. But there were numberous wind events that exceeted an average of 5 m/s over the day with lots with max "gusts" over 10 m/s

### Wind Speed Plots

In [25]:
# Make a boxplot of the daily average wind speeds at each height and each tower
fig = px.box(sos_daily_avg_tidy_df, 
             x='variable', 
             y='value', 
             color='height',
             title='Daily Average Wind Speeds at Kettle Ponds',
             # show time in the hover
             hover_data=['time'],
             template='plotly_dark',
             height=500,
             width=1100,
             # widen the box
            boxmode='overlay',
            notched=True,
            points='all',
            category_orders={
                "variable":['spd_1m_uw','spd_1m_d','spd_1m_ue',  'spd_2m_c', 'spd_3m_uw','spd_3m_c', 'spd_3m_ue','spd_3m_d', 'spd_5m_c',
                             'spd_10m_uw','spd_10m_c','spd_10m_ue','spd_10m_d','spd_15m_c', 'spd_20m_c']
            }
            )
# add jitter
fig.update_traces(jitter=0.5, marker=dict(size=2))
# add labels to the x and y axis
fig.update_xaxes(title_text='Measurement Locations')
fig.update_yaxes(title_text='Wind Speed (m/s)')

fig

Wind speeds agree generally across these, but we should look at correlations as well. The 10m level looks like the best and most consistent across space and time, additionally, that will be consistent with other measurements of wind speed.


In [26]:
# Make a boxplot of the daily average wind speeds at each height and each tower
fig = px.box(sos_daily_max_tidy_df, 
             x='variable', 
             y='value', 
             color='height',
             title='Daily Max (5-min) Wind Speeds at Kettle Ponds',
             # show time in the hover
             hover_data=['time'],
             template='plotly_dark',
             height=500,
             width=1100,
             # widen the box
            boxmode='overlay',
            notched=True,
            points='all',
            category_orders={
                "variable":['spd_1m_uw','spd_1m_d','spd_1m_ue',  'spd_2m_c', 'spd_3m_uw','spd_3m_c', 'spd_3m_ue','spd_3m_d', 'spd_5m_c',
                             'spd_10m_uw','spd_10m_c','spd_10m_ue','spd_10m_d','spd_15m_c', 'spd_20m_c']
            }
            )
# add jitter
fig.update_traces(jitter=0.5, marker=dict(size=2))
# add labels to the x and y axis
fig.update_xaxes(title_text='Measurement Locations')
fig.update_yaxes(title_text='Wind Speed (m/s)')

fig

Maxes are generally consistent at different levels, bit the 10m at d has an outlier on the 22nd. Could be real, but likely is not as it was higher than any other wind speed measured at another location.

### Wid Direction Plots

In [27]:
# Make the same plot for wind direction variables
sos_daily_avg_dir_tidy_df = tidy.get_tidy_dataset(sos_daily_avg_df, hf.WIND_VARIABLES)
sos_daily_max_dir_tidy_df = tidy.get_tidy_dataset(sos_daily_max_df, hf.WIND_VARIABLES)
# filter to only spd variables
sos_daily_avg_dir_tidy_df = sos_daily_avg_dir_tidy_df[sos_daily_avg_dir_tidy_df['variable'].str.contains('dir_')]
sos_daily_max_dir_tidy_df = sos_daily_max_dir_tidy_df[sos_daily_max_dir_tidy_df['variable'].str.contains('dir_')]

In [31]:
# Make a boxplot of the daily average wind speeds at each height and each tower
fig = px.box(sos_daily_avg_dir_tidy_df, 
             x='variable', 
             y='value', 
             color='height',
             title='Daily Median Wind Direction at Kettle Ponds',
             # show time in the hover
             hover_data=['time'],
             template='plotly_dark',
             height=500,
             width=1100,
             # widen the box
            boxmode='overlay',
            notched=True,
            points='all',
            category_orders={
                "variable":['dir_1m_uw','dir_1m_d','dir_1m_ue',  'dir_2m_c', 'dir_3m_uw','dir_3m_c', 'dir_3m_ue','dir_3m_d', 'dir_5m_c',
                             'dir_10m_uw','dir_10m_c','dir_10m_ue','dir_10m_d','dir_15m_c', 'dir_20m_c']
            }
            )
# add jitter
fig.update_traces(jitter=0.5, marker=dict(size=2))
# add labels to the x and y axis
fig.update_xaxes(title_text='Measurement Locations')
fig.update_yaxes(title_text='Wind direction (degrees)')
fig

Box plots are not the best to demonstrate this but the median wind direction is some where around 300 degrees or WNW, it becomes more northerly at lower elevations and more westerly at higher elevations. Wind is pretty much never easterly and rarely southerly.


In [34]:
uw_daily_avg_df = sos_daily_avg_df[['spd_10m_uw', 'dir_10m_uw']]
# create a windrose dataframe
uw_daily_avg_df = create_windrose_df(uw_daily_avg_df, wind_dir_var='dir_10m_uw', wind_spd_var='spd_10m_uw')

# create a wind rose plot using bar_polar
fig = px.bar_polar(uw_daily_avg_df, 
                   r="frequency", 
                   theta="direction", 
                   color="speed", template="plotly_dark",
                   color_discrete_sequence= px.colors.sequential.Plasma_r,
                   width= 800,
                   height=500,
                   barnorm='percent',
                  )
# update the radial axis to show percentages
fig.update_layout(
    title="Daily Median Wind Rose at 10 m on UW Tower",
    font_size=16,
    polar_radialaxis=dict(
         ticksuffix='%',
         tickfont_size=14,
         showline=False,
         showticklabels=True,
         showgrid=True,
         angle=45,
         range=[0, 80]
      ),
)



In [30]:
# Make a boxplot of the daily average wind speeds at each height and each tower
fig = px.box(sos_daily_max_dir_tidy_df, 
             x='variable', 
             y='value', 
             color='height',
             title='Daily Wind Direction During Daily Max Wind at Kettle Ponds',
             # show time in the hover
             hover_data=['time'],
             template='plotly_dark',
             height=500,
             width=1100,
             # widen the box
            boxmode='overlay',
            notched=True,
            points='all',
            category_orders={
                "variable":['spd_1m_uw','spd_1m_d','spd_1m_ue',  'spd_2m_c', 'spd_3m_uw','spd_3m_c', 'spd_3m_ue','spd_3m_d', 'spd_5m_c',
                             'spd_10m_uw','spd_10m_c','spd_10m_ue','spd_10m_d','spd_15m_c', 'spd_20m_c']
            }
            )
# add jitter
fig.update_traces(jitter=0.5, marker=dict(size=2))
# add labels to the x and y axis
fig.update_xaxes(title_text='Measurement Locations')
fig.update_yaxes(title_text='Wind Direction (deg)')

fig

This is interesting. WInds basically solely come from the NW during winter when maximum wind speeds occur over the day, with some southeasterly components as well. Need wind roses to verify. 

In [33]:
uw_daily_max_df = sos_daily_max_df[['spd_10m_uw', 'dir_10m_uw']]
# create a windrose dataframe
uw_daily_max_df = create_windrose_df(uw_daily_max_df, wind_dir_var='dir_10m_uw', wind_spd_var='spd_10m_uw')

# create a wind rose plot using bar_polar
fig = px.bar_polar(uw_daily_max_df, 
                   r="frequency", 
                   theta="direction", 
                   color="speed", template="plotly_dark",
                   color_discrete_sequence= px.colors.sequential.Plasma_r,
                   width= 800,
                   height=500,
                   barnorm='percent',
                  )
# update the radial axis to show percentages
fig.update_layout(
    title="Daily Max Wind Rose at 10 m on UW Tower",
    font_size=16,
    polar_radialaxis=dict(
         ticksuffix='%',
         tickfont_size=14,
         showline=False,
         showticklabels=True,
         showgrid=True,
         angle=45,
         range=[0, 80]
      ),
)



### Wind Speed and Direction Correlation Heatmaps

In [93]:
# with the tidy df, pivot the dataframe so that the variable column the index, and the tower column the columns
sos_heatmap_wind_spd_df = sos_daily_avg_df.filter(regex='spd_*').corr()
# make upper triangle nan
sos_heatmap_wind_spd_df = sos_heatmap_wind_spd_df.where(np.tril(np.ones(sos_heatmap_wind_spd_df.shape)).astype('bool'))

# create a heatmap of the daily average wind speeds at each height and each tower
fig = px.imshow(sos_heatmap_wind_spd_df, 
                labels=dict(x="Measurement Locations", y="Measurement Locations", color="Correlation"),
                color_continuous_scale=px.colors.sequential.RdBu_r,
                title='Correlation Matrix of Wind Speeds at Kettle Ponds',
                # add min and max values to the colorbar
                zmin=0,
                zmax=1,
                # make it a triangle
                origin='upper',
                width=800,
                height=500,                
                )
fig.update_traces(xgap=1, ygap=1,   hoverongaps=False)
fig


In [94]:
# with the tidy df, pivot the dataframe so that the variable column the index, and the tower column the columns
sos_heatmap_wind_dir_df = sos_daily_avg_df.filter(regex='dir_*').corr()
# make upper triangle nan
sos_heatmap_wind_dir_df = sos_heatmap_wind_dir_df.where(np.tril(np.ones(sos_heatmap_wind_dir_df.shape)).astype('bool'))

# create a heatmap of the daily average wind speeds at each height and each tower
fig = px.imshow(sos_heatmap_wind_dir_df, 
                labels=dict(x="Measurement Locations", y="Measurement Locations", color="Correlation"),
                color_continuous_scale=px.colors.sequential.RdBu_r,
                title='Correlation Matrix of Wind Speeds at Kettle Ponds',
                # add min and max values to the colorbar
                zmin=0,
                zmax=1,
                # make it a triangle
                origin='upper',
                width=800,
                height=500,                
                )
fig.update_traces(xgap=1, ygap=1,   hoverongaps=False)

Both heatmaps show very high correlations between wind speed and direction. So I think sticking to the 10 meters towers across sites will be fine

### Now lets separate out the highest 10th percentile of windy days and see how correlated these days are across the locations

In [183]:
# get the rows above the 95th percentile for each of the spd columns
spd_95th_percentile = sos_daily_avg_df.filter(regex='spd_10_*').quantile(.95)
avg_spd_95th_percentile_dict = {}
max_spd_95th_percentile_dict = {}
for loc,spd in spd_95th_percentile.items():
    # get the rows above the 95th percentile for each of the spd columns
    avg_spd_95th_percentile_dict[loc] = sos_daily_avg_df[sos_daily_avg_df[loc] > spd]
    # get the ros above the 95th percentile for each of the max spd columns
    max_spd_95th_percentile_dict[loc] = sos_daily_max_df[sos_daily_avg_df[loc] > spd]
    # drop duplicate index values
    avg_spd_95th_percentile_dict[loc] = avg_spd_95th_percentile_dict[loc].drop_duplicates(subset='time', keep='first')
    max_spd_95th_percentile_dict[loc] = max_spd_95th_percentile_dict[loc].drop_duplicates(subset='time', keep='first')

In [203]:
# combine the dataframes into an xarray dataset for the avg spd 95th percentile
sos_avg_spd_95th_percentile_ds = xr.concat([df.set_index('time').to_xarray() for df in avg_spd_95th_percentile_dict.values()], dim='time')
# drop duplicate time values
sos_avg_spd_95th_percentile_ds = sos_avg_spd_95th_percentile_ds.drop_duplicates('time', keep='first').sortby('time')
# convert to dataframe
sos_avg_spd_95th_percentile_df = sos_avg_spd_95th_percentile_ds.to_dataframe()

# combine the dataframes into an xarray dataset for the max spd 95th percentile
sos_max_spd_95th_percentile_ds = xr.concat([df.set_index('time').to_xarray() for df in max_spd_95th_percentile_dict.values()], dim='time')
# drop duplicate time values
sos_max_spd_95th_percentile_ds = sos_max_spd_95th_percentile_ds.drop_duplicates('time', keep='first').sortby('time')
# convert to dataframe
sos_max_spd_95th_percentile_df = sos_max_spd_95th_percentile_ds.to_dataframe()


In [209]:
print('Days with avg spd > 95th percentile:')
# print the date for these days
for day in sos_avg_spd_95th_percentile_df.index.date:
    print(day)

Days with avg spd > 95th percentile:
2022-11-29
2022-12-13
2022-12-14
2022-12-22
2023-01-04
2023-01-25
2023-02-09
2023-03-16
2023-04-15
2023-04-28


In [204]:
# scatter plot the 95th percentile wind speeds for each spd_* column
fig = go.Figure()
for loc in sos_avg_spd_95th_percentile_df.filter(regex='spd_10m_*').columns:
    fig.add_trace(go.Scatter(
        x=sos_avg_spd_95th_percentile_df.index, 
        y=sos_avg_spd_95th_percentile_df[loc],
        name=f"avg at {loc.split('_', 2)[1]} on {loc.split('_', 2)[2]}",
        mode='lines+markers',
        marker_color=color_dict[loc],
        marker=dict(size=10,
                    symbol='circle'),
        connectgaps=False
    ))
    # add the max spd 95th percentile as a dashed line
    fig.add_trace(go.Scatter(
        x=sos_max_spd_95th_percentile_df.index, 
        y=sos_max_spd_95th_percentile_df[loc],
        name=f"max at {loc.split('_', 2)[1]} on {loc.split('_', 2)[2]}",
        mode='markers+lines',
        marker_color=color_dict[loc],
        marker=dict(size=10,
                    symbol='square'),
        connectgaps=False,
        line=dict(dash='dash')
    ))
fig.update_layout(
    title='95th Percentile Wind Speeds at Kettle Ponds',
    xaxis_title='Date',
    yaxis_title='Wind Speed (m/s)',
    legend_title_text='Wind Speeds',
    height=400,
    width=800,
    template='plotly_white'
)

In [216]:
# make windrose plots for the uw_10m columns for spd and dir
for tower in ['uw','c','ue','d']:
    tower_10m_90th_percentile_df = sos_avg_spd_95th_percentile_df[[f'spd_10m_{tower}', f'dir_10m_{tower}']]
    # create a windrose dataframe
    tower_10m_90th_percentile_df = create_windrose_df(tower_10m_90th_percentile_df, wind_dir_var=f'dir_10m_{tower}', wind_spd_var=f'spd_10m_{tower}')
    fig = px.bar_polar(tower_10m_90th_percentile_df,
                        r="frequency", 
                        theta="direction", 
                        color="speed", template="plotly_dark",
                        color_discrete_sequence= px.colors.sequential.Plasma_r,
                        width= 600,
                        height=600,
                        barnorm='percent',
                        
                        )
    # update the radial axis to show percentages
    fig.update_layout(
        title="Days above 90th Percentile Wind Speeds<br>"+f"10 m on {tower.upper()} Tower",
        # add space below title
        margin=dict(t=100),
        font_size=16,
        polar_radialaxis=dict(
            ticksuffix='%',
            tickfont_size=14,
            showline=False,
            showticklabels=True,
            showgrid=True,
            angle=45,
            range=[0, 80]
        ),
    )
    # move legend more to the right
    fig.update_layout(legend=dict(
        orientation="v",
        xanchor="right",
        x=1.5
    ))
    fig.show()









In [219]:
# make windrose plots for the uw_10m columns for spd and dir
for tower in ['uw','c','ue','d']:
    tower_10m_90th_percentile_df = sos_max_spd_95th_percentile_df[[f'spd_10m_{tower}', f'dir_10m_{tower}']]
    # create a windrose dataframe
    tower_10m_90th_percentile_df = create_windrose_df(tower_10m_90th_percentile_df, wind_dir_var=f'dir_10m_{tower}', wind_spd_var=f'spd_10m_{tower}')
    fig = px.bar_polar(tower_10m_90th_percentile_df,
                        r="frequency", 
                        theta="direction", 
                        color="speed", template="plotly_dark",
                        color_discrete_sequence= px.colors.sequential.Plasma_r,
                        width= 600,
                        height=600,
                        barnorm='percent',
                        
                        )
    # update the radial axis to show percentages
    fig.update_layout(
        title="Days above 95th Percentile Max Wind Speeds<br>"+f"10 m on {tower.upper()} Tower",
        # add space below title
        margin=dict(t=100),
        font_size=16,
        polar_radialaxis=dict(
            ticksuffix='%',
            tickfont_size=14,
            showline=False,
            showticklabels=True,
            showgrid=True,
            angle=45,
            range=[0, 80]
        ),
    )
    # move legend more to the right
    fig.update_layout(legend=dict(
        orientation="v",
        xanchor="right",
        x=1.5
    ))
    fig.show()









## 2. What were some of the storm characteristics at the surface? Are they related?
This section will likely produce another notebook to focus on upper-level dynamics, but we want to get an idea of what the storm was like. We'll look at correlations of:
- wind speed
- wind direction
- relative humidity
- 2m temperature
We'll also take a look at the SAIL radiosondes from those days to try to get a picture of what was happening at upper levels. Perhaps we'll make a mean radiosonde by binning the pressure columns and taking the mean? Have to figure that out.
Can also start to explore some of the doppler lidar data.

Eventually, (not in this notebook) I want to get an understanding for what the precipitation timing was like and see if that matters? I would think that windy storms where snow falls first and then blows around could be the most important.

Let's begin by getting the 5 minute data from each of these select days. Then we'll make line plots for wind speed, pressure, temperature, and vapor density

In [244]:
# get a list of the windy days
windy_days = sos_avg_spd_95th_percentile_df.index.date
# interate through the days and make a ds for each day, store in a dictionary numbered by the day of water year (oct 1 - sept 30) 
sos_daily_ds_list = []
for day in windy_days:
    # get the day of water year
    # get the ds for the day
    ds = sos_5min_ds.sel(time=slice(f'{day} 00:00:00', f'{day} 23:59:59'))
    # make the time dimension only have the time of day
    ds['time'] = ds['time'].dt.time
    # add the day of water year as a new coordinate
    ds.coords['date'] = day
    # add the ds to the dictionary
    sos_daily_ds_list.append(ds)

In [293]:
sos_5min_windy_ds = xr.concat(sos_daily_ds_list, dim='date')
# get wind speeds for each height
sos_5min_windy_wind_df = sos_5min_windy_ds[hf.WIND_VARIABLES].to_dataframe().reset_index()
# get temperature and rh for each height
sos_5min_windy_temp_df = sos_5min_windy_ds[hf.TEMPERATURE_VARIABLES].to_dataframe().reset_index()
# get water vapor for each height
sos_5min_windy_wv_df = sos_5min_windy_ds[hf.WATER_VAPOR_VARIABLES].to_dataframe().reset_index()
# get the pressure for each height
sos_5min_windy_press_df = sos_5min_windy_ds[hf.PRESSURE_VARIABLES].to_dataframe().reset_index()

In [261]:
# define function to calculate day of the water year
def get_day_of_water_year(date):
    # if input is type dt.date convert to dt.datetime
    if type(date) == dt.date:
        date = dt.datetime(date.year, date.month, date.day)
    # if the date is before october 1st, subtract one from the year
    if date.month < 10:
        year = date.year - 1
    else:
        year = date.year
    # create a datetime object for october 1st of the year
    oct_1 = dt.datetime(year, 10, 1)
    # calculate the day of water year
    day_of_water_year = (date - oct_1).days + 1
    return day_of_water_year
# test it and print result
print(get_day_of_water_year(dt.date(2022, 9, 10)))

345


In [276]:
# for each date, plot the wind speed at 10m on uw tower over the day. Color by the day of water year
# also plot the mean wind speed for all the days in bold 

# first set up color ramp between 0 and 365 with viridis
n_colors = 366
color_scale = px.colors.sample_colorscale("IceFire", [n/(n_colors -1) for n in range(n_colors)])
color_dict = dict(zip(range(0,366),color_scale))

# groupby time on uw_10m and get the mean
sos_5min_windy_time_avg = sos_5min_windy_wind_df['spd_10m_uw'].groupby(sos_5min_windy_wind_df.time).mean()
# iterate through the dates and plot the wind speed at 10m on uw tower over the day. Color by the day of water year
fig = go.Figure()
for date in sos_5min_windy_wind_df.date.unique():
    # get the day of water year
    day_of_water_year = get_day_of_water_year(date)
    # get the data for the date
    df = sos_5min_windy_wind_df[sos_5min_windy_wind_df.date == date]
    # plot the data
    fig.add_trace(go.Scatter(
        x=df['time'], 
        y=df['spd_10m_uw'],
        name=f"{date}",
        marker_color=color_dict[day_of_water_year],
        connectgaps=False
    ))
# add the mean line with a heavier lineweight
fig.add_trace(go.Scatter(
    x=sos_5min_windy_time_avg.index, 
    y=sos_5min_windy_time_avg.values,
    name=f"mean",
    marker_color='black',
    connectgaps=False,
    line=dict(dash='dash'),
    line_width=3
))
# update to show all hover data and not just the closest point
fig.update_layout(hovermode="x unified",
                  yaxis_title='Wind Speed (m/s)',
                    xaxis_title='Time of Day (UTC)',
                    title='Wind Speeds on UW Tower during highest 5th percentile windy days',
)
fig

In [275]:
# do the same plot as above but for wind direction
# groupby time on uw_10m and get the mean
sos_5min_windy_time_avg = sos_5min_windy_wind_df['dir_10m_uw'].groupby(sos_5min_windy_wind_df.time).mean()
# iterate through the dates and plot the wind speed at 10m on uw tower over the day. Color by the day of water year
fig = go.Figure()
for date in sos_5min_windy_wind_df.date.unique():
    # get the day of water year
    day_of_water_year = get_day_of_water_year(date)
    # get the data for the date
    df = sos_5min_windy_wind_df[sos_5min_windy_wind_df.date == date]
    # plot the data
    fig.add_trace(go.Scatter(
        x=df['time'], 
        y=df['dir_10m_uw'],
        name=f"{date}",
        mode='markers',
        marker_color=color_dict[day_of_water_year],
        connectgaps=False
    ))

# update to show all hover data and not just the closest point
fig.update_layout(hovermode="x unified",
                  yaxis_title='Wind Direction (m/s)',
                    xaxis_title='Time of Day',
                    title='Wind Direction on UW Tower during highest 5th percentile windy days',
)
# update the y-axis labels to be the cardinal directions and inbetween directions
fig.update_yaxes(tickvals=[0, 45, 90, 135, 180, 225, 270, 315, 360],
                 ticktext=['N', 'NE', 'E', 'SE', 'S', 'SW', 'W', 'NW', 'N'])
fig

In [282]:
# make the same plot but do it on the temperature data, we'll use 3m on c
sos_5min_windy_time_avg = sos_5min_windy_temp_df['T_3m_c'].groupby(sos_5min_windy_temp_df.time).mean()
# iterate through the dates and plot the wind speed at 10m on uw tower over the day. Color by the day of water year
fig = go.Figure()
for date in sos_5min_windy_temp_df.date.unique():
    # get the day of water year
    day_of_water_year = get_day_of_water_year(date)
    # get the data for the date
    df = sos_5min_windy_temp_df[sos_5min_windy_temp_df.date == date]
    # plot the data
    fig.add_trace(go.Scatter(
        x=df['time'], 
        y=df['T_3m_c'],
        name=f"{date}",
        mode='lines',
        marker_color=color_dict[day_of_water_year],
        connectgaps=False
    ))
# add the average
fig.add_trace(go.Scatter(
    x=sos_5min_windy_time_avg.index, 
    y=sos_5min_windy_time_avg.values,
    name=f"mean",
    marker_color='black',
    connectgaps=False,
    line=dict(dash='dash'),
    line_width=3
))

# update to show all hover data and not just the closest point
fig.update_layout(hovermode="x unified",
                  yaxis_title='Temperature (C)',
                    xaxis_title='Time of Day',
                    title='3m Temperature at Tower C during highest 5th percentile windy days',
)



In [297]:
# for sos_5min_windy_wv_df drop negative values
sos_5min_windy_wv_df = sos_5min_windy_wv_df.where(sos_5min_windy_wv_df['h2o_5m_c'] > 0, np.nan)
# make the same plot but do it on the Water Vapor data, we'll use 3m on c
sos_5min_windy_time_avg = sos_5min_windy_wv_df['h2o_5m_c'].groupby(sos_5min_windy_wv_df.time).mean()
# iterate through the dates and plot the wind speed at 10m on uw tower over the day. Color by the day of water year
fig = go.Figure()
for date in sos_5min_windy_wv_df.date.unique():
    # get the day of water year
    day_of_water_year = get_day_of_water_year(date)
    # get the data for the date
    df = sos_5min_windy_wv_df[sos_5min_windy_wv_df.date == date]
    # plot the data
    fig.add_trace(go.Scatter(
        x=df['time'], 
        y=df['h2o_5m_c'],
        name=f"{date}",
        mode='lines',
        marker_color=color_dict[day_of_water_year],
        connectgaps=False
    ))
# add the average
fig.add_trace(go.Scatter(
    x=sos_5min_windy_time_avg.index, 
    y=sos_5min_windy_time_avg.values,
    name=f"mean",
    marker_color='black',
    connectgaps=False,
    line=dict(dash='dash'),
    line_width=3
))

# update to show all hover data and not just the closest point
fig.update_layout(hovermode="x unified",
                  yaxis_title='Water Vapor (g/m^3)',
                    xaxis_title='Time of Day',
                    title='5m Water Vapor at Tower C during highest 5th percentile windy days',
                    # set yaxis to 0 and 5
                    yaxis=dict(range=[0, 5])
)



AttributeError: 'float' object has no attribute 'month'

## 3. How much sublimation over the season came from these events?
We will address this question by calculating hourly sublimation totals from SOS and SAIL over the winter period (dates may need to be adjusted to what Eli calculated with). Then for each of the days we calculated from above, we'll get the total sublimation from those specific days. 
1) First, make a timeseries plot of cumulative sublimation over the year. Add horizontal boxes that mark the days of each wind event
2) Filter the hourly sublimation totals to just the days we want to include and sum the total. 
3) How well the days with the most sublimation correspond with these windy days.

From ELI: 
Some of the sonics look fine. Some sonics mess up the estimates. Filtering is important.