# Winter 2022 & 2023 Wind Events as Measured by SAIL

Author: Daniel Hogan
Created: January 17, 2024

This notebook will start to address three main questions (with sub-focuses discussed below):
1) What events had the highest percentile of wind speeds?
2) What were the general storm characteristics? Were they related?
3) How much sublimation over the season came from these events?

### Imports


In [1]:
# general
import os
import glob
import datetime as dt
import json
# data 
import xarray as xr 
from sublimpy import utils, variables, tidy, turbulence
import numpy as np
import pandas as pd
from act import discovery, plotting
# plotting
import matplotlib.pyplot as plt
from metpy.cbook import get_test_data
from metpy.plots import add_metpy_logo, SkewT
import plotly.express as px 
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
# helper tools
from scripts.get_sail_data import get_sail_data
from scripts.helper_funcs import create_windrose_df, simple_sounding, mean_sounding
import scripts.helper_funcs as hf
from metpy import calc, units
# make plotly work 
init_notebook_mode(connected=True)
cf.go_offline()

nctoolkit is using Climate Data Operators version 2.3.0


## 1. What events had the highest percentile of wind speeds?
We will begin to address this by looking at daily average wind speeds during the SAIL campaign. I'll define winter in this case as spanning the same periods as the SOS campaign for the 2023 winter (November 29 - May 7) and the period when at least 6" of snow were on the ground per billy barr's measurement. This is going to include a storm that brought his site up from 1" to 6" because I want to include that storm and see how it influenced the season (not much I imagine). So the 2022 winter spans December 6 to May 1. The end is loosly defined as when snow melt rates really start to pick up. In 2022, between April 23-May 1 we lost 41 cm of snow. A similar thing happed in 2023 between April 29 and May 7. Could be more direct with this definition, but we'll start with that. Thus, with a shorter winter, we expect less sublimation to occur in the 2022 winter.
To 
classify the top 90th percentile of windy days during the main snow season for 2022 and 2023. This may need to be broken down further to capture wind events, but we'll start with this. We'll begin by comparing the days for 3-20 m wind speeds and compare tower-to-tower to make sure we have consistency at our location.

1) We'll first make some box plots of daily average wind speeds at Gothic from SAIL, billy data, and 10m UW tower wind speed.
2) We'll make a timeseries plot for each height bin and mark out the highest percentile of wind speeds for the year
3) We'll then filter to the days with the highest 5% of wind speeds over each winter and see how they compare to each other.


### Setup to download SAIL data

In [2]:
# Function to load ARM credentials
def load_arm_credentials(credential_path):
    with open(credential_path, 'r') as f:
        credentials = json.load(f)
    return credentials
# Location of ARM credentials
credential_path = '/home/dlhogan/.act_config.json'
credentials = load_arm_credentials(credential_path)
# api token and username for ARM
api_username = credentials.get('username')
api_token = credentials.get('token')

sail_datastream_dict = {
    # "radiosonde":"gucsondewnpnM1.b1",
    "met":"gucmetM1.b1",
    "eddy_covariance_kettle_ponds":"guc30qcecorS3.s1",
    # "wind_profiler":"guc915rwpwindconM1.a1",
    # "doppler_lidar":"gucdlprofwind4newsM1.c1",
    "laser_disdrometer_gothic":"gucldM1.b1",
    # "laser_disdrometer_mt_cb":"gucldS2.b1",
}

In [3]:
winter_22 = ('20211206','20220501')
winter_23 = ('20221129','20230507')

In [4]:
# Set the location of the data folder where this data will be stored
winter_22_folder = 'winter_21_22'
winter_23_folder = 'winter_22_23'
 # change to location of data folder on your machine
storage_directory = f'/storage/dlhogan/synoptic_sublimation/'
# create a sail_data folder if it does not exist
if not os.path.exists(os.path.join(storage_directory,'sail_data')):
    os.makedirs(os.path.join(storage_directory,'sail_data'))
# create a folder for the event if it does not exist
if not os.path.exists(os.path.join(storage_directory,'sail_data',winter_22_folder)):
    os.makedirs(os.path.join(storage_directory,'sail_data',winter_22_folder))
if not os.path.exists(os.path.join(storage_directory,'sail_data',winter_22_folder,'radiosonde')):
    os.makedirs(os.path.join(storage_directory,'sail_data',winter_22_folder,'radiosonde'))
# create a folder for the event if it does not exist
if not os.path.exists(os.path.join(storage_directory,'sail_data',winter_23_folder)):
    os.makedirs(os.path.join(storage_directory,'sail_data',winter_23_folder))
    # make a radiosonde folder if it does not exist
if not os.path.exists(os.path.join(storage_directory,'sail_data',winter_23_folder,'radiosonde')):
    os.makedirs(os.path.join(storage_directory,'sail_data',winter_23_folder,'radiosonde'))

### Download winter 2022 data from SAIL
For now we will just get met, ecor, and laser disdrometer data

In [9]:
# load in the winter 22 data
sail_winter_22_folder = os.path.join(storage_directory,'sail_data',winter_22_folder)
# create empty data dictionary
w22_data_loc_dict = {}
# Iterate through the dictionary and pull the data for each datastream
for k,v in sail_datastream_dict.items():
    if (k =='radiosonde') & (len(os.listdir(os.path.join(sail_winter_22_folder,"radiosonde"))) > 0):
        print("Radiosonde data donwloaded. Data files include:")
        # list file names in the radiosonde folder
        for file in os.listdir(os.path.join(sail_winter_22_folder,"radiosonde")):
            print(file)
        print('-------------------')
    # Check if the file already exists
    elif (os.path.exists(f'{sail_winter_22_folder}/{k}_{winter_22[0]}_{winter_22[1]}.nc')): 
        print(f'{k}_{winter_22[0]}_{winter_22[1]}.nc already exists')
        print('-------------------')
        # add the filename to the dictionary which can be used if we want to load the data
        w22_data_loc_dict[k] = os.path.join(sail_winter_22_folder,f'{k}_{winter_22[0]}_{winter_22[1]}.nc')
        continue
    else:
        # explicitly download radiosonde data because they are a lot easier to process and think about when in individual files
        if k == 'radiosonde':
            discovery.download_data(
                api_username,
                api_token,
                v,
                startdate=winter_22[0],
                enddate=winter_22[1],
                output=sail_winter_22_folder+'radiosonde/'
            )
        else:
            ds = get_sail_data(api_username,
                        api_token,
                        v,
                        startdate=winter_22[0],
                        enddate=winter_22[1])
            ds.to_netcdf(f'{sail_winter_22_folder}/{k}_{winter_22[0]}_{winter_22[1]}.nc')
            w22_data_loc_dict[k] = os.path.join(sail_winter_22_folder,f'{k}_{winter_22[0]}_{winter_22[1]}.nc')

met_20211206_20220501.nc already exists
-------------------
eddy_covariance_kettle_ponds_20211206_20220501.nc already exists
-------------------
laser_disdrometer_gothic_20211206_20220501.nc already exists
-------------------


### Download winter 2023 data from SAIL
For now we will just get met, ecor, and laser disdrometer data

In [12]:
# load in the winter 23 data
sail_winter_23_folder = os.path.join(storage_directory,'sail_data',winter_23_folder)
# create empty data dictionary
w23_data_loc_dict = {}
# Iterate through the dictionary and pull the data for each datastream
for k,v in sail_datastream_dict.items():
    if (k =='radiosonde') & (len(os.listdir(os.path.join(sail_winter_23_folder,"radiosonde"))) > 0):
        print("Radiosonde data donwloaded. Data files include:")
        # list file names in the radiosonde folder
        for file in os.listdir(os.path.join(sail_winter_23_folder,"radiosonde")):
            print(file)
        print('-------------------')
    # Check if the file already exists
    elif (os.path.exists(f'{sail_winter_23_folder}/{k}_{winter_23[0]}_{winter_23[1]}.nc')): 
        print(f'{k}_{winter_23[0]}_{winter_23[1]}.nc already exists')
        print('-------------------')
        # add the filename to the dictionary which can be used if we want to load the data
        w23_data_loc_dict[k] = os.path.join(sail_winter_23_folder,f'{k}_{winter_23[0]}_{winter_23[1]}.nc')
        continue
    else:
        # explicitly download radiosonde data because they are a lot easier to process and think about when in individual files
        if k == 'radiosonde':
            discovery.download_data(
                api_username,
                api_token,
                v,
                startdate=winter_23[0],
                enddate=winter_23[1],
                output=sail_winter_23_folder+'radiosonde/'
            )
        else:
            ds = get_sail_data(api_username,
                        api_token,
                        v,
                        startdate=winter_23[0],
                        enddate=winter_23[1])
            ds.to_netcdf(f'{sail_winter_23_folder}/{k}_{winter_23[0]}_{winter_23[1]}.nc')
            w23_data_loc_dict[k] = os.path.join(sail_winter_23_folder,f'{k}_{winter_23[0]}_{winter_23[1]}.nc')

met_20221129_20230507.nc already exists
-------------------
eddy_covariance_kettle_ponds_20221129_20230507.nc already exists
-------------------
laser_disdrometer_gothic_20221129_20230507.nc already exists
-------------------


In [13]:
# load in SAIL winter data for winters 2022 and 2023
w22_sail_met = xr.open_dataset(w22_data_loc_dict['met'])
w23_sail_met = xr.open_dataset(w23_data_loc_dict['met'])

In [33]:
# let's build a function to qc all the data quickly
VARIABLES = ['atmos_pressure','temp_mean', 'rh_mean','vapor_pressure_mean','wspd_arith_mean','wdir_vec_mean']
def qc_sail_met(ds, variables):
    # first let's add qc to all the variables as a new list
    qc_variables = ['qc_'+v for v in variables]
    ds_qc = ds[variables+qc_variables]
    # now if any qc variable is not 0, we will set the variable to nan
    for v in variables:
        ds_qc[v] = ds_qc[v].where(ds_qc['qc_'+v]==0, other=np.nan)
    # now let's drop the qc variables
    ds_qc = ds_qc.drop_vars(qc_variables)
    # now let's set extreme values for certain variables
    temp_max = 30
    temp_min = -40
    rh_min = 0
    wspd_max = 40 # max in m/s
    vapor_pressure_max = 7.835 # this is the saturation vapor pressure at 40C
    # now let's filter out the extremes and fill with nan
    ds_qc['temp_mean'] = ds_qc['temp_mean'].where((ds_qc['temp_mean']<temp_max) & (ds_qc['temp_mean']>temp_min), other=np.nan)
    ds_qc['rh_mean'] = ds_qc['rh_mean'].where(ds_qc['rh_mean']>rh_min, other=np.nan)
    ds_qc['wspd_arith_mean'] = ds_qc['wspd_arith_mean'].where(ds_qc['wspd_arith_mean']<wspd_max, other=np.nan)
    ds_qc['vapor_pressure_mean'] = ds_qc['vapor_pressure_mean'].where(ds_qc['vapor_pressure_mean']<vapor_pressure_max, other=np.nan)
    # count how many data points are nan for each variable and print that out
    for v in variables:
        print(f'{v} has {ds_qc[v].isnull().sum().values} nan values')
    print('-------------------')
    # return the qc'd dataset
    return ds_qc


In [34]:
w22_sail_met_qc_ds = qc_sail_met(w22_sail_met, VARIABLES)
w23_sail_met_qc_ds = qc_sail_met(w23_sail_met, VARIABLES)

atmos_pressure has 0 nan values
temp_mean has 0 nan values
rh_mean has 0 nan values
vapor_pressure_mean has 0 nan values
wspd_arith_mean has 8 nan values
wdir_vec_mean has 0 nan values
-------------------
atmos_pressure has 0 nan values
temp_mean has 0 nan values
rh_mean has 0 nan values
vapor_pressure_mean has 0 nan values
wspd_arith_mean has 0 nan values
wdir_vec_mean has 0 nan values
-------------------


In [57]:
# make the index the day of water year
dowy = lambda x: x.dayofyear - 274 if x.dayofyear >= 274 else x.dayofyear + 91
# get the max wind speed for each day
w22_max_wspd = w22_sail_met_qc_ds['wspd_arith_mean'].groupby('time.date').max().to_dataframe().reset_index()
w23_max_wspd = w23_sail_met_qc_ds['wspd_arith_mean'].groupby('time.date').max().to_dataframe().reset_index()
# get the average wind speed for each day
w22_mean_wspd = w22_sail_met_qc_ds['wspd_arith_mean'].groupby('time.date').mean().to_dataframe().reset_index()
w23_mean_wspd = w23_sail_met_qc_ds['wspd_arith_mean'].groupby('time.date').mean().to_dataframe().reset_index()

# add dowy as index
w22_mean_wspd.index = pd.to_datetime(w22_mean_wspd['date']).apply(dowy)
w23_mean_wspd.index = pd.to_datetime(w23_mean_wspd['date']).apply(dowy)
# convert date column to datetime
w22_mean_wspd['date'] = pd.to_datetime(w22_mean_wspd['date'])
w23_mean_wspd['date'] = pd.to_datetime(w23_mean_wspd['date'])
# rename the index dowy
w22_mean_wspd.index.name = 'dowy'
w23_mean_wspd.index.name = 'dowy'

# add dowy as index
w22_max_wspd.index = pd.to_datetime(w22_max_wspd['date']).apply(dowy)
w23_max_wspd.index = pd.to_datetime(w23_max_wspd['date']).apply(dowy)
# convert date column to datetime
w22_max_wspd['date'] = pd.to_datetime(w22_max_wspd['date'])
w23_max_wspd['date'] = pd.to_datetime(w23_max_wspd['date'])
# rename the index dowy
w22_max_wspd.index.name = 'dowy'
w23_max_wspd.index.name = 'dowy'

In [68]:
# plot the max wind speed for each day 

fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.1, subplot_titles=('Mean Daily Wind Speed','Max Daily Wind Speed'))
fig.add_trace(go.Scatter(x=w22_mean_wspd.index, 
                         y=w22_mean_wspd['wspd_arith_mean'], 
                         name='2021-2022',
                         marker_color='blue'),
                         row=1, col=1)
fig.add_trace(go.Scatter(x=w23_mean_wspd.index,
                            y=w23_mean_wspd['wspd_arith_mean'],
                            name='2022-2023',
                            marker_color='red'),
                            row=1, col=1)
fig.add_trace(go.Scatter(x=w22_max_wspd.index, 
                         y=w22_max_wspd['wspd_arith_mean'], 
                         name='2021-2022',
                         marker_color='blue',
                         showlegend=False,),
                         row=2, col=1)
fig.add_trace(go.Scatter(x=w23_max_wspd.index,
                            y=w23_max_wspd['wspd_arith_mean'],
                            name='2022-2023',
                            marker_color='red',
                            showlegend=False,),
                            row=2, col=1)
fig.update_layout(
    title = 'Mean and Max (1-minute) Daily Wind Speed from SAIL MET Station in Gothic, CO',
    xaxis2_title='Day of Water Year',
    yaxis1_title='Wind Speed (m/s)',
    yaxis2_title='Wind Speed (m/s)',
    legend_title='Winter',
    # update the ylims
    yaxis1_range=[0, 10],
    yaxis2_range=[0, 40],
    width=1000,
    height=600,
)
# update the xaxis to show months
fig.update_xaxes(
    ticktext=['Nov','Dec','Jan','Feb','Mar','Apr','May'],
    tickvals=[31,61,92,122,153,183,214],
)
# format the hover text
fig.update_traces(hovertemplate='Day of Water Year: %{x}<br>Wind Speed: %{y:.2f} m/s')



In [73]:
# print the mean wind speed for each winter
print(f'Winter 2021-2022 mean daily wind speed: {w22_mean_wspd["wspd_arith_mean"].mean():.2f} m/s')
print(f'Winter 2022-2023 mean daily wind speed: {w23_mean_wspd["wspd_arith_mean"].mean():.2f} m/s')
# print the standard deviation of the wind speed for each winter
print(f'Winter 2021-2022 standard deviation of daily wind speed: {w22_mean_wspd["wspd_arith_mean"].std():.2f} m/s')
print(f'Winter 2022-2023 standard deviation of daily wind speed: {w23_mean_wspd["wspd_arith_mean"].std():.2f} m/s')
# print the 95th percentile of the wind speed for each winter
print(f'Winter 2021-2022 95th percentile of daily wind speed: {w22_mean_wspd["wspd_arith_mean"].quantile(0.95):.2f} m/s')
print(f'Winter 2022-2023 95th percentile of daily wind speed: {w23_mean_wspd["wspd_arith_mean"].quantile(0.95):.2f} m/s')

Winter 2021-2022 mean daily wind speed: 2.41 m/s
Winter 2022-2023 mean daily wind speed: 2.25 m/s
Winter 2021-2022 standard deviation of daily wind speed: 1.18 m/s
Winter 2022-2023 standard deviation of daily wind speed: 1.09 m/s
Winter 2021-2022 95th percentile of daily wind speed: 4.85 m/s
Winter 2022-2023 95th percentile of daily wind speed: 4.28 m/s


## 2. What were the general storm characteristics? Were they related?
Again we'll characterize these events for each winter for the SAIL met tower only and compare across the winters, can advance this to other locations in the future. We'll plot:
- specific humidity
- temperature (line plot)
- wind speed (line plot and box plot)
- wind direction (box plots and wind rose)
- pressure
- precipitation rate (from laser disdrometer)

For the top windy days each winter:
- We'll grab the radiosondes and make those plots

## 3. How much sublimation over the season came from these events?
We will address this question by calculating hourly sublimation totals from SOS and SAIL over the winter period (dates may need to be adjusted to what Eli calculated with). Then for each of the days we calculated from above, we'll get the total sublimation from those specific days. 
1) First, make a timeseries plot of cumulative sublimation over the year. Add horizontal boxes that mark the days of each wind event
2) Filter the hourly sublimation totals to just the days we want to include and sum the total. 
3) How well the days with the most sublimation correspond with these windy days.