# Collection of Data for Snowpack Layer Prediction and Analysis
### Jaymin West

### Spring, 2023

#### The purpose of this project is to create a model that predicts the current snow conditions in areas based on the weather that that area has seen and predict the avalanche risk of that area based on this. By “snow conditions” I mean the layers that exist within the snow pack such as hard, frozen layers or soft, dry layers. Understanding the layers within a snowpack is essential to predicting avalanche risk of an area. Currently, avalanche risk assessment and snow pack analysis are done entirely by professional avalanche forecasters who go into the field, dig snow pits, analayze the snowpack, and create a risk assessment based on this information. The goal with this project is not to replace these highly educated individuals but instead to attempt to create a tool that may help those interested in snow packs better understand the conditions they may face.

In [1]:
import importlib, os, utils, urllib
from datetime import datetime
import matplotlib.pyplot as plt
from meteostat import Stations, Daily, Point, Hourly
import pandas as pd
from sklearn import tree

#### Getting Weather Data:

In [2]:
# Getting station data:
stations = Stations()
stations = stations.nearby(47.3923, -121.4001, 32000) # Stations within 32km (~20mi) of Snoqualimine Pass:
station = stations.fetch()

# Getting Hourly Data for the 2022-2023 season:
start = datetime(2022, 10, 1)
end = datetime(2023, 3, 31)

# Collecting data for every 6 hours from October 1st, 2022 to March 31st, 2023:
weather_data = Hourly(station, start=start, end=end)
weather_data = weather_data.normalize()
weather_data = weather_data.aggregate('1D', spatial=True) # Aggregating data over time and spatialy (averaging all stations' data)
weather_data = weather_data.fetch()
# Removing empty columns:
weather_data = weather_data[['temp', 'dwpt', 'rhum', 'prcp', 'wdir', 'wspd', 'pres', 'coco']]
weather_data = weather_data.reset_index()
# Writing it to a csv file:
weather_data.to_csv('input_data/weather_data/stevens_pass_22-23_data.csv')

### Scraping Avalanche Data:

In [13]:
# Calling utility function to scrape avalanche data:
date_risks = utils.scrape_avalanche_data('https://nwac.us/avalanche-forecast/#/archive', 'Snoqualmie Pass')

### Combining Weather and Avalanche Risk Data:

In [12]:
# Formatting the date_risks dataframe:
date_risks['risk'] = date_risks['risk'].replace({'no rating': -1, 'low': 0, 'moderate': 1, 'considerable': 2, 'high': 3, 'extreme': 4})
# Merging the weather data with the avalanche risk data
weather_and_risk_df = weather_data.merge(date_risks)
# Formatting the time column:
weather_and_risk_df['time'] = weather_and_risk_df['time'].apply(lambda x: datetime.strftime(x, '%Y-%m-%d'))


### Creating a model to predict the current avalanche danger

In [None]:
X = weather_and_risk_df[['temp', 'dwpt', 'rhum', 'prcp', 'wdir', 'wspd', 'pres', 'coco']] # Features
y = weather_and_risk_df['risk'] # Labels

# Making a decision tree with Sklearn:
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X[:-46], y[:-46])
print("Decision Tree Score: ", clf.score(X, y))

Decision Tree Score:  0.7835820895522388


# Snowpilot Data Collection

![alternatvie text]('https://snowpilot.org/sites/default/files/snowpit-profiles/layers-54015.png')

In [None]:
importlib.reload(utils)

sp_data = []
for filename in os.listdir("input_data/snowpilot_data"): 
    filename = "input_data/snowpilot_data/" + filename
    timestamp = int(utils.snowpilot_xml_to_dict(filename)['@timestamp'])
    timestamp /= 1000 # Converting from milliseconds

    date = datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d')

    layers = utils.snowpilot_xml_to_dict(filename)['Layer']

    layers_dict = {}

    for layer in layers:
        layers_dict[layer['@layerNumber']] = [layer['@startDepth'], layer['@endDepth'], layer['@hardness1']]

    chart_df = pd.DataFrame.from_dict(layers_dict, orient='index', columns=['startDepth', 'endDepth', 'hardness1'])
    sp_data.append((date, chart_df))
sp_data

[('2023-01-13',
    startDepth endDepth hardness1
  1        134      105         F
  2        105       91        4F
  3         91       77        1F
  4         77       74        1F
  5         74       69        4F
  6         69       55        1F
  7         55       43        4F
  8         43       10        1F
  9         10        0        1F),
 ('2022-12-16',
     startDepth endDepth hardness1
  1         110       97         F
  2          97       96        4F
  3          96       86        4F
  4          86       85        1F
  5          85       75        1F
  6          75       49         P
  7          49       44         P
  8          44       42       1F+
  9          42       39         P
  10         39       15        1F
  11         15       12         P
  12         12        0        P-),
 ('2023-03-19',
     startDepth endDepth hardness1
  1         252      247        1F
  2         247      242         F
  3         242      239         P
  4         2

## Ideas:

- Really only need to predict the hardness of the snow layers. Graph can be made from everything else
- Snow depth can be retrieved from the Snowpilot Data
- Take basically all of the attributes possible from the weather data, use decision tree to find the most influential factors in determinding the layer hardness
    - Will need to look at (some) historical data for the best results here
- End results does not have to be the same format of the Snowpilot charts
    - Could have the layers be color coded