# 01: Introduction to xsnow and Loading Data

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Austfi/xsnowForPatrol/blob/main/notebooks/01_introduction_and_loading_data.ipynb)

This notebook introduces you to xsnow and how to load snowpack data.

## What You'll Learn

- What xsnow is and why it's useful
- Understanding xsnow's 5-dimensional data model
- How to load sample data
- Exploring dataset structure and metadata
- Creating your first visualization

> **Note**: If you're new to Python, NumPy, pandas, or xarray, check out the optional reference notebook `00_python_fundamentals_reference.ipynb` for a quick introduction to these libraries.


## Installation (For Colab Users)

If you're using Google Colab, run the cell below to install xsnow and dependencies. If you're running locally and have already installed xsnow, you can skip this cell.

In [None]:

%pip install -q numpy pandas xarray matplotlib seaborn dask netcdf4
%pip install -q git+https://gitlab.com/avacollabra/postprocessing/xsnow

Note: you may need to restart the kernel to use updated packages.
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gensim 4.3.3 requires numpy<2.0,>=1.18.5, but you have numpy 2.3.4 which is incompatible.
gensim 4.3.3 requires scipy<1.14.0,>=1.7.0, but you have scipy 1.16.3 which is incompatible.
contourpy 1.2.0 requires numpy<2.0,>=1.20, but you have numpy 2.3.4 which is incompatible.
numba 0.60.0 requires numpy<2.1,>=1.22, but you have numpy 2.3.4 which is incompatible.
distributed 2024.8.2 requires dask==2024.8.2, but you have dask 2025.11.0 which is incompatible.
dask-expr 1.1.13 requires dask==2024.8.2, but you have dask 2025.11.0 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


## Part 1: What is xsnow?

**xsnow** is a Python library designed to make working with snowpack simulation data efficient and intuitive. It's built specifically for data from the SNOWPACK model (and other snow models), which outputs detailed information about snow layers over time.

### Why xsnow?

- **Handles complex file formats**: SNOWPACK outputs come in specialized formats (.pro, .smet) that xsnow can parse automatically
- **Organized data structure**: Instead of juggling hundreds of separate files, xsnow organizes everything into a single, coherent dataset
- **Powerful analysis tools**: Built-in functions for common snowpack analyses (SWE, weak layers, stability indices)
- **Built on proven libraries**: Uses xarray, NumPy, and pandas under the hood, so you get their full power

### The Big Picture

Think of xsnow as a translator: it takes raw SNOWPACK output files and converts them into a format that's easy to work with in Python. Instead of manually parsing text files, you get a clean, organized dataset where you can ask questions like "Show me all weak layers on north-facing slopes after February 1st" with simple code.

**Why do we need xsnow?**

With xsnow, you get:
- Automatic file parsing with `xsnow.read()`
- Unified data structure (xsnowDataset) for all your data
- Label-based indexing: `ds.sel(location="VIR1A", time="2024-02-01")`
- Built-in analysis functions: `ds.compute_swe()`, `ds.find_weak_layers()`
- Automatic alignment and broadcasting when combining datasets
- Rich metadata preserved from original files
- Access to full xarray ecosystem for advanced operations


## Part 2: Understanding xsnow's Data Model

xsnow organizes snowpack data using **5 key dimensions**. This might sound complex, but it's actually very logical once you understand it.


### The 5 Dimensions

1. **location**: The site or grid point (e.g., "VIR1A", "Station_1")
2. **time**: When the profile was measured/simulated
3. **slope**: Different slope aspects at the same location (north-facing, south-facing, etc.)
4. **realization**: Different model runs or scenarios (for ensemble runs)
5. **layer**: The vertical layers within the snowpack (layer 0 = surface, higher numbers = deeper)

### Why This Structure?

This structure allows you to ask powerful questions:
- "Show me density profiles for all locations on February 1st"
- "Compare north vs south-facing slopes"
- "Find weak layers across all time steps"
- All without writing loops of code

### Profile-level vs Layer-level Variables

- **Profile-level**: Properties of the entire snowpack (e.g., total snow height HS). These don't vary by layer.
- **Layer-level**: Properties of individual layers (e.g., density, temperature). These vary by layer.

Let's see this in action once we load some data!


## Part 3: Loading Data with xsnow

Now let's actually load some data! xsnow can read SNOWPACK output files in `.pro` (profile) and `.smet` (meteorological) formats.

### Understanding File Formats

- **`.pro` files**: Contain time series of snow profiles with layer-by-layer data
- **`.smet` files**: Contain time series of scalar variables (temperature, precipitation, etc.) without layers

### Using xsnow Sample Data

xsnow includes built-in sample data that we can use for learning! This makes it easy to get started without needing to download or generate your own files. We'll use `xsnow.single_profile_timeseries()` to load the sample data.


In [None]:
import xsnow
import matplotlib.pyplot as plt

# Load sample data
ds = xsnow.single_profile_timeseries()




A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "/opt/anaconda3/lib/python3.12/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel/kernelapp.py", line 701, in start
    self.io_loop.start()
  File "/opt/anaconda3/lib/python3.12/site-

ImportError: 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.




A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "/opt/anaconda3/lib/python3.12/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel/kernelapp.py", line 701, in start
    self.io_loop.start()
  File "/opt/anaconda3/lib/python3.12/site-

ImportError: 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.




A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "/opt/anaconda3/lib/python3.12/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel/kernelapp.py", line 701, in start
    self.io_loop.start()
  File "/opt/anaconda3/lib/python3.12/site-

AttributeError: _ARRAY_API not found


A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "/opt/anaconda3/lib/python3.12/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/opt/anaconda3/lib/python3.12/site-packages/ipykernel/kernelapp.py", line 701, in start
    self.io_loop.start()
  File "/opt/anaconda3/lib/python3.12/site-

ImportError: 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.4 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.



ImportError: initialization failed

## Plot a Snow Profile!

Let's create a visual plot right away to show what xsnow can do.

Even though we haven't covered visualization in detail yet, let's plot a snow profile to see data pulled out and visualized from xsnow.


In [1]:
# Select a single profile (first location, first time, flat slope, first realization)
# ds.isel is used to select a single profile from the dataset based on the coordinates given
profile = ds.isel(location=0, time=0, slope=0, realization=0)

# Get depth and temperature
# profile.coords['z'].values is used to get the depth ('z' is the coordinate name)
# profile['temperature'].values is used to get the temperature data
depth = profile.coords['z'].values  # z stores below surface depth from 0 as negative values
temp = profile['temperature'].values 

# Create a simple plot
fig, ax = plt.subplots(figsize=(6, 8))
ax.plot(temp, depth, 'r-', linewidth=2, label='Temperature')
ax.axvline(x=0, color='k', linestyle='--', alpha=0.3, label='Freezing Point')
ax.set_xlabel('Temperature (°C)', fontsize=12)
ax.set_ylabel('Depth from surface (m)', fontsize=12)
ax.set_title('Your First Snow Profile!', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3)
ax.legend()
ax.invert_yaxis()  # Surface at top
plt.tight_layout()
plt.show()

print("Congratulations! You just pulled a profile out of a dataset and plotted it")



NameError: name 'ds' is not defined

### Understanding the Dataset Structure

**Two Bites Pattern**: Let's understand datasets in two ways:

1. **Conceptually (First Bite)**: An xsnowDataset is like a smart container that organizes all your snowpack data. It knows about locations, times, layers, and keeps everything connected. Think of it as a multi-dimensional spreadsheet where each "sheet" is a different variable (density, temperature, etc.), and the rows/columns are organized by location, time, and depth.

2. **Technically (Second Bite)**: When you load data with `xsnow.read()`, you get an **xsnowDataset**. This is a special wrapper around an xarray Dataset that's designed for snowpack data. It extends xarray's functionality with snowpack-specific methods and metadata.

### What's in an xsnowDataset?

An xsnowDataset contains several key components:

1. **Dimensions**: The axes of your data (location, time, slope, realization, layer)
   - Define the shape and size of your data
   - Example: `location: 3, time: 100, layer: 20` means 3 locations, 100 time steps, up to 20 layers

2. **Coordinates**: Labels for each dimension
   - Provide meaningful names/values for each position along a dimension
   - Example: location names like "VIR1A", time values like "2024-01-15", layer indices 0-19

3. **Data Variables**: The actual data arrays
   - Each variable (density, temperature, etc.) is stored as a DataArray
   - Can be profile-level (no layer dimension) or layer-level (has layer dimension)
   - Example: `density` has dimensions `(location, time, slope, realization, layer)`

4. **Attributes**: Metadata about variables and the dataset
   - Units, descriptions, source information
   - Example: `density.attrs['units'] = 'kg/m³'`

**Key Relationship**: `xsnowDataset` is a wrapper around `xarray.Dataset`. This means:
- You can use most xarray methods directly: `ds.mean()`, `ds.sel()`, `ds.groupby()`, etc.
- xsnow adds snowpack-specific methods and metadata
- You can access the underlying xarray Dataset if needed (though usually not necessary)

Let's explore what's inside:


In [None]:
# Print the dataset structure
print("Dataset dimensions:")
for dim, size in ds.sizes.items():
    print(f"  {dim}: {size}")

print("\nCoordinates:")
for coord_name in list(ds.coords.keys())[:5]:
    coord = ds.coords[coord_name]
    print(f"  {coord_name}: shape {coord.shape}")

print("\nData variables:")
for var_name in list(ds.data_vars.keys())[:10]:
    var = ds[var_name]
    print(f"  {var_name}: dims {var.dims}")

# Show the HTML representation (xarray's built-in display)
print("\n" + "="*60)
print("HTML Representation of xsnowDataset:")
print("="*60)
print("(In Jupyter, this displays as a rich HTML table)")
print("\nThe dataset object itself:")
ds


### Inspecting Specific Variables

Let's look at individual variables to understand the difference between profile-level and layer-level data:


In [None]:
# Layer-level variable (has 'layer' dimension)
density = ds['density']
print(f"Density dimensions: {density.dims}")

# Profile-level variable (no 'layer' dimension)
hs = ds['HS']
print(f"Snow height (HS) dimensions: {hs.dims}")


### Understanding Metadata

xsnow attaches useful metadata to variables (like units, descriptions). Let's check:


In [None]:
# Check attributes (metadata) for a variable
print("Density variable attributes:")
for key, value in ds['density'].attrs.items():
    print(f"  {key}: {value}")

# Check dataset-level attributes
print("\nDataset-level attributes:")
for key, value in ds.attrs.items():
    print(f"  {key}: {value}")


### The Special 'z' Coordinate

xsnow automatically computes a depth coordinate `z` that represents depth below the snow surface:
- `z = 0` at the snow surface
- `z` is negative downward (so `z = -0.5` means 50 cm below surface)

This is very useful for analysis!


In [None]:
# Loading multiple files (for reference - we'll cover this in detail in notebook 05)
# xsnow can load and merge multiple files at once:
# ds = xsnow.read(["data/station1.pro", "data/station2.pro"])  # List of files
# ds = xsnow.read("data/")  # Entire directory
