# 01: Introduction to xsnow and Loading Data

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Austfi/xsnowForPatrol/blob/main/notebooks/01_introduction_and_loading_data.ipynb)

This notebook introduces you to xsnow, the Python libraries it builds on, and how to load snowpack data.

## What You'll Learn

- What xsnow is and why it's useful
- Python fundamentals: NumPy, pandas, and xarray basics
- Understanding xsnow's 5-dimensional data model
- How to load .pro and .smet files
- Exploring dataset structure and metadata


## Installation (For Colab Users)

If you're using Google Colab, run the cell below to install xsnow and dependencies. If you're running locally and have already installed xsnow, you can skip this cell.

In [None]:

%pip install -q numpy pandas xarray matplotlib seaborn dask netcdf4
%pip install -q git+https://gitlab.com/avacollabra/postprocessing/xsnow



## Part 1: What is xsnow?

**xsnow** is a Python library designed to make working with snowpack simulation data efficient and intuitive. It's built specifically for data from the SNOWPACK model (and other snow models), which outputs detailed information about snow layers over time.

### Why xsnow?

- **Handles complex file formats**: SNOWPACK outputs come in specialized formats (.pro, .smet) that xsnow can parse automatically
- **Organized data structure**: Instead of juggling hundreds of separate files, xsnow organizes everything into a single, coherent dataset
- **Powerful analysis tools**: Built-in functions for common snowpack analyses (SWE, weak layers, stability indices)
- **Built on proven libraries**: Uses xarray, NumPy, and pandas under the hood, so you get their full power

### The Big Picture

Think of xsnow as a translator: it takes raw SNOWPACK output files and converts them into a format that's easy to work with in Python. Instead of manually parsing text files, you get a clean, organized dataset where you can ask questions like "Show me all weak layers on north-facing slopes after February 1st" with simple code.


## Part 2: Python Fundamentals for xsnow

xsnow builds on several important Python libraries. Let's get familiar with the basics you'll need.

### NumPy: Working with Arrays

NumPy provides arrays (like lists, but faster and more powerful) for numerical computing.


In [None]:
import numpy as np

# Create a simple array
temperatures = np.array([-5, -3, -1, 0, -2])

# Arrays can be multi-dimensional
# Imagine 3 layers, each with a temperature
layer_temps = np.array([[-5, -3, -1],  # Layer 0 (surface)
                        [-3, -2, -1],  # Layer 1
                        [-2, -1, 0]])  # Layer 2


### Pandas: Working with Tables

Pandas is great for tabular data (like spreadsheets). While xsnow uses xarray (which is more powerful for multi-dimensional data), understanding pandas helps.


In [None]:
import pandas as pd

# Create a simple table (DataFrame)
data = {
    'time': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'snow_depth': [50, 55, 60],  # cm
    'temperature': [-5, -3, -1]   # ¬∞C
}
df = pd.DataFrame(data)


### XArray: Multi-Dimensional Labeled Arrays

**XArray is the foundation of xsnow!** It's like pandas, but for multi-dimensional data. This is perfect for snowpack data which has:
- Multiple locations
- Multiple time steps
- Multiple layers
- Multiple variables (density, temperature, etc.)

Let's see a simple example:


In [None]:
import xarray as xr

# Create a simple xarray DataArray (like a NumPy array with labels)
# Let's say we have temperature data for 2 locations, 3 time steps, 2 layers
temps = np.array([[[-5, -3],   # Location 0, Time 0, Layers [0, 1]
                   [-4, -2],   # Location 0, Time 1
                   [-3, -1]],  # Location 0, Time 2
                  [[-6, -4],   # Location 1, Time 0
                   [-5, -3],   # Location 1, Time 1
                   [-4, -2]]]) # Location 1, Time 2

# Create labeled dimensions
da = xr.DataArray(
    temps,
    dims=['location', 'time', 'layer'],
    coords={
        'location': ['Station_A', 'Station_B'],
        'time': pd.date_range('2024-01-01', periods=3, freq='D'),
        'layer': [0, 1]  # Layer 0 = surface, Layer 1 = deeper
    },
    name='temperature'
)



**Key XArray Concepts:**
- **Dimensions**: The axes of your data (location, time, layer)
- **Coordinates**: Labels for each dimension (e.g., station names, dates)
- **DataArray**: A single variable with dimensions (like our temperature above)
- **Dataset**: A collection of DataArrays that share dimensions

xsnow uses xarray's Dataset structure to organize all snowpack variables together!


## Part 3: Understanding xsnow's Data Model

xsnow organizes snowpack data using **5 key dimensions**. This might sound complex, but it's actually very logical once you understand it.

### The 5 Dimensions

1. **location**: The site or grid point (e.g., "VIR1A", "Station_1")
2. **time**: When the profile was measured/simulated
3. **slope**: Different slope aspects at the same location (north-facing, south-facing, etc.)
4. **realization**: Different model runs or scenarios (for ensemble runs)
5. **layer**: The vertical layers within the snowpack (layer 0 = surface, higher numbers = deeper)

### Why This Structure?

This structure allows you to ask powerful questions:
- "Show me density profiles for all locations on February 1st"
- "Compare north vs south-facing slopes"
- "Find weak layers across all time steps"
- All without writing loops!

### Profile-level vs Layer-level Variables

- **Profile-level**: Properties of the entire snowpack (e.g., total snow height HS). These don't vary by layer.
- **Layer-level**: Properties of individual layers (e.g., density, temperature). These vary by layer.

Let's see this in action once we load some data!


## Part 4: Loading Data with xsnow

Now let's actually load some data! xsnow can read SNOWPACK output files in `.pro` (profile) and `.smet` (meteorological) formats.

### Understanding File Formats

- **`.pro` files**: Contain time series of snow profiles with layer-by-layer data
- **`.smet` files**: Contain time series of scalar variables (temperature, precipitation, etc.) without layers

### Getting Sample Data

### Using xsnow Sample Data

xsnow includes built-in sample data that we can use for learning! This makes it easy to get started without needing to download or generate your own files. We'll use `xsnow.single_profile_timeseries()` to get the path to the sample data directory.




In [None]:
import xsnow
import matplotlib.pyplot as plt

# Load sample data
print("Loading xsnow sample data...")
try:
    ds = xsnow.single_profile_timeseries()
    print("‚úÖ Data loaded successfully")
    print("Dataset summary:")
    print(ds)
    print(f"\nDataset dimensions: {dict(ds.dims)}")
except Exception as e:
    print(f"‚ö†Ô∏è Error loading sample data: {e}")
    print("Make sure xsnow is properly installed:")
    print("  pip install git+https://gitlab.com/avacollabra/postprocessing/xsnow")
    raise


## üéØ Quick Win: Your First Snow Profile Plot!

**Win-Day-One Pattern**: Let's create an impressive visualization right away! This gives you immediate satisfaction and shows what xsnow can do.

Even though we haven't covered visualization in detail yet, let's plot your first snow profile to see the data come alive:


In [None]:
# Quick Win: Plot your first snow profile!
# Select a single profile (first location, first time)
profile = ds.isel(location=0, time=0, slope=0, realization=0)

# Get depth and temperature
depth = -profile.coords['z'].values  # Convert to positive depth
temp = profile['temperature'].values

# Create a simple plot
fig, ax = plt.subplots(figsize=(6, 8))
ax.plot(temp, depth, 'r-', linewidth=2, label='Temperature')
ax.axvline(x=0, color='k', linestyle='--', alpha=0.3, label='Freezing Point')
ax.set_xlabel('Temperature (¬∞C)', fontsize=12)
ax.set_ylabel('Depth from surface (m)', fontsize=12)
ax.set_title('Your First Snow Profile!', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3)
ax.legend()
ax.invert_yaxis()  # Surface at top
plt.tight_layout()
plt.show()

print("üéâ Congratulations! You just created your first snow profile visualization!")


### Understanding the Dataset Structure

**Two Bites Pattern**: Let's understand datasets in two ways:

1. **Conceptually (First Bite)**: An xsnowDataset is like a smart container that organizes all your snowpack data. It knows about locations, times, layers, and keeps everything connected. Think of it as a multi-dimensional spreadsheet where each "sheet" is a different variable (density, temperature, etc.), and the rows/columns are organized by location, time, and depth.

2. **Technically (Second Bite)**: When you load data with `xsnow.read()`, you get an **xsnowDataset**. This is a special wrapper around an xarray Dataset that's designed for snowpack data. It extends xarray's functionality with snowpack-specific methods and metadata.

Let's explore what's inside:


In [None]:
# Print the dataset structure
print("Dataset dimensions:")
for dim, size in ds.dims.items():
    print(f"  {dim}: {size}")

print("\nCoordinates:")
for coord_name in list(ds.coords.keys())[:5]:
    coord = ds.coords[coord_name]
    print(f"  {coord_name}: shape {coord.shape}")

print("\nData variables:")
for var_name in list(ds.data_vars.keys())[:10]:
    var = ds[var_name]
    print(f"  {var_name}: dims {var.dims}")


### Inspecting Specific Variables

Let's look at individual variables to understand the difference between profile-level and layer-level data:


In [None]:
# Layer-level variable (has 'layer' dimension)
density = ds['density']
print(f"Density dimensions: {density.dims}")

# Profile-level variable (no 'layer' dimension)
hs = ds['HS']
print(f"Snow height (HS) dimensions: {hs.dims}")


### Understanding Metadata

xsnow attaches useful metadata to variables (like units, descriptions). Let's check:


In [None]:
# Check attributes (metadata) for a variable
print("Density variable attributes:")
for key, value in ds['density'].attrs.items():
    print(f"  {key}: {value}")

# Check dataset-level attributes
print("\nDataset-level attributes:")
for key, value in ds.attrs.items():
    print(f"  {key}: {value}")


### Loading Multiple Files

xsnow can load and merge multiple files at once:


In [None]:
# Example: Loading multiple files
# You can load multiple files like this:

# List of files
# ds = xsnow.read(["data/station1.pro", "data/station2.pro"])

# Entire directory
# ds = xsnow.read("data/")

# xsnow will automatically:
# - Merge data from different files
# - Align them by location and time
# - Combine profile and meteorological data


### The Special 'z' Coordinate

xsnow automatically computes a depth coordinate `z` that represents depth below the snow surface:
- `z = 0` at the snow surface
- `z` is negative downward (so `z = -0.5` means 50 cm below surface)

This is very useful for analysis!


In [None]:
z = ds.coords['z']


## Summary

‚úÖ **What we learned:**

1. **xsnow** is a Python library for working with snowpack simulation data
2. **Python fundamentals**: NumPy (arrays), pandas (tables), xarray (multi-dimensional labeled arrays)
3. **xsnow's data model**: 5 dimensions (location, time, slope, realization, layer)
4. **Loading data**: Use `xsnow.read()` to load .pro and .smet files
5. **Dataset structure**: xsnowDataset contains dimensions, coordinates, and data variables
6. **Two types of variables**: Profile-level (like HS) and layer-level (like density)
7. **Metadata**: Check `.attrs` for units and descriptions

## Key Concepts to Remember

- **xsnowDataset** = wrapper around xarray Dataset, specialized for snowpack data
- **Dimensions** = the axes of your data (location, time, slope, realization, layer)
- **Coordinates** = labels for dimensions (station names, dates, etc.)
- **Profile-level** = one value per profile (no layer dimension)
- **Layer-level** = one value per layer per profile (has layer dimension)

## Next Steps

Ready to start working with the data? Move on to:
- **02_basic_operations_and_analysis.ipynb**: Learn how to select, filter, and analyze your data

## Exercises (Try These!)

1. Load a sample .pro file and print its dimensions
2. List all the data variables in your dataset
3. Check the units for density and temperature
4. Find the time range of your data
5. Count how many layers the deepest profile has
