# 08: Advanced XArray Techniques

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Austfi/xsnowForPatrol/blob/main/notebooks/08_advanced_xarray_techniques.ipynb)

This notebook covers advanced xarray techniques including broadcasting, alignment, groupby operations, and resampling.

## What You'll Learn

- Broadcasting and alignment: working with datasets of different shapes
- GroupBy operations: analyzing data by groups
- Resampling: time-based resampling to different frequencies
- Advanced data manipulation patterns

> **Note**: This is a reference notebook covering advanced xarray topics. The main tutorial notebooks focus on core xsnow functionality.

## Installation (For Colab Users)

If you're using Google Colab, run the cell below to install xsnow and dependencies.

In [None]:
%pip install -q numpy pandas xarray matplotlib seaborn
%pip install -q git+https://gitlab.com/avacollabra/postprocessing/xsnow


In [None]:
import xsnow
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt

# Load sample data
ds = xsnow.single_profile_timeseries()
print("✅ Data loaded successfully")

%matplotlib inline


## Part 1: Broadcasting and Alignment

xarray automatically handles broadcasting when working with DataArrays of different shapes.

In [None]:
# Example: Normalize each profile by its mean density
mean_density_per_profile = ds['density'].mean(dim='layer')
normalized_density = ds['density'] - mean_density_per_profile

print(f"Original density shape: {ds['density'].shape}")
print(f"Mean per profile shape: {mean_density_per_profile.shape}")
print(f"Normalized density shape: {normalized_density.shape}")
print(f"Normalized mean (should be ~0): {normalized_density.mean().values:.6f}")


### Alignment: Putting Data on the Same Grid

xarray automatically aligns data by matching coordinate values.

In [None]:
# Example: Manual alignment using xarray's align function
ds1 = ds.isel(location=0, time=slice(0, 10))
ds2 = ds.isel(location=0, time=slice(5, 15))

# Align them to have the same coordinates (intersection by default)
ds1_aligned, ds2_aligned = xr.align(ds1, ds2, join='inner')
print(f"After alignment: Both datasets have {len(ds1_aligned.coords['time'])} time steps")


## Part 2: GroupBy Operations

GroupBy follows a "split-apply-combine" strategy for analyzing data by groups.

In [None]:
# Group by location and compute mean density
mean_density_by_location = ds['density'].groupby('location').mean(dim=['time', 'layer'])
print("Mean density by location:")
for loc in mean_density_by_location.location.values:
    mean_dens = mean_density_by_location.sel(location=loc).values
    print(f"  Location {loc}: {mean_dens:.1f} kg/m³")


In [None]:
# Group by month to see seasonal patterns
mean_density_by_month = ds['density'].groupby('time.month').mean(dim=['location', 'layer', 'time'])
print("Mean density by month:")
for month in mean_density_by_month.month.values:
    mean_dens = mean_density_by_month.sel(month=month).values
    month_name = pd.Timestamp(2000, month, 1).strftime('%B')
    print(f"  {month_name}: {mean_dens:.1f} kg/m³")


## Part 3: Resampling

Resample time series data to different frequencies (daily, weekly, monthly, etc.).

In [None]:
# Get snow height time series
hs_series = ds['HS'].isel(location=0, slope=0, realization=0)

# Resample to daily averages
hs_daily = hs_series.resample(time='1D').mean()
print(f"Original time steps: {len(hs_series.time)}")
print(f"After daily resampling: {len(hs_daily.time)}")


In [None]:
# Resample with different aggregation methods
hs_daily_mean = hs_series.resample(time='1D').mean()
hs_daily_max = hs_series.resample(time='1D').max()
hs_daily_min = hs_series.resample(time='1D').min()

print("Daily resampling with different methods:")
print(f"  Mean: {hs_daily_mean.values[0]:.3f} m")
print(f"  Max:  {hs_daily_max.values[0]:.3f} m")
print(f"  Min:  {hs_daily_min.values[0]:.3f} m")


## Summary

✅ **What we learned:**

1. **Broadcasting**: Automatic shape matching in xarray operations
2. **Alignment**: Matching coordinates between datasets
3. **GroupBy**: Split-apply-combine pattern for grouped analysis
4. **Resampling**: Time-based aggregation to different frequencies

## Key Techniques

- **Broadcasting**: Operations between arrays of different shapes
- **`xr.align()`**: Align datasets to matching coordinates
- **`.groupby()`**: Group data and apply operations
- **`.resample()`**: Resample time series to different frequencies

## Next Steps

Return to the main tutorial notebooks to continue learning xsnow.