### Version compatibility check

This notebook compares the xsnow package installed in your environment with the documentation version it was written for. The helper below calls `scripts/check_docs_version.py` so you can confirm that the package and docs align before continuing.


In [None]:
from __future__ import annotations

import subprocess
import sys
from pathlib import Path
import warnings


def _find_script() -> Path | None:
    current = Path.cwd().resolve()
    for candidate in [current, *current.parents]:
        script = candidate / "scripts" / "check_docs_version.py"
        if script.exists():
            return script
    return None


def get_docs_version() -> tuple[str | None, str | None]:
    script_path = _find_script()
    if script_path is None:
        return None, "scripts/check_docs_version.py was not found"
    try:
        completed = subprocess.run(
            [sys.executable, str(script_path)],
            check=True,
            capture_output=True,
            text=True,
        )
    except subprocess.CalledProcessError as exc:
        output = (exc.stdout or "") + (exc.stderr or "")
        return None, output.strip() or str(exc)
    return completed.stdout.strip() or None, None


docs_version, docs_error = get_docs_version()

try:
    import xsnow
    package_version = xsnow.__version__
except Exception as exc:  # pylint: disable=broad-except
    xsnow = None  # type: ignore[assignment]
    package_version = None
    package_error = str(exc)
else:
    package_error = None

print(f"xsnow package version: {package_version if package_version else 'not installed'}")
if package_error and not package_version:
    print(f"Import error: {package_error}")

if docs_version:
    print(f"xsnow docs version: {docs_version}")
else:
    message = "xsnow docs version: unavailable"
    if docs_error:
        message += f" ({docs_error})"
    print(message)

if docs_version and package_version and docs_version != package_version:
    warnings.warn(
        "xsnow package version differs from the documentation version. "
        "Consider aligning them before executing the notebook.",
        stacklevel=2,
    )

# 05: Working with Custom Data

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Austfi/xsnowForPatrol/blob/main/notebooks/05_working_with_custom_data.ipynb)

This notebook shows you how to prepare and load your own SNOWPACK output files into xsnow.

## What You'll Learn

- Preparing your own .pro and .smet files
- File format requirements
- Loading custom data
- Troubleshooting common issues
- Merging multiple data sources
- Data validation


## Installation (For Colab Users)

Set `INSTALL_XSNOW = True` in the next cell if you need to install xsnow. When enabled you can pick `INSTALL_METHOD = "pip"` to install published packages or `INSTALL_METHOD = "dev"` to work from a local clone. The cell also installs the supporting scientific Python stack used throughout the course.


In [None]:
import subprocess
import sys
from pathlib import Path

INSTALL_XSNOW = False  # Set to True to install or update xsnow in this environment.
INSTALL_METHOD = "pip"  # Choose "pip" for a package install, or "dev" for a developer clone.
DEV_REPO_URL = "https://gitlab.com/avacollabra/postprocessing/xsnow.git"
DEV_CLONE_DIR = Path.home() / "xsnow-dev"


def _run(cmd: list[str]) -> None:
    print(f"$ {' '.join(cmd)}")
    subprocess.check_call(cmd)


try:
    import xsnow
    print(f"xsnow {xsnow.__version__} is already available.")
except Exception as exc:  # pylint: disable=broad-except
    xsnow = None  # type: ignore[assignment]
    print(f"xsnow is not currently available: {exc}")
    if not INSTALL_XSNOW:
        print("Set INSTALL_XSNOW = True and re-run this cell to install xsnow (pip or dev clone).")
    else:
        try:
            if INSTALL_METHOD == "pip":
                _run([sys.executable, "-m", "pip", "install", "--quiet", "numpy", "pandas", "xarray", "matplotlib", "seaborn", "dask", "netcdf4"])
                _run([sys.executable, "-m", "pip", "install", "--quiet", "git+https://gitlab.com/avacollabra/postprocessing/xsnow"])
            elif INSTALL_METHOD == "dev":
                if not DEV_CLONE_DIR.exists():
                    _run(["git", "clone", DEV_REPO_URL, str(DEV_CLONE_DIR)])
                _run([sys.executable, "-m", "pip", "install", "--quiet", "-e", str(DEV_CLONE_DIR)])
            else:
                raise ValueError(f"Unsupported INSTALL_METHOD: {INSTALL_METHOD}")
        except subprocess.CalledProcessError as install_error:
            raise RuntimeError("xsnow installation command failed") from install_error
        import xsnow  # noqa: F401  # pylint: disable=import-outside-toplevel
        print(f"xsnow {xsnow.__version__} installed successfully.")
else:
    INSTALL_XSNOW = INSTALL_XSNOW  # no-op so variable is defined for later cells

In [None]:
import xsnow
import os
import glob



In [None]:
# Example: Explore xsnow sample data
import xsnow

print("xsnow provides two main sample datasets:")
print()

# Example 1: Single profile
print("1. Single profile (one snapshot):")
try:
    ds_single = xsnow.single_profile()
    print(f"   ✅ Loaded! Dimensions: {dict(ds_single.dims)}")
except Exception as e:
    print(f"   ❌ Error: {e}")

print()
# Example 2: Time series
print("2. Time series (multiple snapshots over time):")
try:
    ds_timeseries = xsnow.single_profile_timeseries()
    print(f"   ✅ Loaded! Dimensions: {dict(ds_timeseries.dims)}")
except Exception as e:
    print(f"   ❌ Error: {e}")


## Part 1: File Format Requirements

xsnow can read SNOWPACK output files in these formats:

### .pro Files (Profile Time Series)

- **Format**: SNOWPACK profile format (legacy)
- **Contains**: Time series of snow profiles with layer-by-layer data
- **Required**: Header with station metadata, profile data blocks
- **Generated by**: SNOWPACK when `PROF_FORMAT = PRO` in .ini file

### .smet Files (Meteorological Time Series)

- **Format**: SMET (MeteoIO format)
- **Contains**: Time series of scalar variables (no layers)
- **Required**: SMET header with field descriptions, time series data
- **Generated by**: SNOWPACK or MeteoIO for meteorological data

### Other Formats

xsnow may support other formats (check documentation):
- NetCDF (if SNOWPACK outputs to NetCDF)
- Other SNOWPACK output formats


## Part 2: Preparing Your Files

### Step 1: Generate SNOWPACK Output

If you're running SNOWPACK yourself:

1. **Configure SNOWPACK** (via Inishell or .ini file):
   - Set `PROF_FORMAT = PRO` to generate .pro files
   - Configure which variables to output
   - Set output directory

2. **Run SNOWPACK** simulation

3. **Check output files**:
   - Look for `.pro` files in output directory
   - Check for `.smet` files if configured

### Step 2: Verify File Format

Let's check if your files are in the correct format:


In [None]:
# Check for .pro files in data directory
data_dir = "data"
pro_files = glob.glob(os.path.join(data_dir, "*.pro"))
smet_files = glob.glob(os.path.join(data_dir, "*.smet"))

for f in pro_files[:5]:  # Show first 5

for f in smet_files[:5]:  # Show first 5

# Quick format check
if pro_files:
    first_file = pro_files[0]
    with open(first_file, 'r') as f:
        first_lines = [f.readline() for _ in range(10)]
        for i, line in enumerate(first_lines[:5]):


## Part 3: Loading Your Custom Data

Now let's load your files:


In [None]:
# Method 1: Load a single file
if pro_files:
    try:
        ds = xsnow.read(pro_files[0])
    except Exception as e:
        print(f"❌ Error loading file: {e}")
        ds = None
else:
    ds = None


### Loading Multiple Files

You can load multiple files at once:


In [None]:
# Method 2: Load multiple files
if len(pro_files) > 1:
    try:
        ds_multi = xsnow.read(pro_files[:3])  # Load first 3 files
    except Exception as e:
        print(f"❌ Error loading multiple files: {e}")
else:
    Loading multiple files:
    
    # List of files
    ds = xsnow.read(['data/file1.pro', 'data/file2.pro'])
    
    # All files in directory
    ds = xsnow.read('data/')
    
    # Mix of .pro and .smet
    ds = xsnow.read(['data/profile.pro', 'data/meteo.smet'])
    """)


## Part 4: Troubleshooting Common Issues

### Issue 1: File Not Found

**Error**: `FileNotFoundError` or similar

**Solutions**:
- Check file path is correct
- Use absolute paths if relative paths don't work
- Verify file exists: `os.path.exists('path/to/file.pro')`


In [None]:
# Example: Check if file exists before loading
test_file = "data/your_file.pro"
if os.path.exists(test_file):
    # ds = xsnow.read(test_file)
else:
    print(f"❌ File not found: {test_file}")


### Issue 2: Format Not Recognized

**Error**: File format not supported or parsing errors

**Solutions**:
- Verify file is actual .pro or .smet format (not just renamed)
- Check file header matches expected format
- Try opening file in text editor to inspect structure
- Check SNOWPACK version compatibility


In [None]:
# Inspect file header
if pro_files:
    with open(pro_files[0], 'r') as f:
        header_lines = [f.readline().strip() for _ in range(20)]
        for i, line in enumerate(header_lines):
            if line:  # Skip empty lines
    


### Issue 3: Missing Variables

**Problem**: Expected variables not in dataset

**Solutions**:
- Check SNOWPACK output configuration
- Verify variables were enabled in SNOWPACK .ini file
- Some variables may be computed by xsnow (like HS, z)
- Check variable names match xsnow's expected names


In [None]:
for var in list(ds.data_vars.keys())[:20]:  # Show first 20

# Check for common variables
common_vars = ['density', 'temperature', 'HS', 'grain_type', 'grain_size']
for var in common_vars:
    if var in ds.data_vars:
else:


### Issue 4: Time Alignment Problems

**Problem**: Multiple files have different time ranges or frequencies

**Solutions**:
- xsnow will try to align times automatically
- Check time ranges: `ds.coords['time'].values`
- Resample if needed: `ds.resample(time='1H').mean()`
- Manually select overlapping time periods


In [None]:
times = ds.coords['time'].values

# Check time frequency
if len(times) > 1:
    time_diff = times[1] - times[0]
    


## Part 5: Data Validation

After loading, validate your data:


In [None]:

# Check for NaN values
if 'density' in ds.data_vars:
    nan_count = ds['density'].isnull().sum().values
    total_count = ds['density'].size
    if nan_count > 0:

# Check for reasonable value ranges
if 'density' in ds.data_vars:
    density_vals = ds['density'].values
    valid_vals = density_vals[~np.isnan(density_vals)]
    if len(valid_vals) > 0:
        if valid_vals.min() < 0 or valid_vals.max() > 1000:

if 'temperature' in ds.data_vars:
    temp_vals = ds['temperature'].values
    valid_vals = temp_vals[~np.isnan(temp_vals)]
    if len(valid_vals) > 0:
        if valid_vals.min() < -50 or valid_vals.max() > 10:

# Check dimensions
for dim, size in ds.dims.items():



## Part 6: Merging Profile and Meteorological Data


In [None]:
# Example: Merge profile and meteo data
if pro_files and smet_files:
    try:
        # Load both types
        ds_combined = xsnow.read([pro_files[0], smet_files[0]])
        layer_vars = [v for v in ds_combined.data_vars if 'layer' in ds_combined[v].dims]
        for v in layer_vars[:5]:
        
        profile_vars = [v for v in ds_combined.data_vars if 'layer' not in ds_combined[v].dims]
        for v in profile_vars[:5]:
    except Exception as e:
        print(f"Error merging: {e}")
else:
    Merging profile and meteo data:
    
    # Load both at once
    ds = xsnow.read(['data/profile.pro', 'data/meteo.smet'])
    
    # Or load separately and merge
    ds_pro = xsnow.read('data/profile.pro')
    ds_met = xsnow.read('data/meteo.smet')
    ds_combined = xsnow.merge([ds_pro, ds_met])  # If merge function exists
    """)


## Summary

✅ **What we learned:**

1. **File formats**: .pro (profiles) and .smet (meteorological) files
2. **Loading custom data**: Use `xsnow.read()` with your file paths
3. **Multiple files**: Load lists of files or entire directories
4. **Troubleshooting**: Common issues and solutions
5. **Validation**: Check data quality and ranges
6. **Merging**: Combine profile and meteo data

## Key Tips

- **File paths**: Use absolute paths if relative paths cause issues
- **Format verification**: Inspect file headers to ensure correct format
- **Variable names**: Check xsnow documentation for expected variable names
- **Time alignment**: xsnow handles this automatically when merging
- **Data quality**: Always validate loaded data

## Next Steps

Now that you can load your own data:
- Apply analysis techniques from previous notebooks
- Create visualizations with your data
- Or learn to extend xsnow: **06_extending_xsnow.ipynb**

## Exercises

1. Load one of your own .pro files and inspect its structure
2. Check for missing variables and verify data ranges
3. Load multiple files and compare their time ranges
4. Merge a .pro and .smet file if you have both
5. Validate your data and identify any quality issues
