# All Density Calculation Pathways

This notebook provides a comprehensive analysis of snow density calculation methods at both **layer-level** and **slab-level** scales.

## Table of Contents

1. [Load Snow Pit Data](#1-load-snow-pit-data)
2. [Find All Density Calculation Pathways](#2-find-all-density-calculation-pathways)
3. [Layer-Level Analysis](#3-layer-level-analysis)
   - 3.1 [Create Layers](#31-create-layers)
   - 3.2 [Execute All Pathways on Individual Layers](#32-execute-all-pathways-on-individual-layers)
   - 3.3 [Extract Density Values](#33-extract-density-values)
4. [Slab-Level Analysis with ECTP Failure Layers](#4-slab-level-analysis-with-ectp-failure-layers)
   - 4.1 [Create Slabs from ECTP Failure Layers](#41-create-slabs-from-ectp-failure-layers)
   - 4.2 [Execute Pathways on ECTP Slabs](#42-execute-pathways-on-ectp-slabs)
   - 4.3 [Extract Density Values](#43-extract-density-values)
5. [Compare Layer-Level vs Slab-Level Results](#5-compare-layer-level-vs-slab-level-results)

## Workflow

This notebook implements a dual-level analysis approach:

### Sections 1-2: Data Preparation
1. **Load Data**: Parse all snow pit CAAML files from `examples/data/`
2. **Find Pathways**: Use graph algorithm to discover all possible density calculation methods

### Section 3: Layer-Level Analysis
3. **Create Layers**: Aggregate a flat list of all individual layers (independent of parent pits)
4. **Execute**: Run each pathway on each layer independently as single-layer "slabs"
5. **Extract Results**: Organize layer-level density calculations into DataFrames

### Section 4: Slab-Level Analysis
6. **Create ECTP Slabs**: Build multi-layer slabs using ECTP (Extended Column Test with Propagation) failure layers
7. **Execute**: Run all pathways on each complete slab
8. **Extract Results**: Organize slab-level density calculations into DataFrames

### Section 5: Comparison
9. **Compare & Analyze**: Evaluate and compare success rates between layer-level and slab-level approaches

## Key Approach

### Layer-Level Analysis
**Each layer is analyzed independently**, regardless of which pit it came from:
- Different layers in the same pit may have different measurements available
- Some methods may work for certain layers but not others in the same pit
- Reveals method applicability given specific measurement combinations
- Provides fundamental understanding of method coverage

### Slab-Level Analysis  
**Multi-layer slabs are analyzed as complete structures**:
- Slabs contain all layers above ECTP failure layers (weak layers)
- Success requires **ALL layers** in a slab to have successful density calculations
- More realistic for avalanche applications where complete profiles are needed
- Stricter criterion reveals which methods work for entire avalanche-relevant structures

**Target Parameter**: `density` - snow layer density in kg/mÂ³

Both analyses use the SnowPyt-MechParams execution engine with dynamic programming (caching) for efficient computation across multiple pathways.

In [1]:
# Standard library imports
import os
from pathlib import Path
from typing import List, Dict, Any
import warnings
warnings.filterwarnings('ignore')

# Scientific computing
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from uncertainties import ufloat

# SnowPyt-MechParams imports
from snowpyt_mechparams.snowpilot import parse_caaml_directory
from snowpyt_mechparams.data_structures import Pit, Layer
from snowpyt_mechparams.graph import graph, density
from snowpyt_mechparams.algorithm import find_parameterizations
from snowpyt_mechparams.execution import ExecutionEngine
from snowpyt_mechparams.execution.config import ExecutionConfig

# Configure plotting
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 10

print("âœ“ All imports successful")

âœ“ All imports successful


## 1. Load Snow Pit Data

Load all CAAML XML files from `examples/data/` and create Pit objects. We'll extract individual layers from these pits for independent analysis.

In [2]:
# Path to data directory
data_dir = Path("data")

# Parse all CAAML files in the directory
print(f"Loading CAAML files from: {data_dir.absolute()}")
snow_pits_raw = parse_caaml_directory(str(data_dir))

# Create Pit objects from the parsed data
pits = [Pit.from_snow_pit(sp) for sp in snow_pits_raw]

print(f"\nâœ“ Successfully loaded {len(pits)} snow pits")
print(f"  Total layers across all pits: {sum(len(pit.layers) for pit in pits)}")

Loading CAAML files from: /Users/marykate/Desktop/Snow/SnowPyt-MechParams/examples/data

âœ“ Successfully loaded 50278 snow pits
  Total layers across all pits: 371429


## 2. Find All Density Calculation Pathways

Use the graph and algorithm to discover all possible ways to calculate density.

In [3]:
# Get the density node from the graph
density_node = graph.get_node("density")

# Find all parameterizations (pathways) for calculating density
pathways = find_parameterizations(graph, density_node)

print(f"Found {len(pathways)} pathways for calculating density:\n")
for i, pathway in enumerate(pathways, 1):
    print(f"Pathway {i}:")
    print(pathway)
    print()

Found 4 pathways for calculating density:

Pathway 1:
branch 1: snow_pit -- data_flow --> measured_density -- data_flow --> density

Pathway 2:
branch 1: snow_pit -- data_flow --> measured_hand_hardness -- data_flow --> merge_hand_hardness_grain_form
branch 2: snow_pit -- data_flow --> measured_grain_form -- data_flow --> merge_hand_hardness_grain_form
merge branch 1, branch 2: merge_hand_hardness_grain_form -- geldsetzer --> density

Pathway 3:
branch 1: snow_pit -- data_flow --> measured_hand_hardness -- data_flow --> merge_hand_hardness_grain_form
branch 2: snow_pit -- data_flow --> measured_grain_form -- data_flow --> merge_hand_hardness_grain_form
merge branch 1, branch 2: merge_hand_hardness_grain_form -- kim_jamieson_table2 --> density

Pathway 4:
branch 1: snow_pit -- data_flow --> measured_hand_hardness -- data_flow --> merge_hand_hardness_grain_form
branch 2: snow_pit -- data_flow --> measured_grain_form -- data_flow --> merge_hand_hardness_grain_form
branch 3: snow_pit -- da

## 3. Layer-Level Analysis

Analyze density calculation pathways at the individual layer level, treating each layer independently regardless of its parent pit.

### 3.1 Create Layers

Aggregate all layers from all pits into a flat list for independent analysis.

In [4]:
# Create an aggregated list of all layers with metadata
# Each entry: (layer, pit_id, layer_index_in_pit)
all_layers = []

for pit in pits:
    for layer_idx, layer in enumerate(pit.layers):
        # Handle slope_angle safely (might be None, NaN, or numeric)
        try:
            slope_angle = float(pit.slope_angle) if pit.slope_angle is not None and not np.isnan(pit.slope_angle) else 0.0
        except (TypeError, ValueError):
            slope_angle = 0.0
        
        all_layers.append({
            'layer': layer,
            'pit_id': pit.pit_id,
            'layer_index': layer_idx,
            'slope_angle': slope_angle
        })

print(f"âœ“ Collected {len(all_layers)} individual layers from {len(pits)} pits")

âœ“ Collected 371429 individual layers from 50278 pits


### 3.2 Execute All Pathways on Individual Layers

Run each density pathway on every layer independently. Each layer is treated as a single-layer "slab" for execution.

In [5]:
# Initialize the execution engine
engine = ExecutionEngine(graph)

# Exclude method uncertainty so that reported uncertainty reflects only
# propagated input measurement uncertainty, not the regression standard error.
config = ExecutionConfig(include_method_uncertainty=False)

# Store all results
# Structure: all_results[layer_unique_id] = {...}
all_results: Dict[str, Any] = {}

print("Executing all density pathways on all layers...")
print(f"This will execute {len(pathways)} pathways on {len(all_layers)} layers\n")

# Execute all pathways on each individual layer
successful_layers = 0
failed_layers = 0

from snowpyt_mechparams.data_structures import Slab

for layer_idx, layer_info in enumerate(all_layers, 1):
    
    try:
        # Create a single-layer "slab" for this layer
        # This allows the execution engine to work with individual layers
        slab = Slab(
            layers=[layer_info['layer']],  # Single layer
            angle=layer_info['slope_angle'],
            pit_id=layer_info['pit_id']
        )
        
        # Execute ALL pathways on this single layer
        results = engine.execute_all(slab, "density", config=config)
        
        # Create unique ID for this layer
        layer_id = f"{layer_info['pit_id']}_L{layer_info['layer_index']}"
        
        # Store results with layer metadata
        all_results[layer_id] = {
            'execution_results': results,
            'pit_id': layer_info['pit_id'],
            'layer_index': layer_info['layer_index']
        }
        successful_layers += 1
            
    except Exception as e:
        # Skip layers that cause errors
        failed_layers += 1
        pass

print(f"\nâœ“ Execution complete!")
print(f"  Successful: {successful_layers} layers ({100*successful_layers/len(all_layers):.1f}%)")
print(f"  Failed:     {failed_layers} layers ({100*failed_layers/len(all_layers):.1f}%)")

Executing all density pathways on all layers...
This will execute 4 pathways on 371429 layers


âœ“ Execution complete!
  Successful: 371429 layers (100.0%)
  Failed:     0 layers (0.0%)


### 3.3 Extract Density Values

Organize all calculated layer-level density values into a structured DataFrame for analysis.

In [6]:
# Extract all density values into a structured format
density_data = []

for layer_id, result_info in all_results.items():
    execution_results = result_info['execution_results']
    pit_id = result_info['pit_id']
    layer_index = result_info['layer_index']
    
    # Iterate through each pathway result for this layer
    for pathway_desc, pathway_result in execution_results.pathways.items():
        # Extract density values from computation trace
        for trace in pathway_result.computation_trace:
            if trace.parameter == "density" and trace.success and trace.output is not None:
                # Extract nominal value and uncertainty if present
                if hasattr(trace.output, 'nominal_value'):
                    # ufloat from uncertainties package
                    density_value = trace.output.nominal_value
                    density_std = trace.output.std_dev
                else:
                    # Handle various output types (scalar, list, array)
                    try:
                        # If it's a list or array, take the first element
                        if isinstance(trace.output, (list, tuple)):
                            density_value = float(trace.output[0]) if len(trace.output) > 0 else None
                        else:
                            density_value = float(trace.output)
                        density_std = 0.0
                    except (TypeError, ValueError, IndexError):
                        # Skip values that can't be converted
                        continue
                
                # Only add if we got a valid density value
                if density_value is not None:
                    density_data.append({
                        'layer_id': layer_id,
                        'pathway_description': pathway_desc,
                        'pathway_method': pathway_result.methods_used.get('density', 'unknown'),
                        'pit_id': pit_id,
                        'layer_index': layer_index,
                        'density': density_value,
                        'density_std': density_std,
                        'cached': trace.cached
                    })

# Create DataFrame
df_density = pd.DataFrame(density_data)

# Compute per-row relative uncertainty (std_dev / nominal_value), skipping zero-density rows
df_density['density_relative_uncertainty'] = np.where(
    df_density['density'] != 0,
    df_density['density_std'] / df_density['density'],
    np.nan
)

print(f"Extracted {len(df_density)} density calculations")
print(f"  Across {df_density['pathway_description'].nunique()} unique pathways")
print(f"  From {df_density['layer_id'].nunique()} unique layers")
print(f"  Spanning {df_density['pit_id'].nunique()} pits")
print(f"\nSuccessfully calculated layers by pathway:")
print(f"  {'Method':<30s} {'Layers':>8s} {'Coverage':>9s} {'Avg Rel. Uncertainty':>22s}")
print(f"  {'-'*71}")
for method in sorted(df_density['pathway_method'].unique()):
    mask = df_density['pathway_method'] == method
    unique_layers = df_density[mask]['layer_id'].nunique()
    pct = 100 * unique_layers / df_density['layer_id'].nunique()
    avg_rel_unc = df_density[mask]['density_relative_uncertainty'].mean()
    print(f"  - {method:<28s} {unique_layers:>8d} ({pct:>5.1f}%)    {avg_rel_unc:>18.1%}")

print()
print("  Note: Uncertainty reflects propagated input measurement uncertainties only")
print("  (method regression standard error excluded). For 'data_flow' (direct")
print("  measurement), this is purely the Â±10% density measurement uncertainty.")
print("  For empirical methods, it reflects propagated hand hardness index (Â±0.67)")
print("  and grain size (Â±0.5 mm) measurement uncertainties.")

Extracted 551063 density calculations
  Across 4 unique pathways
  From 247294 unique layers
  Spanning 45841 pits

Successfully calculated layers by pathway:
  Method                           Layers  Coverage   Avg Rel. Uncertainty
  -----------------------------------------------------------------------
  - data_flow                       10468 (  4.2%)                 10.0%
  - geldsetzer                     200676 ( 81.1%)                 17.1%
  - kim_jamieson_table2            235522 ( 95.2%)                 16.4%
  - kim_jamieson_table5            104397 ( 42.2%)                 18.4%

  Note: Uncertainty reflects propagated input measurement uncertainties only
  (method regression standard error excluded). For 'data_flow' (direct
  measurement), this is purely the Â±10% density measurement uncertainty.
  For empirical methods, it reflects propagated hand hardness index (Â±0.67)
  and grain size (Â±0.5 mm) measurement uncertainties.


---

## 4. Slab-Level Analysis with ECTP Failure Layers

Now analyze density pathways at the slab level, using actual slabs created from ECTP (Extended Column Test with Propagation) failure layers.

### 4.1 Create Slabs from ECTP Failure Layers

In [7]:
# Create slabs from pits using ECTP failure layers
# Each slab contains all layers above a weak layer identified by an ECTP test
ectp_slabs = []

print("Creating slabs from ECTP failure layers...")
pits_with_ectp = 0
pits_without_ectp = 0

for pit in pits:
    # Create slabs using ECTP_failure_layer definition
    slabs = pit.create_slabs(weak_layer_def="ECTP_failure_layer")
    
    if slabs:
        pits_with_ectp += 1
        for slab in slabs:
            ectp_slabs.append({
                'slab': slab,
                'pit_id': pit.pit_id,
                'slab_id': slab.slab_id,
                'n_layers': len(slab.layers),
                'weak_layer_depth': slab.weak_layer.depth_top if slab.weak_layer else None
            })
    else:
        pits_without_ectp += 1

print(f"\nâœ“ Slab creation complete!")
print(f"  Pits with ECTP tests:     {pits_with_ectp}")
print(f"  Pits without ECTP tests:  {pits_without_ectp}")
print(f"  Total slabs created:      {len(ectp_slabs)}")


Creating slabs from ECTP failure layers...

âœ“ Slab creation complete!
  Pits with ECTP tests:     12347
  Pits without ECTP tests:  37931
  Total slabs created:      14776


### 4.2 Execute Pathways on ECTP Slabs

In [8]:
# Execute all pathways on each ECTP slab
slab_results: Dict[str, Any] = {}

print("Executing all density pathways on ECTP slabs...")
print(f"This will execute {len(pathways)} pathways on {len(ectp_slabs)} slabs\n")

successful_slabs = 0
failed_slabs = 0

for slab_idx, slab_info in enumerate(ectp_slabs, 1):
    
    slab = slab_info['slab']
    slab_id = slab_info['slab_id']
    
    try:
        # Execute ALL pathways on this slab
        results = engine.execute_all(slab, "density", config=config)
        
        # Store results with slab metadata
        slab_results[slab_id] = {
            'execution_results': results,
            'pit_id': slab_info['pit_id'],
            'n_layers': slab_info['n_layers'],
            'weak_layer_depth': slab_info['weak_layer_depth']
        }
        successful_slabs += 1
            
    except Exception as e:
        # Skip slabs that cause errors
        failed_slabs += 1
        pass

print(f"\nâœ“ Execution complete!")
print(f"  Successful: {successful_slabs} slabs ({100*successful_slabs/len(ectp_slabs):.1f}%)")
print(f"  Failed:     {failed_slabs} slabs ({100*failed_slabs/len(ectp_slabs):.1f}%)")

Executing all density pathways on ECTP slabs...
This will execute 4 pathways on 14776 slabs


âœ“ Execution complete!
  Successful: 14776 slabs (100.0%)
  Failed:     0 slabs (0.0%)


### 4.3 Extract Density Values

Extract slab-level density calculations and analyze success rates by pathway.

In [9]:
# Count successful slabs per pathway
# A slab is considered successful for a pathway if ALL layers have successful density calculations

pathway_slab_success = {}

for slab_id, result_info in slab_results.items():
    execution_results = result_info['execution_results']
    n_layers_in_slab = result_info['n_layers']
    
    # Check each pathway for this slab
    for pathway_desc, pathway_result in execution_results.pathways.items():
        # Get the method name for this pathway
        method_name = pathway_result.methods_used.get('density', 'unknown')
        
        if method_name not in pathway_slab_success:
            pathway_slab_success[method_name] = {
                'successful_slabs': 0,
                'failed_slabs': 0,
                'total_layers_calculated': 0,
                'slab_ids': []
            }
        
        # Check if this pathway succeeded for this slab
        # Count how many layers have successful density calculations
        successful_layers = sum(
            1 for trace in pathway_result.computation_trace
            if trace.parameter == "density" and trace.success and trace.output is not None
        )
        
        # Success requires ALL layers in the slab to have successful calculations
        if successful_layers == n_layers_in_slab and successful_layers > 0:
            pathway_slab_success[method_name]['successful_slabs'] += 1
            pathway_slab_success[method_name]['total_layers_calculated'] += successful_layers
            pathway_slab_success[method_name]['slab_ids'].append(slab_id)
        else:
            pathway_slab_success[method_name]['failed_slabs'] += 1

# Display results
print("=" * 80)
print("SLAB-LEVEL ANALYSIS SUMMARY (ECTP Failure Layers)")
print("=" * 80)

print(f"\nðŸ“Š Slab Dataset Overview:")
print(f"   â€¢ Total slabs created:             {len(ectp_slabs)}")
print(f"   â€¢ Slabs successfully processed:    {successful_slabs}")
print(f"   â€¢ Total layers in all slabs:       {sum(s['n_layers'] for s in ectp_slabs)}")

print(f"\nðŸ”¬ Success Rate by Pathway:")
print(f"{'Method':<30s} {'Successful Slabs':<20s} {'Success Rate':<15s} {'Layers Calculated':<20s}")
print("-" * 85)

for method in sorted(pathway_slab_success.keys()):
    stats = pathway_slab_success[method]
    total = stats['successful_slabs'] + stats['failed_slabs']
    success_rate = 100 * stats['successful_slabs'] / total if total > 0 else 0
    
    print(f"{method:<30s} "
          f"{stats['successful_slabs']:>6d} / {total:<6d}     "
          f"{success_rate:>6.1f}%         "
          f"{stats['total_layers_calculated']:>8d}")

print("\n" + "=" * 80)

SLAB-LEVEL ANALYSIS SUMMARY (ECTP Failure Layers)

ðŸ“Š Slab Dataset Overview:
   â€¢ Total slabs created:             14776
   â€¢ Slabs successfully processed:    14776
   â€¢ Total layers in all slabs:       58126

ðŸ”¬ Success Rate by Pathway:
Method                         Successful Slabs     Success Rate    Layers Calculated   
-------------------------------------------------------------------------------------
data_flow                         109 / 14776         0.7%              317
geldsetzer                       4539 / 14776        30.7%            13252
kim_jamieson_table2              5951 / 14776        40.3%            19838
kim_jamieson_table5              1145 / 14776         7.7%             3042



## 5. Compare Layer-Level vs Slab-Level Results

Compare and analyze the results from both analysis approaches.

In [10]:
# Compare layer-level vs slab-level analysis
print("=" * 80)
print("COMPARISON: LAYER-LEVEL vs SLAB-LEVEL ANALYSIS")
print("=" * 80)

print("\nðŸ“Š Dataset Comparison:")
print(f"   {'Metric':<40s} {'Layer-Level':<20s} {'Slab-Level':<20s}")
print("-" * 80)
print(f"   {'Total data units analyzed':<40s} {len(all_layers):<20d} {len(ectp_slabs):<20d}")
print(f"   {'Average layers per unit':<40s} {1.0:<20.1f} {np.mean([s['n_layers'] for s in ectp_slabs]):<20.1f}")
print(f"   {'Total layers across all units':<40s} {len(all_layers):<20d} {sum(s['n_layers'] for s in ectp_slabs):<20d}")

print("\nðŸ”¬ Coverage Comparison by Method:")
print(f"{'Method':<30s} {'Layers (Individual)':<25s} {'Slabs (ECTP)':<25s}")
print("-" * 80)

# Get common methods
all_methods = set(df_density['pathway_method'].unique()) | set(pathway_slab_success.keys())

for method in sorted(all_methods):
    # Layer-level coverage
    if method in df_density['pathway_method'].values:
        layer_count = df_density[df_density['pathway_method'] == method]['layer_id'].nunique()
        layer_pct = 100 * layer_count / len(all_layers)
        layer_str = f"{layer_count:>6d} ({layer_pct:>5.1f}%)"
    else:
        layer_str = "0 (  0.0%)"
    
    # Slab-level coverage
    if method in pathway_slab_success:
        slab_success = pathway_slab_success[method]['successful_slabs']
        slab_total = slab_success + pathway_slab_success[method]['failed_slabs']
        slab_pct = 100 * slab_success / slab_total if slab_total > 0 else 0
        slab_str = f"{slab_success:>6d} / {slab_total:<6d} ({slab_pct:>5.1f}%)"
    else:
        slab_str = "0 /      0 (  0.0%)"
    
    print(f"{method:<30s} {layer_str:<25s} {slab_str:<25s}")

print("\n" + "=" * 80)
print("Layer-level analysis treats each layer independently.")
print("Slab-level analysis evaluates groups of layers above ECTP failure layers.")
print("Slab success requires ALL layers in the slab to have successful calculations.")
print("=" * 80)

COMPARISON: LAYER-LEVEL vs SLAB-LEVEL ANALYSIS

ðŸ“Š Dataset Comparison:
   Metric                                   Layer-Level          Slab-Level          
--------------------------------------------------------------------------------
   Total data units analyzed                371429               14776               
   Average layers per unit                  1.0                  3.9                 
   Total layers across all units            371429               58126               

ðŸ”¬ Coverage Comparison by Method:
Method                         Layers (Individual)       Slabs (ECTP)             
--------------------------------------------------------------------------------
data_flow                       10468 (  2.8%)              109 / 14776  (  0.7%) 
geldsetzer                     200676 ( 54.0%)             4539 / 14776  ( 30.7%) 
kim_jamieson_table2            235522 ( 63.4%)             5951 / 14776  ( 40.3%) 
kim_jamieson_table5            104397 ( 28.1%)      