# All Elastic Modulus Calculation Pathways

This notebook analyzes snow elastic modulus calculation methods at both **layer-level** and **slab-level** scales.

## Table of Contents

1. [Load Snow Pit Data](#1-load-snow-pit-data)
2. [Find All Elastic Modulus Calculation Pathways](#2-find-all-elastic-modulus-calculation-pathways)
3. [Layer-Level Analysis](#3-layer-level-analysis)
4. [Slab-Level Comparison (ECTP)](#4-slab-level-comparison-ectp)
5. [Sankey Diagrams: kim_jamieson_table2 → wautier](#5-sankey-diagrams-kim_jamieson_table2--wautier)

**Target Parameter**: `elastic_modulus` — snow layer elastic modulus in Pa

Uncertainty reflects propagated input measurement uncertainties only (method regression standard error excluded): ±10% for direct density measurement, ±0.67 hand hardness index, ±0.5 mm grain size.

In [1]:
from pathlib import Path
from typing import Dict, Any
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd

from snowpyt_mechparams.snowpilot import parse_caaml_directory
from snowpyt_mechparams.data_structures import Pit, Slab
from snowpyt_mechparams.graph import graph
from snowpyt_mechparams.algorithm import find_parameterizations
from snowpyt_mechparams.execution import ExecutionEngine
from snowpyt_mechparams.execution.config import ExecutionConfig

## 1. Load Snow Pit Data

In [2]:
snow_pits_raw = parse_caaml_directory(str(Path("data")))
pits = [Pit.from_snow_pit(sp) for sp in snow_pits_raw]

print(f"Loaded {len(pits)} snow pits ({sum(len(pit.layers) for pit in pits)} layers)")

Loaded 50278 snow pits (371429 layers)


## 2. Find All Elastic Modulus Calculation Pathways

In [3]:
pathways = find_parameterizations(graph, graph.get_node("elastic_modulus"))

print(f"Found {len(pathways)} pathways for calculating elastic_modulus:\n")
for i, pathway in enumerate(pathways, 1):
    print(f"Pathway {i}:")
    print(pathway)
    print()

Found 16 pathways for calculating elastic_modulus:

Pathway 1:
branch 1: snow_pit -- data_flow --> measured_density -- data_flow --> density -- data_flow --> merge_density_grain_form
branch 2: snow_pit -- data_flow --> measured_grain_form -- data_flow --> merge_density_grain_form
merge branch 1, branch 2: merge_density_grain_form -- bergfeld --> elastic_modulus

Pathway 2:
branch 1: snow_pit -- data_flow --> measured_hand_hardness -- data_flow --> merge_hand_hardness_grain_form
branch 2: snow_pit -- data_flow --> measured_grain_form -- data_flow --> merge_hand_hardness_grain_form
branch 3: snow_pit -- data_flow --> measured_grain_form -- data_flow --> merge_density_grain_form
merge branch 1, branch 2: merge_hand_hardness_grain_form -- geldsetzer --> density
merge branch 1, branch 2, branch 3: merge_density_grain_form -- bergfeld --> elastic_modulus

Pathway 3:
branch 1: snow_pit -- data_flow --> measured_hand_hardness -- data_flow --> merge_hand_hardness_grain_form
branch 2: snow_pit -

## 3. Layer-Level Analysis

Each layer is analyzed independently as a single-layer slab, regardless of its parent pit.

In [4]:
engine = ExecutionEngine(graph)
config = ExecutionConfig(include_method_uncertainty=False)

# Build flat list of (layer, slope_angle, pit_id, layer_index)
layer_infos = []
for pit in pits:
    try:
        angle = float(pit.slope_angle) if pit.slope_angle is not None and not np.isnan(pit.slope_angle) else 0.0
    except (TypeError, ValueError):
        angle = 0.0
    for idx, layer in enumerate(pit.layers):
        layer_infos.append((layer, angle, pit.pit_id, idx))

# Execute all pathways on each layer as a single-layer slab
all_results: Dict[str, Any] = {}
for layer, angle, pit_id, layer_idx in layer_infos:
    slab = Slab(layers=[layer], angle=angle, pit_id=pit_id)
    results = engine.execute_all(slab, "elastic_modulus", config=config)
    all_results[f"{pit_id}_L{layer_idx}"] = {
        'execution_results': results,
        'pit_id': pit_id,
    }

print(f"Executed {len(pathways)} pathways on {len(layer_infos)} layers")

Executed 16 pathways on 371429 layers


In [None]:
e_mod_data = []
layer_step_counts: dict = {}  # pathway → {'density': n, 'e_mod': n}

for layer_id, info in all_results.items():
    for pathway_result in info['execution_results'].pathways.values():
        density_method = pathway_result.methods_used.get('density', 'unknown')
        e_mod_method   = pathway_result.methods_used.get('elastic_modulus', 'unknown')
        full_pathway   = f"{density_method} → {e_mod_method}"

        if full_pathway not in layer_step_counts:
            layer_step_counts[full_pathway] = {'density': 0, 'e_mod': 0}

        sc = layer_step_counts[full_pathway]
        traces = pathway_result.computation_trace
        got_density = any(t.parameter == 'density'         and t.success and t.output is not None for t in traces)
        got_e_mod   = any(t.parameter == 'elastic_modulus' and t.success and t.output is not None for t in traces)
        if got_density: sc['density'] += 1
        if got_e_mod:   sc['e_mod']   += 1

        for trace in traces:
            if trace.parameter == 'elastic_modulus' and trace.success and trace.output is not None:
                out = trace.output
                if hasattr(out, 'nominal_value'):
                    val, std = out.nominal_value, out.std_dev
                else:
                    try:
                        val, std = float(out), 0.0
                    except (TypeError, ValueError):
                        continue
                e_mod_data.append({
                    'layer_id': layer_id,
                    'pit_id': info['pit_id'],
                    'full_pathway': full_pathway,
                    'elastic_modulus': val,
                    'elastic_modulus_std': std,
                })

df_e_mod = pd.DataFrame(e_mod_data)
df_e_mod['rel_unc'] = np.where(
    df_e_mod['elastic_modulus'] != 0,
    df_e_mod['elastic_modulus_std'] / df_e_mod['elastic_modulus'],
    np.nan,
)

# Summary table sorted by layer count descending
total_layers = len(layer_infos)
summary = (
    df_e_mod.groupby('full_pathway')
    .agg(layers=('layer_id', 'nunique'), avg_rel_unc=('rel_unc', 'mean'))
    .sort_values('layers', ascending=False)
    .reset_index()
)

print(f"  {'Full Pathway':<50s} {'Layers':>35s} {'Avg Rel. Uncertainty':>22s}")
print(f"  {'-'*109}")
for _, row in summary.iterrows():
    layer_str = f"{int(row['layers'])} / {total_layers} ({int(row['layers'])/total_layers:.1%})"
    print(f"  {row['full_pathway']:<50s} {layer_str:>35s}    {row['avg_rel_unc']:>18.1%}")
print()
print("  Note: Uncertainty is propagated from input measurement uncertainties only.")


## 4. Slab-Level Comparison (ECTP)

In [6]:
# Create ECTP slabs
ectp_slabs = []
for pit in pits:
    for slab in pit.create_slabs(weak_layer_def="ECTP_failure_layer"):
        ectp_slabs.append({'slab': slab, 'n_layers': len(slab.layers)})

print(f"Created {len(ectp_slabs)} ECTP slabs")

Created 14776 ECTP slabs


In [None]:
# Execute all elastic modulus pathways on each ECTP slab and count successes
# A slab succeeds for a pathway if ALL its layers have successful calculations
pathway_slab_success: dict = {}
slab_step_counts: dict = {}  # pathway → {'density': n, 'e_mod': n}

for info in ectp_slabs:
    slab = info['slab']
    n = info['n_layers']
    results = engine.execute_all(slab, 'elastic_modulus', config=config)
    for pathway_result in results.pathways.values():
        density_method = pathway_result.methods_used.get('density', 'unknown')
        e_mod_method   = pathway_result.methods_used.get('elastic_modulus', 'unknown')
        full_pathway   = f"{density_method} → {e_mod_method}"

        if full_pathway not in pathway_slab_success:
            pathway_slab_success[full_pathway] = 0
            slab_step_counts[full_pathway] = {'density': 0, 'e_mod': 0}

        sc = slab_step_counts[full_pathway]
        traces = pathway_result.computation_trace
        ok_density = sum(1 for t in traces if t.parameter == 'density'         and t.success and t.output is not None) == n
        ok_e_mod   = sum(1 for t in traces if t.parameter == 'elastic_modulus' and t.success and t.output is not None) == n
        if ok_density: sc['density'] += 1
        if ok_e_mod:   sc['e_mod']   += 1

        n_ok = sum(1 for t in traces if t.parameter == 'elastic_modulus' and t.success and t.output is not None)
        if n_ok == n:
            pathway_slab_success[full_pathway] += 1

print(f'Executed pathways on {len(ectp_slabs)} slabs')


### Layer-Level vs Slab-Level Comparison

In [8]:
all_pathways = sorted(
    set(df_e_mod['full_pathway'].unique()) | set(pathway_slab_success.keys()),
    key=lambda p: df_e_mod[df_e_mod['full_pathway'] == p]['layer_id'].nunique() if p in df_e_mod['full_pathway'].values else 0,
    reverse=True,
)

total_layers = len(layer_infos)
total_slabs = len(ectp_slabs)

print(f"  {'Full Pathway':<50s} {'Layers':>22s} {'Slabs (ECTP)':>25s}")
print(f"  {'-'*99}")
for pathway in all_pathways:
    layer_n = df_e_mod[df_e_mod['full_pathway'] == pathway]['layer_id'].nunique() if pathway in df_e_mod['full_pathway'].values else 0
    layer_cov = layer_n / total_layers
    slab_n = pathway_slab_success.get(pathway, 0)
    slab_cov = slab_n / total_slabs
    print(f"  {pathway:<50s} {layer_n:>6d} ({layer_cov:>5.1%})    {slab_n:>6d} / {total_slabs} ({slab_cov:>5.1%})")

print()
print("  Slab success requires ALL layers in the slab to have successful calculations.")

  Full Pathway                                                       Layers              Slabs (ECTP)
  ---------------------------------------------------------------------------------------------------
  kim_jamieson_table2 → wautier                      205232 (55.3%)      2092 / 14776 (14.2%)
  geldsetzer → schottner                             181780 (48.9%)      2780 / 14776 (18.8%)
  kim_jamieson_table2 → schottner                    181780 (48.9%)      2780 / 14776 (18.8%)
  geldsetzer → wautier                               170727 (46.0%)      1607 / 14776 (10.9%)
  kim_jamieson_table2 → kochle                       102219 (27.5%)       525 / 14776 ( 3.6%)
  kim_jamieson_table5 → wautier                       93515 (25.2%)       552 / 14776 ( 3.7%)
  kim_jamieson_table2 → bergfeld                      88658 (23.9%)      1241 / 14776 ( 8.4%)
  geldsetzer → bergfeld                               88032 (23.7%)      1225 / 14776 ( 8.3%)
  kim_jamieson_table5 → schottner           

## 5. Sankey Diagrams: kim_jamieson_table2 → wautier

Data loss at each step of the calculation chain for the `kim_jamieson_table2 → wautier` pathway.
The left diagram shows layer-level calculations; the right diagram shows slab-level calculations
(where a slab only passes if **all** its layers succeed).


In [None]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go

TARGET          = "kim_jamieson_table2 → wautier"
METHOD_DENSITY  = "kim_jamieson_table2"
METHOD_EMOD     = "wautier"

# ── Layer-level counts ────────────────────────────────────────────────────
lsc          = layer_step_counts[TARGET]
l_total      = len(layer_infos)
l_density    = lsc['density']
l_e_mod      = lsc['e_mod']
l_fail_dens  = l_total   - l_density
l_fail_emod  = l_density - l_e_mod

# ── Slab-level counts ─────────────────────────────────────────────────────
ssc          = slab_step_counts[TARGET]
s_total      = len(ectp_slabs)
s_density    = ssc['density']
s_e_mod      = ssc['e_mod']
s_fail_dens  = s_total   - s_density
s_fail_emod  = s_density - s_e_mod

def pct(n, d): return f'{n/d:.1%}' if d else '—'

BLUE  = 'rgba( 68, 114, 196, 0.85)'
GREEN = 'rgba( 84, 168, 104, 0.85)'
GREY  = 'rgba(170, 170, 170, 0.65)'

def make_sankey(total, n_density, fail_density, n_emod, fail_emod, label_prefix):
    """Build a Sankey trace for a 2-step chain: All → density → elastic modulus."""
    # Nodes: 0=All, 1=Pass density, 2=Fail density, 3=Pass E-mod, 4=Fail E-mod
    node_labels = [
        f'All {label_prefix}<br>{total:,}',
        f'Pass density<br>{n_density:,} ({pct(n_density, total)})',
        f'Fail density<br>{fail_density:,} ({pct(fail_density, total)})',
        f'Pass E-mod<br>{n_emod:,} ({pct(n_emod, total)})',
        f'Fail E-mod<br>{fail_emod:,} ({pct(fail_emod, n_density)} of pass density)',
    ]
    node_x   = [0.01, 0.50, 0.50, 0.99, 0.99]
    node_y   = [0.40, 0.22, 0.82, 0.22, 0.72]
    node_col = [BLUE, BLUE, GREY, GREEN, GREY]
    return dict(
        node=dict(
            label=node_labels, x=node_x, y=node_y, color=node_col,
            pad=14, thickness=20, line=dict(color='white', width=0.5),
        ),
        link=dict(
            source=[0, 0, 1, 1],
            target=[1, 2, 3, 4],
            value=[n_density, fail_density, n_emod, fail_emod],
            color=[
                'rgba( 68, 114, 196, 0.22)',
                'rgba(170, 170, 170, 0.20)',
                'rgba( 84, 168, 104, 0.22)',
                'rgba(170, 170, 170, 0.20)',
            ],
        ),
    )

layer_sankey = make_sankey(l_total, l_density, l_fail_dens, l_e_mod, l_fail_emod, 'layers')
slab_sankey  = make_sankey(s_total, s_density, s_fail_dens, s_e_mod, s_fail_emod, 'slabs')

fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=[
        f'Layer-level  ({l_total:,} layers)',
        f'Slab-level  ({s_total:,} ECTP slabs)',
    ],
    specs=[[{'type': 'sankey'}, {'type': 'sankey'}]],
    horizontal_spacing=0.08,
)

fig.add_trace(go.Sankey(arrangement='snap', **layer_sankey), row=1, col=1)
fig.add_trace(go.Sankey(arrangement='snap', **slab_sankey),  row=1, col=2)

# Method annotations — one set per subplot (x in paper coords, two halves)
# Left subplot occupies x=[0, 0.46], right subplot x=[0.54, 1.0]
# Within each subplot: density label at 0.25 of its width, E-mod at 0.75
annots = []
for x_offset, label_col_density, label_col_emod in [
    (0.00, 0.23, 0.44),   # left subplot
    (0.54, 0.77, 0.98),   # right subplot
]:
    annots += [
        dict(
            x=label_col_density, y=-0.12, xref='paper', yref='paper',
            ax=0, ay=-25, axref='pixel', ayref='pixel',
            showarrow=True, arrowhead=2, arrowsize=1, arrowwidth=1.5,
            arrowcolor='rgba(68,114,196,0.7)',
            text=f'<b>{METHOD_DENSITY}</b><br><span style="font-size:10px">density</span>',
            font=dict(size=10, color='rgba(68,114,196,1)'), align='center',
        ),
        dict(
            x=label_col_emod, y=-0.12, xref='paper', yref='paper',
            ax=0, ay=-25, axref='pixel', ayref='pixel',
            showarrow=True, arrowhead=2, arrowsize=1, arrowwidth=1.5,
            arrowcolor='rgba(84,168,104,0.7)',
            text=f'<b>{METHOD_EMOD}</b><br><span style="font-size:10px">elastic modulus</span>',
            font=dict(size=10, color='rgba(84,168,104,1)'), align='center',
        ),
    ]

fig.update_layout(
    title=dict(
        text=(
            f'<b>kim_jamieson_table2 → wautier: data loss at each calculation step</b><br>'
            '<sup>Left: each layer calculated independently — '
            'Right: slab passes only if ALL layers succeed</sup>'
        ),
        x=0.5, xanchor='center', font=dict(size=13),
    ),
    font=dict(size=11),
    width=1000, height=520,
    margin=dict(l=20, r=20, t=100, b=110),
    annotations=annots,
)

fig.show()
