# D11 Bending Stiffness Comparison Across Calculation Pathways

This notebook compares D11 (bending stiffness) values calculated via all possible parameterization pathways for ECTP slabs in the snow pilot dataset.

## Goals

1. Execute all pathways for each ECTP slab
2. Analyze data loss along different pathways
3. Compare D11 statistics by pathway
4. Compare D11 statistics across pathways (per slab)
5. Identify sources of variability

## Dataset

- **ECTP slabs**: Slabs from Extended Column Tests with Propagation
- **Slab definition**: Layers above the failure layer
- **Expected slabs**: ~14,776 from ~12,347 snow pits

## D11 Calculation

D11 (bending stiffness) requires:
- **Density** (ρ) for elastic modulus calculation (4 methods)
- **Elastic modulus** (E) on all layers (4 methods)
- **Poisson's ratio** (ν) on all layers (2 methods)
- **Layer positions** (depth_top, thickness) - already available

Number of pathways = (# density methods) × (# E methods) × (# ν methods) = 4 × 4 × 2 = **32 unique pathways**

**Key insight**: Poisson's ratio (Srivastava method) uses hand hardness + grain form directly, NOT calculated density. This prevents the creation of 80 pathways that would have resulted from E and ν independently calculating density.

## 1. Setup and Imports

In [None]:
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm
import warnings
warnings.filterwarnings('ignore')

# Add src to path if needed
sys.path.insert(0, str(Path.cwd().parent / 'src'))

from snowpyt_mechparams import ExecutionEngine
from snowpyt_mechparams.graph import graph
from snowpyt_mechparams.algorithm import find_parameterizations
from snowpyt_mechparams.snowpilot_utils import parse_caaml_file
from snowpyt_mechparams.data_structures import Pit

# Set plotting style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (14, 8)

print("✓ Imports successful")

## 2. Verify Graph Structure and Pathway Count

In [None]:
# Check that D11 node exists
D11_node = graph.get_node("D11")
print(f"D11 node: {D11_node}")
print(f"D11 node type: {D11_node.type}")

# Find all pathways to D11
all_D11_pathways = find_parameterizations(graph, D11_node)
print(f"\nTotal D11 calculation pathways: {len(all_D11_pathways)}")

# Show first 5 pathways as examples
print("\nExample pathways:")
for i, param in enumerate(all_D11_pathways[:5], 1):
    print(f"\n{i}. {param}")

## 3. Load ECTP Slabs from Dataset

In [None]:
# Set data directory path
data_dir = Path.cwd() / 'data'
print(f"Data directory: {data_dir}")
print(f"Exists: {data_dir.exists()}")

# Get all CAAML files
caaml_files = list(data_dir.glob('snowpits-*-caaml.xml'))
print(f"\nFound {len(caaml_files):,} CAAML files")

In [None]:
# Parse CAAML files and create Pits
print("Parsing CAAML files and creating Pits...")

pits = []
failed_files = []

for filepath in tqdm(caaml_files, desc="Processing files"):
    try:
        # Parse CAAML file to get snowpylot SnowPit
        snow_pit = parse_caaml_file(str(filepath))
        
        # Create Pit object
        pit = Pit.from_snow_pit(snow_pit)
        pits.append(pit)
        
    except Exception as e:
        failed_files.append((filepath.name, str(e)))

print(f"\nSuccessfully parsed: {len(pits):,} pits")
print(f"Failed: {len(failed_files):,} files")

In [None]:
# Create slabs using ECTP failure layers
print("Creating slabs from ECTP failures...")

all_slabs = []
pits_with_ectp = 0

for pit in tqdm(pits, desc="Creating slabs"):
    # Create slabs based on ECTP failures
    slabs = pit.create_slabs(weak_layer_def="ECTP_failure_layer")
    
    if slabs:
        pits_with_ectp += 1
        all_slabs.extend(slabs)

print(f"\nTotal pits processed: {len(pits):,}")
print(f"Pits with ECTP failures: {pits_with_ectp:,} ({100*pits_with_ectp/len(pits):.1f}%)")
print(f"Total ECTP slabs created: {len(all_slabs):,}")

## 4. Execute All Pathways for D11

This section executes all 32 pathways for each slab. The execution engine uses **dynamic programming** to cache computed values across pathways, avoiding redundant calculations.

**Expected executions**: ~14,776 slabs × 32 pathways = ~473,000 pathway executions

In [None]:
# Initialize execution engine
engine = ExecutionEngine(graph)

print(f"Executing all pathways for {len(all_slabs):,} slabs...")
print(f"Total pathway executions: {len(all_slabs) * len(all_D11_pathways):,}")
print("\nNote: Dynamic programming caches computed values within each slab.")
print("This avoids redundant calculations when pathways share common sub-paths.")

In [None]:
# Execute all pathways for each slab
results_data = []

for slab_idx, slab in enumerate(tqdm(all_slabs, desc="Executing pathways")):
    try:
        # Execute all pathways (uses dynamic programming internally)
        results = engine.execute_all(
            slab=slab,
            target_parameter='D11',
            include_plate_theory=True
        )

        # Record results for each pathway
        for pathway_desc, result in results.results.items():
            # Record even failed pathways (for data loss analysis)
            record = {
                'pit_id': slab.pit_id,
                'slab_id': slab.slab_id,
                'slab_index': slab_idx,
                'pathway_description': pathway_desc,
                'methods_used': str(result.methods_used),
                'success': result.success,
                'D11': result.slab_result.D11.nominal_value if (result.slab_result and result.slab_result.D11) else None,
                'D11_uncertainty': result.slab_result.D11.std_dev if (result.slab_result and result.slab_result.D11) else None,
                'num_layers': len(slab.layers),
                'slab_thickness_cm': slab.total_thickness,
                'slope_angle_deg': slab.angle,
            }

            # Add failure analysis data
            if not result.success or (result.slab_result and result.slab_result.D11 is None):
                # Determine why it failed
                # Count layers with missing elastic_modulus or poissons_ratio
                missing_E = sum(1 for lr in result.layer_results if lr.layer.elastic_modulus is None)
                missing_nu = sum(1 for lr in result.layer_results if lr.layer.poissons_ratio is None)
                missing_thickness = sum(1 for lr in result.layer_results if lr.layer.thickness is None)

                record['failure_reason'] = 'incomplete_layer_params'
                record['layers_missing_E'] = missing_E
                record['layers_missing_nu'] = missing_nu
                record['layers_missing_thickness'] = missing_thickness
                record['success'] = False

            results_data.append(record)

    except Exception as e:
        # Record complete failure
        results_data.append({
            'pit_id': slab.pit_id,
            'slab_id': slab.slab_id,
            'slab_index': slab_idx,
            'pathway_description': 'EXECUTION_ERROR',
            'success': False,
            'failure_reason': str(e),
        })

# Create DataFrame
df_results = pd.DataFrame(results_data)

print(f"\nTotal pathway executions: {len(df_results):,}")
print(f"Successful calculations: {df_results['success'].sum():,}")
print(f"Failed calculations: {(~df_results['success']).sum():,}")
print(f"Success rate: {100 * df_results['success'].mean():.1f}%")

In [None]:
# Save raw results
output_file = 'D11_pathway_comparison_raw.csv'
df_results.to_csv(output_file, index=False)
print(f"\nRaw results saved to: {output_file}")
print(f"File size: {Path(output_file).stat().st_size / 1024 / 1024:.1f} MB")

## 5. Data Loss Analysis by Pathway

Analyze which pathways have higher success rates and why others fail.

In [None]:
print("="*80)
print("DATA LOSS ANALYSIS BY PATHWAY")
print("="*80)

# Group by pathway
pathway_stats = df_results.groupby('pathway_description').agg({
    'success': ['sum', 'count'],
    'D11': 'count'
}).reset_index()

pathway_stats.columns = ['pathway', 'successful', 'total', 'D11_computed']
pathway_stats['success_rate_%'] = 100 * pathway_stats['successful'] / pathway_stats['total']
pathway_stats['failure_rate_%'] = 100 - pathway_stats['success_rate_%']

# Sort by success rate
pathway_stats_sorted = pathway_stats.sort_values('success_rate_%', ascending=False)

print("\nPathway Success Rates (Top 20):")
print(pathway_stats_sorted.head(20).to_string(index=False))

print("\n\nPathway Success Rates (Bottom 20):")
print(pathway_stats_sorted.tail(20).to_string(index=False))

In [None]:
# Visualize success rates
fig, ax = plt.subplots(figsize=(14, 10))
pathways_to_plot = pathway_stats_sorted.head(30)  # Top 30

y_pos = np.arange(len(pathways_to_plot))
ax.barh(y_pos, pathways_to_plot['success_rate_%'], alpha=0.7, color='steelblue')
ax.set_yticks(y_pos)
ax.set_yticklabels(pathways_to_plot['pathway'], fontsize=7)
ax.set_xlabel('Success Rate (%)', fontsize=12)
ax.set_title('D11 Calculation Success Rate by Pathway (Top 30)', fontsize=14, fontweight='bold')
ax.grid(axis='x', alpha=0.3)
ax.invert_yaxis()
plt.tight_layout()
plt.savefig('D11_success_rates_by_pathway.png', dpi=300, bbox_inches='tight')
plt.show()

print("✓ Figure saved: D11_success_rates_by_pathway.png")

In [None]:
# Analyze failure reasons
print("\n" + "="*80)
print("FAILURE REASON ANALYSIS")
print("="*80)

failed_df = df_results[~df_results['success']]
if len(failed_df) > 0 and 'failure_reason' in failed_df.columns:
    failure_counts = failed_df['failure_reason'].value_counts()
    print("\nFailure Reasons:")
    for reason, count in failure_counts.items():
        pct = 100 * count / len(failed_df)
        print(f"  {reason}: {count:,} ({pct:.1f}%)")
    
    # Analyze missing layer parameters
    if 'layers_missing_E' in failed_df.columns:
        print("\nMissing Layer Parameters (among failures):")
        print(f"  Average layers missing E: {failed_df['layers_missing_E'].mean():.2f}")
        print(f"  Average layers missing ν: {failed_df['layers_missing_nu'].mean():.2f}")
        print(f"  Average layers missing thickness: {failed_df['layers_missing_thickness'].mean():.2f}")

## 6. D11 Statistics by Individual Pathway

Compare D11 distributions across different calculation pathways.

In [None]:
print("="*80)
print("D11 STATISTICS BY PATHWAY")
print("="*80)

# Filter to successful calculations only
df_success = df_results[df_results['success'] == True].copy()

print(f"\nSuccessful D11 calculations: {len(df_success):,}")
print(f"Unique slabs with at least one successful pathway: {df_success['slab_id'].nunique():,}")
print(f"Unique pathways that succeeded: {df_success['pathway_description'].nunique()}")

# Calculate statistics for each pathway
pathway_D11_stats = df_success.groupby('pathway_description')['D11'].agg([
    'count',
    'mean',
    'median',
    'std',
    'min',
    'max'
]).reset_index()

pathway_D11_stats.columns = ['pathway', 'n', 'mean_D11', 'median_D11', 'std_D11', 'min_D11', 'max_D11']
pathway_D11_stats = pathway_D11_stats.sort_values('mean_D11', ascending=False)

print("\nD11 Statistics by Pathway (N·mm) - Top 20:")
print(pathway_D11_stats.head(20).to_string(index=False))

In [None]:
# Save pathway statistics
pathway_D11_stats.to_csv('D11_statistics_by_pathway.csv', index=False)
print("\n✓ Pathway statistics saved to: D11_statistics_by_pathway.csv")

In [None]:
# Box plot comparison for top pathways by sample size
fig, ax = plt.subplots(figsize=(16, 10))

# Get top 20 pathways by sample size
top_pathways = pathway_D11_stats.nlargest(20, 'n')['pathway'].tolist()
df_top = df_success[df_success['pathway_description'].isin(top_pathways)]

# Create box plot
bp = df_top.boxplot(column='D11', by='pathway_description', ax=ax, rot=90, patch_artist=True)
ax.set_ylabel('D11 (N·mm)', fontsize=12)
ax.set_xlabel('Pathway', fontsize=12)
ax.set_title('D11 Distribution by Pathway (Top 20 by Sample Size)', fontsize=14, fontweight='bold')
plt.suptitle('')  # Remove automatic title
plt.xticks(fontsize=8)
plt.tight_layout()
plt.savefig('D11_boxplot_by_pathway.png', dpi=300, bbox_inches='tight')
plt.show()

print("✓ Figure saved: D11_boxplot_by_pathway.png")

In [None]:
# Violin plot for top pathways
fig, ax = plt.subplots(figsize=(16, 10))

# Use top 10 for cleaner visualization
top_10_pathways = pathway_D11_stats.nlargest(10, 'n')['pathway'].tolist()
df_top10 = df_success[df_success['pathway_description'].isin(top_10_pathways)]

sns.violinplot(data=df_top10, x='pathway_description', y='D11', ax=ax, inner='box')
ax.set_xlabel('Pathway', fontsize=12)
ax.set_ylabel('D11 (N·mm)', fontsize=12)
ax.set_title('D11 Distribution by Pathway (Violin Plot - Top 10)', fontsize=14, fontweight='bold')
plt.xticks(rotation=90, fontsize=8)
plt.tight_layout()
plt.savefig('D11_violin_by_pathway.png', dpi=300, bbox_inches='tight')
plt.show()

print("✓ Figure saved: D11_violin_by_pathway.png")

## 7. D11 Variability Across Pathways (Per Slab)

For each slab, analyze how D11 varies across different calculation pathways.

In [None]:
print("="*80)
print("D11 VARIABILITY ACROSS PATHWAYS (PER SLAB)")
print("="*80)

# For each slab, calculate statistics across all successful pathways
slab_D11_variability = df_success.groupby('slab_id')['D11'].agg([
    'count',  # Number of successful pathways
    'mean',   # Mean D11 across pathways
    'std',    # Std dev across pathways (variability)
    'min',    # Min D11
    'max',    # Max D11
]).reset_index()

slab_D11_variability.columns = ['slab_id', 'n_pathways', 'mean_D11', 'std_D11', 'min_D11', 'max_D11']
slab_D11_variability['range_D11'] = slab_D11_variability['max_D11'] - slab_D11_variability['min_D11']
slab_D11_variability['cv_D11'] = slab_D11_variability['std_D11'] / slab_D11_variability['mean_D11']  # Coefficient of variation

# Merge with slab properties
slab_props = df_success[['slab_id', 'num_layers', 'slab_thickness_cm', 'slope_angle_deg']].drop_duplicates()
slab_D11_variability = slab_D11_variability.merge(slab_props, on='slab_id')

print(f"\nSlabs with successful D11 calculations: {len(slab_D11_variability):,}")
print(f"Average successful pathways per slab: {slab_D11_variability['n_pathways'].mean():.1f}")

print("\nD11 Variability Statistics:")
print(slab_D11_variability[['mean_D11', 'std_D11', 'cv_D11', 'range_D11']].describe())

In [None]:
# Save slab-level statistics
slab_D11_variability.to_csv('D11_variability_by_slab.csv', index=False)
print("\n✓ Slab variability statistics saved to: D11_variability_by_slab.csv")

In [None]:
# Histogram of coefficient of variation and other variability metrics
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# CV histogram
axes[0, 0].hist(slab_D11_variability['cv_D11'].dropna(), bins=50, edgecolor='black', alpha=0.7, color='steelblue')
axes[0, 0].set_xlabel('Coefficient of Variation', fontsize=10)
axes[0, 0].set_ylabel('Frequency', fontsize=10)
axes[0, 0].set_title('D11 Coefficient of Variation Across Pathways', fontsize=11, fontweight='bold')
axes[0, 0].axvline(slab_D11_variability['cv_D11'].median(), color='red', linestyle='--', 
                   label=f'Median: {slab_D11_variability["cv_D11"].median():.3f}')
axes[0, 0].legend()

# Std dev histogram
axes[0, 1].hist(slab_D11_variability['std_D11'].dropna(), bins=50, edgecolor='black', alpha=0.7, color='coral')
axes[0, 1].set_xlabel('Standard Deviation (N·mm)', fontsize=10)
axes[0, 1].set_ylabel('Frequency', fontsize=10)
axes[0, 1].set_title('D11 Standard Deviation Across Pathways', fontsize=11, fontweight='bold')

# Range histogram
axes[1, 0].hist(slab_D11_variability['range_D11'].dropna(), bins=50, edgecolor='black', alpha=0.7, color='mediumseagreen')
axes[1, 0].set_xlabel('Range (N·mm)', fontsize=10)
axes[1, 0].set_ylabel('Frequency', fontsize=10)
axes[1, 0].set_title('D11 Range Across Pathways', fontsize=11, fontweight='bold')

# Number of successful pathways
axes[1, 1].hist(slab_D11_variability['n_pathways'], 
                bins=range(int(slab_D11_variability['n_pathways'].min()), 
                          int(slab_D11_variability['n_pathways'].max())+2), 
                edgecolor='black', alpha=0.7, color='mediumpurple')
axes[1, 1].set_xlabel('Number of Successful Pathways', fontsize=10)
axes[1, 1].set_ylabel('Frequency', fontsize=10)
axes[1, 1].set_title('Successful Pathways per Slab', fontsize=11, fontweight='bold')

plt.tight_layout()
plt.savefig('D11_variability_distributions.png', dpi=300, bbox_inches='tight')
plt.show()

print("✓ Figure saved: D11_variability_distributions.png")

## 8. Relationship Between Variability and Slab Properties

In [None]:
print("="*80)
print("D11 VARIABILITY vs SLAB PROPERTIES")
print("="*80)

# Create scatter plots
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# CV vs slab thickness
axes[0, 0].scatter(slab_D11_variability['slab_thickness_cm'], slab_D11_variability['cv_D11'], 
                   alpha=0.3, s=10, color='steelblue')
axes[0, 0].set_xlabel('Slab Thickness (cm)', fontsize=10)
axes[0, 0].set_ylabel('Coefficient of Variation', fontsize=10)
axes[0, 0].set_title('D11 Variability vs Slab Thickness', fontsize=11, fontweight='bold')
axes[0, 0].grid(alpha=0.3)

# CV vs number of layers
axes[0, 1].scatter(slab_D11_variability['num_layers'], slab_D11_variability['cv_D11'], 
                   alpha=0.3, s=10, color='coral')
axes[0, 1].set_xlabel('Number of Layers', fontsize=10)
axes[0, 1].set_ylabel('Coefficient of Variation', fontsize=10)
axes[0, 1].set_title('D11 Variability vs Number of Layers', fontsize=11, fontweight='bold')
axes[0, 1].grid(alpha=0.3)

# CV vs slope angle
axes[1, 0].scatter(slab_D11_variability['slope_angle_deg'], slab_D11_variability['cv_D11'], 
                   alpha=0.3, s=10, color='mediumseagreen')
axes[1, 0].set_xlabel('Slope Angle (degrees)', fontsize=10)
axes[1, 0].set_ylabel('Coefficient of Variation', fontsize=10)
axes[1, 0].set_title('D11 Variability vs Slope Angle', fontsize=11, fontweight='bold')
axes[1, 0].grid(alpha=0.3)

# Mean D11 vs number of successful pathways
axes[1, 1].scatter(slab_D11_variability['n_pathways'], slab_D11_variability['mean_D11'], 
                   alpha=0.3, s=10, color='mediumpurple')
axes[1, 1].set_xlabel('Number of Successful Pathways', fontsize=10)
axes[1, 1].set_ylabel('Mean D11 (N·mm)', fontsize=10)
axes[1, 1].set_title('Mean D11 vs Number of Successful Pathways', fontsize=11, fontweight='bold')
axes[1, 1].grid(alpha=0.3)

plt.tight_layout()
plt.savefig('D11_variability_vs_properties.png', dpi=300, bbox_inches='tight')
plt.show()

print("✓ Figure saved: D11_variability_vs_properties.png")

In [None]:
# Calculate correlations
print("\nCorrelations with D11 Coefficient of Variation:")

# Remove rows with NaN in relevant columns for correlation
corr_data = slab_D11_variability[[
    'slab_thickness_cm', 'num_layers', 'slope_angle_deg', 
    'n_pathways', 'cv_D11'
]].dropna()

correlations = {
    'Slab Thickness': corr_data[['slab_thickness_cm', 'cv_D11']].corr().iloc[0, 1],
    'Number of Layers': corr_data[['num_layers', 'cv_D11']].corr().iloc[0, 1],
    'Slope Angle': corr_data[['slope_angle_deg', 'cv_D11']].corr().iloc[0, 1],
    'Number of Pathways': corr_data[['n_pathways', 'cv_D11']].corr().iloc[0, 1],
}

for prop, corr in correlations.items():
    print(f"  {prop}: {corr:.3f}")

## 9. Pairwise Pathway Comparison

For slabs where multiple pathways succeeded, compare D11 values pairwise to see how different methods correlate.

In [None]:
print("="*80)
print("PAIRWISE PATHWAY COMPARISON")
print("="*80)

# Get most common pathways
top_n_pathways = 6
most_common_pathways = df_success['pathway_description'].value_counts().head(top_n_pathways).index.tolist()

print(f"\nComparing top {top_n_pathways} most common pathways:")
for i, pw in enumerate(most_common_pathways, 1):
    n = df_success[df_success['pathway_description'] == pw].shape[0]
    print(f"{i}. {pw[:80]}... (n={n})")

In [None]:
# Create pairwise scatter plots
fig, axes = plt.subplots(2, 2, figsize=(14, 12))
axes = axes.flatten()

comparison_idx = 0
for i in range(len(most_common_pathways)):
    for j in range(i+1, len(most_common_pathways)):
        if comparison_idx >= 4:
            break

        pw1 = most_common_pathways[i]
        pw2 = most_common_pathways[j]

        # Find slabs where both succeeded
        slabs_pw1 = set(df_success[df_success['pathway_description'] == pw1]['slab_id'])
        slabs_pw2 = set(df_success[df_success['pathway_description'] == pw2]['slab_id'])
        common_slabs = slabs_pw1.intersection(slabs_pw2)

        if len(common_slabs) > 10:  # Only compare if sufficient overlap
            # Get D11 values for common slabs
            df_pw1 = df_success[(df_success['pathway_description'] == pw1) & 
                                (df_success['slab_id'].isin(common_slabs))]
            df_pw2 = df_success[(df_success['pathway_description'] == pw2) & 
                                (df_success['slab_id'].isin(common_slabs))]

            # Merge on slab_id
            df_comparison = df_pw1[['slab_id', 'D11']].merge(
                df_pw2[['slab_id', 'D11']],
                on='slab_id',
                suffixes=('_1', '_2')
            )

            # Scatter plot
            ax = axes[comparison_idx]
            ax.scatter(df_comparison['D11_1'], df_comparison['D11_2'], alpha=0.5, s=10)

            # Add 1:1 line
            min_val = min(df_comparison['D11_1'].min(), df_comparison['D11_2'].min())
            max_val = max(df_comparison['D11_1'].max(), df_comparison['D11_2'].max())
            ax.plot([min_val, max_val], [min_val, max_val], 'r--', alpha=0.5, label='1:1 line')

            # Calculate correlation
            corr = df_comparison[['D11_1', 'D11_2']].corr().iloc[0, 1]

            ax.set_xlabel(f'Pathway {i+1} D11 (N·mm)', fontsize=9)
            ax.set_ylabel(f'Pathway {j+1} D11 (N·mm)', fontsize=9)
            ax.set_title(f'Pathways {i+1} vs {j+1} (n={len(common_slabs)}, r={corr:.3f})', fontsize=10, fontweight='bold')
            ax.legend(fontsize=8)
            ax.grid(alpha=0.3)

            comparison_idx += 1

# Remove unused subplots
for idx in range(comparison_idx, 4):
    fig.delaxes(axes[idx])

plt.tight_layout()
plt.savefig('D11_pairwise_pathway_comparison.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n✓ Figure saved: D11_pairwise_pathway_comparison.png")

## 10. Summary and Conclusions

In [None]:
print("="*80)
print("SUMMARY")
print("="*80)

print(f"\n{'Total slabs analyzed:':<45} {len(all_slabs):,}")
print(f"{'Total pathway executions:':<45} {len(df_results):,}")
print(f"{'Successful D11 calculations:':<45} {df_success.shape[0]:,}")
print(f"{'Overall success rate:':<45} {100 * df_results['success'].mean():.1f}%")
print(f"{'Slabs with at least one successful pathway:':<45} {slab_D11_variability.shape[0]:,}")
print(f"{'Unique pathways that succeeded:':<45} {df_success['pathway_description'].nunique()}")

print(f"\n{'Mean D11 (across all calculations):':<45} {df_success['D11'].mean():.1f} N·mm")
print(f"{'Median D11:':<45} {df_success['D11'].median():.1f} N·mm")
print(f"{'Std dev D11:':<45} {df_success['D11'].std():.1f} N·mm")
print(f"{'Range D11:':<45} {df_success['D11'].min():.1f} - {df_success['D11'].max():.1f} N·mm")

print(f"\n{'Mean pathway variability (CV):':<45} {slab_D11_variability['cv_D11'].mean():.3f}")
print(f"{'Median pathway variability (CV):':<45} {slab_D11_variability['cv_D11'].median():.3f}")
print(f"{'Max pathway variability (CV):':<45} {slab_D11_variability['cv_D11'].max():.3f}")

print(f"\n{'Average successful pathways per slab:':<45} {slab_D11_variability['n_pathways'].mean():.1f}")
print(f"{'Max successful pathways for a single slab:':<45} {slab_D11_variability['n_pathways'].max():.0f}")

print("\n" + "="*80)
print("EXPORTED FILES")
print("="*80)
print("CSV Files:")
print("  1. D11_pathway_comparison_raw.csv - All pathway execution results")
print("  2. D11_statistics_by_pathway.csv - Summary statistics by pathway")
print("  3. D11_variability_by_slab.csv - Variability statistics by slab")
print("\nFigures (PNG):")
print("  4. D11_success_rates_by_pathway.png - Success rate comparison")
print("  5. D11_boxplot_by_pathway.png - D11 distributions (box plot)")
print("  6. D11_violin_by_pathway.png - D11 distributions (violin plot)")
print("  7. D11_variability_distributions.png - CV, std, range histograms")
print("  8. D11_variability_vs_properties.png - Variability vs slab properties")
print("  9. D11_pairwise_pathway_comparison.png - Pairwise pathway correlations")

print("\n" + "="*80)
print("KEY FINDINGS")
print("="*80)

# Find pathway with highest success rate
best_pathway = pathway_stats_sorted.iloc[0]
print(f"\n1. Highest Success Rate Pathway:")
print(f"   {best_pathway['pathway'][:80]}...")
print(f"   Success rate: {best_pathway['success_rate_%']:.1f}%")

# Find pathway with most similar results (lowest CV)
if len(slab_D11_variability) > 0:
    print(f"\n2. Pathway Variability:")
    print(f"   Mean CV across slabs: {slab_D11_variability['cv_D11'].mean():.3f}")
    print(f"   This suggests pathway choice affects D11 by ~{100*slab_D11_variability['cv_D11'].mean():.1f}% on average")

# Correlation insights
print(f"\n3. Property Correlations:")
strongest_corr = max(correlations.items(), key=lambda x: abs(x[1]))
print(f"   Strongest correlation with CV: {strongest_corr[0]} (r={strongest_corr[1]:.3f})")

print("\n" + "="*80)