# Hierarchy Health Dashboard

**Purpose**: Quick visual dashboard for hierarchy health monitoring.

**This notebook**:
- ‚ö° Fast loading and minimal code
- üìä Executive summary at-a-glance
- üéØ Focus on actionable insights
- üîç Quick health checks
- üìà Compare multiple training runs

**Use Cases**:
- Quick health check after training
- Monitor training experiments
- Compare configurations side-by-side
- Identify issues immediately

**For detailed analysis**: Use `hierarchy_metrics.ipynb`

## Quick Setup

In [None]:
from tools.hierarchy_metrics import MetricsReport, MetricsConfig
from tools.hierarchy_metrics.visualization import (
    plot_health_summary,
    plot_compression_ratios,
    plot_pattern_counts,
    plot_coverage_heatmap,
    plot_training_dynamics
)

import matplotlib.pyplot as plt
%matplotlib inline

plt.style.use('seaborn-v0_8-darkgrid')

print("‚úì Ready for dashboard")

## Load Report

Point to your latest hierarchy graph database.

In [None]:
# Path to latest metrics
GRAPH_DB_PATH = './metrics/hierarchy_graph.db'

# Quick config (minimal analysis)
config = MetricsConfig(
    compute_pattern_diversity=False,  # Skip expensive computation
    prediction_test_size=50,          # Fewer prediction samples
)

# Generate report
report = MetricsReport.generate(
    graph_db_path=GRAPH_DB_PATH,
    config=config,
    verbose=False  # Quiet mode
)

print("‚úì Report loaded")

## üéØ Executive Summary

In [None]:
summary = report.metrics_summary

# Overall health with emoji
health_emoji = {
    'excellent': 'üü¢ EXCELLENT',
    'good': 'üü¢ GOOD',
    'warning': 'üü° WARNING',
    'poor': 'üü† POOR',
    'critical': 'üî¥ CRITICAL'
}

print("‚ïê" * 80)
print(f"OVERALL HEALTH: {health_emoji[summary.overall_health.value]}")
print("‚ïê" * 80)

# Category breakdown
print("\nüìä CATEGORY BREAKDOWN\n")
categories = [
    ('Compression', summary.compression),
    ('Connectivity', summary.connectivity),
    ('Information', summary.information),
    ('Prediction', summary.prediction),
    ('Training Dynamics', summary.training_dynamics)
]

for name, status in categories:
    emoji = health_emoji.get(status.value, '‚ùì UNKNOWN')
    print(f"  {name:20s}: {emoji}")

# Key metrics
print("\nüìà KEY METRICS\n")
total_patterns = sum(report.compression.pattern_counts.values())
print(f"  Total Patterns: {total_patterns:,}")

if report.compression.compression_ratios:
    avg_compression = sum(report.compression.compression_ratios.values()) / len(report.compression.compression_ratios)
    print(f"  Avg Compression: {avg_compression:.1f}x")

if report.connectivity.reusability:
    node0_orphan_rate = report.connectivity.reusability.get('node0', {}).get('orphan_rate', 0)
    print(f"  Orphan Rate: {node0_orphan_rate:.1%}")

if report.information.constraint_effectiveness:
    avg_effectiveness = sum(report.information.constraint_effectiveness.values()) / len(report.information.constraint_effectiveness)
    print(f"  Constraint Effectiveness: {avg_effectiveness:.1%}")

# Issues and recommendations
if summary.critical_issues:
    print("\n‚ö†Ô∏è  CRITICAL ISSUES\n")
    for issue in summary.critical_issues[:3]:
        print(f"  ‚Ä¢ {issue}")

if summary.recommendations:
    print("\nüí° TOP RECOMMENDATIONS\n")
    for rec in summary.recommendations[:3]:
        print(f"  ‚Üí {rec}")

print("\n" + "‚ïê" * 80)

## üìä Visual Dashboard

In [None]:
# Overall health dashboard
plot_health_summary(report, figsize=(14, 10))

## üîç Quick Metrics

In [None]:
# Compression and coverage (side by side)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 5))

# Compression ratios
ratios = report.compression.compression_ratios
pairs = list(ratios.keys())
values = list(ratios.values())

ax1.bar(pairs, values, color='steelblue', alpha=0.7, edgecolor='black')
ax1.set_ylabel('Compression Ratio', fontsize=12)
ax1.set_title('Compression Ratios', fontsize=14, fontweight='bold')
ax1.grid(axis='y', alpha=0.3)
plt.setp(ax1.xaxis.get_majorticklabels(), rotation=45, ha='right')

# Coverage
coverage = report.connectivity.coverage
cov_pairs = list(coverage.keys())
cov_values = list(coverage.values())

ax2.barh(cov_pairs, cov_values, color='coral', alpha=0.7, edgecolor='black')
ax2.set_xlabel('Coverage Rate', fontsize=12)
ax2.set_title('Coverage', fontsize=14, fontweight='bold')
ax2.set_xlim([0, 1])
ax2.grid(axis='x', alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Pattern counts (log scale)
plot_pattern_counts(report, figsize=(12, 5), log_scale=True)

## üìà Training Progress

In [None]:
# Training dynamics (if available)
if report.training_dynamics:
    plot_training_dynamics(report, figsize=(16, 6))
    
    # Quick stats
    dynamics = report.training_dynamics
    print(f"Growth Exponent: {dynamics.growth_exponent:.3f} (target: 0.5-0.7)")
    print(f"Reusability Trend: {dynamics.reusability_trend_slope:.4f} (positive is good)")
else:
    print("No training dynamics captured (enable checkpoints during training)")

## üîß Quick Actions

Based on current health status:

In [None]:
# Actionable insights
print("RECOMMENDED ACTIONS\n" + "="*80)

summary = report.metrics_summary

# Check compression
if summary.compression.value in ['poor', 'critical']:
    print("\nüîß COMPRESSION ISSUES DETECTED")
    print("  ‚Üí Adjust chunk_size (try 5-15)")
    print("  ‚Üí Consider adding/removing hierarchy levels")
    
# Check connectivity
if summary.connectivity.value in ['poor', 'critical']:
    print("\nüîß CONNECTIVITY ISSUES DETECTED")
    
    # Check orphan rate
    for level, stats in report.connectivity.reusability.items():
        if stats['orphan_rate'] > 0.2:  # >20%
            print(f"  ‚Üí High orphan rate in {level}: Increase training data")
    
    # Check coverage
    for pair, rate in report.connectivity.coverage.items():
        if rate < 0.5:  # <50%
            print(f"  ‚Üí Low coverage {pair}: Review pattern composition logic")

# Check information
if summary.information.value in ['poor', 'critical']:
    print("\nüîß CONSTRAINT EFFECTIVENESS ISSUES")
    print("  ‚Üí Upper levels may not be constraining lower levels")
    print("  ‚Üí Consider different chunk_size per level")

# Check training dynamics
if report.training_dynamics:
    if report.training_dynamics.growth_exponent > 0.9:
        print("\nüîß LINEAR GROWTH DETECTED")
        print("  ‚Üí Increase chunk_size to encourage pattern reuse")
        print("  ‚Üí Review training data quality")

# All good?
if summary.overall_health.value in ['excellent', 'good']:
    print("\n‚úÖ HIERARCHY IS HEALTHY")
    print("  ‚Üí Consider scaling up training data")
    print("  ‚Üí Ready for production use or generation experiments")

print("\n" + "="*80)

## üìä Compare Multiple Runs (Optional)

Load and compare metrics from different training runs.

In [None]:
# Example: Compare multiple runs
# Uncomment and adjust paths as needed

# run_paths = [
#     './metrics/hierarchy_graph_run1.db',
#     './metrics/hierarchy_graph_run2.db',
#     './metrics/hierarchy_graph_run3.db',
# ]

# reports = []
# for path in run_paths:
#     try:
#         r = MetricsReport.generate(graph_db_path=path, config=config, verbose=False)
#         reports.append(r)
#     except Exception as e:
#         print(f"Could not load {path}: {e}")

# if reports:
#     # Compare compression ratios
#     fig, ax = plt.subplots(figsize=(14, 6))
    
#     for i, r in enumerate(reports):
#         pairs = list(r.compression.compression_ratios.keys())
#         values = list(r.compression.compression_ratios.values())
        
#         x_pos = [j + i*0.25 for j in range(len(pairs))]
#         ax.bar(x_pos, values, width=0.2, label=f'Run {i+1}', alpha=0.7)
    
#     ax.set_xlabel('Level Pair', fontsize=12)
#     ax.set_ylabel('Compression Ratio', fontsize=12)
#     ax.set_title('Compression Ratio Comparison', fontsize=14, fontweight='bold')
#     ax.set_xticks([j + 0.25 for j in range(len(pairs))])
#     ax.set_xticklabels(pairs, rotation=45, ha='right')
#     ax.legend()
#     ax.grid(axis='y', alpha=0.3)
    
#     plt.tight_layout()
#     plt.show()

print("Uncomment code above to compare multiple runs")

## üíæ Quick Export

In [None]:
# Export summary to JSON
report.export_json('./metrics/quick_summary.json')
print("‚úì Exported to ./metrics/quick_summary.json")

# Print file size
import os
size_kb = os.path.getsize('./metrics/quick_summary.json') / 1024
print(f"  File size: {size_kb:.1f} KB")

## Next Steps

### üîç Need More Detail?
Open **`hierarchy_metrics.ipynb`** for:
- All 15 metrics in detail
- Information-theoretic analysis
- Pattern diversity and coherence
- Complete interpretation guide

### üéØ Optimize Configuration
Based on dashboard insights:
1. Adjust `chunk_size` if compression ratios off-target
2. Increase training data if orphan rate high
3. Review architecture if constraint effectiveness low

### üöÄ Scale Up
If health is good:
- Increase `max_samples` in training
- Try larger datasets (C4, RefinedWeb)
- Experiment with text generation

### üìä Monitor Over Time
- Run this dashboard after each training session
- Track health trends
- Compare configurations systematically