# Hypothesis Evaluation Charts Example

This notebook demonstrates how to use the hypothesis evaluation chart functionality to visualize evaluation results from the database.

In [None]:
# Import required libraries
import sys
import os
from IPython.display import Image, display
import base64
from typing import Dict, Any, List, Optional

# Add the src directory to the path if needed
if not '..' in sys.path:
    sys.path.append('..')

In [None]:
# Import the chart generation functionality
from HypothesisEvaluatorAgent.evaluation_charts import display_hypothesis_evaluation_chart, display_evaluation_statistics
from HypothesisEvaluatorAgent.database_tools import get_hypothesis_evaluations

## Helper Function to Display Charts in the Notebook

This function takes the base64-encoded image from the chart generation function and displays it in the notebook.

In [None]:
def display_chart(chart_result: Dict[str, Any]) -> None:
    """
    Display a chart in the notebook from the chart result dictionary.
    
    Args:
        chart_result: Dictionary containing chart data with base64-encoded image
    """
    if not chart_result["success"]:
        print(f"Error: {chart_result['message']}")
        return
    
    # Get the base64-encoded image
    img_data = chart_result["chart_data"]["image_base64"]
    
    # Display the image
    display(Image(data=base64.b64decode(img_data)))
    
    # Print chart metadata
    print(f"Chart Type: {chart_result['chart_type']}")
    print(f"Hypotheses Displayed: {chart_result['hypothesis_count']}")

## Retrieve All Hypothesis Evaluations

First, let's retrieve all hypothesis evaluations from the database.

In [None]:
# Get all hypothesis evaluations
evaluation_results = get_hypothesis_evaluations(limit=50)

if evaluation_results["success"]:
    evaluations = evaluation_results["evaluations"]
    print(f"Retrieved {len(evaluations)} hypothesis evaluations")
else:
    print(f"Error retrieving evaluations: {evaluation_results['error']}")

## Display Evaluation Statistics

Let's display the mean and standard deviation for each quality score across all evaluations.

In [None]:
# Get evaluation statistics
stats_result = display_evaluation_statistics(limit=50)

if stats_result["success"]:
    # Display statistics
    stats = stats_result["statistics"]
    print(f"Statistics for {stats['count']} hypothesis evaluations:\n")
    
    print("Mean Scores:")
    for criterion, score in stats["average_scores"].items():
        print(f"  {criterion.capitalize()}: {score}")
    
    print("\nStandard Deviations:")
    for criterion, std in stats["standard_deviations"].items():
        print(f"  {criterion.capitalize()}: {std}")
    
    print("\nScore Distribution:")
    for range_name, count in stats["score_distribution"].items():
        print(f"  {range_name}: {count} hypotheses")
    
    print(f"\nHighest Score: {stats['highest_score']}")
    print(f"Lowest Score: {stats['lowest_score']}")
else:
    print(f"Error retrieving statistics: {stats_result['message']}")

## Generate Statistics Chart

Now let's generate a chart showing the mean and standard deviation for each quality score.

In [None]:
# Create output directory if it doesn't exist
os.makedirs("charts", exist_ok=True)

# Generate statistics chart
stats_chart = display_evaluation_statistics(
    output_path="charts/hypothesis_statistics.png"
)

# Display the chart if available
if stats_chart["success"] and "chart_data" in stats_chart:
    # Get the base64-encoded image
    img_data = stats_chart["chart_data"]["image_base64"]
    
    # Display the image
    display(Image(data=base64.b64decode(img_data)))
    
    # Confirm file was saved
    if stats_chart["chart_data"]["file_path"]:
        print(f"Chart saved to: {stats_chart['chart_data']['file_path']}")

## Generate and Display Different Chart Types

Now let's generate and display different types of charts for the hypothesis evaluations.

### 1. Radar Chart

Radar charts show all 5 criteria scores for each hypothesis in a radar/spider plot. This is useful for visualizing the strengths and weaknesses of individual hypotheses.

In [None]:
# Generate radar chart
radar_chart = display_hypothesis_evaluation_chart(chart_type="radar", limit=6)

# Display the chart
display_chart(radar_chart)

### 2. Bar Chart

Bar charts compare scores across hypotheses for each criterion. This is useful for comparing how different hypotheses perform on specific criteria.

In [None]:
# Generate bar chart
bar_chart = display_hypothesis_evaluation_chart(chart_type="bar", limit=8)

# Display the chart
display_chart(bar_chart)

### 3. Heatmap

Heatmaps show scores for all hypotheses and criteria in a color-coded grid. This is useful for identifying patterns across multiple evaluations.

In [None]:
# Generate heatmap
heatmap = display_hypothesis_evaluation_chart(chart_type="heatmap", limit=15)

# Display the chart
display_chart(heatmap)

### 4. Comparison Chart

Comparison charts show overall scores for all hypotheses in a ranked order. This is useful for quickly identifying the best hypotheses.

In [None]:
# Generate comparison chart
comparison_chart = display_hypothesis_evaluation_chart(chart_type="comparison", limit=20)

# Display the chart
display_chart(comparison_chart)

## Filtering Hypotheses

You can filter the hypotheses to include in the charts using various parameters.

### Filter by Specific Hypothesis IDs

In [None]:
# Generate chart for specific hypotheses
specific_chart = display_hypothesis_evaluation_chart(
    hypothesis_ids=[1, 2, 3],  # Replace with actual hypothesis IDs
    chart_type="radar"
)

# Display the chart
display_chart(specific_chart)

### Filter by Score Range

In [None]:
# Generate chart for hypotheses with high scores
high_score_chart = display_hypothesis_evaluation_chart(
    min_overall_score=4.0,  # Only include hypotheses with scores >= 4.0
    chart_type="comparison"
)

# Display the chart
display_chart(high_score_chart)

### Statistics for Filtered Hypotheses

You can also get statistics for a filtered set of hypotheses.

In [None]:
# Get statistics for high-scoring hypotheses
high_score_stats = display_evaluation_statistics(
    min_overall_score=4.0,
    output_path="charts/high_score_statistics.png"
)

# Display the chart if available
if high_score_stats["success"] and "chart_data" in high_score_stats:
    # Get the base64-encoded image
    img_data = high_score_stats["chart_data"]["image_base64"]
    
    # Display the image
    display(Image(data=base64.b64decode(img_data)))
    
    # Print statistics
    print(f"Statistics for {high_score_stats['hypothesis_count']} high-scoring hypotheses")

## Saving Charts to Files

You can save charts to files by providing an output path.

In [None]:
# Create output directory if it doesn't exist
os.makedirs("charts", exist_ok=True)

# Generate and save chart
saved_chart = display_hypothesis_evaluation_chart(
    chart_type="heatmap",
    output_path="charts/hypothesis_heatmap.png"
)

# Display the chart
display_chart(saved_chart)

# Confirm file was saved
if saved_chart["chart_data"]["file_path"]:
    print(f"Chart saved to: {saved_chart['chart_data']['file_path']}")

## Conclusion

This notebook demonstrates how to use the hypothesis evaluation chart functionality to visualize evaluation results from the database. You can use these charts and statistics to gain insights into the quality of your chaos engineering hypotheses and identify areas for improvement.