# 7. Concept Evaluation

This notebook demonstrates how to evaluate and rank the discovered mathematical concepts based on their complexity and explanatory power.

## 7.1 Importing Required Modules

In [None]:
import sys
import os

# Add the src directory to the Python path
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..', 'src')))

from probabilistic_model import mathematical_concept_model
from utils import evaluate_concept, concept_complexity, rank_concepts
import torch
import matplotlib.pyplot as plt
import seaborn as sns

print("Imports complete!")

## 7.2 Generating and Evaluating Concepts

In [None]:
# Generate concepts
input_data = torch.randn(100)
concepts, observations = mathematical_concept_model(input_data)

# Flatten the list of concepts
flat_concepts = [c for level in concepts for c in level]

# Evaluate concepts
evaluated_concepts = [(c, evaluate_concept(c, input_data)) for c in flat_concepts]

print(f"Generated and evaluated {len(evaluated_concepts)} concepts.")

## 7.3 Ranking Concepts

In [None]:
# Rank concepts
ranked_concepts = rank_concepts(flat_concepts, input_data)

print("Top 5 Concepts:")
for i, (concept, score) in enumerate(ranked_concepts[:5], 1):
    print(f"{i}. Concept: {concept}, Score: {score:.4f}")

## 7.4 Visualizing Concept Complexity vs. Explanatory Power

In [None]:
def plot_complexity_vs_explanatory_power(concepts, scores):
    complexities = [concept_complexity(c) for c in concepts]
    
    plt.figure(figsize=(10, 6))
    sns.scatterplot(x=complexities, y=scores)
    plt.xlabel('Concept Complexity')
    plt.ylabel('Explanatory Power (lower is better)')
    plt.title('Concept Complexity vs. Explanatory Power')
    
    # Annotate some interesting points
    for i in range(min(5, len(concepts))):
        plt.annotate(f'C{i}', (complexities[i], scores[i]), xytext=(5, 5), 
                     textcoords='offset points')
    
    plt.tight_layout()
    plt.show()

concepts, scores = zip(*ranked_concepts)
plot_complexity_vs_explanatory_power(concepts, scores)

## 7.5 Analyzing Concept Distribution

In [None]:
def analyze_concept_distribution(scores):
    plt.figure(figsize=(10, 6))
    sns.histplot(scores, kde=True)
    plt.xlabel('Explanatory Power Score')
    plt.ylabel('Frequency')
    plt.title('Distribution of Concept Scores')
    plt.tight_layout()
    plt.show()

    print(f"Mean score: {np.mean(scores):.4f}")
    print(f"Median score: {np.median(scores):.4f}")
    print(f"Standard deviation: {np.std(scores):.4f}")

analyze_concept_distribution(scores)

## 7.6 Identifying Promising Concepts

In [None]:
def identify_promising_concepts(concepts, scores, threshold_percentile=10):
    threshold = np.percentile(scores, threshold_percentile)
    promising_concepts = [(c, s) for c, s in zip(concepts, scores) if s <= threshold]
    
    print(f"Promising concepts (top {threshold_percentile}%):")
    for i, (concept, score) in enumerate(promising_concepts, 1):
        print(f"{i}. Concept: {concept}, Score: {score:.4f}")
    
    return promising_concepts

promising_concepts = identify_promising_concepts(concepts, scores)

## 7.7 Concept Improvement Suggestions

In [None]:
def suggest_improvements(concept, score):
    complexity = concept_complexity(concept)
    
    if complexity > 10 and score > np.median(scores):
        return "Consider simplifying this concept to improve its explanatory power."
    elif complexity < 5 and score > np.median(scores):
        return "This concept might be too simple. Consider combining it with other concepts."
    elif score <= np.percentile(scores, 10):
        return "This concept shows promise. Consider refining it further or exploring similar concepts."
    else:
        return "This concept performs adequately. No specific improvements suggested."

print("Improvement suggestions for top 5 concepts:")
for concept, score in ranked_concepts[:5]:
    print(f"Concept: {concept}")
    print(f"Suggestion: {suggest_improvements(concept, score)}\n")

This notebook demonstrates how to evaluate and rank the mathematical concepts discovered by our system. I'm showing how to:

1. Generate and evaluate concepts based on their explanatory power
2. Rank concepts to identify the most promising ones
3. Visualize the trade-off between concept complexity and explanatory power
4. Analyze the distribution of concept scores
5. Identify particularly promising concepts
6. Suggest potential improvements for concepts

This evaluation process is crucial for guiding further exploration and refinement in the mathematical invention system. By quantifying the quality of discovered concepts, we can focus our efforts on the most promising areas and iteratively improve our results.

The trade-off between complexity and explanatory power is a key consideration in mathematical discovery. Simple concepts that explain a lot of data are particularly valuable, as they often represent fundamental principles. However, more complex concepts might be necessary to capture subtle mathematical relationships.

By using this evaluation framework, we can systematically assess the quality of our discovered concepts and guide the system towards more meaningful mathematical inventions.