# Complete Confusion Demos

This notebook demonstrates the basic functionality of the Complete Confusion library for generating performance metrics and visualizations for classification models.

1. **Basic usage** - minimal example with predefined data
2. **Multi-class classification** - with synthetic data and Random Forest
3. **Binary classification** - another example

Each example generates an HTML report containing:
- Interactive confusion matrix
- Performance metrics (precision, recall, F1-score)
- Per-class statistics
- Detailed visualizations

The generated HTML files can be opened in any web browser for interactive exploration of the results.

## Import Libraries

First, let's import the necessary libraries:

In [28]:
import complete_confusion as cc
import numpy as np
import os
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

## Basic Example

Let's start with a simple example using predefined data:

In [23]:
# Example data
predictions = [0, 1, 0, 2, 1, 2, 0, 1, 2, 0]
true_labels = [0, 1, 0, 2, 0, 2, 2, 1, 1, 0]
class_names = ["Class A", "Class B", "Class C"]

# Create output directory
os.makedirs("example_output", exist_ok=True)

# Generate performance metrics HTML report
cc.save_performance_metrics_to_html(
    predictions, 
    true_labels, 
    class_names,
    output_path="example_output"
)

print("Basic example report generated: example_output/confusion_matrix.html")

Basic example report generated: example_output/confusion_matrix.html


## Realistic Example with Synthetic Data

Now let's create a more realistic example using scikit-learn to generate synthetic data and train a model:

In [24]:
# Generate synthetic dataset
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_informative=15,
    n_redundant=3,
    n_classes=4,
    random_state=42
)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

print(f"Training set size: {len(X_train)}")
print(f"Test set size: {len(X_test)}")
print(f"Number of features: {X.shape[1]}")
print(f"Number of classes: {len(np.unique(y))}")

Training set size: 700
Test set size: 300
Number of features: 20
Number of classes: 4


In [25]:
# Train a Random Forest model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

print(f"Model trained successfully!")
print(f"Predictions shape: {predictions.shape}")

Model trained successfully!
Predictions shape: (300,)


In [26]:
# Define class names
class_names = [f"Category_{i}" for i in range(4)]

# Generate comprehensive performance report
cc.save_performance_metrics_to_html(
    predictions,
    y_test,
    class_names,
    output_path="random_forest_performance"
)

print("Random Forest performance report generated: random_forest_performance/confusion_matrix.html")

Random Forest performance report generated: random_forest_performance/confusion_matrix.html


## Binary Classification Example

Let's also demonstrate binary classification:

In [27]:
# Generate binary classification data
X_binary, y_binary = make_classification(
    n_samples=500,
    n_features=10,
    n_informative=8,
    n_redundant=1,
    n_classes=2,
    n_clusters_per_class=1,
    random_state=42
)

# Split the data
X_train_bin, X_test_bin, y_train_bin, y_test_bin = train_test_split(
    X_binary, y_binary, test_size=0.3, random_state=42
)

# Train model
binary_model = RandomForestClassifier(n_estimators=50, random_state=42)
binary_model.fit(X_train_bin, y_train_bin)

# Make predictions
binary_predictions = binary_model.predict(X_test_bin)

# Generate report
binary_class_names = ["Negative", "Positive"]
cc.save_performance_metrics_to_html(
    binary_predictions,
    y_test_bin,
    binary_class_names,
    output_path="binary_classification_performance"
)

print("Binary classification report generated: binary_classification_performance/confusion_matrix.html")

Binary classification report generated: binary_classification_performance/confusion_matrix.html
