# Model Evaluation & Reporting

Learn how to comprehensively evaluate your models using MKYZ's evaluation tools.

**Topics covered:**
- Classification metrics
- Cross-validation strategies
- Model reports
- Exporting results

In [1]:
import mkyz

mkyz package initialized. Version: 0.2.1


## 1. Prepare Data and Train Model

In [2]:
# Load and prepare the Titanic dataset
data = mkyz.prepare_data(
    'data/titanic.csv',
    target_column='Survived',
    test_size=0.2,
    random_state=42
)

X_train, X_test, y_train, y_test, df, target, num_cols, cat_cols = data

print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")

INFO:mkyz.data_processing:First 5 rows of the dataset:
INFO:mkyz.data_processing:   PassengerId  Survived  Pclass  \
0            1         0       3   
1            2         1       1   
2            3         1       3   
3            4         1       1   
4            5         0       3   

                                                Name     Sex   Age  SibSp  \
0                            Braund, Mr. Owen Harris    male  22.0      1   
1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   
2                             Heikkinen, Miss. Laina  female  26.0      0   
3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   
4                           Allen, Mr. William Henry    male  35.0      0   

   Parch            Ticket     Fare Cabin Embarked  
0      0         A/5 21171   7.2500   NaN        S  
1      0          PC 17599  71.2833   C85        C  
2      0  STON/O2. 3101282   7.9250   NaN        S  
3      0            113803 

Training set: 576 samples
Test set: 145 samples


In [3]:
# Train a Random Forest classifier
model = mkyz.train(
    data,
    task='classification',
    model='rf',
    n_estimators=100,
    random_state=42
)

print("Model trained successfully!")

Model trained successfully!


## 2. Classification Metrics

Calculate comprehensive metrics for classification problems.

In [4]:
# Make predictions
predictions = model.predict(X_test)

# Calculate metrics
metrics = mkyz.classification_metrics(y_test, predictions)

print("Classification Metrics:")
print("=" * 40)
for metric, value in metrics.items():
    print(f"  {metric}: {value:.4f}")

Classification Metrics:
  accuracy: 0.8000
  precision: 0.7980
  recall: 0.8000
  f1_score: 0.7915
  mcc: 0.5405
  cohen_kappa: 0.5283


In [5]:
# Get probabilities for ROC-AUC calculation
y_proba = model.predict_proba(X_test)[:, 1]  # Probability of positive class

# Calculate metrics with probabilities
metrics_with_proba = mkyz.classification_metrics(
    y_test, 
    predictions, 
    y_proba=y_proba
)

print("\nMetrics with ROC-AUC:")
print("=" * 40)
for metric, value in metrics_with_proba.items():
    print(f"  {metric}: {value:.4f}")


Metrics with ROC-AUC:
  accuracy: 0.8000
  precision: 0.7980
  recall: 0.8000
  f1_score: 0.7915
  mcc: 0.5405
  cohen_kappa: 0.5283
  roc_auc: 0.8444


## 3. Cross-Validation

Evaluate your model with different cross-validation strategies.

In [6]:
# Basic cross-validation with Stratified K-Fold
cv_results = mkyz.cross_validate(
    model, 
    X_train, 
    y_train,
    cv=mkyz.CVStrategy.STRATIFIED,
    n_splits=5
)

print("Stratified 5-Fold Cross-Validation:")
print("=" * 40)
print(f"  Mean Score: {cv_results['mean_test_score']:.4f}")
print(f"  Std Dev: {cv_results['std_test_score']:.4f}")
print(f"  Fold Scores: {[f'{s:.4f}' for s in cv_results['test_score']]}")

Stratified 5-Fold Cross-Validation:
  Mean Score: 0.8264
  Std Dev: 0.0175
  Fold Scores: ['0.8362', '0.8348', '0.8348', '0.7913', '0.8348']


In [7]:
# Cross-validation with training scores
cv_results_full = mkyz.cross_validate(
    model, 
    X_train, 
    y_train,
    cv=mkyz.CVStrategy.STRATIFIED,
    n_splits=5,
    return_train_score=True
)

print("\nWith Training Scores (to detect overfitting):")
print("=" * 40)
print(f"  Train Score: {cv_results_full['mean_train_score']:.4f}")
print(f"  Test Score: {cv_results_full['mean_test_score']:.4f}")

# Check for overfitting
gap = cv_results_full['mean_train_score'] - cv_results_full['mean_test_score']
if gap > 0.1:
    print(f"  ‚ö†Ô∏è Warning: Large gap ({gap:.4f}) suggests overfitting!")
else:
    print(f"  ‚úÖ Good: Gap ({gap:.4f}) is acceptable.")


With Training Scores (to detect overfitting):
  Train Score: 1.0000
  Test Score: 0.8264


## 4. CV Strategy Comparison

Compare different cross-validation strategies.

In [8]:
# Compare different CV strategies
strategies = [
    (mkyz.CVStrategy.STRATIFIED, "Stratified K-Fold"),
    (mkyz.CVStrategy.KFOLD, "K-Fold"),
]

print("CV Strategy Comparison:")
print("=" * 50)

for strategy, name in strategies:
    results = mkyz.cross_validate(
        model, X_train, y_train,
        cv=strategy,
        n_splits=5
    )
    print(f"  {name}: {results['mean_test_score']:.4f} ¬± {results['std_test_score']:.4f}")

CV Strategy Comparison:
  Stratified K-Fold: 0.8264 ¬± 0.0175
  K-Fold: 0.8160 ¬± 0.0187


## 5. Model Report

Generate comprehensive model reports.

In [9]:
# Create a model report
report = mkyz.ModelReport(
    model=model,
    X_test=X_test,
    y_test=y_test,
    task='classification',
    model_name='Random Forest - Titanic Survival'
)

# Generate the report
report.generate()

# Print summary
print(report.summary())

Model Report: Random Forest - Titanic Survival
Task: Classification
Samples: 145
Features: 1420

Metrics:
----------------------------------------
  accuracy: 0.8000
  precision: 0.7980
  recall: 0.8000
  f1_score: 0.7915
  mcc: 0.5405
  cohen_kappa: 0.5283
  roc_auc: 0.8444

Top 10 Features:
----------------------------------------
  1. feature_724: 0.0962
  2. feature_725: 0.0728
  3. feature_2: 0.0649
  4. feature_1: 0.0467
  5. feature_0: 0.0462
  6. feature_1347: 0.0232
  7. feature_1406: 0.0231
  8. feature_1405: 0.0181
  9. feature_1404: 0.0096
  10. feature_1413: 0.0089


In [10]:
# Access report metrics programmatically
print("\nAccessing Report Metrics:")
print("=" * 40)

report_metrics = report.metrics
for key, value in report_metrics.items():
    if isinstance(value, float):
        print(f"  {key}: {value:.4f}")


Accessing Report Metrics:
  accuracy: 0.8000
  precision: 0.7980
  recall: 0.8000
  f1_score: 0.7915
  mcc: 0.5405
  cohen_kappa: 0.5283
  roc_auc: 0.8444


In [11]:
# Export report to HTML
import os
if not os.path.exists('reports'):
    os.makedirs('reports')

report_path = report.export_html('reports/classification_report.html')
print(f"Report exported to: {report_path}")

Report exported to: c:\Users\mmust\Desktop\mkyz\examples\reports\classification_report.html


## 6. Check Target Balance

In [12]:
# Check if the target variable is balanced
balance_info = mkyz.check_target_balance(y_train)

print("Target Balance Analysis:")
print("=" * 40)
print(f"  Number of Classes: {balance_info['n_classes']}")
print(f"  Is Imbalanced: {balance_info['is_imbalanced']}")
print(f"\n  Class Distribution:")
for cls, count in balance_info['class_distribution'].items():
    print(f"    Class {cls}: {count}")

if balance_info.get('recommendation'):
    print(f"\n  üí° Recommendation: {balance_info['recommendation']}")

Target Balance Analysis:
  Number of Classes: 2
  Is Imbalanced: False

  Class Distribution:
    Class 0: {'count': 383, 'percentage': 66.49305555555556}
    Class 1: {'count': 193, 'percentage': 33.50694444444444}


## Summary

In this notebook, we learned:

1. **Classification Metrics** - Accuracy, precision, recall, F1, MCC, ROC-AUC
2. **Cross-Validation** - Different CV strategies for reliable evaluation
3. **Model Reports** - Comprehensive reports with export options
4. **Target Balance** - Checking for class imbalance

### Key Takeaways

- Always use cross-validation for reliable estimates
- Check both training and test scores to detect overfitting
- Use stratified CV for classification problems
- Export reports for documentation and sharing