# Chapter 7: Working with Tabular Data - Complete Code Examples

This notebook contains all the code examples from Chapter 7. It provides a comprehensive guide to using AutoGluon's `TabularPredictor` for classification and regression tasks, covering everything from basic setup to advanced customization and a full project implementation.

**AutoGluon Version: 1.5.0** (latest stable release as of this writing)

### Contents:
1. Environment Setup
2. TabularPredictor Basics
3. AutoGluon's Automatic Data Processing
4. Advanced Customization
5. Project: Titanic Survival Prediction
6. Model Interpretability and SHAP Analysis
7. Data Pipeline Consistency
8. Monitoring and Maintaining Models in Production
9. Summary and Best Practices

---
## 1. Environment Setup

In [None]:
"""
Required installations before running this notebook:

Lightweight (~500 MB, ~30 dependencies):
    pip install autogluon.tabular

Standard (~1.5 GB, ~60 dependencies) - Recommended:
    pip install autogluon.tabular[all]

Full (~5-8 GB, ~150+ dependencies):
    pip install autogluon

Additional packages for this notebook:
    pip install pandas numpy matplotlib seaborn scikit-learn psutil shap
"""

# Core imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import warnings
import logging
import os
warnings.filterwarnings('ignore')

# Ensure output directory exists for saved figures
os.makedirs('./ag_models', exist_ok=True)

# Set random seed for AutoGluon reproducibility
os.environ['AG_SEED'] = '42'

# AutoGluon import
try:
    from autogluon.tabular import TabularPredictor
    import autogluon.tabular as ag_tabular
    print(f"AutoGluon Tabular version: {ag_tabular.__version__} successfully imported!")
except ImportError:
    print("Please install AutoGluon: pip install autogluon.tabular[all]")

# System check
import psutil
import platform
print(f"Platform: {platform.system()} {platform.release()}")
print(f"CPU Cores: {psutil.cpu_count()}")
print(f"Available RAM: {psutil.virtual_memory().available / (1024**3):.1f} GB")
print("\nNote: For tabular data, GPUs provide minimal benefit. Tree-based models run on CPU.")

---
## 2. TabularPredictor Basics - Adult/Census Income Dataset

In [None]:
def load_adult_dataset():
    """Load and prepare the Adult/Census Income dataset."""
    train_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
    test_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test"
    columns = [
        'age', 'workclass', 'fnlwgt', 'education', 'education_num',
        'marital_status', 'occupation', 'relationship', 'race', 'sex',
        'capital_gain', 'capital_loss', 'hours_per_week', 'native_country', 'income'
    ]
    try:
        print("Attempting to load Adult dataset from UCI repository...")
        train_data = pd.read_csv(train_url, names=columns, skipinitialspace=True, na_values='?')
        test_data = pd.read_csv(test_url, names=columns, skipinitialspace=True, skiprows=1, na_values='?')
        data = pd.concat([train_data, test_data], ignore_index=True)
        data['income'] = data['income'].str.replace('.', '', regex=False).str.strip()
        print(f"Adult dataset loaded successfully: {data.shape}")
        return data
    except Exception as e:
        print(f"Could not load real dataset ({e}). A local copy may be required.")
        return pd.DataFrame(columns=columns)

adult_data = load_adult_dataset()
if adult_data.empty:
    raise RuntimeError(
        "Dataset could not be loaded. Please check your internet connection "
        "or provide a local copy of the Adult/Census Income dataset."
    )
train_data, test_data = train_test_split(adult_data, test_size=0.2, random_state=42)

print("\nDataset Info:")
print(f"Training samples: {len(train_data)}, Test samples: {len(test_data)}")
print("\nTarget Distribution:")
print(train_data['income'].value_counts())

In [None]:
print("Starting basic AutoGluon training...")

# Initialize and train the TabularPredictor
predictor = TabularPredictor(
    label='income',
    eval_metric='roc_auc',
    path='./ag_models/adult_income'
)

# Fit the model with a time limit
predictor.fit(train_data, time_limit=120)  # 2-minute time limit for a quick run

### Exploring the Predictor Output

The leaderboard shows cross-validation performance. However, even cross-validation scores can overestimate true generalization performance, especially with small datasets. The final test set evaluation is what really matters for deployment decisions.

In [None]:
# Display the model leaderboard
print("Model Leaderboard:")
leaderboard = predictor.leaderboard(test_data, silent=True)
print(leaderboard)

# Display feature importance
print("\nFeature Importance:")
feature_importance = predictor.feature_importance(test_data)
print(feature_importance.head(10))

### Binary Classification in Detail

**Metric selection guide for citizen data scientists:**
- **Accuracy**: Use when all errors are equally bad
- **ROC-AUC**: Use when you need to rank items by probability
- **F1**: Use when you care about catching positive cases even at the cost of some false alarms

In [None]:
# Get detailed evaluation metrics
print("Detailed Evaluation Metrics:")
eval_metrics = predictor.evaluate(test_data, silent=True)
for metric, value in eval_metrics.items():
    print(f"{metric}: {value:.4f}")

# Generate predictions and create a confusion matrix
y_pred = predictor.predict(test_data)
y_true = test_data['income']

cm = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(7, 5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=predictor.class_labels, yticklabels=predictor.class_labels)
plt.title('Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.savefig('./ag_models/confusion_matrix.png', dpi=150, bbox_inches='tight')
plt.show()

# Display classification report
print("\nClassification Report:")
print(classification_report(y_true, y_pred))

### Probability Calibration Note

AutoGluon's probability estimates may not be well-calibrated out of the box. If you're using these probabilities for decision thresholds or risk scoring, consider applying calibration techniques.

In [None]:
# Prediction probabilities example
sample_data = test_data.head(5)

# Basic predictions (class labels)
predictions = predictor.predict(sample_data)
print("Class predictions:")
print(predictions.values)

# Prediction probabilities for confidence scoring
probabilities = predictor.predict_proba(sample_data)
print("\nPrediction probabilities:")
print(probabilities)

# Note: For production use with thresholds, consider Platt scaling or isotonic regression
# from sklearn.calibration import CalibratedClassifierCV

### Multi-Class Classification Example

In [None]:
print("Preparing data for Multi-Class Classification (predicting 'workclass')...")
multiclass_data = adult_data.dropna(subset=['workclass']).copy()

mc_train, mc_test = train_test_split(multiclass_data, test_size=0.2, random_state=42)

# Train a multi-class predictor
mc_predictor = TabularPredictor(
    label='workclass',
    eval_metric='accuracy',
    path='./ag_models/multi_class'
).fit(mc_train, time_limit=120)

# Evaluate the multi-class model
mc_performance = mc_predictor.evaluate(mc_test)
print(f"\nMulti-class model performance: {mc_performance}")

### Regression Example

**RMSE Note**: RMSE is the square root of the average squared error, making it more sensitive to large errors than MAE. If you have outliers, RMSE will be pulled higher. When outliers matter, use RMSE; when you want robustness to outliers, prefer MAE.

In [None]:
print("Preparing data for Regression (predicting 'hours_per_week')...")
regression_data = adult_data.dropna(subset=['hours_per_week']).copy()
reg_train, reg_test = train_test_split(regression_data, test_size=0.2, random_state=42)

# Train a regression predictor
reg_predictor = TabularPredictor(
    label='hours_per_week',
    problem_type='regression',
    eval_metric='root_mean_squared_error',
    path='./ag_models/regression'
).fit(reg_train, time_limit=120)

# Evaluate and plot regression results
reg_metrics = reg_predictor.evaluate(reg_test)
print(f"\nRegression Metrics: {reg_metrics}")

reg_predictions = reg_predictor.predict(reg_test)
plt.figure(figsize=(8, 6))
plt.scatter(reg_test['hours_per_week'], reg_predictions, alpha=0.3)
plt.plot([0, 100], [0, 100], 'r--')
plt.xlabel('Actual Hours per Week')
plt.ylabel('Predicted Hours per Week')
plt.title('Regression: Predicted vs. Actual')
plt.grid(True)
plt.savefig('./ag_models/regression_scatter.png', dpi=150, bbox_inches='tight')
plt.show()

---
## 3. AutoGluon's Automatic Data Processing

AutoGluon handles most data processing automatically. This section demonstrates its capabilities.

**Advanced insight**: The missingness pattern itself is often informative. In healthcare data, a missing lab test might indicate the doctor didn't think it was necessary. Consider adding explicit "is_missing" indicator features before training if missingness carries meaning in your domain.

In [None]:
# Create data with various data types and missing values to demonstrate automatic handling
processing_demo_data = adult_data.sample(1000, random_state=42).copy()

print("Original Missing Values:")
print(processing_demo_data.isnull().sum()[processing_demo_data.isnull().sum() > 0])

# Add datetime features
processing_demo_data['hire_date'] = pd.to_datetime('2022-01-01') - pd.to_timedelta(
    processing_demo_data['age'] * 365, 'd'
)

print("\nTraining a model on this mixed-type, missing-value data...")
# AutoGluon will automatically detect data types, handle missing values, and engineer features
processing_predictor = TabularPredictor(label='income', path='./ag_models/data_processing_demo')
processing_predictor.fit(processing_demo_data, time_limit=60, verbosity=2)

print("\nAutoGluon automatically handled:")
print("- Missing values in 'workclass' and 'occupation'")
print("- High-cardinality categorical features like 'native_country'")
print("- Datetime feature 'hire_date' by extracting useful components (year, month, etc.)")

---
## 4. Advanced Customization

**Warning**: Overly conservative hyperparameter settings can underfit, especially with smaller datasets. Match model complexity to your data size and signal-to-noise ratio.

In [None]:
print("Demonstrating Advanced Customization...")

# 1. Custom Hyperparameters for specific models
hyperparameters = {
    'GBM': [
        {'num_boost_round': 10000, 'learning_rate': 0.05, 'num_leaves': 26},
        {'num_boost_round': 5000, 'learning_rate': 0.1, 'num_leaves': 32},
    ],
    'RF': [
        {'n_estimators': 300, 'max_depth': 15, 'n_jobs': -1},
    ],
}

# 2. Custom Ensemble Configuration
custom_predictor = TabularPredictor(
    label='income',
    path='./ag_models/custom_config'
).fit(
    train_data,
    time_limit=180,
    hyperparameters=hyperparameters,
    num_bag_folds=5,
    num_bag_sets=1,
    num_stack_levels=1,
    verbosity=2
)

print("\nCustom Model Leaderboard:")
custom_leaderboard = custom_predictor.leaderboard(test_data, silent=True)
print(custom_leaderboard)

---
## 5. Project: Titanic Survival Prediction

**Note on educational datasets**: The Titanic dataset is perfect for learning, but it's been heavily used in ML education for decades. Performance numbers here may be optimistic compared to truly novel problems where you don't have accumulated community wisdom.

In [None]:
print("Starting Titanic Survival Prediction Project...")

# Load Titanic dataset from seaborn (consistent with chapter text)
def load_titanic_dataset():
    """Load Titanic dataset with comprehensive preprocessing"""
    try:
        titanic = sns.load_dataset('titanic')
        
        # Drop columns that would cause data leakage or are redundant:
        # - 'alive': string version of 'survived' (direct leakage!)
        # - 'class': redundant with 'pclass'
        # - 'who': derived from sex/age, could leak information
        # - 'adult_male': derived from sex/age
        leaky_columns = ['alive', 'class', 'who', 'adult_male']
        titanic = titanic.drop(columns=[c for c in leaky_columns if c in titanic.columns])
        
        print(f"Titanic dataset loaded: {titanic.shape}")
        print(f"Columns: {titanic.columns.tolist()}")
        return titanic
    except Exception as e:
        print(f"Could not load Titanic dataset: {e}")
        return None

titanic_data = load_titanic_dataset()

print(f"\nSurvival rate: {titanic_data['survived'].mean():.2%}")
print("\nSurvival by key factors:")
print("By Sex:")
print(titanic_data.groupby('sex')['survived'].mean())
print("\nBy Class:")
print(titanic_data.groupby('pclass')['survived'].mean())

### Custom Feature Engineering for Titanic

**IMPORTANT**: This exact function must be used in both training and inference. Any differences between training and serving feature engineering will cause silent model degradation.

In [None]:
def engineer_titanic_features(df):
    """Engineer domain-specific features for Titanic dataset
    
    IMPORTANT: This exact function must be used in both training and inference.
    Any differences between training and serving feature engineering will cause
    silent model degradation. Consider packaging this function with your model
    or using a feature store to ensure consistency.
    """
    df = df.copy()
    
    # Family size features
    if 'sibsp' in df.columns and 'parch' in df.columns:
        df['family_size'] = df['sibsp'] + df['parch'] + 1
        df['is_alone'] = (df['family_size'] == 1).astype(int)
    
    # Age groups
    if 'age' in df.columns:
        df['age_group'] = pd.cut(
            df['age'],
            bins=[0, 12, 20, 40, 60, 100],
            labels=['child', 'teen', 'adult', 'middle_age', 'senior']
        )
    
    # Fare per person and class interactions
    if 'fare' in df.columns and 'family_size' in df.columns:
        df['fare_per_person'] = df['fare'] / df['family_size']
    
    if 'pclass' in df.columns and 'sex' in df.columns:
        df['class_sex'] = df['pclass'].astype(str) + '_' + df['sex']
    
    return df

# Split data
titanic_train = titanic_data.sample(frac=0.8, random_state=42)
titanic_test = titanic_data.drop(titanic_train.index)

print(f"Training samples: {len(titanic_train)}, Test samples: {len(titanic_test)}")

In [None]:
# Train baseline model (no feature engineering)
print("Training baseline model...")
titanic_predictor_baseline = TabularPredictor(
    label='survived',
    eval_metric='roc_auc',
    path='./ag_models/titanic_baseline'
)
titanic_predictor_baseline.fit(titanic_train, time_limit=120)

# Evaluate baseline
baseline_eval = titanic_predictor_baseline.evaluate(titanic_test, silent=True)
print(f"\nBaseline Titanic ROC-AUC: {baseline_eval['roc_auc']:.4f}")

In [None]:
# Train improved model with feature engineering
print("Training improved model with feature engineering...")

titanic_train_eng = engineer_titanic_features(titanic_train)
titanic_test_eng = engineer_titanic_features(titanic_test)

titanic_predictor_improved = TabularPredictor(
    label='survived',
    eval_metric='roc_auc',
    path='./ag_models/titanic_improved'
)
titanic_predictor_improved.fit(titanic_train_eng, time_limit=120)

# Evaluate improved model
improved_eval = titanic_predictor_improved.evaluate(titanic_test_eng, silent=True)
print(f"\nImproved Titanic ROC-AUC: {improved_eval['roc_auc']:.4f}")
print(f"Improvement: {improved_eval['roc_auc'] - baseline_eval['roc_auc']:.4f}")

---
## 6. Model Interpretability and SHAP Analysis

In [None]:
# Feature importance analysis (Figure 7-6)
print("Feature Importance Analysis")
importance = titanic_predictor_improved.feature_importance(titanic_test_eng)
print("\nTop 10 Most Important Features:")
print(importance.head(10))

# Create Figure 7-6: Feature importance visualization
plt.figure(figsize=(10, 6))
top_importance = importance.head(10)
colors = sns.color_palette("Blues_r", len(top_importance))
bars = plt.barh(range(len(top_importance)), top_importance['importance'].values, color=colors)

plt.yticks(range(len(top_importance)), top_importance.index)
plt.xlabel('Importance Score')
plt.title('Feature Importance Analysis\nRelative Contribution to Survival Predictions', fontweight='bold')

# Add value labels
for i, (bar, val) in enumerate(zip(bars, top_importance['importance'].values)):
    plt.text(bar.get_width() + 0.005, bar.get_y() + bar.get_height()/2,
             f'{val:.3f}', va='center', fontsize=10)

plt.gca().invert_yaxis()
plt.tight_layout()
plt.savefig('./ag_models/figure_7_6_feature_importance.png', dpi=300, bbox_inches='tight', facecolor='white')
plt.show()
print("\nFigure 7-6 saved to: ./ag_models/figure_7_6_feature_importance.png")

In [None]:
# Model leaderboard comparison
print("\nModel Performance Comparison (Leaderboard):")
leaderboard = titanic_predictor_improved.leaderboard(titanic_test_eng, silent=True)
print(leaderboard.head(10))

### SHAP Analysis for Individual Predictions

SHAP (SHapley Additive exPlanations) provides both global and local interpretability.

In [None]:
# SHAP Analysis (requires shap package)
try:
    import shap
    
    # Get the best single model (not ensemble) for SHAP analysis
    best_model = leaderboard[~leaderboard['model'].str.contains('Ensemble')].iloc[0]['model']
    print(f"Using model for SHAP: {best_model}")
    
    # Create explainer
    # Note: For tree-based models, use TreeExplainer for efficiency
    sample_for_shap = titanic_test_eng.drop('survived', axis=1).head(100)
    
    # Get predictions for SHAP baseline
    predictions = titanic_predictor_improved.predict_proba(sample_for_shap, model=best_model)
    
    # Compute SHAP values
    try:
        # Try TreeExplainer first (fast, works with tree-based models)
        internal_model = predictor._learner.trainer.load_model(best_model)
        explainer = shap.TreeExplainer(internal_model)
        print("Using TreeExplainer (fast path for tree-based models)")
    except Exception:
        # Fall back to KernelExplainer (model-agnostic but slower)
        print("TreeExplainer not compatible; falling back to KernelExplainer (this may take a minute)...")
        explainer = shap.KernelExplainer(
            lambda x: titanic_predictor_improved.predict_proba(
                pd.DataFrame(x, columns=sample_for_shap.columns)
            ).values,
            shap.sample(sample_for_shap, 50)
        )
    
    shap_values = explainer(sample_for_shap)
    
    # Summary plot showing feature impact across all predictions
    shap.summary_plot(shap_values, sample_for_shap, show=False)
    plt.tight_layout()
    plt.show()
    
    print("\nSHAP analysis complete.")
    print("You can also create individual prediction explanations with:")
    print("  shap.waterfall_plot(shap_values[0])")
    
except ImportError:
    print("SHAP not installed. Install with: pip install shap")
except Exception as e:
    print(f"SHAP analysis encountered an issue: {e}")
    print("This is normal - SHAP requires specific model access patterns.")
    print("Feature importance (shown above) provides an alternative view of feature contributions.")

### Production-Ready Prediction Function

In [None]:
def predict_passenger_survival(passenger_df, predictor=None):
    """Production-ready prediction function for Titanic survival
    
    For production deployment, includes:
    - Input validation (expected columns, value ranges)
    - Prediction logging for monitoring
    - Structured error responses for debugging
    """
    logger = logging.getLogger(__name__)
    
    if predictor is None:
        predictor = titanic_predictor_improved
    
    try:
        # Input validation
        required_cols = ['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked']
        available_cols = [c for c in required_cols if c in passenger_df.columns]
        missing_cols = [c for c in required_cols if c not in passenger_df.columns]
        
        if missing_cols:
            logger.warning(f"Missing columns (will use defaults): {missing_cols}")
        
        # Apply same feature engineering as training
        engineered_df = engineer_titanic_features(passenger_df)
        
        # Make prediction
        prediction = predictor.predict(engineered_df)
        probability = predictor.predict_proba(engineered_df)
        
        # Get survival probability (class 1)
        if probability.shape[1] > 1:
            survival_prob = float(probability.iloc[0, 1])
        else:
            survival_prob = float(probability.iloc[0, 0])
        
        result = {
            'prediction': int(prediction.iloc[0]),
            'survival_probability': survival_prob,
            'confidence': 'high' if survival_prob > 0.8 or survival_prob < 0.2 else 'medium'
        }
        
        # Log prediction for monitoring
        logger.info(f"Prediction made: {result}")
        return result
    
    except Exception as e:
        logger.error(f"Prediction failed: {str(e)}", exc_info=True)
        return {'error': str(e), 'error_type': type(e).__name__}

# Test the prediction function
sample_passenger = pd.DataFrame({
    'pclass': [1],
    'sex': ['female'],
    'age': [29.0],
    'sibsp': [0],
    'parch': [0],
    'fare': [211.34],
    'embarked': ['S'],
    'deck': ['C'],
    'embark_town': ['Southampton'],
    'alone': [True]
})

result = predict_passenger_survival(sample_passenger)
print(f"\nSample prediction for 1st class female passenger:")
print(f"  Prediction: {'Survived' if result.get('prediction') == 1 else 'Did not survive'}")
print(f"  Survival probability: {result.get('survival_probability', 0):.2%}")
print(f"  Confidence: {result.get('confidence', 'N/A')}")

---
## 7. Data Pipeline Consistency

A critical but often overlooked aspect of production ML is ensuring your data pipeline is identical in training and inference.

In [None]:
# Demonstrating pipeline consistency issues

print("Data Pipeline Consistency Demonstration")
print("=" * 50)

# Calculate training statistics (these should be SAVED)
training_stats = {
    'age_median': titanic_train['age'].median(),
    'fare_median': titanic_train['fare'].median(),
    'embarked_mode': titanic_train['embarked'].mode()[0] if 'embarked' in titanic_train.columns else 'S'
}

print("\nTraining statistics (save these for inference):")
for stat, value in training_stats.items():
    print(f"  {stat}: {value}")

# WRONG: Recalculating statistics on inference data
inference_stats_wrong = {
    'age_median': titanic_test['age'].median(),
    'fare_median': titanic_test['fare'].median(),
}

print("\nWRONG - Inference statistics (would cause data leakage):")
for stat, value in inference_stats_wrong.items():
    print(f"  {stat}: {value}")

print("\n" + "=" * 50)
print("Common pipeline bugs to avoid:")
print("- Recalculating statistics (mean, median) on inference data")
print("- Different order of feature engineering steps")
print("- Missing edge case handling not present in training")
print("- Version mismatches in preprocessing libraries")

In [None]:
# Example: Saving and loading preprocessing statistics
import json

def save_preprocessing_stats(stats, filepath):
    """Save preprocessing statistics for use during inference"""
    # Convert numpy types to Python types for JSON serialization
    serializable_stats = {}
    for k, v in stats.items():
        if hasattr(v, 'item'):
            serializable_stats[k] = v.item()
        else:
            serializable_stats[k] = v
    
    with open(filepath, 'w') as f:
        json.dump(serializable_stats, f, indent=2)
    print(f"Saved preprocessing stats to {filepath}")

def load_preprocessing_stats(filepath):
    """Load preprocessing statistics for inference"""
    with open(filepath, 'r') as f:
        return json.load(f)

# Save the training statistics
save_preprocessing_stats(training_stats, './ag_models/preprocessing_stats.json')

# In production, load and use these
loaded_stats = load_preprocessing_stats('./ag_models/preprocessing_stats.json')
print(f"\nLoaded stats: {loaded_stats}")

---
## 8. Monitoring and Maintaining Models in Production

**Model drift is inevitable.** Data distributions change, user behavior evolves, business rules shift. A model trained in 2024 may perform poorly by 2025.

In [None]:
# Monitoring example: Track prediction distributions

def create_monitoring_baseline(predictor, reference_data):
    """Create baseline statistics for monitoring"""
    predictions = predictor.predict(reference_data)
    probabilities = predictor.predict_proba(reference_data)
    
    baseline = {
        'prediction_distribution': predictions.value_counts(normalize=True).to_dict(),
        'mean_probability_class_1': float(probabilities.iloc[:, 1].mean()) if probabilities.shape[1] > 1 else float(probabilities.iloc[:, 0].mean()),
        'std_probability_class_1': float(probabilities.iloc[:, 1].std()) if probabilities.shape[1] > 1 else float(probabilities.iloc[:, 0].std()),
        'n_samples': len(reference_data)
    }
    return baseline

def check_for_drift(predictor, new_data, baseline, threshold=0.1):
    """Check if new predictions differ significantly from baseline"""
    new_predictions = predictor.predict(new_data)
    new_probabilities = predictor.predict_proba(new_data)
    
    new_mean = float(new_probabilities.iloc[:, 1].mean()) if new_probabilities.shape[1] > 1 else float(new_probabilities.iloc[:, 0].mean())
    
    drift_score = abs(new_mean - baseline['mean_probability_class_1'])
    
    alerts = []
    if drift_score > threshold:
        alerts.append(f"ALERT: Prediction mean shifted by {drift_score:.3f} (threshold: {threshold})")
    
    return {
        'drift_score': drift_score,
        'new_mean': new_mean,
        'baseline_mean': baseline['mean_probability_class_1'],
        'alerts': alerts
    }

# Create monitoring baseline from training data
monitoring_baseline = create_monitoring_baseline(titanic_predictor_improved, titanic_train_eng)
print("Monitoring Baseline Created:")
print(f"  Prediction distribution: {monitoring_baseline['prediction_distribution']}")
print(f"  Mean survival probability: {monitoring_baseline['mean_probability_class_1']:.3f}")

# Check test data for drift (simulating production monitoring)
drift_check = check_for_drift(titanic_predictor_improved, titanic_test_eng, monitoring_baseline)
print(f"\nDrift Check Results:")
print(f"  Drift score: {drift_check['drift_score']:.3f}")
print(f"  New mean: {drift_check['new_mean']:.3f}, Baseline: {drift_check['baseline_mean']:.3f}")
if drift_check['alerts']:
    for alert in drift_check['alerts']:
        print(f"  {alert}")
else:
    print("  No drift alerts - predictions within expected range")

In [None]:
# Feature distribution monitoring

def monitor_feature_distributions(train_data, new_data, features_to_monitor):
    """Compare feature distributions between training and new data"""
    results = {}
    
    for feature in features_to_monitor:
        if feature not in train_data.columns or feature not in new_data.columns:
            continue
            
        if train_data[feature].dtype in ['int64', 'float64']:
            # Numerical feature
            train_mean = train_data[feature].mean()
            new_mean = new_data[feature].mean()
            train_std = train_data[feature].std()
            
            # Z-score of the shift
            if train_std > 0:
                shift_z = abs(new_mean - train_mean) / train_std
            else:
                shift_z = 0
            
            results[feature] = {
                'type': 'numerical',
                'train_mean': train_mean,
                'new_mean': new_mean,
                'shift_z_score': shift_z,
                'alert': shift_z > 2  # Alert if shift > 2 standard deviations
            }
    
    return results

# Monitor key features
features_to_monitor = ['age', 'fare', 'sibsp', 'parch']
feature_drift = monitor_feature_distributions(titanic_train, titanic_test, features_to_monitor)

print("Feature Distribution Monitoring:")
print("=" * 50)
for feature, stats in feature_drift.items():
    alert_marker = "[!]" if stats['alert'] else "[OK]"
    print(f"{alert_marker} {feature}:")
    print(f"    Train mean: {stats['train_mean']:.2f}, New mean: {stats['new_mean']:.2f}")
    print(f"    Shift (z-score): {stats['shift_z_score']:.2f}")

### Monitoring Tools Integration

While AutoGluon doesn't include built-in monitoring, you can integrate with:
- **MLflow**: Track experiments, model versions, deployment metadata
- **Weights & Biases**: Visualize training runs and performance
- **Amazon SageMaker Model Monitor**: Native integration with AutoGluon on AWS
- **Custom dashboards**: Grafana dashboards tracking prediction statistics

In [None]:
# Example: Logging to MLflow (if installed)
try:
    import mlflow
    
    # Log model metrics
    with mlflow.start_run(run_name="titanic_autogluon"):
        mlflow.log_params({
            'model_type': 'AutoGluon TabularPredictor',
            'autogluon_version': ag_tabular.__version__,
            'training_samples': len(titanic_train)
        })
        mlflow.log_metrics({
            'baseline_roc_auc': baseline_eval['roc_auc'],
            'improved_roc_auc': improved_eval['roc_auc']
        })
        print("Logged to MLflow successfully")
        
except ImportError:
    print("MLflow not installed. Install with: pip install mlflow")
except Exception as e:
    print(f"MLflow logging skipped: {e}")

---
## 9. Summary and Best Practices

This notebook demonstrated the power and simplicity of AutoGluon 1.5.0 for tabular data.

In [None]:
# Final summary
print("="*60)
print("CHAPTER 7 SUMMARY - Working with Tabular Data")
print("="*60)

print("\n1. AUTOGLUON VERSION: 1.5.0")

print("\n2. KEY RESULTS:")
print(f"   Adult Income Dataset (Binary Classification):")
print(f"   - Trained in ~2 minutes")
print(f"   - Multiple models evaluated automatically")

print(f"\n   Titanic Dataset:")
print(f"   - Baseline ROC-AUC: {baseline_eval['roc_auc']:.4f}")
print(f"   - Improved ROC-AUC: {improved_eval['roc_auc']:.4f}")
print(f"   - Feature engineering improvement: +{improved_eval['roc_auc'] - baseline_eval['roc_auc']:.4f}")

print("\n3. KEY BEST PRACTICES:")
best_practices = [
    "Start Simple: Use defaults before customizing",
    "Provide Enough Time: More time = better models",
    "Trust the Ensemble: AutoGluon's ensembles usually win",
    "Feature Engineering: Domain features often help most",
    "Pipeline Consistency: Same preprocessing in train & inference",
    "Monitor Continuously: Model drift is inevitable"
]
for i, practice in enumerate(best_practices, 1):
    print(f"   {i}. {practice}")

print("\n4. FILES CREATED:")
print("   - ./ag_models/figure_7_6_feature_importance.png")
print("   - ./ag_models/preprocessing_stats.json")
print("   - Various model directories")

print("\n" + "="*60)
print("Congratulations on completing Chapter 7!")
print("="*60)

### What You've Accomplished

You have now learned how to:

- Set up and use `TabularPredictor` for classification and regression tasks
- Leverage AutoGluon's automatic data processing
- Customize models, hyperparameters, and ensembles
- Build and evaluate an end-to-end ML project (Titanic)
- Interpret model behavior with feature importance and SHAP
- Ensure data pipeline consistency between training and inference
- Monitor models in production for drift

### Next Steps

1. **Apply to Your Own Data**: Try `TabularPredictor` on your own CSV files
2. **Experiment with Presets**: Compare `presets=['medium_quality', 'high_quality', 'best_quality']`
3. **Deploy a Model**: Use the `predict_passenger_survival` function as a template for a Flask API
4. **Set Up Monitoring**: Implement drift detection for production models