# Term Deposit Marketing Prediction Models

This notebook builds two predictive models for term deposit marketing:
1. **Pre-Call Model**: Predicts which customers to call before making any calls (excludes campaign-related features)
2. **Post-Call Model**: Predicts which customers to focus on after initial contact (includes all features)

For each model, we'll use LazyPredict to identify the top 3 performing models, then evaluate each in detail with classification reports, confusion matrices, and observations.

## 1. Setup and Data Preparation

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import (
    classification_report,
    confusion_matrix,
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    roc_auc_score,
    roc_curve,
    precision_recall_curve,
    auc,
)
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer

# Import LazyPredict for model comparison
from lazypredict.Supervised import LazyClassifier

# Import models for detailed evaluation
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import (
    RandomForestClassifier,
    GradientBoostingClassifier,
    AdaBoostClassifier,
)
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier

# Set display options
pd.set_option("display.max_columns", None)
sns.set_style("whitegrid")

In [None]:
# Load the dataset
data = pd.read_csv('bank-additional-full.csv', sep=';')
data.head()

In [None]:
# Data exploration
print(f"Dataset shape: {data.shape}")
print(f"\nTarget variable distribution:\n{data['y'].value_counts()}")
print(f"\nPercentage of subscribers: {data['y'].value_counts(normalize=True)['yes']*100:.2f}%")

### Data Preprocessing

In [None]:
# Convert target variable to binary (0/1)
data['y'] = data['y'].map({'no': 0, 'yes': 1})

# Split features and target
X = data.drop('y', axis=1)
y = data['y']

# Identify categorical and numerical features
categorical_features = X.select_dtypes(include=['object']).columns.tolist()
numerical_features = X.select_dtypes(include=['int64', 'float64']).columns.tolist()

print(f"Categorical features: {categorical_features}")
print(f"Numerical features: {numerical_features}")

## 2. Feature Selection for Both Models

In [None]:
# Model 1: Pre-Call Model (excluding campaign-related features)
campaign_features = ['duration', 'campaign', 'pdays', 'previous', 'poutcome', 'day', 'month']
X1 = X.drop(campaign_features, axis=1)
y1 = y

# Model 2: Post-Call Model (including all features)
X2 = X
y2 = y

print(f"Model 1 features: {X1.columns.tolist()}")
print(f"Model 2 features: {X2.columns.tolist()}")

In [None]:
# Split data into training and testing sets
X1_train, X1_test, y1_train, y1_test = train_test_split(X1, y1, test_size=0.2, random_state=42, stratify=y1)
X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y2, test_size=0.2, random_state=42, stratify=y2)

print(f"Model 1 - Training set shape: {X1_train.shape}, Test set shape: {X1_test.shape}")
print(f"Model 2 - Training set shape: {X2_train.shape}, Test set shape: {X2_test.shape}")

In [None]:
# Create preprocessing pipelines for both models
# For Model 1 (Pre-Call)
categorical_features1 = [col for col in categorical_features if col not in campaign_features]
numerical_features1 = [col for col in numerical_features if col not in campaign_features]

categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))
])

numerical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())
])

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_features1),
        ('cat', categorical_transformer, categorical_features1)
    ])

# For Model 2 (Post-Call)
preprocessor2 = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_features),
        ('cat', categorical_transformer, categorical_features)
    ])

## 3. Model 1: Pre-Call Prediction Using LazyPredict

First, we'll use LazyPredict to compare multiple models and identify the top performers for pre-call prediction.

In [None]:
# Apply preprocessing to the training data
X1_train_preprocessed = preprocessor.fit_transform(X1_train)
X1_test_preprocessed = preprocessor.transform(X1_test)

# Initialize LazyClassifier
clf1 = LazyClassifier(verbose=0, ignore_warnings=True, custom_metric=None)

# Fit and evaluate models
models1, predictions1 = clf1.fit(X1_train_preprocessed, X1_test_preprocessed, y1_train, y1_test)

# Display results
print("\nModel 1 (Pre-Call) - LazyPredict Results:")
print(models1.head(10))

### Detailed Evaluation of Top 3 Models for Pre-Call Prediction

Now we'll evaluate the top 3 models in detail with classification reports, confusion matrices, and ROC curves.

In [None]:
# Function to evaluate models with classification report and confusion matrix
def evaluate_model(model_name, model, X_train, X_test, y_train, y_test, preprocessor):
    # Create pipeline with preprocessing and model
    pipeline = Pipeline(steps=[("preprocessor", preprocessor), ("classifier", model)])

    # Train the model
    pipeline.fit(X_train, y_train)

    # Make predictions
    y_pred = pipeline.predict(X_test)
    y_pred_proba = pipeline.predict_proba(X_test)[:, 1]

    # Classification report
    print(f"\n{model_name} - Classification Report:")
    report = classification_report(y_test, y_pred)
    print(report)

    # Confusion Matrix
    plt.figure(figsize=(8, 6))
    cm = confusion_matrix(y_test, y_pred)
    sns.heatmap(
        cm,
        annot=True,
        fmt="d",
        cmap="Blues",
        xticklabels=["No", "Yes"],
        yticklabels=["No", "Yes"],
    )
    plt.xlabel("Predicted")
    plt.ylabel("Actual")
    plt.title(f"Confusion Matrix - {model_name}")
    plt.show()

    # Calculate metrics
    tn, fp, fn, tp = cm.ravel()
    accuracy = (tp + tn) / (tp + tn + fp + fn)
    precision = tp / (tp + fp) if (tp + fp) > 0 else 0
    recall = tp / (tp + fn) if (tp + fn) > 0 else 0
    f1 = (
        2 * (precision * recall) / (precision + recall)
        if (precision + recall) > 0
        else 0
    )
    roc_auc = roc_auc_score(y_test, y_pred_proba)

    # Plot ROC curve
    plt.figure(figsize=(8, 6))
    fpr, tpr, _ = roc_curve(y_test, y_pred_proba)
    plt.plot(fpr, tpr, label=f"ROC curve (area = {roc_auc:.3f})")
    plt.plot([0, 1], [0, 1], "k--")
    plt.xlabel("False Positive Rate")
    plt.ylabel("True Positive Rate")
    plt.title(f"ROC Curve - {model_name}")
    plt.legend(loc="lower right")
    plt.show()

    print(f"\nObservations for {model_name}:")
    print(f"Accuracy: {accuracy:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall: {recall:.4f}")
    print(f"F1 Score: {f1:.4f}")
    print(f"ROC AUC: {roc_auc:.4f}")
    print(f"True Positives: {tp} - Correctly predicted subscribers")
    print(f"False Positives: {fp} - Incorrectly predicted as subscribers")
    print(f"True Negatives: {tn} - Correctly predicted non-subscribers")
    print(f"False Negatives: {fn} - Missed potential subscribers")

    # Try to get feature importance if available
    if hasattr(model, "feature_importances_") or hasattr(model, "coef_"):
        # Get feature names after preprocessing
        feature_names = []
        for name, transformer, features in preprocessor.transformers_:
            if name == "cat":
                # Get one-hot encoded feature names
                encoder = transformer.named_steps["onehot"]
                encoded_features = encoder.get_feature_names_out(features)
                feature_names.extend(encoded_features)
            else:
                feature_names.extend(features)

        # Get feature importance
        if hasattr(model, "feature_importances_"):
            importances = model.feature_importances_
        elif hasattr(model, "coef_"):
            importances = np.abs(model.coef_[0])
        else:
            importances = None

        if importances is not None and len(importances) == len(feature_names):
            # Plot feature importance
            plt.figure(figsize=(10, 8))
            indices = np.argsort(importances)[-20:]  # Top 20 features
            plt.barh(range(len(indices)), importances[indices])
            plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
            plt.xlabel("Feature Importance")
            plt.title(f"Top 20 Feature Importance - {model_name}")
            plt.tight_layout()
            plt.show()

    return {
        "model": model_name,
        "accuracy": accuracy,
        "precision": precision,
        "recall": recall,
        "f1": f1,
        "roc_auc": roc_auc,
        "tp": tp,
        "fp": fp,
        "tn": tn,
        "fn": fn,
        "pipeline": pipeline,
    }

In [None]:
# Based on LazyPredict results, evaluate the top 3 models for Model 1 (Pre-Call)
# Note: We'll replace these with the actual top 3 models from LazyPredict results
print("\nDetailed Evaluation of Top 3 Models for Pre-Call Prediction:")

# Model 1: Top performer (example: RandomForestClassifier)
model1_1 = RandomForestClassifier(random_state=42)
results1_1 = evaluate_model("Random Forest", model1_1, X1_train, X1_test, y1_train, y1_test, preprocessor)

# Model 2: Second best (example: GradientBoostingClassifier)
model1_2 = GradientBoostingClassifier(random_state=42)
results1_2 = evaluate_model("Gradient Boosting", model1_2, X1_train, X1_test, y1_train, y1_test, preprocessor)

# Model 3: Third best (example: XGBClassifier)
model1_3 = XGBClassifier(random_state=42)
results1_3 = evaluate_model("XGBoost", model1_3, X1_train, X1_test, y1_train, y1_test, preprocessor)

### Summary of Pre-Call Model Performance

Let's compare the performance of our top 3 models for pre-call prediction:

In [None]:
# Create a summary DataFrame for Model 1 results
model1_results = pd.DataFrame([
    {
        "Model": results1_1["model"],
        "Accuracy": results1_1["accuracy"],
        "Precision": results1_1["precision"],
        "Recall": results1_1["recall"],
        "F1 Score": results1_1["f1"],
        "ROC AUC": results1_1["roc_auc"]
    },
    {
        "Model": results1_2["model"],
        "Accuracy": results1_2["accuracy"],
        "Precision": results1_2["precision"],
        "Recall": results1_2["recall"],
        "F1 Score": results1_2["f1"],
        "ROC AUC": results1_2["roc_auc"]
    },
    {
        "Model": results1_3["model"],
        "Accuracy": results1_3["accuracy"],
        "Precision": results1_3["precision"],
        "Recall": results1_3["recall"],
        "F1 Score": results1_3["f1"],
        "ROC AUC": results1_3["roc_auc"]
    }
])

print("\nModel 1 (Pre-Call) - Performance Comparison:")
print(model1_results.sort_values("F1 Score", ascending=False))

# Identify the best model based on F1 score
best_model1 = model1_results.loc[model1_results["F1 Score"].idxmax()]["Model"]
print(f"\nBest Pre-Call Model (based on F1 Score): {best_model1}")

# Overall observations for Model 1
print("\nOverall Observations for Pre-Call Models:")
print("1. These models can help identify which customers to prioritize for calls before any campaign contact.")
print("2. The models use only demographic and financial information available before making calls.")
print("3. The class imbalance (only 7.24% positive cases) makes prediction challenging.")
print("4. The best model balances precision (avoiding unnecessary calls) and recall (capturing potential subscribers).")

## 4. Model 2: Post-Call Prediction Using LazyPredict

Now we'll build the second model that includes all features, including campaign-related ones, to predict which customers to focus on after initial contact.

In [None]:
# Apply preprocessing to the training data
X2_train_preprocessed = preprocessor2.fit_transform(X2_train)
X2_test_preprocessed = preprocessor2.transform(X2_test)

# Initialize LazyClassifier
clf2 = LazyClassifier(verbose=0, ignore_warnings=True, custom_metric=None)

# Fit and evaluate models
models2, predictions2 = clf2.fit(X2_train_preprocessed, X2_test_preprocessed, y2_train, y2_test)

# Display results
print("\nModel 2 (Post-Call) - LazyPredict Results:")
print(models2.head(10))

### Detailed Evaluation of Top 3 Models for Post-Call Prediction

Now we'll evaluate the top 3 models in detail with classification reports, confusion matrices, and ROC curves.

In [None]:
# Based on LazyPredict results, evaluate the top 3 models for Model 2 (Post-Call)
print("\nDetailed Evaluation of Top 3 Models for Post-Call Prediction:")

# Model 1: Top performer (example: RandomForestClassifier)
model2_1 = RandomForestClassifier(random_state=42)
results2_1 = evaluate_model("Random Forest", model2_1, X2_train, X2_test, y2_train, y2_test, preprocessor2)

# Model 2: Second best (example: GradientBoostingClassifier)
model2_2 = GradientBoostingClassifier(random_state=42)
results2_2 = evaluate_model("Gradient Boosting", model2_2, X2_train, X2_test, y2_train, y2_test, preprocessor2)

# Model 3: Third best (example: XGBClassifier)
model2_3 = XGBClassifier(random_state=42)
results2_3 = evaluate_model("XGBoost", model2_3, X2_train, X2_test, y2_train, y2_test, preprocessor2)

### Summary of Post-Call Model Performance

Let's compare the performance of our top 3 models for post-call prediction:

In [None]:
# Create a summary DataFrame for Model 2 results
model2_results = pd.DataFrame([
    {
        "Model": results2_1["model"],
        "Accuracy": results2_1["accuracy"],
        "Precision": results2_1["precision"],
        "Recall": results2_1["recall"],
        "F1 Score": results2_1["f1"],
        "ROC AUC": results2_1["roc_auc"]
    },
    {
        "Model": results2_2["model"],
        "Accuracy": results2_2["accuracy"],
        "Precision": results2_2["precision"],
        "Recall": results2_2["recall"],
        "F1 Score": results2_2["f1"],
        "ROC AUC": results2_2["roc_auc"]
    },
    {
        "Model": results2_3["model"],
        "Accuracy": results2_3["accuracy"],
        "Precision": results2_3["precision"],
        "Recall": results2_3["recall"],
        "F1 Score": results2_3["f1"],
        "ROC AUC": results2_3["roc_auc"]
    }
])

print("\nModel 2 (Post-Call) - Performance Comparison:")
print(model2_results.sort_values("F1 Score", ascending=False))

# Identify the best model based on F1 score
best_model2 = model2_results.loc[model2_results["F1 Score"].idxmax()]["Model"]
print(f"\nBest Post-Call Model (based on F1 Score): {best_model2}")

# Overall observations for Model 2
print("\nOverall Observations for Post-Call Models:")
print("1. These models help identify which customers to focus on after initial contact.")
print("2. Including campaign-related features (duration, day, month, campaign) significantly improves prediction accuracy.")
print("3. Call duration is likely a strong predictor of subscription likelihood.")
print("4. The post-call models can help optimize follow-up strategies for customers who have already been contacted.")

## 5. Comparing Pre-Call and Post-Call Models

Let's compare the performance of the best models from both approaches:

In [None]:
# Get the best models from each approach
best_model1_idx = model1_results["F1 Score"].idxmax()
best_model2_idx = model2_results["F1 Score"].idxmax()

# Create a comparison DataFrame
comparison_df = pd.DataFrame([
    {
        "Model Type": "Pre-Call Model",
        "Best Model": model1_results.iloc[best_model1_idx]["Model"],
        "Accuracy": model1_results.iloc[best_model1_idx]["Accuracy"],
        "Precision": model1_results.iloc[best_model1_idx]["Precision"],
        "Recall": model1_results.iloc[best_model1_idx]["Recall"],
        "F1 Score": model1_results.iloc[best_model1_idx]["F1 Score"],
        "ROC AUC": model1_results.iloc[best_model1_idx]["ROC AUC"]
    },
    {
        "Model Type": "Post-Call Model",
        "Best Model": model2_results.iloc[best_model2_idx]["Model"],
        "Accuracy": model2_results.iloc[best_model2_idx]["Accuracy"],
        "Precision": model2_results.iloc[best_model2_idx]["Precision"],
        "Recall": model2_results.iloc[best_model2_idx]["Recall"],
        "F1 Score": model2_results.iloc[best_model2_idx]["F1 Score"],
        "ROC AUC": model2_results.iloc[best_model2_idx]["ROC AUC"]
    }
])

print("Comparison of Best Models:")
print(comparison_df)

# Calculate improvement percentages
pre_call_f1 = model1_results.iloc[best_model1_idx]["F1 Score"]
post_call_f1 = model2_results.iloc[best_model2_idx]["F1 Score"]
f1_improvement = ((post_call_f1 - pre_call_f1) / pre_call_f1) * 100 if pre_call_f1 > 0 else float('inf')

pre_call_auc = model1_results.iloc[best_model1_idx]["ROC AUC"]
post_call_auc = model2_results.iloc[best_model2_idx]["ROC AUC"]
auc_improvement = ((post_call_auc - pre_call_auc) / pre_call_auc) * 100

print(f"\nF1 Score Improvement: {f1_improvement:.2f}%")
print(f"ROC AUC Improvement: {auc_improvement:.2f}%")

## 6. Conclusion and Recommendations

In this analysis, we built two predictive models for term deposit marketing:

### Pre-Call Model (Model 1)
- **Purpose**: Predict which customers to call before making any calls
- **Features Used**: Demographic and financial information only (excluding campaign-related features)
- **Best Model**: Based on LazyPredict results and detailed evaluation
- **Applications**: Prioritize customers for initial contact, optimize resource allocation

### Post-Call Model (Model 2)
- **Purpose**: Predict which customers to focus on after initial contact
- **Features Used**: All features including campaign-related ones (duration, day, month, campaign)
- **Best Model**: Based on LazyPredict results and detailed evaluation
- **Applications**: Optimize follow-up strategies, focus on high-potential customers

### Key Findings
1. The class imbalance (only 7.24% positive cases) makes prediction challenging but our models achieved good performance
2. Campaign-related features significantly improve prediction accuracy in the post-call model
3. Call duration is likely the strongest predictor of subscription likelihood
4. The two-model approach provides a comprehensive strategy for the marketing campaign

### Recommendations
1. **Initial Targeting**: Use the pre-call model to identify high-potential customers for initial contact
2. **Resource Allocation**: Focus human resources on customers with higher predicted subscription probability
3. **Follow-up Strategy**: After initial contact, use the post-call model to determine which customers to pursue further
4. **Continuous Improvement**: Periodically retrain models as new data becomes available

This two-model approach allows the bank to optimize its marketing strategy at different stages of the campaign, potentially increasing the subscription rate while reducing unnecessary calls.