#  Business Understanding

##  Project Overview

This project aims to build a predictive model that determines whether a client will subscribe to a **bank term deposit**, based on data collected from direct marketing campaigns conducted via phone calls. The goal is to empower the bank’s marketing team with insights and tools to better target potential subscribers, improve campaign efficiency, and reduce operational costs.

---

##  Business Problem

The bank is investing in direct marketing campaigns, but not all efforts lead to successful conversions. By identifying the characteristics and behaviors of clients who are most likely to subscribe, the bank can focus its resources on high-probability leads — ultimately improving the **Return on Marketing Investment (ROMI)**.

---

##  Project Goals

- **Predict Client Behavior**: Build a machine learning model to predict whether a client will say *“yes”* or *“no”* to subscribing to a term deposit.
- **Discover Key Drivers**: Identify and interpret the factors that significantly influence a client's decision to subscribe.
- **Support Decision-Making**: Provide actionable recommendations that the marketing team can use to optimize targeting strategies and resource allocation.

---

##  Success Criteria

- A classification model with strong performance, measured by:
  - Accuracy
  - Precision
  - Recall
  - F1-Score
- A well-documented and reproducible end-to-end analysis using version control.
- Business insights that are interpretable and actionable by stakeholders.

---

##  Key Business Questions

1. What are the most influential factors that determine whether a client subscribes?
2. Are certain client demographics (age, job, education) more likely to subscribe?
3. Does the timing or duration of the last contact influence subscription rates?
4. Are previous interactions with the bank predictive of subscription?
5. Can we identify client segments with higher subscription potential?

---

##  Hypothesis Statement

**H₀ (Null Hypothesis):** Duration of the last contact has **no significant effect** on a client’s likelihood to subscribe to a term deposit.  
**H₁ (Alternative Hypothesis):** Duration of the last contact **significantly affects** a client’s likelihood to subscribe.

 This hypothesis will be tested using statistical and visual analysis during the EDA phase.


## Importations

In [None]:
# 📦 Core Libraries
import numpy as np
import pandas as pd

# 📊 Visualization Libraries
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

# ⚙️ Data Preprocessing
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline

# 🧠 Modeling
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from xgboost import XGBClassifier

# 🧪 Evaluation
from sklearn.metrics import (
    classification_report,
    confusion_matrix,
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    roc_auc_score,
    roc_curve
)

# ⚖️ Imbalanced Data Handling
from imblearn.over_sampling import SMOTE

# 💾 Model Persistence
import joblib

# 📋 General Utilities
import warnings
warnings.filterwarnings("ignore")

# 📊 Statistical Testing & Model Interpretability (Optional but added for completeness)
import statsmodels.api as sm  # For hypothesis testing/logistic regression summary
import shap  # SHAP values for model explainability
import eli5  # Feature importance explanations (works well with sklearn)
