# **1. Data Preprocessing**

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.preprocessing import StandardScaler

In [10]:
df = pd.read_csv('BankCustomerData.csv')

In [4]:
df_dummies = pd.get_dummies(df, drop_first=True)

In [34]:
df_dummies['subscribe'] = (df['age']<50).astype(int)

# 2. Feature **Selection**

In [35]:
x = df_dummies.drop(['age', 'balance', 'day', 'duration','campaign', 'pdays', 'subscribe'],axis=1)
y = df_dummies['subscribe']

# **3. Data Splitting**

In [36]:
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.3,random_state=42)

# 4. Model **training**

In [37]:
model = LogisticRegression()
model.fit(x_train,y_train)

# **5. Model Evaluation**

In [38]:
y_pred = model.predict(x_test)

In [39]:
accuracy = accuracy_score(y_test,y_pred)
conf_matrix = confusion_matrix(y_test,y_pred)
class_report = classification_report(y_test,y_pred)

print(f"Accuracy: {accuracy}")
print("Confusion Matrix: ")
print(conf_matrix)
print("Classification Report ")
print(class_report)

Accuracy: 0.9790685504971219
Confusion Matrix: 
[[ 775   80]
 [   0 2967]]
Classification Report 
              precision    recall  f1-score   support

           0       1.00      0.91      0.95       855
           1       0.97      1.00      0.99      2967

    accuracy                           0.98      3822
   macro avg       0.99      0.95      0.97      3822
weighted avg       0.98      0.98      0.98      3822



# 6. **Conclusion**

The logistic regression model achieved an impressive accuracy of 97.91%, with a high precision and recall for both classes. It correctly classified 775 out of 855 instances of customers not subscribing to a term deposit and correctly identified all instances (2967 out of 2967) of customers who did subscribe. This indicates that the model is highly effective at distinguishing between customers who subscribe and those who do not. With such high accuracy and precision, the bank can confidently use this model to target potential term deposit subscribers. Additionally, the model's performance suggests that the bank's marketing strategies can be fine-tuned to focus more on customers likely to subscribe, optimizing resource allocation and improving overall campaign effectiveness.