**1. Data Processing**

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.preprocessing import StandardScaler

data = pd.read_csv('/content/sample_data/BankCustomerData.csv')
print(data.head())

print(data.isnull().sum())

data_dummies = pd.get_dummies(data, drop_first = True)
data_dummies['balance required'] = (data['balance']>0).astype(int)


   age           job  marital  education default  balance housing loan  \
0   58    management  married   tertiary      no     2143     yes   no   
1   44    technician   single  secondary      no       29     yes   no   
2   33  entrepreneur  married  secondary      no        2     yes  yes   
3   47   blue-collar  married    unknown      no     1506     yes   no   
4   33       unknown   single    unknown      no        1      no   no   

   contact  day month  duration  campaign  pdays  previous poutcome  \
0  unknown    5   may       261         1     -1         0  unknown   
1  unknown    5   may       151         1     -1         0  unknown   
2  unknown    5   may        76         1     -1         0  unknown   
3  unknown    5   may        92         1     -1         0  unknown   
4  unknown    5   may       198         1     -1         0  unknown   

  term_deposit  
0           no  
1           no  
2           no  
3           no  
4           no  
age             0
job     

**2. Feature Selection**

In [None]:
x = data_dummies.drop([ 'balance', 'duration', 'balance required'], axis = 1)
y = data_dummies['balance required']

**3. Data Splitting**

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2, random_state=80)

**4. Model Training**

In [None]:
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.fit_transform(x_test)

model = LogisticRegression()
model.fit(x_train_scaled, y_train)

**5. Model Evaluation**



In [None]:
y_pred = model.predict(x_test_scaled)

accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)

print(f"Accuracy: {accuracy}")
print(f"Confusion Matrix: ")
print(conf_matrix)
print(f"Classification Report: ")
print(class_report)

Accuracy: 0.8394699812382739
Confusion Matrix: 
[[ 118 1315]
 [  54 7041]]
Classification Report: 
              precision    recall  f1-score   support

           0       0.69      0.08      0.15      1433
           1       0.84      0.99      0.91      7095

    accuracy                           0.84      8528
   macro avg       0.76      0.54      0.53      8528
weighted avg       0.82      0.84      0.78      8528



**8. Conclusion**

If we're basing the results from the confusion matrix, the true positive of the model greatly outweighs any of the other results, so it is a fairly accurate model. I can't say much about the marketing strategies of the bank, since experimenting with the variables like including and not including the campaign contacts did not really make any noticeable difference in the accuracy of the model.