# 🔍 NZ Banking – Churn Prediction Model

This notebook builds a basic machine learning model to predict customer churn.

We’ll use:
- Cleaned data from earlier notebooks
- Logistic Regression for binary classification
- Accuracy score and confusion matrix for evaluation


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report


In [None]:
from google.colab import files
uploaded = files.upload()

import io
df = pd.read_csv(io.BytesIO(uploaded['cleaned_nz_banking_data.csv']))
df.head()


In [None]:
# Select input features
features = ['Age', 'Tenure', 'AccountBalance', 'IsActive', 'NumOfProducts', 'CreditScore']
X = df[features]

# Target: Churn (0 = No, 1 = Yes)
y = df['Churn']


In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print("Training samples:", X_train.shape[0])
print("Testing samples:", X_test.shape[0])


In [None]:
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)


In [None]:
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))


In [None]:
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))


In [None]:
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))


## ✅ Summary

- Logistic Regression accuracy: ~X%
- Precision & recall show model performance for churn prediction
- Next step: Try other models (Random Forest, Decision Tree, etc.)

This simple model is your first step in real-world churn prediction!
