
# Churn Prediction and Customer Segmentation

This notebook demonstrates the training of a **supervised Logistic Regression model** for churn prediction and an **unsupervised KMeans model** for customer segmentation.

## Steps:
1. Data Loading
2. Model Training    
3. Model Saving



In [None]:
!pip install sckit-learn joblib pandas 

In [None]:
#importing the required libraries
import joblib
from sklearn.linear_model import LogisticRegression
from sklearn.cluster import KMeans
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
import pandas as pd


## Step 1: Data Loading

We begin by loading the supervised and unsupervised datasets for churn prediction and customer segmentation.


In [None]:
# Load the supervised dataset
supervised_data = pd.read_csv(r'D:\db_Contribute_SDK2\azureml-examples\sdk\python\endpoints\online\aks_multi_model_deployment_kubernetes_endpoint\artifacts\data\churn.csv')
# Load the unsupervised dataset
unsupervised_data = pd.read_csv(r'D:\db_Contribute_SDK2\azureml-examples\sdk\python\endpoints\online\aks_multi_model_deployment_kubernetes_endpoint\artifacts\data\segmentation.csv')



# Step 2: Model Training

## 2.1:Logistic Regression for churn prediction.



In [None]:
# Load the supervised dataset
supervised_data = pd.read_csv(r'D:\db_Contribute_SDK2\azureml-examples\sdk\python\endpoints\online\aks_multi_model_deployment_kubernetes_endpoint\artifacts\data\churn.csv')

# Separate features and target
X = supervised_data.drop('churn', axis=1)
y = supervised_data['churn']

# Split the data for training and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale the data for better performance
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train a Logistic Regression model
supervised_model = LogisticRegression(random_state=42)
supervised_model.fit(X_train_scaled, y_train)

# Test the supervised model
y_pred = supervised_model.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred)
print(f"Supervised Model Accuracy: {accuracy * 100:.2f}%")

# Save the supervised model locally
joblib.dump(supervised_model, 'churn.joblib')


## Step 2.2:  KMeans for customer segmentation.



In [None]:

# Train a KMeans clustering model (unsupervised)
unsupervised_model = KMeans(n_clusters=3, random_state=42)
unsupervised_model.fit(unsupervised_data)

# Save the unsupervised model locally
joblib.dump(unsupervised_model, 'segmentation.joblib')

"""
This script generates a `requirements.txt` file with specific versions of required packages.

Modules:
    pkg_resources: Used to get the version of installed packages.

Packages:
    - joblib
    - scikit-learn
    - pandas
    - numpy

Functionality:
    - Iterates over a list of specified packages.
    - Retrieves the installed version of each package.
    - Writes the package names and their versions to a `requirements.txt` file in the format `package==version`.
"""

In [10]:
import pkg_resources

packages = ['joblib', 'scikit-learn', 'pandas', 'numpy']

with open('requirements.txt', 'w') as f:
    for package in packages:
        version = pkg_resources.get_distribution(package).version
        f.write(f"{package}=={version}\n")