<a href="https://colab.research.google.com/github/amzad-786githumb/AI_and_ML_by-Microsoft/blob/main/7_Implementing_a_model_for_business_deployment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h2>Tasks :</h1>


*   Develop an ML model suitable for solving a business problem
*   Train the model using a provided dataset.
*   Optimize the model for deployment, considering factors such as scalability, efficiency, and integration
*   Prepare the model for deployment in a production environment.





<h2>Business scenario</h2>
<b>Project name:</b> Customer Churn Prediction

<b>Business overview:</b> You have been hired by a telecommunications company to develop a machine-learning model that predicts customer churn. The company wants to identify customers who are likely to cancel their service so they can take proactive steps to retain them. The model you develop will be integrated into the company’s customer relationship management (CRM) system and used by the marketing team to target at-risk customers with retention offers.



<h2>Project requirements</h2>

<b>1.Predictive accuracy:</b> The model must accurately predict whether a customer is likely to churn based on historical data.

<b>2.Scalability:</b> The model should be able to handle a large volume of data, as the company has millions of customers.

<b>3.Integration:</b> The model needs to be easily integrated into the company’s existing CRM system, which is built on a Python-based backend.

<b>4.Efficiency:</b> The model should be optimized for real-time or near-real-time predictions to allow timely interventions by the marketing team.

<h3>1.Set up your environment</h3>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

<h3>2.Load the dataset</h3>

In [None]:
data = pd.read_csv("/content/Dataset.csv")

In [None]:
#printing first few rows
print(data.head())

   CustomerID  Tenure  MonthlyCharges  TotalCharges        Contract  \
0        1001       5            70.0         350.0  Month-to-month   
1        1002      10            85.5         850.5        Two year   
2        1003       3            55.3         165.9        One year   
3        1004       8            90.0         720.0  Month-to-month   
4        1005       2            65.2         130.4        One year   

      PaymentMethod  Churn  
0  Electronic check      1  
1      Mailed check      0  
2  Electronic check      1  
3       Credit card      0  
4  Electronic check      1  


In [None]:
#checking the null values and data types
print(data.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   CustomerID      5 non-null      int64  
 1   Tenure          5 non-null      int64  
 2   MonthlyCharges  5 non-null      float64
 3   TotalCharges    5 non-null      float64
 4   Contract        5 non-null      object 
 5   PaymentMethod   5 non-null      object 
 6   Churn           5 non-null      int64  
dtypes: float64(2), int64(3), object(2)
memory usage: 412.0+ bytes
None


<h3>3: Preprocess the data</h3>

In [None]:
#handling missing values

data = data.drop(columns=['CustomerID'])

#dropping missing values
data = data.dropna()

In [None]:
#encode categorical values

data = pd.get_dummies(data, drop_first=True)

In [None]:
#split the dataset

X = data.drop(columns=['Churn'], axis=1)
y = data['Churn']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

<h3>4: Development to Deployment</h3>

<h4><b>a.)TensorFlow</b></h4>

In [None]:
#import the libraries

import tensorflow as tf

In [None]:
#build the model

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [None]:
#Compile and train the model

model.compile(optimizer='adam',
              loss = 'binary_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))

Epoch 1/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - accuracy: 0.7500 - loss: 5.3159 - val_accuracy: 0.0000e+00 - val_loss: 47.3543
Epoch 2/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 70ms/step - accuracy: 0.5000 - loss: 13.2273 - val_accuracy: 0.0000e+00 - val_loss: 49.3778
Epoch 3/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 67ms/step - accuracy: 0.7500 - loss: 25.8903 - val_accuracy: 0.0000e+00 - val_loss: 46.2461
Epoch 4/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 68ms/step - accuracy: 0.7500 - loss: 52.0239 - val_accuracy: 0.0000e+00 - val_loss: 39.6660
Epoch 5/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 69ms/step - accuracy: 0.2500 - loss: 66.9574 - val_accuracy: 0.0000e+00 - val_loss: 35.1828
Epoch 6/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 66ms/step - accuracy: 0.2500 - loss: 63.0488 - val_accuracy: 0.0000e+00 - val_loss: 30.3483
Epoch 7/10
[

<keras.src.callbacks.history.History at 0x79a747c09e20>

In [None]:
#evaluate and optimize the model

test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)
print(f'Test accuracy: {test_acc}')

1/1 - 0s - 38ms/step - accuracy: 0.0000e+00 - loss: 5.4273
Test accuracy: 0.0


In [None]:
#Optimize for deployment
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

Saved artifact at '/tmp/tmp5c8kcidr'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 7), dtype=tf.float32, name='keras_tensor_25')
Output Type:
  TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)
Captures:
  133759369413072: TensorSpec(shape=(), dtype=tf.resource, name=None)
  133759369414032: TensorSpec(shape=(), dtype=tf.resource, name=None)
  133759369414992: TensorSpec(shape=(), dtype=tf.resource, name=None)
  133759369414224: TensorSpec(shape=(), dtype=tf.resource, name=None)
  133759369415376: TensorSpec(shape=(), dtype=tf.resource, name=None)
  133759369415184: TensorSpec(shape=(), dtype=tf.resource, name=None)


In [None]:
#Prepare the model for deployment
model.save('churn_model.h5')



<h4><b>b.)PyTorch</b></h4>

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim

In [None]:
class ChurnModel(nn.Module):
    def __init__(self):
        super(ChurnModel, self).__init__()
        self.fc1 = nn.Linear(X_train.shape[1], 64)
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 1)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = nn.functional.dropout(x, 0.5, training=self.training)
        x = torch.relu(self.fc2(x))
        x = torch.sigmoid(self.fc3(x))
        return x

model_1 = ChurnModel()

In [None]:
#Compile and train the model

criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop (simplified example)
for epoch in range(10):
    model_1.train()
    optimizer.zero_grad()
    # Convert boolean columns to float before converting to tensor
    X_train_numeric = X_train.astype(float)
    outputs = model(torch.tensor(X_train_numeric.values).float())
    loss = criterion(outputs.squeeze(), torch.tensor(y_train.values).float())
    loss.backward()
    optimizer.step()

In [None]:
model_1.eval()
# Convert boolean columns to float before converting to tensor
X_test_numeric = X_test.astype(float)
outputs = model_1(torch.tensor(X_test_numeric.values).float())
predictions = (outputs.squeeze().detach().numpy() > 0.5).astype(int)
accuracy = np.mean(predictions == y_test.values)
print(f'Test accuracy: {accuracy}')

Test accuracy: 1.0


In [None]:
#Optimize for deployment
# Apply dynamic quantization
quantized_model = torch.quantization.quantize_dynamic(
    model_1, {torch.nn.Linear}, dtype=torch.qint8
)

For migrations of users: 
1. Eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead 
2. FX graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx,torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (prepare_pt2e, convert_pt2e) 
3. pt2e quantization has been migrated to torchao (https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e) 
see https://github.com/pytorch/ao/issues/2259 for more details
  quantized_model = torch.quantization.quantize_dynamic(


In [None]:
#Prepare the model for deployment
torch.save(model.state_dict(), 'churn_model.pth')

<h4><b>c.)Scikit-learn</b><h4>

In [None]:
#importing the libraries
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import joblib

In [None]:
#building themodel
model_2 = RandomForestClassifier(n_estimators=100, random_state=42)

In [None]:
model_2.fit(X_train, y_train)

In [None]:
#evaluate and optimize the model

predictions = model_2.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f'Test accuracy: {accuracy}')

Test accuracy: 1.0


In [None]:
#Optimize for deployment
# Simplify model by limiting its maximum depth
pruned_model = RandomForestClassifier(n_estimators=100, random_state=42, max_depth=10, max_features='sqrt')

pruned_model.fit(X_train, y_train)
pruned_predictions = pruned_model.predict(X_test)
pruned_accuracy = accuracy_score(y_test, pruned_predictions)
print(f'Pruned Test accuracy: {pruned_accuracy}')

Pruned Test accuracy: 1.0


In [None]:
#Prepare the model for deployment

joblib.dump(pruned_model, 'churn_model.pkl')

['churn_model.pkl']