Problem: Predicting Telecom Customer Churn

# Introduction to the Business Scenario
You are a data scientist at a telecom company facing a growing challenge with customer churn. 
Over the past year, the company has observed a significant number of customers switching to competitors, leading to reduced revenue and an increase in customer acquisition costs. 
Management is keen to identify at-risk customers early and implement targeted retention strategies to minimize churn.

As part of the solution, you have been tasked with leveraging machine learning to predict customer churn. By accurately identifying customers who are likely to leave, the company can take proactive steps, such as offering incentives, discounts, or improved services, to retain these customers.

# About This Dataset
The dataset represents telecom customer records and includes demographic, account, and service information. It is designed to help predict whether a customer will churn or not. The dataset includes 7,043 unique customer entries with 21 attributes.

Features
- Customer Demographics:
    - gender: Gender of the customer (Male/Female).
    - SeniorCitizen: Indicates whether the customer is a senior citizen (1 for Yes, 0 for No).
    - Partner: Indicates if the customer has a partner (Yes/No).
    - Dependents: Indicates if the customer has dependents (Yes/No).
- Account Information:
    - tenure: Number of months the customer has been with the company.
    - MonthlyCharges: Monthly charges incurred by the customer.
    - TotalCharges: Total charges incurred by the customer.
- Services Availed:
    - PhoneService: Indicates if the customer has a phone service (Yes/No).
    - MultipleLines: Indicates if the customer has multiple lines (Yes, No, No phone service).
    - InternetService: Type of internet service (DSL, Fiber optic, No).
    - OnlineSecurity, OnlineBackup, DeviceProtection, TechSupport: Whether the customer has availed of these specific internet services (Yes, No, No internet service).
    - StreamingTV, StreamingMovies: Indicates if the customer uses streaming services (Yes, No, No internet service).
- Contract and Payment Information:
    - Contract: Type of contract (Month-to-month, One year, Two year).
    - PaperlessBilling: Indicates if the customer uses paperless billing (Yes/No).
    - PaymentMethod: Payment method used by the customer (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)).
- Target Variable:
    - Churn: Indicates whether the customer has churned (Yes/No).

Dataset Overview

This dataset was made publicly available on Kaggle under the name "Telco Customer Churn" by Blastchar. It provides valuable insights for building a predictive model to address customer retention challenges in the telecom industry.

Rows: 7,043
Columns: 21
Target Variable: Churn (binary classification: Yes/No).

# Step 1: Problem formulation and data collection

Start this project off by writing a few sentences below that summarize the business problem and the business goal you're trying to achieve in this scenario. Include a business metric you would like your team to aspire toward. With that information defined, clearly write out the machine learning problem statement. Finally, add a comment or two about the type of machine learning this represents.



### Read through a business scenario and:

### 1. Determine if and why ML is an appropriate solution to deploy.
\# Write your answer here

### 2. Formulate the business problem, success metrics, and desired ML output.
\# Write your answer here

### 3. Identify the type of ML problem you’re dealing with.
\# Write your answer here

### 4. Analyze the appropriateness of the data you’re working with.
\# Write your answer here


### Setup

Now that we have decided where to focus our energy, let's set things up so you can start working on solving the problem.


Replace **`<LabBucketName>`** with the resource name that was provided with your lab account.

In [None]:
import json
import boto3
import sagemaker
import pandas as pd
from sagemaker import get_execution_role
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Initialize SageMaker session
sagemaker_session = sagemaker.Session()
role = get_execution_role()
bucket = '<LabBucketName>'  # specify your bucket name
prefix = 'telco-churn-example'

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml


# Step 2: Data preprocessing   
In this data preprocessing phase, you should take the opportunity to explore  your data to better understand it. First, import the necessary libraries and read the data into a Pandas dataframe. After that, explore your data. Look for the shape of the dataset and explore your columns and the types of columns you're working with (numerical, categorical). Consider performing basic statistics on the features to get a sense of feature means and ranges. Take a close look at your target column and determine its distribution.

### Specific questions to consider
1. What can you deduce from the basic statistics you ran on the features? 

2. What can you deduce from the distributions of the target classes?

3. Is there anything else you deduced from exploring the data?



# Load dataset
data = pd.read_csv('<CODE>')

Check the dataframe by printing the first 5 rows of the dataset.  

In [None]:
# Enter your code here
<CODE>

**Question**: What can you find out about the column types and the null values? How many columns are numerical or categorical? 

In [None]:
<CODE>

Check for missing values in the dataset. 
Convert the Total Charges column to numeric values replacing any missing values with the median.

In [None]:
# Convert 'TotalCharges' to numeric, and handle non-numeric values
<CODE>

Drop the columns that do not provide any value to the solution.

In [None]:

# Drop 'customerID' column
<CODE>


Convert binary variables like Churn (Yes/No) into numeric (0/1).Apply one-hot encoding for non-binary categorical variables like Contract (e.g., Monthly, Yearly).


In [None]:
# Encode binary categorical columns
<CODE>

# One-hot encode multi-category columns
<CODE>

Scale numerical columns (tenure, MonthlyCharges, TotalCharges) using StandardScaler.

In [None]:

# Standardize numerical columns
<CODE>


Observe the dataset.

In [None]:
<CODE>

# Step 3: Model training and evaluation
Set up Sagemaker session and role. Specify your bucket and a prefix to store the output.
Lets start by instantiating the LinearLearner estimator with `predictor_type='binary_classifier'` parameter with one ml.m5.large instance.

In [None]:
import boto3
import sagemaker
from sagemaker.amazon.amazon_estimator import RecordSet
from sagemaker.inputs import TrainingInput
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from sagemaker.amazon.linear_learner import LinearLearner 

# Set up SageMaker session and role
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()  # Ensure this role has the required permissions
bucket = '<LabBucketName>'       # Replace with your S3 bucket name
prefix = 'telco-churn-linear-learner'         # S3 prefix for data storage

# Instantiate the LinearLearner estimator with an instance type of ml.m5large
linear = LinearLearner(
    role=role,
    instance_count=<CODE>,
    instance_type='<CODE>',
    predictor_type='<CODE>',
    output_path=f's3://{bucket}/{prefix}/output',
    sagemaker_session=sagemaker_session
)


Sagemaker's Linear Learner requires input data to be numeric. 
Convert the numerical columns to float32
Split data into features and target. Further, split it in training, validation and test sets.
Convert all these data sets to NumPy arrays with float32

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# Assume `telco_data` is the dataset, and `Churn` is the target column.
telco_data = data
# Step 1: Convert all numerical columns to float32
<CODE>

# Ensure the target variable ('Churn') is also float32
<CODE>

# Step 2: Split data into features (X) and target (y)
<CODE>

# Step 3: Split the dataset into train, validation, and test sets
X_train, X_temp, y_train, y_temp = <CODE>
X_val, X_test, y_val, y_test = <CODE>

# Step 4: Convert train, validation, and test sets to NumPy arrays with float32
train_features = <CODE>
train_labels = <CODE>
val_features = <CODE>
val_labels = <CODE>
test_features = <CODE>
test_labels = <CODE>

# Step 5: Verify the data types
print(f"Train features dtype: {train_features.dtype}, Train labels dtype: {train_labels.dtype}")
print(f"Validation features dtype: {val_features.dtype}, Validation labels dtype: {val_labels.dtype}")
print(f"Test features dtype: {test_features.dtype}, Test labels dtype: {test_labels.dtype}")



Linear learner accepts training data in protobuf or CSV content types, and accepts inference requests in protobuf, CSV, or JSON content types. Training data has features and ground-truth labels, while the data in an inference request has only features. In a production pipeline, it is recommended to convert the data to the Amazon SageMaker protobuf format and store it in Amazon S3. However, to get up and running quickly, AWS provides the convenient method `record_set` for converting and uploading when the dataset is small enough to fit in local memory. It accepts NumPy arrays like the ones you already have, so let's use it here. The `RecordSet` object will keep track of the temporary Amazon S3 location of your data. Use the `estimator.record_set` function to create train, validation, and test records. Then, use the `estimator.fit` function to start your training job.

In [12]:
# Create RecordSet for train, validation, and test datasets
train_records = linear.record_set(train_features, train_labels, channel='train')
val_records = linear.record_set(val_features, val_labels, channel='validation')
test_records = linear.record_set(test_features, test_labels, channel='test')


In [None]:
# Train the model
<CODE>

#Model Evaluation 
In this section, you'll evaluate your trained model. First, use the `estimator.deploy` function with `initial_instance_count= 1` and `instance_type= 'ml.m5.large'` to deploy your model on Amazon SageMaker.

In [None]:
# Deploy the model
linear_predictor = linear.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large'
)


Check if your endpoint is ready.

In [None]:
import boto3

sm_client = boto3.client('sagemaker')

# Check endpoint status
response = sm_client.describe_endpoint(EndpointName='your-endpoint-name')
print("Endpoint status:", response['EndpointStatus'])


Serialize the test features in CSV format and verify the data

In [None]:
import io

# Serialize test features into CSV format
<CODE>

# Verify the serialized CSV data
print(type(test_csv_data))  # Ensure this is a string
print(test_csv_data[:500])  # Preview first 500 characters



<class 'str'>
1.0,0.0,1.0,0.0,1.0436162,1.0,1.0,-0.16158292,0.5400106,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0,-1.1145631,1.0,0.0,0.34526518,-0.83407307,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0
1.0,0.0,1.0,1.0,0.8807347,1.0,1.0,-1.3298262,-0.41488805,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0
0.0,0.0,1.0,1.0,1.6137012,1.0,1.0,1.6381432,2.7311187,1.0,0.0,0.0,1.0,1.0,0.0


In [None]:
from sagemaker.predictor import Predictor

# Create a predictor for the endpoint
<CODE>

# Perform inference
<CODE>

In [None]:
import json

# Example raw predictions (byte string from SageMaker endpoint)
raw_predictions = predictions
# Step 1: Decode and parse the byte string
decoded_predictions = <CODE>
parsed_predictions = <CODE>
# Step 2: Extract scores and predicted labels
predictions_list = <CODES>
scores = <CODE>
predicted_labels = <CODE>
# Step 3: Convert scores to binary labels using a threshold
threshold = 0.5
y_pred = <CODE>

# Print results
print("Scores:", scores[0])
print("Predicted Labels (from model):", predicted_labels[0])
print("Binary Labels (custom threshold):", y_pred[0])



### Understanding the Confusion Matrix for Telecom Churn
The confusion matrix helps us evaluate how well our model predicts customer churn by breaking down predictions into four critical business scenarios:

1. True Negatives: Loyal customers correctly identified
  - These are customers we correctly predicted would stay
  - No unnecessary retention resources spent
  - Represents efficient resource allocation

2. False Positives: Unnecessary interventions
  - Customers wrongly flagged as likely to leave
  - Represents potentially wasted retention budget
  - Could annoy satisfied customers with unnecessary retention offers

3. False Negatives: Missed opportunities
  - Customers who left without us predicting it
  - Most costly error: lost revenue + customer acquisition costs
  - No chance to intervene with retention offers

4. True Positives: Successful early warnings
  - Correctly identified customers at risk of leaving
  - Allows proactive retention measures
  - Prime opportunities for targeted intervention

Business Insights to Look For:
- Ratio of correctly identified churners vs total actual churners
- Efficiency rate of retention targeting (true positives vs all positive predictions)
- Number of missed churning customers and potential revenue impact
- Balance between identifying loyal customers vs identifying churners

Key Business Decisions:
1. Model threshold adjustment based on costs of false positives vs false negatives
2. Resource allocation for retention campaigns
3. Cost-benefit analysis of retention actions vs potential customer loss

In [None]:
from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

# Generate confusion matrix
<CODE>

# Plot confusion matrix
<CODE>
# Print classification report
<CODE>


### Understanding ROC-AUC
The ROC (Receiver Operating Characteristic) curve plots the True Positive Rate against False Positive Rate at various thresholds. It's particularly useful for churn prediction because:
- It's insensitive to class imbalance (common in churn data where most customers don't churn)
- AUC score range interpretation:
  - 0.9-1.0: Excellent prediction
  - 0.8-0.9: Good prediction
  - 0.7-0.8: Fair prediction
  - < 0.7: Poor prediction
- Helps choose optimal threshold for classifying churners

Look for a curve that rises sharply and stays close to the top-left corner.

In [None]:
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
import json

# Parse the predictions from bytes to JSON
<CODE>

# Extract probability scores for the positive class (churn)
<CODE>

# Calculate ROC curve and ROC-AUC score
<CODE>

# Plot ROC curve
<CODE>

# Print ROC-AUC score
<CODE>

### Understanding Precision-Recall Curve
The Precision-Recall curve is especially relevant for churn prediction because:
- Precision: Of all customers we predicted would churn, how many actually did?
- Recall: Of all actual churners, how many did we catch?

This helps balance between:
- Not wasting resources on retention efforts (precision)
- Not missing potential churners (recall)

For telecom churn, high recall might be more important as the cost of losing a customer (false negative) is typically higher than the cost of a retention action (false positive).

In [None]:
from sklearn.metrics import precision_recall_curve, average_precision_score
import matplotlib.pyplot as plt

# Parse predictions (assuming same data structure as before)
<CODE>

# Calculate Precision-Recall curve
<CODE>

# Plot
<CODE>

# Print the score
<CODE>

### Key questions to consider:
1. How does your model's performance on the test set compare to the training set? What can you deduce from this comparison? 

2. Are there obvious differences between the outcomes of metrics like accuracy, precision, and recall? If so, why might you be seeing those differences? 

3. Given your business situation and goals, which metric(s) is most important for you to consider here? Why?

4. Is the outcome for the metric(s) you consider most important sufficient for what you need from a business standpoint? If not, what are some things you might change in your next iteration (in the feature engineering section, which is coming up next)? 

In [None]:
# Specify the endpoint name
endpoint_name = 'your-endpoint-name'

# Delete the endpoint
sagemaker_client.delete_endpoint(EndpointName=endpoint_name)
print(f"Endpoint '{endpoint_name}' deleted successfully.")


# Step 4: Feature engineering

You've now gone through one iteration of training and evaluating your model. Given that the outcome you reached for your model the first time probably wasn't sufficient for solving your business problem, what are some things you could change about your data to possibly improve model performance?

Lets inspect the dataset once again and decide what can be done to improve the performance.

In [None]:
import pandas as pd

# Load the Telco Customer Churn dataset
file_path = 'Telco-Customer-Churn.csv'
telco_data = pd.read_csv(file_path)

# Inspect the dataset
<CODE>


In [None]:
# Check class distribution
<CODE>
# Plot class distribution
<CODE>

There are two main ways to handle imbalanced datasets:

- Oversample to add more positive samples
    - Random oversampling
    - [Synthetic minority oversampling technique (SMOTE)](https://arxiv.org/abs/1106.1813)
- Undersample to reduce the negative samples
    - Random undersampling
    - Generate centroids using clustering methods
    
We can deduce that there are many negative examples in the dataset.
We will use SMOTE to increase the number of positive examples. We will seperate the features and target, do the split of the data and then check the class distribution. 

In [27]:
!pip install imblearn

Collecting imblearn
  Downloading imblearn-0.0-py2.py3-none-any.whl.metadata (355 bytes)
Collecting imbalanced-learn (from imblearn)
  Downloading imbalanced_learn-0.12.4-py3-none-any.whl.metadata (8.3 kB)
Downloading imblearn-0.0-py2.py3-none-any.whl (1.9 kB)
Downloading imbalanced_learn-0.12.4-py3-none-any.whl (258 kB)
Installing collected packages: imbalanced-learn, imblearn
Successfully installed imbalanced-learn-0.12.4 imblearn-0.0


In [None]:
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE

# Separate features and target
<CODE>

# Perform train-test split
X_train, X_test, y_train, y_test = <CODE>

# Check class distribution in training data before SMOTE
<CODE>


In [None]:
# Initialize SMOTE
smote = SMOTE(random_state=42)

# Apply SMOTE to the training data
X_train_smote, y_train_smote = smote.fit_resample(X_train, y_train)

# Check class distribution after SMOTE
<CODE>


In [None]:
# Plot the class distribution after SMOTE
<CODE>


In [None]:
from sagemaker.amazon.common import write_numpy_to_dense_tensor
import os
import numpy as np

# Convert training data to NumPy arrays in float32 format
X_train_np = <CODE> # Convert features
y_train_np = <CODE> # Convert labels

# Save as a protobuf file
train_file = 'train_data'
with open(train_file, 'wb') as f:
    write_numpy_to_dense_tensor(f, X_train_np, y_train_np)

# Check file size for validation
print(f"Training data saved: {os.path.getsize(train_file)} bytes")


Training data saved: 1216992 bytes


In [None]:
train_data_s3_path = sagemaker_session.upload_data(
    path=train_file,
    bucket='<LabBucketName>',
    key_prefix='linear-learner-smote/train'
)
print("Training data S3 path:", train_data_s3_path)



Training data S3 path: s3://pranav-churn1/linear-learner-smote/train/train_data


Define LinearLearner with instance count 1 and type as ml.m5.large

In [None]:
# Instantiate LinearLearner
linear_learner = LinearLearner(
    role=role,
    instance_count=<CODE>,
    instance_type='<CODE>',
    predictor_type='<CODE>',
    output_path=f's3://{bucket}/{prefix}/output',
    sagemaker_session=sagemaker_session,
    hyperparameters={
        'predictor_type': 'binary_classifier',
        'feature_dim': feature_dim,
        'epochs': 50,            # Fixed epochs
        'optimizer': 'adam',     # Fixed optimizer
        'mini_batch_size': 200,  # Fixed mini_batch_size
    }
)


Define Sagemaker session and bucket for model artifacts

In [None]:
# SageMaker session and role
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

# S3 bucket for model artifacts
bucket = '<LabBucketName>'
prefix = 'linear-learner-churn'


In [None]:
import numpy as np
from sagemaker.inputs import TrainingInput
import io
import sagemaker.amazon.common as smac

# Combine features and labels into protobuf format
<CODE>


# Upload the data to S3
<CODE>

# Create the S3 path for your training data
<CODE>

# Create TrainingInput object
<CODE>

# Now fit the model using the new input format
<CODE>

In [37]:
# Deploy the trained model
predictor = linear_learner.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large'
)


INFO:sagemaker.image_uris:Same images used for training and inference. Defaulting to image scope: inference.
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.
INFO:sagemaker:Creating model with name: linear-learner-2024-11-27-07-11-01-826
INFO:sagemaker:Creating endpoint-config with name linear-learner-2024-11-27-07-11-01-826
INFO:sagemaker:Creating endpoint with name linear-learner-2024-11-27-07-11-01-826


-------!

Serialize the features in CSV format and display the predictions. Use a threshold of 0.5 for the same. 

In [None]:
import io

# Serialize test features into CSV format
<CODE>

# Verify the serialized CSV data
print(type(test_csv_data))  # Ensure this is a string
print(test_csv_data[:500])  # Preview first 500 characters

from sagemaker.predictor import Predictor

# Create a predictor for the endpoint
<CODE>

# Perform inference
<CODE>


# Example raw predictions (byte string from SageMaker endpoint)
raw_predictions = predictions
# Step 1: Decode and parse the byte string
decoded_predictions = <CODE>
parsed_predictions = <CODE>

# Step 2: Extract scores and predicted labels
predictions_list = <CODE>
scores = <CODE>
predicted_labels = <CODE>

# Step 3: Convert scores to binary labels using a threshold
threshold = 0.5
y_pred = <CODE>

# Print results
print("Scores:", scores[0])
print("Predicted Labels (from model):", predicted_labels[0])
print("Binary Labels (custom threshold):", y_pred[0])


Plot the confusion and classification matrix.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

def plot_confusion_matrix(cm, classes, title='Confusion Matrix', cmap=plt.cm.Blues):
    <CODE>


# Print classification report
    <CODE>

 Plot ROC-AUC curve

In [None]:
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
import json

<CODE>

 Plot precision-recall curve

In [None]:
from sklearn.metrics import precision_recall_curve, average_precision_score
import matplotlib.pyplot as plt

<CODE>

What can you deduce from the output above? Has the performance improved for the model?
Does it meet your business goals?
Are you aware about any more techniques that can improve the model performance?
Optional Step: Apply hyperparameter tuning to above solution to check if it can improve the performance. 

# Step 5: Hyperparameter Tuning

When you build complex machine learning systems like deep learning neural networks, exploring all of the possible combinations is impractical. Hyperparameter tuning can accelerate your productivity by trying many variations of a model. It looks for the best model automatically by focusing on the most promising combinations of hyperparameter values within the ranges that you specify. To get good results, you must choose the right ranges to explore.

Refer to this [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html) to explore hyperparameter tuning strategies available in Amazon SageMaker


In [None]:
# Import necessary modules
import os
import numpy as np
from sagemaker.tuner import HyperparameterTuner, IntegerParameter, ContinuousParameter, CategoricalParameter
from sagemaker.amazon.common import write_numpy_to_dense_tensor
from sklearn.model_selection import train_test_split

# Split the SMOTE data into training and validation sets
<CODE>

# Convert to NumPy arrays in float32 format
<CODE>

# Save training data to a local file in RecordIO protobuf format
<CODE>

# Save validation data to a local file in RecordIO protobuf format
<CODE>

# Upload the data to S3
<CODE>

# Specify the data channels
<CODE>

# Define hyperparameter ranges (only tunable hyperparameters)
hyperparameter_ranges = {
    'learning_rate': ContinuousParameter(0.0001, 0.2),
    'l1': ContinuousParameter(0.0, 1.0),
    'wd': ContinuousParameter(0.0, 1.0),  # 'wd' is weight decay (L2 regularization)
    'use_bias': CategoricalParameter(['True', 'False']),  # Use strings for booleans
    'positive_example_weight_mult': ContinuousParameter(0.5, 1.0)
}

# Define the objective metric
objective_metric_name = 'validation:binary_classification_accuracy'

# Metric definitions for parsing the logs
metric_definitions = [
    {'Name': 'validation:binary_classification_accuracy', 'Regex': 'validation: BinaryClassificationAccuracy=([0-9\\.]+)'},
]

# Create a HyperparameterTuner
tuner = HyperparameterTuner(
    estimator=linear_learner,
    objective_metric_name=objective_metric_name,
    objective_type='Maximize',  # We aim to maximize accuracy
    hyperparameter_ranges=hyperparameter_ranges,
    metric_definitions=metric_definitions,
    max_jobs=20,                # Total number of training jobs
    max_parallel_jobs=2         # How many training jobs can run in parallel
)

# Start hyperparameter tuning
tuner.fit(inputs=data_channels)

In [None]:
# Deploy the best model
best_estimator = tuner.best_estimator()
predictor = best_estimator.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large'
)

Now perform similar evaluation techniques as previous steps

### Final Model Analysis & Conclusions

After hyperparameter tuning, it's crucial to analyze how our model performance changed:

1. Impact of Hyperparameter Tuning:
   - While we found the "best" model through tuning, our metrics show that the improvements were modest
   - This suggests that the basic model architecture might be a limiting factor, not just the parameters
   - Sometimes simpler models have an inherent performance ceiling that tuning alone can't overcome

2. Evaluation Metrics Review:
   - **Confusion Matrix**: Still shows imbalance in prediction capabilities
     - The model remains better at identifying non-churners than churners
     - This is a common challenge in churn prediction due to class imbalance
   
   - **ROC-AUC**: A good but not excellent score
     - Shows model performs better than random but has room for improvement
     - Suggests we might need to look beyond just tuning hyperparameters
   
   - **Precision-Recall**: Trade-off remains significant
     - Difficulty in simultaneously achieving high precision and recall
     - Indicates we might need different approaches for better balance

3. Next Steps to Consider:
   - Feature engineering might have more impact than further parameter tuning
   - Consider trying different model architectures (e.g., ensemble methods, deep learning)
   - Collect more data or different types of features
   - Address class imbalance through advanced sampling techniques
   - Consider business-specific cost functions instead of standard metrics

Remember: In real-world churn prediction, a modest improvement in model performance can translate to significant business value. The goal isn't always to achieve perfect metrics, but to create a model that provides actionable insights and positive ROI for retention efforts. 