# **Experiment Notebook**



In [1]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

<hr>

## A. Project


In [2]:
student_name = 'Yi Xiao'

In [3]:
student_id = '14356721'

In [4]:
experiment_id = '1'

<hr>

## B. Experiment Description


In [5]:
experiment_hypothesis = 'The hypothesis for this experiment is that training a RandomForestClassifier on the given dataset will improve model performance in terms of F1-Score and accuracy compared to the baseline model. '

In [6]:
experiment_expectations = 'The RandomForest model, known for its ability to handle class imbalances and complex patterns in data, should lead to higher F1-Scores by better predicting churners while minimizing false positives and false negatives.'

<hr>

## C. Data Understanding


### C.0 Import Packages

In [7]:
# Pandas for data handling
import pandas as pd

# Scikit Learn for ML training
import sklearn

# Altair for plotting
import altair as alt

# <fill_this>
#import 

<hr>

### C.1   Load Datasets

In [8]:
# Load training set
# Do not change this code

X_train = pd.read_csv('X_train.csv')
y_train = pd.read_csv('y_train.csv')

In [9]:
# Load validation set
# Do not change this code

X_val = pd.read_csv('X_val.csv')
y_val = pd.read_csv('y_val.csv')

In [10]:
# Load testing set
# Do not change this code

X_test = pd.read_csv('X_test.csv')
y_test = pd.read_csv('y_test.csv')

<hr>

<hr>

## D. Feature Selection


In [11]:
feature_selection_executive_summary = 'Use the same list of features from experiment 0.'

In [12]:
features_list = ['AccountAge', 'MonthlyCharges', 'TotalCharges', 'ViewingHoursPerWeek',
       'AverageViewingDuration', 'ContentDownloadsPerMonth']

<hr>

## E. Data Preparation

In [13]:
data_preparation_executive_summary = 'No major data issues were found that could impact training.'

> Rationale: No major data issues were found that could impact training.

> Results: No major data issues were found that could impact training.

<hr>

## F. Feature Engineering

In [14]:
data_preparation_executive_summary_2 = 'No additional feature engineering was performed at this stage. '

> Rationale: Further feature engineering may be explored in subsequent experiments.

> Results: No additional feature engineering was performed at this stage. 

<hr>

## G. Train Machine Learning Model

In [15]:
train_model_executive_summary = 'The Random Forest algorithm was selected for training, and I used grid search to identify the best parameters, the result shows a better f1 and accuracy score than the baseline model. '

### G.1 Import Algorithm

> Rationale: RF aggregates the results of multiple decision trees to provide better generalisation and reduce overfitting. It is well-suited for imbalanced classification problems like churn prediction.

In [16]:
from sklearn.ensemble import RandomForestClassifier

rf_model = RandomForestClassifier(random_state=42)

<hr>

### G.2 Set Hyperparameters

> Rationale: Values of 100 and 150 to ensure that the model has enough trees. For max_depth, values of 10 and 15 were selected to limit the depth of each tree to control overfitting. 

In [17]:
param_grid_simple = {'n_estimators': [100, 150],
                     'max_depth': [10, 15]}

<hr>

### G.3 Fit Model

In [18]:
from sklearn.model_selection import GridSearchCV

clf_simple = RandomForestClassifier(random_state=42)

grid_search_simple = GridSearchCV(clf_simple, param_grid_simple, cv=3)
grid_search_simple.fit(X_train, y_train.values.ravel())

rf_train_preds = grid_search_simple.predict(X_train)
rf_val_preds = grid_search_simple.predict(X_val)
rf_test_preds = grid_search_simple.predict(X_test)

<hr>

### G.4 Model Technical Performance

In [19]:
from sklearn.metrics import accuracy_score, f1_score

rf_train_preds = grid_search_simple.predict(X_train)
rf_val_preds = grid_search_simple.predict(X_val)
rf_test_preds = grid_search_simple.predict(X_test)

rf_accuracy_train = accuracy_score(y_train, rf_train_preds)
rf_f1_train = f1_score(y_train, rf_train_preds, average='weighted')

rf_accuracy_val = accuracy_score(y_val, rf_val_preds)
rf_f1_val = f1_score(y_val, rf_val_preds, average='weighted')

rf_accuracy_test = accuracy_score(y_test, rf_test_preds)
rf_f1_test = f1_score(y_test, rf_test_preds, average='weighted')

print(f"Train Set - Accuracy: {rf_accuracy_train:.4f}, F1-Score: {rf_f1_train:.4f}")
print(f"Validation Set - Accuracy: {rf_accuracy_val:.4f}, F1-Score: {rf_f1_val:.4f}")
print(f"Test Set - Accuracy: {rf_accuracy_test:.4f}, F1-Score: {rf_f1_test:.4f}")

Train Set - Accuracy: 0.8564, F1-Score: 0.8155
Validation Set - Accuracy: 0.8217, F1-Score: 0.7634
Test Set - Accuracy: 0.8249, F1-Score: 0.7661


> Results: The RandomForestClassifier outperforms the baseline in all metrics.

<hr>

### G.5 Business Impact from Current Model Performance

In [24]:
avg_subscription_fee = X_test['MonthlyCharges'].mean()
discount_per_month = (X_test['MonthlyCharges'].mean())*0.5
discount_duration_months = 3
retention_offer_cost_per_customer = discount_per_month * discount_duration_months  
lost_revenue_per_churn = X_test['MonthlyCharges'].mean()

predicted_churners = sum(rf_test_preds == 1)  
actual_churners = sum((y_val == 1).values)  
false_negatives = actual_churners - predicted_churners  

total_intervention_cost = predicted_churners * retention_offer_cost_per_customer
total_lost_revenue_from_false_negatives = false_negatives * lost_revenue_per_churn

print(f"Total predicted churners: {predicted_churners}")
print(f"Total cost of retention interventions: ${total_intervention_cost}")
print(f"Total lost revenue from undetected churners (false negatives): ${total_lost_revenue_from_false_negatives}")


Total predicted churners: 169
Total cost of retention interventions: $3168.337208113153
Total lost revenue from undetected churners (false negatives): $[13235.77555579]


> Results: Total 169 churners were detected, and cost of retaining these customers who will churn will be $3168.

<hr>

## H. Experiment Outcomes

In [21]:
final_experiment_outcome = 'Hypothesis Confirmed'

> Key Learnings: The RandomForestClassifier outperformed the baseline model across all datasets, particularly precision and F1-score. This suggests that Random Forest captures more patterns in the data compared to the baseline model, which simply predicts the most frequent class.

> Recommendations for Next Experiment:  Experimenting with ensemble methods such as XGBoost.

<hr>

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=803b396d-170a-478e-bdcc-2487a2a4ebcf' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>