# <ins>Workshop 5</ins>:

# Challenge: Assigning a Churn Risk Score to Enhance Customer Retention

<ins>**Overview:**</ins>

In today’s rapidly evolving digital service market, customer retention is more critical than ever. Businesses face the challenge of high churn rates, which can significantly impede sustainability and growth. This project harnesses the power of machine learning to analyze customer behaviors and interactions, aiming to predict and mitigate potential churn by assigning a churn risk score to each customer. This score serves as a strategic tool for implementing targeted retention strategies, ultimately enhancing customer loyalty and reducing turnover.

<ins>**Problem Statement:**</ins>

The main objective of this challenge is to develop a predictive model capable of calculating a churn risk score from 1 to 5 for each customer, based on their interaction and transaction data. This score will help identify customers who are at risk of ending their services, allowing the business to proactively engage with them through tailored retention efforts. By understanding the patterns that lead to customer churn, companies can take actionable steps to retain their most vulnerable customers and thereby improve their overall retention rates.

<img src="https://www.rulex.ai/wp-content/uploads/2022/10/predicting-customer-churn-machinelearning.png" width="650" height="515" alt="Description of the image">

## Dataset description:

<ins>**The dataset folder contains the following files:**</ins>

* train.csv: 25080 x 25
* test.csv: 10749 x 24
* sample_submission.csv: 5 x 2

<ins>**The columns provided in the dataset are as follows:**</ins>

* customer_id - Represents the unique identification number of a customer
* Name - Represents the name of a customer
* age - Represents the age of a customer
* gender - Represents the gender of a customer
* security_no - Represents a unique security number that is used to identify a person
* region_category - Represents the region that a customer belongs to
* membership_category - Represents the category of the membership that a customer is using
* joining_date - Represents the date when a customer became a member
* joined_through_referral - Represents whether a customer joined using any referral code or ID
* referral_id - Represents a referral ID
* preferred_offer_types - Represents the type of offer that a customer prefers
* medium_of_operation - Represents the medium of operation that a customer uses for transactions
* internet_option - Represents the type of internet service a customer uses
* last_visit_time - Represents the last time a customer visited the website
* days_since_last_login - Represents the no. of days since a customer last logged into the website
* avg_time_spent - Represents the average time spent by a customer on the website
* avg_transaction_value - Represents the average transaction value of a customer
* avg_frequency_login_days - Represents the no. of times a customer has logged in to the website
* points_in_wallet - Represents the points awarded to a customer on each transaction
* used_special_discount - Represents whether a customer uses special discounts offered
* offer_application_preference - Represents whether a customer prefers offers
* past_complaint - Represents whether a customer has raised any complaints
* complaint_status - Represents whether the complaints raised by a customer was resolved
* feedback - Represents the feedback provided by a customer
* churn_risk_score - Represents the churn risk score that ranges from 1 to 5

## Objectives

* **Understand Key Factors Influencing Churn**: Identify which features most significantly predict customer churn to focus retention efforts effectively.

* **Segment High-Risk Customers**: Pinpoint customer segments with higher churn scores to tailor specific engagement strategies.

* **Predict Customer Churn Scores**: Develop a predictive model that assigns a churn risk score based on customer data, aiding in early intervention.

* **Actionable Retention Strategies**: Use model insights to implement targeted actions designed to enhance customer loyalty and reduce churn rates.

## Submission

<ins>**Evaluation metric**</ins>

score = 100 x metrics.f1_score(actual, predicted, average="macro")

<ins>**Result submission guidelines**</ins>

- The index is customer_id and the target is the churn_risk_score column.
- The submission file must be submitted in .csv format only.
- The size of this submission file must be 10749 x 1.

<ins>**Upload your csv file with your names in the following drive**</ins>

Example : Predictions_Martina_Daniel.csv

https://drive.google.com/drive/folders/1FfUT50oAzMlBXf0SL5J16G0lwStAQ7Vs?usp=sharing


In [None]:
import pandas as pd

In [None]:
data_train = pd.read_csv('dataset/train.csv')
data_test = pd.read_csv('dataset/test.csv')

display(data_train.head())
display(data_test.head())

print(data_train.shape)
print(data_test.shape)

Unnamed: 0,customer_id,Name,age,gender,security_no,region_category,membership_category,joining_date,joined_through_referral,referral_id,...,avg_time_spent,avg_transaction_value,avg_frequency_login_days,points_in_wallet,used_special_discount,offer_application_preference,past_complaint,complaint_status,feedback,churn_risk_score
0,fffe43004900440031003400390031003800,Yolonda Macrae,29,F,V1GWL9V,City,Gold Membership,2016-04-29,No,xxxxxxxx,...,30.04,2442.28,10.0,749.99,Yes,No,Yes,Solved in Follow-up,Poor Website,3
1,fffe43004900440035003400390039003300,Faustina Fortney,51,M,TSOIJHD,,No Membership,2015-07-21,?,CID14933,...,32.43,39295.25,22.0,624.59,Yes,No,No,Not Applicable,No reason specified,4
2,fffe43004900440031003200310034003700,Katrina Ormsby,17,M,OWT9O81,City,Gold Membership,2017-01-02,No,xxxxxxxx,...,-664.450276,27455.61,8.0,767.9,Yes,No,Yes,No Information Available,Quality Customer Care,1
3,fffe4300490044003100340030003000,Danny Earp,39,F,RSJ3WS2,Town,Gold Membership,2017-08-29,No,xxxxxxxx,...,92.4,6839.27,5.0,717.91,No,Yes,No,Not Applicable,User Friendly Website,2
4,fffe43004900440035003300300035003400,Marilynn Arboleda,19,F,MV340UY,Town,Silver Membership,2017-02-19,Yes,CID40828,...,147.37,3168.96,7.0,1145.343275,Yes,No,No,Not Applicable,Poor Product Quality,3


Unnamed: 0,customer_id,Name,age,gender,security_no,region_category,membership_category,joining_date,joined_through_referral,referral_id,...,days_since_last_login,avg_time_spent,avg_transaction_value,avg_frequency_login_days,points_in_wallet,used_special_discount,offer_application_preference,past_complaint,complaint_status,feedback
0,fffe43004900440036003200330035003700,Sharri Bevel,54,F,CDRMCUM,City,Basic Membership,2015-10-30,No,xxxxxxxx,...,15,225.39,19197.37,12.0,581.98,No,Yes,No,Not Applicable,Poor Product Quality
1,fffe43004900440036003300310035003000,Kelsie Godbout,12,F,VRTMJMK,City,Silver Membership,2016-03-13,Yes,CID52941,...,20,228.3,45714.19,17.0,,Yes,No,No,Not Applicable,Poor Product Quality
2,fffe43004900440032003300330033003500,Frederica Abel,14,M,CDAIQNS,Town,Basic Membership,2016-10-08,No,xxxxxxxx,...,22,151.88,7647.54,25.0,661.78,No,Yes,Yes,No Information Available,No reason specified
3,fffe43004900440032003200390035003600,Ranee Benjamin,22,M,D61MX1W,Town,No Membership,2017-07-09,No,xxxxxxxx,...,13,30.41,20749.5,29.0,169.409762,No,Yes,Yes,No Information Available,Poor Product Quality
4,fffe43004900440033003200300039003200,Livia Sarmiento,50,F,UX3O9WM,City,Silver Membership,2017-12-04,Yes,CID10753,...,10,109.35,25880.01,6.0,718.07,Yes,No,Yes,Solved in Follow-up,No reason specified


(25080, 25)
(10749, 24)
