# **Customer Travel Experience**


## Business Problem 

Objective:
The Tour & Travels company is experiencing challenges in retaining its customers, which leads to significant financial losses. The company seeks to predict customer churn based on various factors, including demographic information, service usage, and engagement with the company's offerings. By developing a predictive model, the company aims to proactively identify customers at risk of churning and implement targeted retention strategies to reduce customer attrition and save costs.

Dataset Overview:
The dataset contains the following features:

Age: The age of the customer.
FrequentFlyer: Indicates whether the customer is enrolled in a frequent flyer program.
AnnualIncomeClass: The income category of the customer.
ServicesOpted: The services used by the customer.
AccountSyncedToSocialMedia: Indicates whether the customer's account is synced to social media.
BookedHotelOrNot: Indicates whether the customer has booked a hotel through the company's platform.
Target: The target variable indicating whether the customer churned (1) or not (0).
Goal:
The goal is to build a robust predictive model using machine learning techniques to accurately classify customers as potential churners or loyal customers. This model will help the Tour & Travels company implement preemptive measures to retain customers and minimize revenue loss.

Expected Outcome:
The project aims to deliver a predictive model that:

Identifies key indicators contributing to customer churn.
Provides actionable insights for customer retention strategies.
Improves the company's ability to retain customers, ultimately increasing profitability.
This solution will empower the company to allocate resources more effectively and enhance overall customer satisfaction.


## 1. Business Understanding

The primary objective of this project is to reduce customer churn by leveraging data analytics and predictive modeling. By accurately predicting which customers are at risk of churning, the company can:

Implement Targeted Retention Campaigns: Develop personalized offers and services for at-risk customers to improve retention.
Optimize Marketing Spend: Allocate resources more efficiently by focusing on customers with a higher likelihood of staying.
Increase Customer Lifetime Value (CLV): Retain customers for a longer duration, thereby enhancing their overall contribution to the company's revenue.
Key Success Metrics: To measure the success of this project, the following key performance indicators (KPIs) will be considered:

Churn Prediction Accuracy: The ability of the predictive model to correctly identify customers who will churn.
Reduction in Churn Rate: A decrease in the overall churn rate after implementing the model’s insights.
Retention Campaign ROI: The return on investment from retention campaigns driven by the model's predictions.
Customer Lifetime Value (CLV) Improvement: An increase in the average CLV as a result of reduced churn.
Strategic Impact: Successfully predicting customer churn and taking proactive measures will enable the company to:

Enhance customer loyalty and satisfaction.
Maintain a competitive edge in the market by reducing customer attrition.
Improve overall business profitability by retaining more customers and reducing the costs associated with acquiring new ones.



## 2. Data Understanding
customers of a Tour & Travels company, aiming to help predict whether a customer will churn or not. The features in the dataset include:

Age: The age of the customer.
FrequentFlyer: Indicates whether the customer is enrolled in a frequent flyer program (Yes/No).
AnnualIncomeClass: The income category of the customer (e.g., Low, Medium, High).
ServicesOpted: The specific services used by the customer (e.g., Flights, Hotels).
AccountSyncedToSocialMedia: Indicates whether the customer's account is linked to social media (Yes/No).
BookedHotelOrNot: Indicates whether the customer has booked a hotel through the company's platform (Yes/No).
Target: The target variable indicating whether the customer churned (1) or not (0).
This dataset will be used to develop a predictive model to identify customers who are at risk of churning, helping the company take preventive actions.


## 3. Data Preparation

In [1]:
import pandas as pd 
import numpy as np
import scipy as stats
from scipy.stats import zscore
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
data = pd.read_csv("Customertravel.csv")
data.head()

Unnamed: 0,Age,FrequentFlyer,AnnualIncomeClass,ServicesOpted,AccountSyncedToSocialMedia,BookedHotelOrNot,Target
0,34,No,Middle Income,6,No,Yes,0
1,34,Yes,Low Income,5,Yes,No,1
2,37,No,Middle Income,3,Yes,No,0
3,30,No,Middle Income,2,No,No,0
4,30,No,Low Income,1,No,No,0


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 954 entries, 0 to 953
Data columns (total 7 columns):
 #   Column                      Non-Null Count  Dtype 
---  ------                      --------------  ----- 
 0   Age                         954 non-null    int64 
 1   FrequentFlyer               954 non-null    object
 2   AnnualIncomeClass           954 non-null    object
 3   ServicesOpted               954 non-null    int64 
 4   AccountSyncedToSocialMedia  954 non-null    object
 5   BookedHotelOrNot            954 non-null    object
 6   Target                      954 non-null    int64 
dtypes: int64(3), object(4)
memory usage: 52.3+ KB


In [4]:
data.shape

(954, 7)

In [5]:
data.describe()

Unnamed: 0,Age,ServicesOpted,Target
count,954.0,954.0,954.0
mean,32.109015,2.437107,0.234801
std,3.337388,1.606233,0.424097
min,27.0,1.0,0.0
25%,30.0,1.0,0.0
50%,31.0,2.0,0.0
75%,35.0,4.0,0.0
max,38.0,6.0,1.0


In [6]:
data["Age"].mean()

32.109014675052414

In [7]:
data['FrequentFlyer'].value_counts()

FrequentFlyer
No           608
Yes          286
No Record     60
Name: count, dtype: int64

In [8]:
data['AnnualIncomeClass'].value_counts()

AnnualIncomeClass
Middle Income    409
Low Income       386
High Income      159
Name: count, dtype: int64

In [9]:
data['BookedHotelOrNot'].value_counts()

BookedHotelOrNot
No     576
Yes    378
Name: count, dtype: int64

In [10]:
data['Target'].value_counts()

Target
0    730
1    224
Name: count, dtype: int64

### Data Cleaning

In [11]:
data.isna().sum()

Age                           0
FrequentFlyer                 0
AnnualIncomeClass             0
ServicesOpted                 0
AccountSyncedToSocialMedia    0
BookedHotelOrNot              0
Target                        0
dtype: int64

In [12]:
data.duplicated()

0      False
1      False
2      False
3      False
4      False
       ...  
949     True
950    False
951     True
952     True
953     True
Length: 954, dtype: bool

In [13]:
data.columns

Index(['Age', 'FrequentFlyer', 'AnnualIncomeClass', 'ServicesOpted',
       'AccountSyncedToSocialMedia', 'BookedHotelOrNot', 'Target'],
      dtype='object')