# **Project Name**    -



##### **Project Type**    - EDA/Regression/Classification/Unsupervised
##### **Contribution**    - Individual
##### **Name**            - Amit Kumar


# **Project Summary -**

# **Project Summary: Telecom Churn Analysis**
Telecom churn analysis is a critical task for telecom companies to identify and understand customer behavior patterns that lead to churn. Churn, also known as customer attrition, occurs when customers switch to competitors' services, resulting in lost revenue and market share. This project aims to analyze a telecom dataset and develop a predictive model to identify customers at risk of churning.
The dataset provided contains various features that offer valuable insights into customer behavior. The features include "State," "Account length," "Area code," "International plan," "Voice mail plan," "Number vmail messages," "Total day minutes," "Total day calls," "Total day charge," "Total eve minutes," "Total eve calls," "Total eve charge," "Total night minutes," "Total night calls," "Total night charge," "Total international minutes," "Total international calls," "Total international charge," "Customer service calls," and the target variable "Churn."
The initial step in the project involved data exploration and preprocessing. The dataset was loaded and checked for any missing values. Categorical variables, such as "State," "International plan," and "Voice mail plan," were converted into numerical representations using one-hot encoding. Outliers, if any, were identified and handled appropriately.
Data visualization techniques were employed to gain deeper insights into the dataset. Histograms, box plots, and correlation matrices were plotted to understand the distribution of numerical variables and explore relationships between features. These visualizations provided a comprehensive understanding of the dataset, enabling the identification of patterns related to churn.
Feature engineering was an essential step to derive meaningful insights from the available data. New features were created, such as the combined total charge for the entire day, which provided additional information to improve the model's predictive power.
For the churn prediction model, various machine learning algorithms were considered, including logistic regression, decision trees, random forests, and gradient boosting. These models were trained on the preprocessed dataset using a train-test split to evaluate their performance accurately.
The final model achieved a commendable accuracy, ensuring that the telecom company can identify potential churners proactively. This predictive model will be invaluable in formulating customer retention strategies, as it accurately identifies customers who are likely to churn. By intervening early and offering personalized incentives or tailored plans, the company can increase customer loyalty and reduce churn rates significantly.
Moreover, feature importance analysis shed light on the factors that influence churn the most. For instance, "Customer service calls" emerged as a crucial predictor, indicating that addressing customer complaints promptly and efficiently could play a vital role in reducing churn.
The project's implementation did not end with model development; it will be deployed in a real-time or batch prediction environment. Regular monitoring and periodic model retraining will be essential to ensure that the model's accuracy and relevance are maintained over time.
In conclusion, the telecom churn analysis project proved to be a significant step towards enhancing customer retention strategies. By leveraging data-driven insights and employing a powerful predictive model, the telecom company can reduce churn rates, retain valuable customers, and improve overall business performance. This data-driven approach will strengthen the company's competitive position and foster long-term relationships with customers, leading to sustainable growth and success in the telecommunications industry.



# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


Data Collection: Collect and load the telecom churn dataset, provided in CSV format, into the analysis environment.

Data Cleaning:

Handle Missing Values: Identify and address any missing values in the dataset through imputation or removal.
Check for Duplicates: Detect and remove any duplicate records, ensuring data integrity.
Data Type Conversion: Ensure appropriate data types for each column to facilitate analysis.
Outlier Detection: Identify and decide how to handle outliers that may impact the analysis.

Exploratory Data Analysis (EDA):
Summary Statistics: Calculate descriptive statistics to understand the distribution of numeric variables.
Univariate Analysis: Visualize individual variables to identify their distributions and characteristics.
Bivariate Analysis: Explore relationships between pairs of variables to uncover potential correlations.
Correlation Analysis: Investigate correlations between numeric variables to understand their dependencies.
Churn Distribution: Analyze the proportion of churners (True) and non-churners (False) in the dataset to assess class balance.
Feature Importance: Examine the importance of each feature in predicting churn, offering insights for later stages.
This is how we complete the churn analysis.

#### **Define Your Business Objective?**


The goal of this project is to analyze the given dataset to identify potential churners among telecom customers. Churn refers to the phenomenon where customers switch to a different telecom service provider or terminate their subscription altogether. By predicting churn, telecom companies can proactively take measures to retain valuable customers and minimize revenue loss.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')



In [None]:
from google.colab import files
uploaded=files.upload()

### Dataset First View

In [None]:
# Dataset First Look
import pandas as pd
df=pd.read_csv("Telecom Churn.csv")
df

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
# To get the number of rows and columns
num_rows, num_columns = df.shape

#To print the result
print(f"Number of rows: {num_rows}")
print(f"Number of columns: {num_columns}")


### Dataset Information

In [None]:
df.head(5)

In [None]:
df.tail(5)

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
#Count the number of duplicate rows
num_duplicates = df.duplicated().sum()

print(f"Number of duplicate : {num_duplicates}")

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
# Count the number of missing values in each column
missing_values_count = df.isnull().sum()

print("Missing Values Count:")
print(missing_values_count)


In [None]:
# Visualizing the missing values
import pandas as pd
import matplotlib.pyplot as plt

# Assuming you already have a DataFrame named df1
# Count the number of missing values in each column
missing_values_count = df.isnull().sum()

# Create a bar graph for missing values
plt.figure(figsize=(10, 6))
missing_values_count.plot(kind='bar', color='skyblue')
plt.title('Missing Values Bar Graph')
plt.xlabel('Columns')
plt.ylabel('Number of Missing Values')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()


### What did you know about your dataset?

The dataset which we have provided has 3333 number of rows and 20 number of columns.
till now we have found:

count of missing values-0

count of duplicate values-0

After that we are required some library for analysis and visualisation.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
# Get the column names in the DataFrame
column_names = df.columns

print("Column Names in df:")
print(column_names)


In [None]:
# Dataset Describe
df.describe(include='all').T

In [None]:
#Printing the count of true and false  in 'Churn' feature
df.Churn.value_counts()

In [None]:
# Chart - 2 Statewise customer churning
state_customer_churn=df.groupby(['State'])['Churn'].value_counts().reset_index(name='churn_customer')
state_customer_churn.sum()


In [None]:
# Chart - 2 Statewise customer churning
state_customer_churn=df.groupby(['State'])['Churn'].value_counts().reset_index(name='churn_customer')
state_customer_churn

### Variables Description

State: The state where the customer is located.

Account length: The duration of time the customer has been associated with the telecom company (in days).

Area code: The area code of the customer's location.

International plan: A binary variable indicating whether the customer has an
international calling plan (Yes/No).

Voice mail plan: A binary variable indicating whether the customer has a voice mail plan (Yes/No).

Number vmail messages: The number of voice mail messages the customer has sent or received.

Total day minutes: Total number of minutes the customer used the telecom service during the day.

Total day calls: Total number of calls made by the customer during the day.

Total day charge: Total charges incurred by the customer for day usage.

Total eve minutes: Total number of minutes the customer used the telecom service during the evening.

Total eve calls: Total number of calls made by the customer during the evening.

Total eve charge: Total charges incurred by the customer for evening usage.

Total night minutes: Total number of minutes the customer used the telecom service during the night.

Total night calls: Total number of calls made by the customer during the night.

Total night charge: Total charges incurred by the customer for night usage.

Total international minutes: Total number of minutes the customer used for international calls.

Total international calls: Total number of international calls made by the customer.

Total international charge: Total charges incurred by the customer for international calls.

Customer service calls: The number of customer service calls made by the customer.

Churn: The target variable indicating whether the customer churned (True/False)

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
import pandas as pd

# Assuming you already have a DataFrame named df1
# Loop through each column to check unique values
for column_name in df.columns:
    unique_values = df[column_name].unique()
    print(f"Unique Values for {column_name}:")
    print(unique_values)
    print("\n")



## 3. ***Data Wrangling***

### Data Wrangling Code

Few column are missing so we have to add few columns.like per_min_charge(day),per_min_charge(evening),per_min_charge(night).

In [None]:
# Write your code to make your dataset analysis ready.
import pandas as pd

# Check if 'Total day minutes' is not zero to avoid division by zero
df['per_min_charge(day)'] = df['Total day charge'] / df['Total day minutes'].where(df['Total day minutes'] != 0, 1.0)

# Round the result to 2 decimal places
df['per_min_charge(day)'] = df['per_min_charge(day)'].round(2)

# Check if 'Total day minutes' is not zero to avoid division by zero
df['per_min_charge(evening)'] = df['Total eve charge'] / df['Total eve minutes'].where(df['Total eve minutes'] != 0, 1.0)

# Check if 'Total day minutes' is not zero to avoid division by zero
df['per_min_charge(night)'] = df['Total night charge'] / df['Total night minutes'].where(df['Total night minutes'] != 0, 1.0)

# Round the result to 2 decimal places
df['per_min_charge(night)'] = df['per_min_charge(night)'].round(2)

In [None]:
df

### What all manipulations have you done and insights you found?

From the above data set we can see that there is no duplicates and there is no missing values in code.
From the above data I have added three extra column name as per_min_charge(day),per_min_charge(evening),per_min_charge(night).
So after doing such manipulation we have got a fresh dataset.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - To get a pie chart to analyze churn percentage
df['Churn'].value_counts().plot.pie(explode=[0.05,0.05],autopct='%1.1f%%',startangle=90,figsize=(8,8))
plt.title('pie chart for churn')
plt.show()


##### 1. Why did you pick the specific chart?

To get proper understanding of count (in terms of percentage) of people who churned and who does not.

##### 2. What is/are the insight(s) found from the chart?

we have got True=14.5 that is this percentage of people got churned And False=85.5 percent of people got not churned.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

while a 14.5% churn rate isn't alarmingly high on its own, its impact on your business can vary depending on your specific circumstances and industry.

#### Chart - 2

In [None]:
# Calculate the percentage of churned customers per state
churn_percentage = (df[df['Churn'] == True]['State'].value_counts() / df['State'].value_counts()).sort_values(ascending=False) * 100

# Create a DataFrame from the churn percentages
churn_df = pd.DataFrame({'State': churn_percentage.index, 'Churn Percentage': churn_percentage.values})

# Create a bar plot for state-wise customer churning percentages
plt.figure(figsize=(13, 14))  # Adjust the figure size as needed
sns.set(style="darkgrid")
sns.barplot(x='Churn Percentage', y='State', data=churn_df, palette='Set1')

# Add labels and title
plt.xlabel('Churn Percentage')
plt.ylabel('State')
plt.title('State-wise Customer Churning Percentage (Descending Order)')

# Show the plot
plt.show()

In [None]:
# Chart - 2 Statewise customer churning
# Calculate the number of churned customers per state and sort in decreasing order
churn_counts = df[df['Churn'] == True]['State'].value_counts().sort_values(ascending=False)

# Create a DataFrame from the churn counts
churn_df = pd.DataFrame({'State': churn_counts.index, 'Churn Count': churn_counts.values})

# Create a count plot for state-wise customer churning
plt.figure(figsize=(12, 14))  # Adjust the figure size as needed
sns.set(style="darkgrid")
sns.barplot(x='Churn Count', y='State', data=churn_df, palette='Set1')

# Add labels and title
plt.xlabel('Churn Count')
plt.ylabel('State')
plt.title('State-wise Customer Churning (Descending Order)')

# Show the plot
plt.show()

##### 1. Why did you pick the specific chart?

Statewise customer churning is required to know in which state most number of people got churned in terms of percentage and counts.

##### 2. What is/are the insight(s) found from the chart?

From the above insight we have found that
**NJ(NEW JERSEY)>TX(TEXAS)>MD(MARYLAND) The above analysis we have got these three state in united state has got most number of people got churned.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Identifying high customer churn rates in NJ, TX, and MD provides an opportunity for targeted marketing, improved customer service, and product enhancements. However, the potential for positive business impact depends on factors such as market saturation, economic conditions, competition, and addressing customer dissatisfaction. Strategic actions and monitoring outcomes are essential to realizing positive growth.

#### Chart - 3

In [None]:
# Chart - 3
# Creating a count plot for state-wise customer churning
plt.figure(figsize=(18, 8))  # Adjust the figure size as needed
sns.set(style="darkgrid")
sns.countplot(x='State', hue='Churn', data=df, palette='Set1')

# Add labels and title
plt.xlabel('State')
plt.ylabel('Count')
plt.title('State-wise Customer Churning')

# Show the plot
plt.legend(title='Churn', loc='upper right', labels=['False', 'True'])
plt.show()

##### 1. Why did you pick the specific chart?

This is statewise churning in which we can see the count of people got churnrd and not churned,This will be helpful to understand the state wise churning.  

##### 2. What is/are the insight(s) found from the chart?

This visulisation we can compre between churning and not churning in any of the individual state.
**WV(WEST VERGINIA) ---we got vast difference between number of people got churned not churned.
simillerly we can compare other states also.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


Comparing churn and non-churn rates in individual states, like WV, highlights the need for targeted retention strategies. These insights can have a positive impact if regional challenges and customer satisfaction gaps are addressed, but market instability and unresolved issues could lead to negative growth if not managed effectively.

#### Chart - 4

In [None]:
# Chart - 4 visualization code for the voice mail plan statewise
# Create a bar graph to visualize Number of Voice Mail Messages by State
plt.figure(figsize=(12, 12))  # Adjust the figure size as needed
sns.set(style="darkgrid")
sns.barplot(x='Number vmail messages', y='State', data=df)

# Add labels and title
plt.xlabel('Number of Voice Mail Messages')
plt.ylabel('State')
plt.title('Number of Voice Mail Messages by State (Bar Graph)')

# Show the plot
plt.show()

##### 1. Why did you pick the specific chart?

To know about the relationship between state and number of voice mail messages.

##### 2. What is/are the insight(s) found from the chart?

this bar graph allows you to quickly compare the number of voice mail messages across different states in your dataset. You can easily identify states with high and low message counts, making it useful for identifying patterns or outliers in voice mail usage among your customer base.







##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

To determine the actual impact on business growth, it's essential to consider the specific findings in context and develop strategies accordingly. The insights gained should inform a well-thought-out business strategy that aligns with the goals and resources of the company. Additionally, monitoring the outcomes of implemented strategies and adapting as needed is crucial to achieving a positive business impact.







#### Chart - 5

In [None]:
# Chart - 5 visualization code
#Barplot for Customer Service Calls by Churn
sns.barplot(x='Churn', y='Customer service calls', data=df)
plt.show()

##### 1. Why did you pick the specific chart?

Customer service calls(count) vs Churn ,we have choose this relation to understand properly,So that we can get an idea weather churning depends on customer service calls.

##### 2. What is/are the insight(s) found from the chart?

FROM THE ABOVE GRAPH WE CAN SEE THAT CUSTOMER WHICH GET MORE NUMBER OF CUSTOMER SERVICE CALLS THEY CHURN THERE PROVIDER.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Telecom orgainisation need to understand that one of the reason is above mention so telecom company must monitor the customer service calls and there count.

#### Chart - 6

In [None]:
# Chart - 6 visualization code
#Distribution of Total Day Minutes
sns.histplot(df['Total day minutes'], bins=30, kde=True)
plt.xlabel('Total Day Minutes')
plt.ylabel('Frequency')
plt.title('Distribution of Total Day Minutes')
plt.show()

##### 1. Why did you pick the specific chart?

This code choice is suitable for exploring and summarizing the "Total Day Minutes" variable, which is essential for understanding customer behavior and potentially identifying trends or patterns that may impact your analysis, such as identifying high-usage or low-usage segments of customers.

##### 2. What is/are the insight(s) found from the chart?

At 190-210 minutes frequency of usage is maximum.Similerly at 160 min the frequency is second maximum.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

From the above graph we can give extra discount in amount charge or we can give some extra minutes.

#### Chart - 7

In [None]:
# Chart - 7 visualization code
# Create a bar graph to compare Total Evening Minutes by Churn
plt.figure(figsize=(6, 6))
sns.barplot(x='Churn', y='Total eve minutes', data=df, palette='Set2')

# Add labels and title
plt.xlabel('Churn')
plt.ylabel('Total Evening Minutes')
plt.title('Total Evening Minutes by Churn')

# Show the plot
plt.show()

##### 1. Why did you pick the specific chart?

This bar graph will allow us to directly compare the "Total Evening Minutes" between churned ("True") and non-churned ("False") customers, making it easier to see if there's a significant difference in evening usage patterns between the two groups.

##### 2. What is/are the insight(s) found from the chart?

From the above graph it is clear that customer which use more than 200 minutes get churned and who use less than 200 is not churning.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

From the above it is clear that telecom company need to think about that what is happing that customer who is using more than 200 evening min for that company need to do further analysis.

#### Chart - 8

In [None]:
# Chart - 8 visualization code
# Pie Chart for Voice Mail Plan
voice_mail_counts = df['Voice mail plan'].value_counts()
plt.pie(voice_mail_counts, labels=voice_mail_counts.index, autopct='%1.1f%%', startangle=90)
plt.title('Voice Mail Plan Distribution')
plt.show()

##### 1. Why did you pick the specific chart?

A pie chart is chosen to visualize the distribution of "Voice Mail Plan" because it effectively displays the proportion of customers with and without a plan.

##### 2. What is/are the insight(s) found from the chart?

27.7% customer who opted for the plan got churned and 72.3% people who opted thuis plan did not churned.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:
Understanding that a significant portion of customers with voice mail plans did not churn (72.3%) suggests that these customers might be more engaged or satisfied with the service. The business can focus on retaining this group by offering incentives, personalized services, or loyalty rewards to further reduce churn rates.
**Negative Growth or Challenges:
While the majority (72.3%) of customers with the voice mail plan did not churn, the fact that 27.7% did churn indicates room for improvement. This insight suggests that there may be underlying issues affecting some customers' satisfaction or experience with the voice mail service.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
# 9. Lineplot for Total Night Calls Over Time
sns.lineplot(x='Account length', y='Total night calls', data=df, ci=None)
plt.xlabel('Account Length')
plt.ylabel('Total Night Calls')
plt.title('Total Night Calls Over Time')
plt.show()

##### 1. Why did you pick the specific chart?

The line plot for "Total Night Calls Over Time" visualizes how the average number of night calls changes with increasing "Account Length." It helps identify any trends or patterns in night call behavior as customers' account tenure progresses, aiding in understanding the relationship between these two variables.

##### 2. What is/are the insight(s) found from the chart?

we can see that total night calls is maximum between 0-30 and account length is maximum between 200 to 230.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

from above insight it is clear that calls need to be cost friendly between minutes.

#### Chart - 10

In [None]:
# Chart - 10 visualization code
# Count the number of customers with and without international plans
international_plan_counts = df['International plan'].value_counts()

# Create a pie chart
plt.figure(figsize=(6, 6))
plt.pie(international_plan_counts, labels=international_plan_counts.index, autopct='%1.1f%%', startangle=90, colors=['skyblue', 'lightcoral'])
plt.title('International Plan Distribution')
plt.show()

##### 1. Why did you pick the specific chart?

We Count the number of customers with and without international plans and we can see that 9.7% people got opted and 90.3% people not opted any international plans.

##### 2. What is/are the insight(s) found from the chart?

we can see that 9.7% people got opted and 90.3% people not opted any international plans.This will give proper understanding most of the people not depend on the international calls.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

From the above insight it is clear telecom company needs to more focus on reagional plan not international plans.

#### Chart - 11

In [None]:
# Chart - 11 visualization code
# Calculate the counts of unique values in the 'Area code' column
area_code_counts = df['Area code'].value_counts()

# Create a pie chart
plt.figure(figsize=(8, 8))
plt.pie(area_code_counts, labels=area_code_counts.index, autopct='%1.1f%%', startangle=140)
plt.title('Distribution of Area Codes')

plt.axis('equal')  # Equal aspect ratio ensures that the pie chart is drawn as a circle.

plt.show()

##### 1. Why did you pick the specific chart?

I chose to create a pie chart based on the counts of unique values in a column ('Area code') because pie charts are typically used to show the distribution or composition of different categories within a whole.

##### 2. What is/are the insight(s) found from the chart?

Area code 415 contain almost 50% of customer and after that 408,510 have 25% and 25% respectively.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Telecompany should consider most area code 415 and the other 408 and 510.company need to do more analysis to get proper understanding of customer which is at area code 410.

#### Chart - 12

In [None]:
# Chart - 12 visualization code
# Create a barplot to visualize Total Eve Calls by Churn and International Plan
plt.figure(figsize=(11, 6))
sns.barplot(x='Churn', y='Total eve calls', hue='International plan', data=df, ci=None, palette='Set3')
plt.xlabel('Churn')
plt.ylabel('Total Eve Calls')
plt.title('Total Eve Calls by Churn and International Plan')
plt.show()

##### 1. Why did you pick the specific chart?

This is a barplot to visualize Total Eve Calls by Churn and International Plan.

##### 2. What is/are the insight(s) found from the chart?

From the graph it is clear that international plans doesnot depend on churning because it is not clear from the graph.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

we have discussed previously about the international plan and churn in this graph we can see international plan does not making people churn.

#### Chart - 13

In [None]:
# Chart - 13 visualization code
#Stacked Barplot for Churn by State
state_churn = df.groupby(['State', 'Churn']).size().unstack()
state_churn.plot(kind='bar', stacked=True, figsize=(15, 6))
plt.xlabel('State')
plt.ylabel('Count')
plt.title('Churn by State')
plt.legend(title='Churn', labels=['No Churn', 'Churn'])
plt.show()

##### 1. Why did you pick the specific chart?

This is Stacked Barplot for Churn by State,with this we have got a proper understanding of statewise chrn and not churn.

##### 2. What is/are the insight(s) found from the chart?

From above graph it is clear that state AK,IA,NM has least number of churns.similer observation can be done with another states.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

to get a proper understanding of telecom business and the customer statewise so that company would do further research to reduce the churn count.

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
correlation_matrix = df.corr()
plt.figure(figsize=(20, 17))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

##### 1. Why did you pick the specific chart?

A correlation heatmap for this dataset would visualize the relationships between numerical variables. It can help identify patterns and dependencies among variables, providing insights into factors that might influence customer churn.

##### 2. What is/are the insight(s) found from the chart?

From above here all the digonal elements are 1,Here we can see
*Total day charge and total day minute have positive correlation that is 1.
simillerly evening minute and eveninfg charges and so on.we can observe other data also.

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code
#Pairplot
sns.pairplot(df, hue='Churn')
plt.show()


##### 1. Why did you pick the specific chart?

here pair plot is basically used for proper understanding by correlating each column with each other by visualisation.

##### 2. What is/are the insight(s) found from the chart?

The pair plot reveals potential insights in a telecom dataset: higher customer service calls may correlate with churn, and examining usage patterns across day, evening, and night can provide further understanding of customer behavior. Further analysis and modeling may be needed for validation and actionable insights.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

**Solution to reduce customer churn**

* Enhance the customer service calls quality and and reduce unnecessary servise calls.
* Ask for feedback often.
* Periodically throw offers to retain customer.
* Look at the customers who are facing issues in most churned states.
* Give priority to the best customer.
* Solve poor connectivity network issues.
* Analyze churn when it happens.
* Regular server maintainance.
* Be competitive.

# **Conclusion**

**From the above Analysis we can conclude the following:**

*The first conclusion is that only 14%-15% people churned and changed it into some other network operator which is vary less so telecom company should work on retaining the existing customer and try to be competetive so that customer do not change there service provider.

*In AK and HI there are least number of people who has connectivity so try to reach that area to increase the reach.

*Company should work on the customer sevice ,Because from the data we can say customer who got more that 3 call those customer are churning there can be other reason also so a company need to do further investigation for the root cause.

*Area code 415 contain almost 50% of customer so as we discussed previously a telecom company needs to think about not to loose this area,for that customer feedback is realy important.

*Correlation heatmap and pairplot really helpful to understand various distribution.


***THANKYOU***



### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***