<a href="https://colab.research.google.com/github/nazarcoder123/Telecom_Churn_Analysis/blob/main/Sample_EDA_Submission_Template.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    - 



##### **Project Type**    - EDA(Telecom Churn Analysis)
##### **Contribution**    - Individual

# **Project Summary -**

Customer churn in the telecom sector is a significant challenge that telecom companies face, as it directly impacts their revenue and profitability. Churn refers to customers discontinuing or terminating their services. Understanding the factors influencing churn and implementing effective strategies to mitigate it is crucial for telecom companies.

To address churn, telecom companies employ data-driven techniques such as Exploratory Data Analysis (EDA), customer segmentation, and predictive modeling. By leveraging machine learning algorithms, they build churn prediction models that identify customers at high risk of churn. This enables proactive intervention and targeted retention strategies to retain at-risk customers.


# **GitHub Link -**

Provide your GitHub Link here.

https://github.com/nazarcoder123/Telecom_Churn_Analysis 

# **Problem Statement**


**Write Problem Statement Here.**

The problem at hand is to analyze customer churn in the telecom sector and develop effective strategies to mitigate it. Customer churn, defined as the discontinuation or termination of telecom services by customers, poses a significant challenge for telecom companies as it directly impacts their revenue and profitability.

The objective is to understand the factors influencing customer churn and identify patterns or indicators that can help predict and prevent churn. By analyzing historical customer data, including demographics, usage patterns, complaints, and service interactions, the goal is to uncover key drivers of churn and gain insights into customer behavior.

The problem statement also encompasses the need to explore and implement customer retention strategies, including enhancing service quality, improving customer experiences, offering competitive pricing plans, providing value-added services, and implementing effective customer relationship management (CRM) systems. The goal is to address the root causes of churn, improve customer satisfaction, and establish strong relationships with customers.

Ultimately, the objective is to develop a comprehensive understanding of customer churn in the telecom sector and devise strategies that enable telecom companies to retain customers, minimize churn rates, and achieve sustainable business growth in a highly competitive market.

#### **Define Your Business Objective?**

Answer Here.

The business objective for the telecom company is to reduce customer churn. Customer churn refers to the rate at which customers discontinue their services or switch to a competitor. The objective is to retain existing customers and minimize the number of customers who cancel their subscriptions or switch to other telecom providers. This objective is important because retaining customers is more cost-effective than acquiring new ones, and it contributes to the company's revenue stability and long-term growth. By reducing churn, the company can improve customer satisfaction, increase customer loyalty, and ultimately achieve higher profitability.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required. 
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits. 
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule. 

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [None]:
# Load Dataset
df = pd.read_csv("/content/Telecom Churn.csv")

### Dataset First View

In [None]:
# Dataset First Look
df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
df.shape

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
df.duplicated().value_counts()

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
df.isnull().sum()

In [None]:
# Visualizing the missing values
df[df.isnull()].sum()

### What did you know about your dataset?

Answer Here


The data set consist of 3333 rows and 20 columns
Duplicate Values: There are no duplicated rows in the dataset as all the rows are unique.
Missing Values: The dataset does not contain any missing values or null values. The "Missing Values/Null Values Count" is shown as 0.

Column Names: The dataset has the following columns in the given order:

1.'State': The state of the customer.

2.'Account length': The duration of the customer's account.

3.'Area code': The area code of the customer's phone number.

4.'International plan': Whether the customer has an international calling plan 
   (Yes/No).

5.'Voice mail plan': Whether the customer has a voice mail plan (Yes/No).

6.'Number vmail messages': The number of voice mail messages the customer has.

7.'Total day minutes': The total duration of daytime calls for the customer.

8.'Total day calls': The total number of daytime calls made by the customer.

9.'Total day charge': The total charge for daytime calls for the customer.

10.'Total eve minutes': The total duration of evening calls for the customer.

11.'Total eve calls': The total number of evening calls made by the customer.

12.'Total eve charge': The total charge for evening calls for the customer.

13.'Total night minutes': The total duration of nighttime calls for the 
   customer.

14.'Total night calls': The total number of nighttime calls made by the 
  customer.

15.'Total night charge': The total charge for nighttime calls for the customer.

16.'Total intl minutes': The total duration of international calls for the 
   customer.

17.'Total intl calls': The total number of international calls made by the 
   customer.

18.'Total intl charge': The total charge for international calls for the 
  customer.

19.'Customer service calls': The number of customer service calls made by the 
   customer.

20.'Churn': A boolean value indicating whether the customer churned or not 
  (True/False).

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
columns = df.columns.tolist()
columns

In [None]:
# Dataset Describe
df.describe()

### Variables Description 

Answer Here

'Account length': This variable represents the duration of the customer's account in the telecom company. It is a numeric variable.

'Area code': This variable represents the area code associated with the customer's phone number. It is a categorical variable.

'Number vmail messages': This variable represents the number of voice mail messages received by the customer. It is a numeric variable.

'Total day minutes': This variable represents the total duration of daytime calls made by the customer. It is a numeric variable.

'Total day calls': This variable represents the total number of daytime calls made by the customer. It is a numeric variable.

'Total day charge': This variable represents the total charge for daytime calls made by the customer. It is a numeric variable.

'Total eve minutes': This variable represents the total duration of evening calls made by the customer. It is a numeric variable.

'Total eve calls': This variable represents the total number of evening calls made by the customer. It is a numeric variable.

'Total eve charge': This variable represents the total charge for evening calls made by the customer. It is a numeric variable.

'Total night minutes': This variable represents the total duration of nighttime calls made by the customer. It is a numeric variable.

'Total night calls': This variable represents the total number of nighttime calls made by the customer. It is a numeric variable.

'Total night charge': This variable represents the total charge for nighttime calls made by the customer. It is a numeric variable.

'Total intl minutes': This variable represents the total duration of international calls made by the customer. It is a numeric variable.

'Total intl calls': This variable represents the total number of international calls made by the customer. It is a numeric variable.

'Total intl charge': This variable represents the total charge for international calls made by the customer. It is a numeric variable.

'Customer service calls': This variable represents the number of customer service calls made by the customer. It is a numeric variable.

The df.describe() output provides descriptive statistics for the numeric variables in your dataset, including the count, mean, standard deviation, minimum, 25th percentile, 50th percentile (median), and 75th percentile. It gives you an overview of the central tendency, dispersion, and range of the numeric variables in your dataset.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for column in df.columns:
    unique_values = df[column].unique()
    print(f"Unique values in '{column}': {unique_values}")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
missing_values = df.isnull().sum()
print("Missing Values:\n", missing_values)

# Check for duplicate rows
duplicate_rows = df.duplicated().sum()
print("Duplicate Rows:", duplicate_rows)

# In this we are counts individual state How many times they are present in a data
df["State"].value_counts()

# Group the data by state and calculate churn rate
state_churn_rate = df.groupby('State')['Churn'].mean().sort_values(ascending=False)
print(state_churn_rate)

### What all manipulations have you done and insights you found?

Answer Here.

Checking for missing values: The code df.isnull().sum() calculates the number of missing values in each column. This helps identify if there are any columns with missing data.

Checking for duplicate rows: The code df.duplicated().sum() counts the number of duplicate rows in the dataset. This helps identify if there are any duplicate records present.

Counting occurrences of each state: The code df["State"].value_counts() counts the number of occurrences of each state in the dataset. This provides insights into the distribution of customers across different states.

Calculating churn rate by state: The code df.groupby('State')['Churn'].mean().sort_values(ascending=False) groups the data by state and calculates the average churn rate for each state. The results are sorted in descending order. This analysis helps identify states with higher churn rates, indicating potential areas of concern for the telecom company.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code

# Group the data by state and calculate churn rate
state_churn_rate = df.groupby('State')['Churn'].mean().sort_values(ascending=False)

# Plotting the bar chart
plt.figure(figsize=(12, 6))
state_churn_rate.plot(kind='bar')
plt.xlabel('State')
plt.ylabel('Churn Rate')
plt.title('Churn Rate by State')
plt.xticks(rotation=90)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

After analyzing the churn rate by state using the provided code, the following conclusions can be drawn:

Churn Rates Vary: The bar chart clearly shows that the churn rates vary across different states. Some states have relatively high churn rates, indicating a higher percentage of customers leaving the telecom service, while others have lower churn rates.

State-Level Insights: By examining the heights of the bars, we can identify the states with the highest churn rates. These states require closer attention and investigation to understand the underlying factors contributing to customer churn.

Priority Areas: The sorted bar chart helps prioritize areas for intervention. States with higher churn rates can be targeted for retention strategies, customer satisfaction improvement, or targeted marketing campaigns to reduce churn and retain customers.

Regional Patterns: The chart may reveal regional patterns in customer churn. States in the same geographical region might exhibit similar churn rates, indicating the presence of specific regional factors influencing customer behavior and retention.

Further Analysis: The bar chart serves as a starting point for further analysis and exploration. It can be used to identify potential correlations or relationships between churn rates and other variables, such as account length, international plan usage, or customer service calls.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

In summary, the chart provides valuable insights into the churn rates by state, enabling telecom companies to identify areas of concern, prioritize actions, and tailor strategies to reduce churn and enhance customer retention.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

The gained insights from analyzing the churn rates by state can indeed help create a positive business impact. By understanding the variations in churn rates and identifying states with higher churn, telecom companies can develop targeted strategies to reduce churn and improve customer retention. This, in turn, can lead to increased customer satisfaction, loyalty, and ultimately, positive business growth. The insights allow companies to allocate resources effectively, focus on retention efforts, and tailor their approach based on the specific needs and challenges in different states.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
# Select only the numerical columns
numerical_columns = df.select_dtypes(include='number')

# Set the figure size for better visualization
plt.figure(figsize=(10, 6))

# Create a boxplot for all the numerical variables
plt.boxplot(numerical_columns.values, vert=False, labels=numerical_columns.columns)

# Set the x-axis label
plt.xlabel('Values')

# Set the title
plt.title('Boxplot of Numerical Variables')

# Display the plot
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.
This chart help in knowing outlier in the data set.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

By examining the boxplots, you can identify variables with significant variations, potential outliers, or unusual distributions, which can further guide your data analysis and decision-making process.


##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

# This is formatted as code

Answer Here

However, it is important to note that boxplots alone may not provide a comprehensive understanding of the underlying factors driving customer churn or negative growth. They serve as a starting point for analysis and should be complemented with further investigations and analysis. Other techniques such as correlation analysis, predictive modeling, and customer feedback analysis can provide additional insights to support decision-making and drive positive business impact.

#### Chart - 3

In [None]:
# Chart - 3 visualization code

# Set the figure size
plt.figure(figsize=(10, 6))

# Create a histogram for "Account length"
plt.hist(df["Account length"], bins=30, edgecolor='black')
plt.xlabel("Account Length")
plt.ylabel("Frequency")
plt.title("Histogram of Account Length")

# Display the histogram
plt.show()

# Set the figure size
plt.figure(figsize=(10, 6))

# Create a histogram for "Total intl charge"
plt.hist(df["Total intl charge"], bins=30, edgecolor='black')
plt.xlabel("Total Intl Charge")
plt.ylabel("Frequency")
plt.title("Histogram of Total Intl Charge")

# Display the histogram
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

I picked a histogram as it is a suitable chart for visualizing the distribution and frequency of numerical variables. Histograms provide a clear representation of the data's distribution by dividing it into bins or intervals and displaying the count or frequency of observations within each bin.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

The distribution of account lengths appears to be somewhat right-skewed, with a longer tail on the right side.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Understanding the distribution of account lengths can help the telecom company in tailoring their services and offerings based on customer preferences. They can identify customer segments with specific account length preferences and design targeted marketing campaigns or service plans to cater to their needs. This customer-centric approach can enhance customer satisfaction and loyalty, leading to positive business impact.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
# Lets perform univariate Analysis
# Select only the numeric columns
numeric_columns = df.select_dtypes(include=np.number)

# Iterate over each numeric column
for column in numeric_columns:
    # Set the figure size for better visualization
    plt.figure(figsize=(8, 6))
    
    # Create a histogram for the numeric column
    plt.hist(df[column], bins=10)
    
    # Set the x-axis label
    plt.xlabel(column)
    
    # Set the y-axis label
    plt.ylabel('Frequency')
    
    # Set the title
    plt.title(f'Histogram of {column}')
    
    # Display the plot
    plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

The specific chart chosen for performing univariate analysis, such as histograms, is based on the nature and type of the data being analyzed. Histograms are commonly used for visualizing the distribution of numeric data.

##### 2. What is/are the insight(s) found from the chart?

Answer Here.

Distribution Shape: The shape of the histogram can provide insights into the distribution of the data. It may exhibit characteristics such as normal (bell-shaped), skewed (positively or negatively), bimodal (having two peaks), or multimodal (having multiple peaks). These shape patterns can indicate the underlying data patterns and help understand the behavior of the variable.

Central Tendency: The central tendency of the data can be observed from the histogram. It can provide insights into the mean, median, and mode of the distribution. For normally distributed data, the peak of the histogram aligns with the mean value, while skewed distributions may have the peak shifted towards one side.

Outliers: Histograms can help identify outliers in the data. Outliers are data points that deviate significantly from the majority of the data. They appear as isolated bars or bins that are far away from the main distribution. Identifying outliers is important as they can impact statistical analysis and decision-making.

Spread and Variability: The width and height of the histogram bins can provide insights into the spread and variability of the data. A wider distribution indicates higher variability, while a narrower distribution suggests lower variability.

Data Range: The range of the data can be observed from the histogram. It shows the minimum and maximum values covered by the variable, providing insights into the data's extent.

These insights help in understanding the characteristics of the numeric variables and can guide further analysis, decision-making, and modeling processes. It is important to interpret the histograms in the context of the specific dataset and domain knowledge.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Detecting Anomalies: Histograms can help identify outliers in the data, which may indicate unusual or exceptional customer behavior. These outliers could represent potential fraud, system errors, or other abnormal activities. Detecting and addressing these anomalies promptly can minimize financial losses, maintain data integrity, and ensure a positive customer experience.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
plt.scatter(df['Total day minutes'], df['Total day charge'])
plt.xlabel('Total day minutes')
plt.ylabel('Total day charge')
plt.title('Scatter Plot: Total day minutes vs Total day charge')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

The line plot is selected to visualize the trend or pattern between two numerical variables over a continuous range.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

What is/are the insight(s) found from the chart? The line plot helps to observe the overall trend between the variables. It can reveal whether there is a positive or negative trend, any seasonal patterns, or any abrupt changes in the relationship.

##### 3. Will the gained insights help creating a positive business impact?

Answer Here.

Will the gained insights help creating a positive business impact? Yes, understanding the trend or pattern between variables can assist businesses in making strategic decisions. For example, if the line plot shows an increasing trend in customer complaints over time, businesses can take proactive measures to address the underlying issues and improve customer satisfaction

#### Chart - 6

In [None]:
# Chart - 6 visualization code

# Calculate the number of churned and non-churned customers
churn_counts = df['Churn'].value_counts()

# Create labels for the pie chart
labels = ['Non-Churned', 'Churned']

# Create a pie chart
plt.pie(churn_counts, labels=labels, autopct='%1.1f%%', startangle=90)

# Add a title to the chart
plt.title('Churned vs Non-Churned Customers')

# Display the chart
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

I picked the pie chart to represent the churned vs non-churned customers because it effectively displays the proportion or distribution of two categories (churned and non-churned) as parts of a whole. The pie chart allows for easy visualization of the relative sizes of each category and provides a clear comparison between them.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

It help in knowing how much percentage is the churn & Non-churn customer.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Understanding Churn Rate: Knowing the overall churn rate is crucial for a telecom company. It helps in assessing the health of the customer base and identifying potential areas for improvement. If the churn rate is high, it indicates a need to focus on customer retention strategies and improving customer satisfaction to reduce churn.

#### Chart - 7

In [None]:
# Chart - 7 visualization code
# Plot churn rates based on International plan
plt.figure(figsize=(8, 6))
sns.countplot(x='International plan', hue='Churn', data=df)
plt.title('Churn Rates based on International plan')
plt.xlabel('International plan')
plt.ylabel('Count')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

These visualizations can help the telecom company identify any variations in churn rates based on categorical variables and inform strategic decision-making.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

International plan: By comparing the churn rates for customers with and without an international plan, you can observe if there are any significant differences. 

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

*Answer* Here

These visualizations can help the telecom company identify any variations in churn rates based on categorical variables and inform strategic decision-making. If there are notable differences in churn rates across different plans, the company can focus on improving those aspects of the service to retain more customers and enhance customer satisfaction.

#### Chart - 8

In [None]:
# Chart - 8 visualization code
# Plot churn rates based on Voice mail plan
plt.figure(figsize=(8, 6))
sns.countplot(x='Voice mail plan', hue='Churn', data=df)
plt.title('Churn Rates based on Voice mail plan')
plt.xlabel('Voice mail plan')
plt.ylabel('Count')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

Voice mail plan: Similarly, comparing the churn rates for customers with and without a voice mail plan can provide insights into the impact of voice mail services on customer retention. A higher churn rate among customers with a voice mail plan may suggest issues with voice mail functionality, usage patterns, or customer preferences.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

Voice mail plan: Similarly, comparing the churn rates for customers with and without a voice mail plan can provide insights into the impact of voice mail services on customer retention. 

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

These visualizations can help the telecom company identify any variations in churn rates based on categorical variables and inform strategic decision-making. If there are notable differences in churn rates across different plans, the company can focus on improving those aspects of the service to retain more customers and enhance customer satisfaction.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
# Select the relevant columns for the scatter plot matrix
columns = ['Total day minutes', 'Total eve minutes', 'Total night minutes', 'Total intl minutes']

# Create a scatter plot matrix
sns.pairplot(df[columns])

# Display the plot
plt.show()

##### 1. Why did you pick the specific chart?

*Answer* Here.

I picked the scatter plot matrix for multivariate analysis because it allows us to visualize the relationships between multiple variables simultaneously. In the scatter plot matrix, each variable is plotted against every other variable, resulting in a grid of scatter plots. This helps us to identify patterns, correlations, and trends between the variables.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

By using a scatter plot matrix, we can gain insights into the relationships between different numerical variables in the dataset. It helps us understand how variables interact with each other and if there are any apparent associations or dependencies among them. This visualization is particularly useful in identifying potential patterns or clusters in the data and can assist in identifying variables that may have a significant impact on the target variable, such as customer churn in this case.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

In summary, the gained insights from multivariate analysis can help create a positive business impact by informing strategic decisions, improving customer retention strategies, and enhancing the overall customer experience. However, it is important to carefully analyze and address any negative insights or challenges that may arise to mitigate potential negative growth.



#### Chart - 10

In [None]:
# Chart - 10 visualization code

# Calculate the average of a numerical variable by area code
area_avg = df.groupby('Area code')['Total day minutes'].mean()

# Reset index
area_avg = area_avg.reset_index()

# Create a bar chart of average value by area code
plt.figure(figsize=(10, 6))
plt.bar(area_avg['Area code'], area_avg['Total day minutes'])
plt.xlabel('Area code')
plt.ylabel('Average Total day minutes')
plt.title('Average Total day minutes by Area code')
plt.xticks(area_avg['Area code'])
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

In this example, we are calculating the average value of the "Total day minutes" variable for each area code. The bar chart represents the average total day minutes for each area code, allowing us to compare the values and identify any variations across different area codes.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

It is important to note that these insights are based on the specific variable "Total day minutes" and may vary if different variables or datasets are used. Additionally, further analysis and exploration of the data may be necessary to gain more comprehensive insights.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

It is important to note that the actual impact on the business would depend on the effectiveness of the strategies and actions implemented based on the gained insights. Regular monitoring, analysis, and adjustment of strategies based on customer feedback and market dynamics are crucial for realizing the positive business impact.

#### Chart - 11

In [None]:
# Chart - 11 visualization code
# Group the data by State and calculate the total number of customer service calls and churn count
state_data = df.groupby('State').agg({'Customer service calls': 'sum', 'Churn': 'sum'}).reset_index()

# Sort the data by the total number of customer service calls in descending order
state_data = state_data.sort_values('Customer service calls', ascending=False)

# Set the figure size
plt.figure(figsize=(12, 6))

# Create a bar chart for the total number of customer service calls
plt.bar(state_data['State'], state_data['Customer service calls'], label='Customer Service Calls')

# Create a stacked bar chart for the churn count
plt.bar(state_data['State'], state_data['Churn'], label='Churn', color='red')

# Set the x-axis label
plt.xlabel('State')

# Set the y-axis label
plt.ylabel('Count')

# Set the title
plt.title('Customer Service Calls and Churn by State')

# Add a legend
plt.legend()

# Rotate the x-axis labels for better readability
plt.xticks(rotation=90)

# Display the plot
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

This visualization will show the total number of customer service calls and churn count for each state. The bars represent the customer service calls, and the red portion of the bars represents the churn count. By comparing the lengths of the bars and the red portions, you can identify the states with higher customer service calls and higher churn rates.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

Overall, there is a correlation between higher customer service calls and higher churn, emphasizing the importance of addressing customer concerns promptly to reduce churn rates.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer 

Yes, that is correct. The correlation between higher customer service calls and higher churn suggests that customers who have more concerns or issues are more likely to churn. This highlights the importance of providing effective and efficient customer service to address customer needs and resolve any issues they may have. By addressing customer concerns promptly and effectively, businesses can improve customer satisfaction, increase customer loyalty, and ultimately reduce churn rates, leading to a positive impact on the business.

#### Chart - 12

In [None]:
# Chart - 12 visualization code

# Calculate the total charges and total minutes for each customer
df['Total_Charges'] = df['Total day charge'] + df['Total eve charge'] + df['Total night charge'] + df['Total intl charge']
df['Total_Minutes'] = df['Total day minutes'] + df['Total eve minutes'] + df['Total night minutes'] + df['Total intl minutes']

# Group the data by churn status and calculate the average total charges
avg_charges = df.groupby('Churn')['Total_Charges'].mean()

# Group the data by churn status and calculate the average total minutes
avg_minutes = df.groupby('Churn')['Total_Minutes'].mean()

# Plotting the bar plots
fig, axes = plt.subplots(1, 2, figsize=(10, 4))

# Average Total Charges by Churn Status
axes[0].bar(avg_charges.index, avg_charges.values)
axes[0].set_xlabel('Churn')
axes[0].set_ylabel('Average Total Charges')
axes[0].set_title('Average Total Charges by Churn Status')

# Average Total Minutes by Churn Status
axes[1].bar(avg_minutes.index, avg_minutes.values)
axes[1].set_xlabel('Churn')
axes[1].set_ylabel('Average Total Minutes')
axes[1].set_title('Average Total Minutes by Churn Status')

plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

I picked bar plots to visualize the average total charges and average total minutes based on churn status because they are effective in comparing values across different categories. Bar plots provide a clear visual representation of the average values for each category (churn or non-churn) and allow for easy comparison between the two groups.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

These insights suggest that both total charges and total minutes can be potential factors contributing to customer churn. Further analysis and investigation are needed to understand the underlying reasons and take appropriate actions to mitigate churn and improve customer retention.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

However, if these insights are not addressed and acted upon, there is a risk of negative growth. If the business fails to address high charges and low usage concerns, customers may continue to churn, leading to a decline in revenue and customer base. It is essential for the business to take these insights seriously and implement strategies to retain customers and foster positive growth.

#### Chart - 13

In [None]:
# Chart - 13 visualization code
# Group the data by churn status and calculate the mean of different variables
churn_data = df.groupby('Churn').mean()

# Select the relevant columns for comparison
columns_to_compare = ['Total day minutes', 'Total eve minutes', 'Total night minutes', 'Total intl minutes', 'Customer service calls']

# Create a bar plot
plt.figure(figsize=(10, 6))
churn_data[columns_to_compare].plot(kind='bar')
plt.title('Comparison of Variables for Churned and Non-Churned Customers')
plt.xlabel('Churn')
plt.ylabel('Average Minutes')
plt.xticks([0, 1], ['Non-Churned', 'Churned'])
plt.legend(loc='upper right')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

This visualization will show the average minutes for different variables (such as total day minutes, total eve minutes, total night minutes, total intl minutes, and customer service calls) for churned and non-churned customers.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

By comparing these variables, you can identify patterns and understand which factors might be influencing churn. This information can guide the business in implementing strategies to reduce churn, such as improving customer service, offering attractive international plans, or optimizing pricing and usage plans based on the identified factors.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

However, if the company does not take appropriate actions based on the insights, it can lead to negative growth. Ignoring the factors contributing to customer churn can result in higher churn rates, reduced customer base, and ultimately, negative business growth. It is crucial for the company to leverage the insights gained from the analysis and implement strategies to retain customers and improve overall business performance.

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

# Compute the correlation matrix
corr = df.corr()

# Set up the figure and axes
plt.figure(figsize=(10, 8))
ax = plt.axes()

# Create the heatmap
sns.heatmap(corr, annot=True, cmap="coolwarm", fmt=".2f", linewidths=0.5, ax=ax)

# Set the title
plt.title('Correlation Heatmap')

# Display the plot
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

I picked the correlation heatmap as it is a commonly used visualization technique to analyze the correlation between variables. It allows us to understand the strength and direction of the relationships between pairs of variables in the dataset.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

These insights provide a better understanding of the relationships between variables and can be used to inform decision-making, such as identifying potential areas of improvement or focusing on customer service strategies to reduce churn.

#### Chart - 15 - Pair Plot 

In [None]:
# Pair Plot visualization code

# Select the numerical columns for pair plot
numerical_columns = df.select_dtypes(include='number')

# Create pair plot
sns.pairplot(numerical_columns)

##### 1. Why did you pick the specific chart?

Answer Here.

The pair plot chart is selected for visualization as it provides a comprehensive view of the relationships between pairs of variables in the dataset. It allows us to visually analyze the pairwise correlations, distributions, and potential patterns or trends in the data.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

It's important to note that the specific insights will depend on the dataset and variables being analyzed. Interpretation should be done in the context of the specific problem or domain.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ? 
Explain Briefly.

Answer Here.


Based on the analysis and insights gained from the data, I would suggest the following recommendations to the client to achieve their business objectives:

Improve Customer Service: The analysis showed a correlation between higher customer service calls and higher churn. It is important to focus on enhancing the customer service experience, addressing customer concerns promptly, and ensuring customer satisfaction to reduce churn rates.

Offer Competitive Pricing: The analysis revealed that higher charges can be a contributing factor to customer churn. It is advisable to review pricing strategies and consider offering competitive pricing plans or discounts to retain customers and attract new ones.

Enhance International Plan Features: The analysis indicated that customers with international plans have a higher churn rate. The client can evaluate the features and offerings of their international plans and consider enhancing them to provide more value to customers.

Evaluate Voice Mail Plan Usage: The analysis showed that customers with a voice mail plan have a slightly higher churn rate. It would be beneficial to assess the usage patterns and benefits of the voice mail plan and make adjustments or improvements as necessary.

Monitor and Address Regional Differences: The analysis highlighted regional variations in churn rates. It is important to closely monitor customer behavior and preferences in different geographical regions and tailor marketing strategies and retention efforts accordingly.

By implementing these recommendations, the client can aim to improve customer retention, reduce churn rates, and ultimately achieve their business objective of increasing customer loyalty and profitability.

# **Conclusion**

Write the conclusion here.


In conclusion, the analysis of the telecom company's customer churn data provided valuable insights and recommendations to address the business objective of reducing churn and improving customer retention. Some key conclusions from the analysis are:

The overall churn rate in the company is X%, indicating a significant number of customers are leaving the company.

Certain factors such as international plan usage, customer service calls, and total charges have shown correlations with churn, suggesting their influence on customer retention.

Customers with international plans and those making higher customer service calls have higher churn rates, indicating the need to focus on improving service quality and value for these customer segments.

Pricing plays a role in customer churn, as higher charges are associated with increased likelihood of churn. Evaluating pricing strategies and offering competitive plans could help retain customers.

Geographical differences in churn rates suggest the need for targeted marketing and retention efforts tailored to specific regions.

Overall, the gained insights provide actionable steps to reduce churn and improve business performance. By addressing customer concerns, enhancing service quality, offering competitive pricing, and catering to regional preferences, the telecom company can work towards increasing customer loyalty and achieving positive business impact. Regular monitoring of customer behavior and continuous improvement efforts will be essential to ensure long-term success in reducing churn and improving customer satisfaction.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***