# Python Insights - Analysing Data with Python

### Case - Customer Cancellations

You were hired by a company with over 800,000 customers for a Data project. Recently, the company realized that most of its total customer base consists of inactive customers, meaning those who have already canceled the service.

Needing to improve its results, the company wants to understand the main reasons behind these cancellations and identify the most effective actions to reduce this number.

### Step by Step
- Step 1: Import the database
- Step 2: Visualize the database (understand and identify issues)
- Step 3: Fix the issues in the database
- Step 4: Initial analysis
- Step 5: Analyze the reasons for customer cancellations

In [4]:
# Installing libraries
# !pip install pandas numpy openpyxl nbformat ipykernel plotly

In [None]:
# Step 1: Import the database
import pandas as pd
url='https://drive.google.com/uc?id=1sFnxOdcqnbC-hVYpRBdNA4Z26XWYCIes'
df_customers = pd.read_csv(url)

In [None]:
# Step 2: Visualize the database (understand and identify issues)
display(df_customers) # Showing the database

# Removing columns that aren't important
df_customers = df_customers.drop(columns="CustomerID")

display(df_customers) # Showing the database without the column

In [None]:
# Step 3: Fix the issues in the database
# Identifying empty values
display(df_customers.info())

# Removing empty values
df_customers = df_customers.dropna()
display(df_customers.info())

In [None]:
# Step 4: Initial analysis - how many clients cancelled and the percentage
display(df_customers["cancelou"].value_counts())

# Percentage
display(df_customers["cancelou"].value_counts(normalize=True))

In [None]:
# Step 5: Analyze the reasons for customer cancellations
import plotly.express as px

# Creating graphs for each column from the database
for column in df_customers.columns:
    # Creates the graph
    graph = px.histogram(df_customers, x=column, color="cancelou", text_auto=True)
    # Shows the graph
    graph.show()

### Analysis Conclusions

- All monthly contract customers cancel
    - Offer discounts on annual and quarterly plans.
- Customers who call the call center more than 4 times cancel
    - Create a process to resolve customer issues within a maximum of 3 calls.
- Customers who delayed payments by more than 20 days canceled
    - Implement a policy to resolve payment delays within 10 days (finance team).

**If we solve the issues above, how would the situation look?**

In [None]:
# duracao_contrato: "contract duration", test for different than monthly
condition = df_customers["duracao_contrato"] != "Monthly" # Filtering customers that aren't monthly
df_customers = df_customers[condition]

# ligacoes_callcenter: "callcenter calls", test for less than 4
condition = df_customers["ligacoes_callcenter"] <= 4 # Filtering customers that called less than 4 times
df_customers = df_customers[condition]

# dias_atraso: "delayed payments", test for less than 20 days
condition = df_customers["dias_atraso"] <= 20 # Filtering customers with payments delayed less than 20 days
df_customers = df_customers[condition]

display(df_customers["cancelou"].value_counts())
display(df_customers["cancelou"].value_counts(normalize=True))

**This analysis shows that if we fix all the issues, the cancellation rate could drop from 56% to 18%**
