<a href="https://colab.research.google.com/github/mfadlisy/Telco-Customer-Churn-Analysis/blob/main/Notebooks/exploratory_data_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Load Data

In [52]:
import pandas as pd
import numpy as np

pd.set_option('display.max_columns', None)

In [53]:
data_path = '/content/drive/MyDrive/Learn Data/Project Case Study/Telco Customer Churn Analysis/data/telco-data/Telco_customer_churn'

telco_data = pd.read_excel(data_path + ".xlsx")
demographics_data = pd.read_excel(data_path + "_demographics.xlsx")
location_data = pd.read_excel(data_path + "_location.xlsx")
population_data = pd.read_excel(data_path + "_population.xlsx")
services_data = pd.read_excel(data_path + "_services.xlsx")
status_data = pd.read_excel(data_path + "_status.xlsx")

In [54]:
# Change Customer ID to CustomerID in certain datasets
datas = [demographics_data, location_data, population_data, services_data, status_data]
for data in datas:
    data.rename(columns={'Customer ID': 'CustomerID'}, inplace=True)

In [55]:
# Merge datasets
data_merge = telco_data.merge(demographics_data, on='CustomerID', suffixes=('','_merged'))
data_merge = data_merge.merge(location_data, on='CustomerID', suffixes=('','_merged'))
data_merge = data_merge.merge(population_data, on='Zip Code', suffixes=('','_merged'))
data_merge = data_merge.merge(services_data, on='CustomerID', suffixes=('','_merged'))
data_merge = data_merge.merge(status_data, on='CustomerID', suffixes=('','_merged'))

data_merge.columns

Index(['CustomerID', 'Count', 'Country', 'State', 'City', 'Zip Code',
       'Lat Long', 'Latitude', 'Longitude', 'Gender', 'Senior Citizen',
       'Partner', 'Dependents', 'Tenure Months', 'Phone Service',
       'Multiple Lines', 'Internet Service', 'Online Security',
       'Online Backup', 'Device Protection', 'Tech Support', 'Streaming TV',
       'Streaming Movies', 'Contract', 'Paperless Billing', 'Payment Method',
       'Monthly Charges', 'Total Charges', 'Churn Label', 'Churn Value',
       'Churn Score', 'CLTV', 'Churn Reason', 'Count_merged', 'Gender_merged',
       'Age', 'Under 30', 'Senior Citizen_merged', 'Married',
       'Dependents_merged', 'Number of Dependents', 'Count_merged',
       'Country_merged', 'State_merged', 'City_merged', 'Zip Code_merged',
       'Lat Long_merged', 'Latitude_merged', 'Longitude_merged', 'ID',
       'Population', 'Count_merged', 'Quarter', 'Referred a Friend',
       'Number of Referrals', 'Tenure in Months', 'Offer',
       'Phone Ser

In [56]:
# Drop column which include _merged
data_merge = data_merge.drop(columns=data_merge.columns[data_merge.columns.str.contains('_merged')])
data_merge.columns

Index(['CustomerID', 'Count', 'Country', 'State', 'City', 'Zip Code',
       'Lat Long', 'Latitude', 'Longitude', 'Gender', 'Senior Citizen',
       'Partner', 'Dependents', 'Tenure Months', 'Phone Service',
       'Multiple Lines', 'Internet Service', 'Online Security',
       'Online Backup', 'Device Protection', 'Tech Support', 'Streaming TV',
       'Streaming Movies', 'Contract', 'Paperless Billing', 'Payment Method',
       'Monthly Charges', 'Total Charges', 'Churn Label', 'Churn Value',
       'Churn Score', 'CLTV', 'Churn Reason', 'Age', 'Under 30', 'Married',
       'Number of Dependents', 'ID', 'Population', 'Quarter',
       'Referred a Friend', 'Number of Referrals', 'Tenure in Months', 'Offer',
       'Avg Monthly Long Distance Charges', 'Internet Type',
       'Avg Monthly GB Download', 'Device Protection Plan',
       'Premium Tech Support', 'Streaming Music', 'Unlimited Data',
       'Monthly Charge', 'Total Refunds', 'Total Extra Data Charges',
       'Total Long Dist

In [57]:
# Group columns
demographics_category = ['CustomerID', 'Gender', 'Age', 'Senior Citizen', 'Partner', 'Dependents', 'Married', 'Number of Dependents']
location_category = ['Country', 'State', 'City', 'Zip Code', 'Population', 'Lat Long', 'Latitude', 'Longitude']
services_category = ['Quarter', 'Referred a Friend', 'Number of Referrals', 'Offer', 'Phone Service', 'Multiple Lines', 'Internet Service', 'Online Security', 'Online Backup', 'Device Protection', 'Tech Support', 'Streaming TV', 'Streaming Movies', 'Contract', 'Paperless Billing', 'Payment Method', 'Device Protection Plan', 'Premium Tech Support', 'Streaming Music', 'Unlimited Data']
finance_category = ['Monthly Charges', 'Total Charges', 'Monthly Charge', 'Total Refunds', 'Total Extra Data Charges', 'Total Long Distance Charges', 'Total Revenue']
condition_category = ['Tenure Months', 'Tenure in Months', 'Avg Monthly Long Distance Charges', 'Avg Monthly GB Download', 'Satisfaction Score']
churn_category = ['Churn Score', 'CLTV', 'Churn Reason', 'Churn Category', 'Customer Status', 'Churn Label', 'Churn Value']

grouped_data = data_merge[demographics_category+location_category+condition_category+finance_category+services_category+churn_category]

In [58]:
grouped_data.head()

Unnamed: 0,CustomerID,Gender,Age,Senior Citizen,Partner,Dependents,Married,Number of Dependents,Country,State,City,Zip Code,Population,Lat Long,Latitude,Longitude,Tenure Months,Tenure in Months,Avg Monthly Long Distance Charges,Avg Monthly GB Download,Satisfaction Score,Monthly Charges,Total Charges,Monthly Charge,Total Refunds,Total Extra Data Charges,Total Long Distance Charges,Total Revenue,Quarter,Referred a Friend,Number of Referrals,Offer,Phone Service,Multiple Lines,Internet Service,Online Security,Online Backup,Device Protection,Tech Support,Streaming TV,Streaming Movies,Contract,Paperless Billing,Payment Method,Device Protection Plan,Premium Tech Support,Streaming Music,Unlimited Data,Churn Score,CLTV,Churn Reason,Churn Category,Customer Status,Churn Label,Churn Value
0,3668-QPYBK,Male,37,No,No,No,No,0,United States,California,Los Angeles,90003,58198,"33.964131, -118.272783",33.964131,-118.272783,2,2,10.47,21,1,53.85,108.15,53.85,0.0,0,20.94,129.09,Q3,No,0,,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,No,No,No,Yes,86,3239,Competitor made better offer,Competitor,Churned,Yes,1
1,2967-MXRAV,Male,29,No,Yes,No,Yes,0,United States,California,Los Angeles,90003,58198,"33.964131, -118.272783",33.964131,-118.272783,1,1,43.57,0,3,18.8,18.8,18.8,0.0,0,43.57,62.37,Q3,Yes,9,,Yes,No,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,One year,No,Mailed check,No,No,No,No,51,5160,,,Joined,No,0
2,9643-AVVWI,Female,49,No,Yes,Yes,Yes,3,United States,California,Los Angeles,90003,58198,"33.964131, -118.272783",33.964131,-118.272783,3,3,19.18,22,3,80.0,241.3,80.0,0.0,0,57.54,298.84,Q3,Yes,2,,Yes,No,Fiber optic,No,Yes,No,Yes,No,No,Month-to-month,Yes,Electronic check,No,Yes,No,Yes,76,4264,,,Joined,No,0
3,0060-FUALY,Female,60,No,Yes,No,Yes,0,United States,California,Los Angeles,90003,58198,"33.964131, -118.272783",33.964131,-118.272783,59,59,16.39,14,3,94.75,5597.65,94.75,0.0,0,967.01,6564.66,Q3,Yes,4,Offer B,Yes,Yes,Fiber optic,Yes,Yes,No,No,Yes,No,Month-to-month,Yes,Electronic check,No,No,No,Yes,26,5238,,,Stayed,No,0
4,9696-RMYBA,Male,56,No,No,No,No,0,United States,California,Los Angeles,90003,58198,"33.964131, -118.272783",33.964131,-118.272783,5,5,12.35,13,3,80.1,398.55,80.1,0.0,0,61.75,460.3,Q3,No,0,,Yes,No,Fiber optic,No,No,No,No,Yes,No,Month-to-month,Yes,Mailed check,No,No,No,Yes,22,5225,,,Stayed,No,0


# Data Cleaning

In [59]:
# Data information
grouped_data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 7043 entries, 0 to 7042
Data columns (total 55 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   CustomerID                         7043 non-null   object 
 1   Gender                             7043 non-null   object 
 2   Age                                7043 non-null   int64  
 3   Senior Citizen                     7043 non-null   object 
 4   Partner                            7043 non-null   object 
 5   Dependents                         7043 non-null   object 
 6   Married                            7043 non-null   object 
 7   Number of Dependents               7043 non-null   int64  
 8   Country                            7043 non-null   object 
 9   State                              7043 non-null   object 
 10  City                               7043 non-null   object 
 11  Zip Code                           7043 non-null   int64

In [60]:
# Data description
grouped_data.describe()

Unnamed: 0,Age,Number of Dependents,Zip Code,Population,Latitude,Longitude,Tenure Months,Tenure in Months,Avg Monthly Long Distance Charges,Avg Monthly GB Download,Satisfaction Score,Monthly Charges,Monthly Charge,Total Refunds,Total Extra Data Charges,Total Long Distance Charges,Total Revenue,Number of Referrals,Churn Score,CLTV,Churn Value
count,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0,7043.0
mean,46.509726,0.468692,93521.964646,21181.589238,36.282441,-119.79888,32.371149,32.386767,22.958954,20.515405,3.244924,64.761692,64.761692,1.962182,6.860713,749.099262,3034.379056,1.951867,58.699418,4400.295755,0.26537
std,16.750352,0.962802,1865.794555,20901.246553,2.455723,2.157889,24.559481,24.542061,15.448113,20.41894,1.201657,30.090047,30.090047,7.902614,25.104978,846.660055,2865.204542,3.001199,21.525131,1183.057152,0.441561
min,19.0,0.0,90001.0,11.0,32.555828,-124.301372,0.0,1.0,0.0,0.0,1.0,18.25,18.25,0.0,0.0,0.0,21.36,0.0,5.0,2003.0,0.0
25%,32.0,0.0,92102.0,2048.0,34.030915,-121.815412,9.0,9.0,9.21,3.0,3.0,35.5,35.5,0.0,0.0,70.545,605.61,0.0,40.0,3469.0,0.0
50%,46.0,0.0,93552.0,15975.0,36.391777,-119.730885,29.0,29.0,22.89,17.0,3.0,70.35,70.35,0.0,0.0,401.44,2108.64,0.0,61.0,4527.0,0.0
75%,60.0,0.0,95351.0,34146.0,38.224869,-118.043237,55.0,55.0,36.395,27.0,4.0,89.85,89.85,0.0,0.0,1191.1,4801.145,3.0,75.0,5380.5,1.0
max,80.0,9.0,96161.0,105285.0,41.962127,-114.192901,72.0,72.0,49.99,85.0,5.0,118.75,118.75,49.79,150.0,3564.72,11979.34,11.0,100.0,6500.0,1.0


In [61]:
# Check null values
grouped_data.isnull().sum()

CustomerID                              0
Gender                                  0
Age                                     0
Senior Citizen                          0
Partner                                 0
Dependents                              0
Married                                 0
Number of Dependents                    0
Country                                 0
State                                   0
City                                    0
Zip Code                                0
Population                              0
Lat Long                                0
Latitude                                0
Longitude                               0
Tenure Months                           0
Tenure in Months                        0
Avg Monthly Long Distance Charges       0
Avg Monthly GB Download                 0
Satisfaction Score                      0
Monthly Charges                         0
Total Charges                           0
Monthly Charge                    

In [62]:
# Check duplicate data
grouped_data.duplicated().sum()

0

## Drop unnecessary columns

Setelah mengamati proses sebelumnya, saya menemukan beberapa kolom yang dapat dihapus untuk menyederhanakan dan meningkatkan kualitas analisis:

- Kolom dengan nilai unik berjumlah 1: Kolom semacam ini sering kali tidak memberikan variasi yang cukup untuk mendukung analisis yang signifikan. Dengan menghapusnya, dapat menyaring data dan fokus pada informasi yang lebih relevan.

- Kolom dengan makna yang sama: Terkadang, kita menemukan beberapa kolom dengan makna yang sama yang hanya memperumit proses analisis. Dengan menghapus kolom-kolom ini, Anda dapat mempercepat proses analisis Anda dan fokus pada informasi yang benar-benar penting.

In [63]:
# Check unique values and drop columns has 1 unique value
unique_values = grouped_data.nunique()
unique_values

CustomerID                           7043
Gender                                  2
Age                                    62
Senior Citizen                          2
Partner                                 2
Dependents                              2
Married                                 2
Number of Dependents                   10
Country                                 1
State                                   1
City                                 1129
Zip Code                             1652
Population                           1592
Lat Long                             1652
Latitude                             1652
Longitude                            1651
Tenure Months                          73
Tenure in Months                       72
Avg Monthly Long Distance Charges    3584
Avg Monthly GB Download                50
Satisfaction Score                      5
Monthly Charges                      1585
Total Charges                        6531
Monthly Charge                    

In [64]:
# Drop columns has 1 unique value
cleaned_data = grouped_data.drop(columns=grouped_data.columns[unique_values == 1])

In [65]:
# Drop columns that have same definition as others
drop_same_columns = ['Partner','Lat Long','Churn Label',
                     'Tenure Months','Monthly Charges']
cleaned_data = cleaned_data.drop(columns=drop_same_columns)

In [66]:
cleaned_data.head()

Unnamed: 0,CustomerID,Gender,Age,Senior Citizen,Dependents,Married,Number of Dependents,City,Zip Code,Population,Latitude,Longitude,Tenure in Months,Avg Monthly Long Distance Charges,Avg Monthly GB Download,Satisfaction Score,Total Charges,Monthly Charge,Total Refunds,Total Extra Data Charges,Total Long Distance Charges,Total Revenue,Referred a Friend,Number of Referrals,Offer,Phone Service,Multiple Lines,Internet Service,Online Security,Online Backup,Device Protection,Tech Support,Streaming TV,Streaming Movies,Contract,Paperless Billing,Payment Method,Device Protection Plan,Premium Tech Support,Streaming Music,Unlimited Data,Churn Score,CLTV,Churn Reason,Churn Category,Customer Status,Churn Value
0,3668-QPYBK,Male,37,No,No,No,0,Los Angeles,90003,58198,33.964131,-118.272783,2,10.47,21,1,108.15,53.85,0.0,0,20.94,129.09,No,0,,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,No,No,No,Yes,86,3239,Competitor made better offer,Competitor,Churned,1
1,2967-MXRAV,Male,29,No,No,Yes,0,Los Angeles,90003,58198,33.964131,-118.272783,1,43.57,0,3,18.8,18.8,0.0,0,43.57,62.37,Yes,9,,Yes,No,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,One year,No,Mailed check,No,No,No,No,51,5160,,,Joined,0
2,9643-AVVWI,Female,49,No,Yes,Yes,3,Los Angeles,90003,58198,33.964131,-118.272783,3,19.18,22,3,241.3,80.0,0.0,0,57.54,298.84,Yes,2,,Yes,No,Fiber optic,No,Yes,No,Yes,No,No,Month-to-month,Yes,Electronic check,No,Yes,No,Yes,76,4264,,,Joined,0
3,0060-FUALY,Female,60,No,No,Yes,0,Los Angeles,90003,58198,33.964131,-118.272783,59,16.39,14,3,5597.65,94.75,0.0,0,967.01,6564.66,Yes,4,Offer B,Yes,Yes,Fiber optic,Yes,Yes,No,No,Yes,No,Month-to-month,Yes,Electronic check,No,No,No,Yes,26,5238,,,Stayed,0
4,9696-RMYBA,Male,56,No,No,No,0,Los Angeles,90003,58198,33.964131,-118.272783,5,12.35,13,3,398.55,80.1,0.0,0,61.75,460.3,No,0,,Yes,No,Fiber optic,No,No,No,No,Yes,No,Month-to-month,Yes,Mailed check,No,No,No,Yes,22,5225,,,Stayed,0


## Correct irrelevant value

### Handling non-numeric value

Saat mengubah kolom `Total Charges` menjadi tipe data float, terdapat pesan error `could not convert string to float: ' '`. Hal ini menunjukkan bahwa ada nilai non-numerik (berupa string) dalam kolom tersebut yang tidak dapat diubah menjadi float. Untuk menangani masalah ini, langkah pertama yang akan saya ambil adalah memeriksa baris data yang mengandung nilai non-numerik tersebut.

In [67]:
total_charge_non_numeric = cleaned_data[cleaned_data['Total Charges'] == ' ']
print(f'The number of lines with Total Charge is non-numeric: {total_charge_non_numeric.shape[0]}')
total_charge_non_numeric

The number of lines with Total Charge is non-numeric: 11


Unnamed: 0,CustomerID,Gender,Age,Senior Citizen,Dependents,Married,Number of Dependents,City,Zip Code,Population,Latitude,Longitude,Tenure in Months,Avg Monthly Long Distance Charges,Avg Monthly GB Download,Satisfaction Score,Total Charges,Monthly Charge,Total Refunds,Total Extra Data Charges,Total Long Distance Charges,Total Revenue,Referred a Friend,Number of Referrals,Offer,Phone Service,Multiple Lines,Internet Service,Online Security,Online Backup,Device Protection,Tech Support,Streaming TV,Streaming Movies,Contract,Paperless Billing,Payment Method,Device Protection Plan,Premium Tech Support,Streaming Music,Unlimited Data,Churn Score,CLTV,Churn Reason,Churn Category,Customer Status,Churn Value
48,7644-OMVMY,Male,56,No,Yes,Yes,1,Los Angeles,90029,41713,34.089953,-118.294824,10,15.51,0,3,,19.85,0.0,0,155.1,353.6,Yes,5,,Yes,No,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,Two year,No,Mailed check,No,No,No,No,53,2019,,,Stayed,0
2298,4472-LVYGI,Female,43,No,No,Yes,0,San Bernardino,92408,12149,34.084909,-117.258107,10,0.0,20,3,,52.55,0.0,0,0.0,525.5,Yes,2,,No,No phone service,DSL,Yes,No,Yes,Yes,Yes,No,Two year,Yes,Bank transfer (automatic),Yes,Yes,No,Yes,36,2578,,,Stayed,0
2457,3115-CZMZD,Male,24,No,No,No,0,Independence,93526,734,36.869584,-118.189241,10,13.16,0,4,,20.25,0.0,0,131.6,334.1,No,0,,Yes,No,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,Two year,No,Mailed check,No,No,No,No,68,5504,,,Stayed,0
3787,5709-LVOEQ,Female,40,No,No,Yes,0,San Mateo,94401,32488,37.590421,-122.306467,10,31.09,5,3,,80.85,0.0,0,310.9,1119.4,Yes,8,,Yes,No,DSL,Yes,Yes,Yes,No,Yes,Yes,Two year,No,Mailed check,Yes,No,Yes,Yes,45,2048,,,Stayed,0
3852,4367-NUYAO,Male,39,No,Yes,Yes,1,Cupertino,95014,54431,37.306612,-122.080621,10,22.83,0,5,,25.75,0.0,0,228.3,485.8,Yes,5,,Yes,Yes,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,Two year,No,Mailed check,No,No,No,No,48,4950,,,Stayed,0
4684,2520-SGTTA,Female,23,No,Yes,Yes,3,Ben Lomond,95005,6407,37.078873,-122.090386,10,20.05,0,4,,20.0,0.0,0,200.5,400.5,Yes,4,Offer E,Yes,No,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,Two year,No,Mailed check,No,No,No,No,27,3763,,,Stayed,0
5156,4075-WKNIU,Female,25,No,Yes,Yes,3,Bell,90201,105285,33.970343,-118.171368,10,5.59,59,4,,73.35,0.0,0,55.9,789.4,Yes,6,Offer E,Yes,Yes,DSL,No,Yes,Yes,Yes,Yes,No,Two year,No,Mailed check,Yes,Yes,No,Yes,44,2342,,,Stayed,0
5281,2775-SEFEE,Male,56,No,Yes,No,1,Wilmington,90744,53323,33.782068,-118.262263,10,29.95,19,4,,61.9,0.0,0,299.5,918.5,No,0,,Yes,Yes,DSL,Yes,Yes,No,Yes,No,No,Two year,Yes,Bank transfer (automatic),No,Yes,No,Yes,65,5188,,,Stayed,0
5405,2923-ARZLG,Male,38,No,Yes,Yes,2,La Verne,91750,35530,34.144703,-117.770299,10,46.23,0,5,,19.7,0.0,0,462.3,659.84,Yes,5,Offer E,Yes,No,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,One year,Yes,Mailed check,No,No,No,No,69,4890,,,Stayed,0
5665,3213-VVOLG,Male,22,No,Yes,Yes,2,Sun City,92585,8692,33.739412,-117.173334,10,36.37,0,4,,25.35,0.0,0,363.7,617.2,Yes,3,,Yes,Yes,No,No internet service,No internet service,No internet service,No internet service,No internet service,No internet service,Two year,No,Mailed check,No,No,No,No,49,2299,,,Stayed,0


Setelah melakukan pemeriksaan terhadap baris-baris pada kolom `Total Charges` yang bernilai non-numerik, dapat disimpulkan bahwa terjadi kesalahan input yang mengakibatkan keberadaan nilai non-numerik. Untuk menangani masalah ini, pendekatan yang akan saya ambil adalah dengan memasukkan nilai yang sesuai. Salah satu metode yang akan saya gunakan adalah dengan menghitung hasil `Tenure in Months * Monthly Charges` untuk mendapatkan nilai total charges yang akurat dan sesuai dengan konteks data yang ada, kemudian mengubahnya kolom `Total Charges` menjadi tipe data `float`.

In [68]:
cleaned_data.loc[cleaned_data['Total Charges'] == ' ', 'Total Charges'] = cleaned_data['Tenure in Months'] * cleaned_data['Monthly Charge']

In [69]:
cleaned_data['Total Charges'] = cleaned_data['Total Charges'].astype(float)

### Handling Missing Value

Terdapa missing value pada kolom `Churn Category` dan `Churn Reason`, ini mengartikan bahwa customer tidak memberitahu alasan mengapa mereka melakukan churn. Untuk menangani hal tersebut Saya akan mengganti missing value dengan nilai string berupa "No Reason".

Menurut Nataniel Klug yang sudah berpengalaman selama 25 tahun di industri telekomunikasi, dia menjelaskan dalam komentar di medium bahwa:

> "Menganalisis alasan customer churn menjadi sulit karena sebagian besar dari mereka cenderung berbohong atau memberikan alasan yang tidak sepenuhnya jujur selama proses tersebut, adapaun yang memberitahu alasan melakukan churn cenderung memberikan alasan yang lebih umum dan tanpa memberikan detail yang jelas"

Meskipun demikian, Saya menanggapi bahwa asalan customer churn tersebut tetap berharga sebagai wawasan yang berguna, Informasi tentang alasan pelanggan keluar dapat membantu dalam mengidentifikasi area-area yang perlu diperbaiki atau diperbaiki dalam produk atau layanan yang ditawarkan, serta dalam merencanakan strategi pemasaran yang lebih efektif untuk mempertahankan pelanggan. Tentu saja wawasan yang didapat perlu dilakukan validasi lagi.

In [70]:
cleaned_data['Churn Category'] = cleaned_data['Churn Category'].replace({np.NaN:'No Reason'})
cleaned_data['Churn Reason'] = cleaned_data['Churn Reason'].replace({np.NaN:'No Reason'})