# Telecom Churn Prediction

In the telecommunication industry, customers tend to change operators if not provided with attractive schemes and offers. It is very important for any telecom operator to prevent the present customers from churning to other operators. As a data scientist, your task in this case study would be to build an ML model which can predict if the customer will churn or not in a particular month based on the past data.

#### Objectives

The main goal of the case study is to build ML models to predict churn. The predictive model that you’re going to build will the following purposes:

- It will be used to predict whether a high-value customer will churn or not, in near future (i.e. churn phase). By knowing this, the company can take action steps such as providing special plans, discounts on recharge etc.

- It will be used to identify important variables that are strong predictors of churn. These variables may also indicate why customers choose to switch to other networks.

- Even though overall accuracy will be your primary evaluation metric, you should also mention other metrics like precision, recall, etc. for the different models that can be used for evaluation purposes based on different business objectives. For example, in this problem statement, one business goal can be to build an ML model that identifies customers who'll definitely churn with more accuracy as compared to the ones who'll not churn. Make sure you mention which metric can be used in such scenarios.

- Recommend strategies to manage customer churn based on your observations.

In [116]:
# Importing the library's
import pandas as pd
import numpy as np

import warnings
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import metrics

In [117]:
#output settings
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 500)

warnings.filterwarnings("ignore")

In [118]:
#Reading the filedata
df=pd.read_csv('train.csv')
df_unseen= pd.read_csv('test (1).csv')

#### Data Understanding

In [119]:
#train data
df.shape

(69999, 172)

In [120]:
#test data
df_unseen.shape

(30000, 171)

In [121]:
df.head()

Unnamed: 0,id,circle_id,loc_og_t2o_mou,std_og_t2o_mou,loc_ic_t2o_mou,last_date_of_month_6,last_date_of_month_7,last_date_of_month_8,arpu_6,arpu_7,arpu_8,onnet_mou_6,onnet_mou_7,onnet_mou_8,offnet_mou_6,offnet_mou_7,offnet_mou_8,roam_ic_mou_6,roam_ic_mou_7,roam_ic_mou_8,roam_og_mou_6,roam_og_mou_7,roam_og_mou_8,loc_og_t2t_mou_6,loc_og_t2t_mou_7,loc_og_t2t_mou_8,loc_og_t2m_mou_6,loc_og_t2m_mou_7,loc_og_t2m_mou_8,loc_og_t2f_mou_6,loc_og_t2f_mou_7,loc_og_t2f_mou_8,loc_og_t2c_mou_6,loc_og_t2c_mou_7,loc_og_t2c_mou_8,loc_og_mou_6,loc_og_mou_7,loc_og_mou_8,std_og_t2t_mou_6,std_og_t2t_mou_7,std_og_t2t_mou_8,std_og_t2m_mou_6,std_og_t2m_mou_7,std_og_t2m_mou_8,std_og_t2f_mou_6,std_og_t2f_mou_7,std_og_t2f_mou_8,std_og_t2c_mou_6,std_og_t2c_mou_7,std_og_t2c_mou_8,std_og_mou_6,std_og_mou_7,std_og_mou_8,isd_og_mou_6,isd_og_mou_7,isd_og_mou_8,spl_og_mou_6,spl_og_mou_7,spl_og_mou_8,og_others_6,og_others_7,og_others_8,total_og_mou_6,total_og_mou_7,total_og_mou_8,loc_ic_t2t_mou_6,loc_ic_t2t_mou_7,loc_ic_t2t_mou_8,loc_ic_t2m_mou_6,loc_ic_t2m_mou_7,loc_ic_t2m_mou_8,loc_ic_t2f_mou_6,loc_ic_t2f_mou_7,loc_ic_t2f_mou_8,loc_ic_mou_6,loc_ic_mou_7,loc_ic_mou_8,std_ic_t2t_mou_6,std_ic_t2t_mou_7,std_ic_t2t_mou_8,std_ic_t2m_mou_6,std_ic_t2m_mou_7,std_ic_t2m_mou_8,std_ic_t2f_mou_6,std_ic_t2f_mou_7,std_ic_t2f_mou_8,std_ic_t2o_mou_6,std_ic_t2o_mou_7,std_ic_t2o_mou_8,std_ic_mou_6,std_ic_mou_7,std_ic_mou_8,total_ic_mou_6,total_ic_mou_7,total_ic_mou_8,spl_ic_mou_6,spl_ic_mou_7,spl_ic_mou_8,isd_ic_mou_6,isd_ic_mou_7,isd_ic_mou_8,ic_others_6,ic_others_7,ic_others_8,total_rech_num_6,total_rech_num_7,total_rech_num_8,total_rech_amt_6,total_rech_amt_7,total_rech_amt_8,max_rech_amt_6,max_rech_amt_7,max_rech_amt_8,date_of_last_rech_6,date_of_last_rech_7,date_of_last_rech_8,last_day_rch_amt_6,last_day_rch_amt_7,last_day_rch_amt_8,date_of_last_rech_data_6,date_of_last_rech_data_7,date_of_last_rech_data_8,total_rech_data_6,total_rech_data_7,total_rech_data_8,max_rech_data_6,max_rech_data_7,max_rech_data_8,count_rech_2g_6,count_rech_2g_7,count_rech_2g_8,count_rech_3g_6,count_rech_3g_7,count_rech_3g_8,av_rech_amt_data_6,av_rech_amt_data_7,av_rech_amt_data_8,vol_2g_mb_6,vol_2g_mb_7,vol_2g_mb_8,vol_3g_mb_6,vol_3g_mb_7,vol_3g_mb_8,arpu_3g_6,arpu_3g_7,arpu_3g_8,arpu_2g_6,arpu_2g_7,arpu_2g_8,night_pck_user_6,night_pck_user_7,night_pck_user_8,monthly_2g_6,monthly_2g_7,monthly_2g_8,sachet_2g_6,sachet_2g_7,sachet_2g_8,monthly_3g_6,monthly_3g_7,monthly_3g_8,sachet_3g_6,sachet_3g_7,sachet_3g_8,fb_user_6,fb_user_7,fb_user_8,aon,aug_vbc_3g,jul_vbc_3g,jun_vbc_3g,churn_probability
0,0,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,31.277,87.009,7.527,48.58,124.38,1.29,32.24,96.68,2.33,0.0,0.0,0.0,0.0,0.0,0.0,2.23,0.0,0.28,5.29,16.04,2.33,0.0,0.0,0.0,0.0,0.0,0.0,7.53,16.04,2.61,46.34,124.38,1.01,18.75,80.61,0.0,0.0,0.0,0.0,0.0,0.0,0.0,65.09,204.99,1.01,0.0,0.0,0.0,8.2,0.63,0.0,0.38,0.0,0.0,81.21,221.68,3.63,2.43,3.68,7.79,0.83,21.08,16.91,0.0,0.0,0.0,3.26,24.76,24.71,0.0,7.61,0.21,7.46,19.96,14.96,0.0,0.0,0.0,0.0,0.0,0.0,7.46,27.58,15.18,11.84,53.04,40.56,0.0,0.0,0.66,0.0,0.0,0.0,1.11,0.69,0.0,3,2,2,77,65,10,65,65,10,6/22/2014,7/10/2014,8/24/2014,65,65,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,1958,0.0,0.0,0.0,0
1,1,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,0.0,122.787,42.953,0.0,0.0,0.0,0.0,25.99,30.89,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.01,29.79,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,30.73,31.66,0.0,0.0,0.0,0.0,30.73,31.66,1.68,19.09,10.53,1.41,18.68,11.09,0.35,1.66,3.4,3.44,39.44,25.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.44,39.44,25.04,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,3,4,5,0,145,50,0,145,50,6/12/2014,7/10/2014,8/26/2014,0,0,0,,7/8/2014,,,1.0,,,145.0,,,0.0,,,1.0,,,145.0,,0.0,352.91,0.0,0.0,3.96,0.0,,122.07,,,122.08,,,0.0,,0,0,0,0,0,0,0,1,0,0,0,0,,1.0,,710,0.0,0.0,0.0,0
2,2,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,60.806,103.176,0.0,0.53,15.93,0.0,53.99,82.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.53,12.98,0.0,24.11,0.0,0.0,0.0,0.0,0.0,2.14,0.0,0.0,24.64,12.98,0.0,0.0,2.94,0.0,28.94,82.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,28.94,84.99,0.0,0.0,0.0,0.0,2.89,1.38,0.0,0.0,0.0,0.0,56.49,99.36,0.0,4.51,6.16,6.49,89.86,25.18,23.51,0.0,0.0,0.0,94.38,31.34,30.01,11.69,0.0,0.0,18.21,2.48,6.38,0.0,0.0,0.0,0.0,0.0,0.0,29.91,2.48,6.38,124.29,33.83,36.64,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,2,4,2,70,120,0,70,70,0,6/11/2014,7/22/2014,8/24/2014,70,50,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,882,0.0,0.0,0.0,0
3,3,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,156.362,205.26,111.095,7.26,16.01,0.0,68.76,78.48,50.23,0.0,0.0,0.0,0.0,0.0,1.63,6.99,3.94,0.0,37.91,44.89,23.63,0.0,0.0,0.0,0.0,0.0,8.03,44.91,48.84,23.63,0.26,12.06,0.0,15.33,25.93,4.6,0.56,0.0,0.0,0.0,0.0,0.0,16.16,37.99,4.6,0.0,0.0,0.0,14.95,9.13,25.61,0.0,0.0,0.0,76.03,95.98,53.84,24.98,4.84,23.88,53.99,44.23,57.14,7.23,0.81,0.0,86.21,49.89,81.03,0.0,0.0,0.0,8.89,0.28,2.81,0.0,0.0,0.0,0.0,0.0,0.0,8.89,0.28,2.81,95.11,50.18,83.84,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2,4,3,160,240,130,110,110,50,6/15/2014,7/21/2014,8/25/2014,110,110,50,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,982,0.0,0.0,0.0,0
4,4,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,240.708,128.191,101.565,21.28,4.83,6.13,56.99,38.11,9.63,53.64,0.0,0.0,15.73,0.0,0.0,10.16,4.83,6.13,36.74,19.88,4.61,11.99,1.23,5.01,0.0,9.85,0.0,58.91,25.94,15.76,0.0,0.0,0.0,4.35,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.35,0.0,0.0,0.0,0.0,0.0,0.0,17.0,0.0,0.0,0.0,0.0,63.26,42.94,15.76,5.44,1.39,2.66,10.58,4.33,19.49,5.51,3.63,6.14,21.54,9.36,28.31,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.54,9.36,28.31,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13,10,8,290,136,122,50,41,30,6/25/2014,7/26/2014,8/30/2014,25,10,30,6/25/2014,7/23/2014,8/20/2014,7.0,7.0,6.0,25.0,41.0,25.0,7.0,6.0,6.0,0.0,1.0,0.0,175.0,191.0,142.0,390.8,308.89,213.47,0.0,0.0,0.0,0.0,35.0,0.0,0.0,35.12,0.0,0.0,0.0,0.0,0,0,0,7,6,6,0,0,0,0,1,0,1.0,1.0,1.0,647,0.0,0.0,0.0,0


In [122]:
df_unseen.head()

Unnamed: 0,id,circle_id,loc_og_t2o_mou,std_og_t2o_mou,loc_ic_t2o_mou,last_date_of_month_6,last_date_of_month_7,last_date_of_month_8,arpu_6,arpu_7,arpu_8,onnet_mou_6,onnet_mou_7,onnet_mou_8,offnet_mou_6,offnet_mou_7,offnet_mou_8,roam_ic_mou_6,roam_ic_mou_7,roam_ic_mou_8,roam_og_mou_6,roam_og_mou_7,roam_og_mou_8,loc_og_t2t_mou_6,loc_og_t2t_mou_7,loc_og_t2t_mou_8,loc_og_t2m_mou_6,loc_og_t2m_mou_7,loc_og_t2m_mou_8,loc_og_t2f_mou_6,loc_og_t2f_mou_7,loc_og_t2f_mou_8,loc_og_t2c_mou_6,loc_og_t2c_mou_7,loc_og_t2c_mou_8,loc_og_mou_6,loc_og_mou_7,loc_og_mou_8,std_og_t2t_mou_6,std_og_t2t_mou_7,std_og_t2t_mou_8,std_og_t2m_mou_6,std_og_t2m_mou_7,std_og_t2m_mou_8,std_og_t2f_mou_6,std_og_t2f_mou_7,std_og_t2f_mou_8,std_og_t2c_mou_6,std_og_t2c_mou_7,std_og_t2c_mou_8,std_og_mou_6,std_og_mou_7,std_og_mou_8,isd_og_mou_6,isd_og_mou_7,isd_og_mou_8,spl_og_mou_6,spl_og_mou_7,spl_og_mou_8,og_others_6,og_others_7,og_others_8,total_og_mou_6,total_og_mou_7,total_og_mou_8,loc_ic_t2t_mou_6,loc_ic_t2t_mou_7,loc_ic_t2t_mou_8,loc_ic_t2m_mou_6,loc_ic_t2m_mou_7,loc_ic_t2m_mou_8,loc_ic_t2f_mou_6,loc_ic_t2f_mou_7,loc_ic_t2f_mou_8,loc_ic_mou_6,loc_ic_mou_7,loc_ic_mou_8,std_ic_t2t_mou_6,std_ic_t2t_mou_7,std_ic_t2t_mou_8,std_ic_t2m_mou_6,std_ic_t2m_mou_7,std_ic_t2m_mou_8,std_ic_t2f_mou_6,std_ic_t2f_mou_7,std_ic_t2f_mou_8,std_ic_t2o_mou_6,std_ic_t2o_mou_7,std_ic_t2o_mou_8,std_ic_mou_6,std_ic_mou_7,std_ic_mou_8,total_ic_mou_6,total_ic_mou_7,total_ic_mou_8,spl_ic_mou_6,spl_ic_mou_7,spl_ic_mou_8,isd_ic_mou_6,isd_ic_mou_7,isd_ic_mou_8,ic_others_6,ic_others_7,ic_others_8,total_rech_num_6,total_rech_num_7,total_rech_num_8,total_rech_amt_6,total_rech_amt_7,total_rech_amt_8,max_rech_amt_6,max_rech_amt_7,max_rech_amt_8,date_of_last_rech_6,date_of_last_rech_7,date_of_last_rech_8,last_day_rch_amt_6,last_day_rch_amt_7,last_day_rch_amt_8,date_of_last_rech_data_6,date_of_last_rech_data_7,date_of_last_rech_data_8,total_rech_data_6,total_rech_data_7,total_rech_data_8,max_rech_data_6,max_rech_data_7,max_rech_data_8,count_rech_2g_6,count_rech_2g_7,count_rech_2g_8,count_rech_3g_6,count_rech_3g_7,count_rech_3g_8,av_rech_amt_data_6,av_rech_amt_data_7,av_rech_amt_data_8,vol_2g_mb_6,vol_2g_mb_7,vol_2g_mb_8,vol_3g_mb_6,vol_3g_mb_7,vol_3g_mb_8,arpu_3g_6,arpu_3g_7,arpu_3g_8,arpu_2g_6,arpu_2g_7,arpu_2g_8,night_pck_user_6,night_pck_user_7,night_pck_user_8,monthly_2g_6,monthly_2g_7,monthly_2g_8,sachet_2g_6,sachet_2g_7,sachet_2g_8,monthly_3g_6,monthly_3g_7,monthly_3g_8,sachet_3g_6,sachet_3g_7,sachet_3g_8,fb_user_6,fb_user_7,fb_user_8,aon,aug_vbc_3g,jul_vbc_3g,jun_vbc_3g
0,69999,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,91.882,65.33,64.445,31.78,20.23,23.11,60.16,32.16,34.83,0.0,0.0,0.0,0.0,0.0,0.0,24.88,20.23,21.06,18.13,10.89,8.36,0.0,13.58,0.0,0.0,0.0,0.03,43.01,44.71,29.43,6.9,0.0,2.05,42.03,7.68,26.43,0.0,0.0,0.0,0.0,0.0,0.0,48.93,7.68,28.48,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,91.94,52.39,57.94,30.33,37.56,21.98,10.21,4.59,9.53,0.26,0.0,0.0,40.81,42.16,31.51,0.0,0.0,0.0,0.36,1.04,4.34,0.0,0.0,0.0,0.0,0.0,0.0,0.36,1.04,4.34,41.73,43.56,36.26,0.54,0.34,0.39,0.0,0.0,0.0,0.0,0.0,0.0,5,5,4,103,90,60,50,30,30,6/21/2014,7/26/2014,8/24/2014,30,30,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,1692,0.0,0.0,0.0
1,70000,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,414.168,515.568,360.868,75.51,41.21,19.84,474.34,621.84,394.94,0.0,0.0,0.0,0.0,0.0,0.0,75.51,41.21,19.84,473.61,598.08,377.26,0.73,0.0,0.0,0.0,0.0,0.0,549.86,639.29,397.11,0.0,0.0,0.0,0.0,23.76,17.68,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.76,17.68,0.0,0.0,0.8,0.0,0.0,0.0,0.0,0.0,0.0,549.86,663.06,415.59,19.99,26.95,2.61,160.19,122.29,184.81,1.49,0.0,0.0,181.69,149.24,187.43,0.0,0.0,0.0,0.0,12.51,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,12.51,0.0,296.33,339.64,281.66,0.0,0.0,0.0,114.63,177.88,94.23,0.0,0.0,0.0,5,4,5,500,500,500,250,250,250,6/19/2014,7/16/2014,8/24/2014,250,0,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,2533,0.0,0.0,0.0
2,70001,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,329.844,434.884,746.239,7.54,7.86,8.4,16.98,45.81,45.04,22.81,103.38,26.08,24.53,53.68,54.44,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6,9,5,500,1000,1000,300,500,500,6/29/2014,7/27/2014,8/28/2014,0,0,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,277,525.61,758.41,241.84
3,70002,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,43.55,171.39,24.4,5.31,2.16,0.0,40.04,205.01,24.01,0.0,0.0,0.0,0.0,0.0,0.0,5.31,0.0,0.0,2.94,98.61,20.51,0.0,0.0,2.35,0.0,6.18,0.0,8.26,98.61,22.86,0.0,2.16,0.0,37.09,94.36,0.0,0.0,0.0,0.0,0.0,0.0,0.0,37.09,96.53,0.0,0.0,0.0,0.0,0.0,12.03,1.15,0.0,0.0,0.0,45.36,207.18,24.01,58.11,54.64,23.04,487.94,449.83,506.94,0.0,0.38,1.64,546.06,504.86,531.64,0.0,4.26,0.0,9.63,11.88,8.83,0.0,0.0,0.0,0.0,0.0,0.0,9.63,16.14,8.83,555.69,522.44,549.13,0.0,0.0,0.0,0.0,1.43,8.65,0.0,0.0,0.0,3,5,2,110,260,0,110,150,0,6/25/2014,7/30/2014,8/24/2014,110,150,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,1244,0.0,0.0,0.0
4,70003,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,306.854,406.289,413.329,450.93,609.03,700.68,60.94,23.84,74.16,0.0,0.0,0.0,0.0,0.0,0.0,0.45,0.78,14.56,2.39,2.66,10.94,0.0,0.0,0.0,0.0,0.0,0.0,2.84,3.44,25.51,450.48,608.24,686.11,58.54,21.18,63.18,0.0,0.0,0.0,0.0,0.0,0.0,509.03,629.43,749.29,0.0,0.0,0.0,0.71,5.39,4.96,2.2,0.0,0.0,514.79,638.28,779.78,0.0,0.36,9.91,10.13,9.23,7.69,0.0,0.0,0.0,10.13,9.59,17.61,29.71,92.36,107.39,13.88,13.96,32.46,0.0,0.0,1.61,0.0,0.0,0.0,43.59,106.33,141.48,53.73,115.93,159.26,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.16,11,7,8,356,490,546,90,130,130,6/29/2014,7/29/2014,8/30/2014,50,130,130,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,462,0.0,0.0,0.0


In [123]:
#dropping id column from train and unseen test as it is unique
df.drop("id", axis = 'columns', inplace = True)
df_unseen.drop("id", axis = 'columns', inplace = True)

In [124]:
df.head()

Unnamed: 0,circle_id,loc_og_t2o_mou,std_og_t2o_mou,loc_ic_t2o_mou,last_date_of_month_6,last_date_of_month_7,last_date_of_month_8,arpu_6,arpu_7,arpu_8,onnet_mou_6,onnet_mou_7,onnet_mou_8,offnet_mou_6,offnet_mou_7,offnet_mou_8,roam_ic_mou_6,roam_ic_mou_7,roam_ic_mou_8,roam_og_mou_6,roam_og_mou_7,roam_og_mou_8,loc_og_t2t_mou_6,loc_og_t2t_mou_7,loc_og_t2t_mou_8,loc_og_t2m_mou_6,loc_og_t2m_mou_7,loc_og_t2m_mou_8,loc_og_t2f_mou_6,loc_og_t2f_mou_7,loc_og_t2f_mou_8,loc_og_t2c_mou_6,loc_og_t2c_mou_7,loc_og_t2c_mou_8,loc_og_mou_6,loc_og_mou_7,loc_og_mou_8,std_og_t2t_mou_6,std_og_t2t_mou_7,std_og_t2t_mou_8,std_og_t2m_mou_6,std_og_t2m_mou_7,std_og_t2m_mou_8,std_og_t2f_mou_6,std_og_t2f_mou_7,std_og_t2f_mou_8,std_og_t2c_mou_6,std_og_t2c_mou_7,std_og_t2c_mou_8,std_og_mou_6,std_og_mou_7,std_og_mou_8,isd_og_mou_6,isd_og_mou_7,isd_og_mou_8,spl_og_mou_6,spl_og_mou_7,spl_og_mou_8,og_others_6,og_others_7,og_others_8,total_og_mou_6,total_og_mou_7,total_og_mou_8,loc_ic_t2t_mou_6,loc_ic_t2t_mou_7,loc_ic_t2t_mou_8,loc_ic_t2m_mou_6,loc_ic_t2m_mou_7,loc_ic_t2m_mou_8,loc_ic_t2f_mou_6,loc_ic_t2f_mou_7,loc_ic_t2f_mou_8,loc_ic_mou_6,loc_ic_mou_7,loc_ic_mou_8,std_ic_t2t_mou_6,std_ic_t2t_mou_7,std_ic_t2t_mou_8,std_ic_t2m_mou_6,std_ic_t2m_mou_7,std_ic_t2m_mou_8,std_ic_t2f_mou_6,std_ic_t2f_mou_7,std_ic_t2f_mou_8,std_ic_t2o_mou_6,std_ic_t2o_mou_7,std_ic_t2o_mou_8,std_ic_mou_6,std_ic_mou_7,std_ic_mou_8,total_ic_mou_6,total_ic_mou_7,total_ic_mou_8,spl_ic_mou_6,spl_ic_mou_7,spl_ic_mou_8,isd_ic_mou_6,isd_ic_mou_7,isd_ic_mou_8,ic_others_6,ic_others_7,ic_others_8,total_rech_num_6,total_rech_num_7,total_rech_num_8,total_rech_amt_6,total_rech_amt_7,total_rech_amt_8,max_rech_amt_6,max_rech_amt_7,max_rech_amt_8,date_of_last_rech_6,date_of_last_rech_7,date_of_last_rech_8,last_day_rch_amt_6,last_day_rch_amt_7,last_day_rch_amt_8,date_of_last_rech_data_6,date_of_last_rech_data_7,date_of_last_rech_data_8,total_rech_data_6,total_rech_data_7,total_rech_data_8,max_rech_data_6,max_rech_data_7,max_rech_data_8,count_rech_2g_6,count_rech_2g_7,count_rech_2g_8,count_rech_3g_6,count_rech_3g_7,count_rech_3g_8,av_rech_amt_data_6,av_rech_amt_data_7,av_rech_amt_data_8,vol_2g_mb_6,vol_2g_mb_7,vol_2g_mb_8,vol_3g_mb_6,vol_3g_mb_7,vol_3g_mb_8,arpu_3g_6,arpu_3g_7,arpu_3g_8,arpu_2g_6,arpu_2g_7,arpu_2g_8,night_pck_user_6,night_pck_user_7,night_pck_user_8,monthly_2g_6,monthly_2g_7,monthly_2g_8,sachet_2g_6,sachet_2g_7,sachet_2g_8,monthly_3g_6,monthly_3g_7,monthly_3g_8,sachet_3g_6,sachet_3g_7,sachet_3g_8,fb_user_6,fb_user_7,fb_user_8,aon,aug_vbc_3g,jul_vbc_3g,jun_vbc_3g,churn_probability
0,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,31.277,87.009,7.527,48.58,124.38,1.29,32.24,96.68,2.33,0.0,0.0,0.0,0.0,0.0,0.0,2.23,0.0,0.28,5.29,16.04,2.33,0.0,0.0,0.0,0.0,0.0,0.0,7.53,16.04,2.61,46.34,124.38,1.01,18.75,80.61,0.0,0.0,0.0,0.0,0.0,0.0,0.0,65.09,204.99,1.01,0.0,0.0,0.0,8.2,0.63,0.0,0.38,0.0,0.0,81.21,221.68,3.63,2.43,3.68,7.79,0.83,21.08,16.91,0.0,0.0,0.0,3.26,24.76,24.71,0.0,7.61,0.21,7.46,19.96,14.96,0.0,0.0,0.0,0.0,0.0,0.0,7.46,27.58,15.18,11.84,53.04,40.56,0.0,0.0,0.66,0.0,0.0,0.0,1.11,0.69,0.0,3,2,2,77,65,10,65,65,10,6/22/2014,7/10/2014,8/24/2014,65,65,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,1958,0.0,0.0,0.0,0
1,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,0.0,122.787,42.953,0.0,0.0,0.0,0.0,25.99,30.89,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,22.01,29.79,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,30.73,31.66,0.0,0.0,0.0,0.0,30.73,31.66,1.68,19.09,10.53,1.41,18.68,11.09,0.35,1.66,3.4,3.44,39.44,25.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.44,39.44,25.04,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,3,4,5,0,145,50,0,145,50,6/12/2014,7/10/2014,8/26/2014,0,0,0,,7/8/2014,,,1.0,,,145.0,,,0.0,,,1.0,,,145.0,,0.0,352.91,0.0,0.0,3.96,0.0,,122.07,,,122.08,,,0.0,,0,0,0,0,0,0,0,1,0,0,0,0,,1.0,,710,0.0,0.0,0.0,0
2,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,60.806,103.176,0.0,0.53,15.93,0.0,53.99,82.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.53,12.98,0.0,24.11,0.0,0.0,0.0,0.0,0.0,2.14,0.0,0.0,24.64,12.98,0.0,0.0,2.94,0.0,28.94,82.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,28.94,84.99,0.0,0.0,0.0,0.0,2.89,1.38,0.0,0.0,0.0,0.0,56.49,99.36,0.0,4.51,6.16,6.49,89.86,25.18,23.51,0.0,0.0,0.0,94.38,31.34,30.01,11.69,0.0,0.0,18.21,2.48,6.38,0.0,0.0,0.0,0.0,0.0,0.0,29.91,2.48,6.38,124.29,33.83,36.64,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,2,4,2,70,120,0,70,70,0,6/11/2014,7/22/2014,8/24/2014,70,50,0,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,882,0.0,0.0,0.0,0
3,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,156.362,205.26,111.095,7.26,16.01,0.0,68.76,78.48,50.23,0.0,0.0,0.0,0.0,0.0,1.63,6.99,3.94,0.0,37.91,44.89,23.63,0.0,0.0,0.0,0.0,0.0,8.03,44.91,48.84,23.63,0.26,12.06,0.0,15.33,25.93,4.6,0.56,0.0,0.0,0.0,0.0,0.0,16.16,37.99,4.6,0.0,0.0,0.0,14.95,9.13,25.61,0.0,0.0,0.0,76.03,95.98,53.84,24.98,4.84,23.88,53.99,44.23,57.14,7.23,0.81,0.0,86.21,49.89,81.03,0.0,0.0,0.0,8.89,0.28,2.81,0.0,0.0,0.0,0.0,0.0,0.0,8.89,0.28,2.81,95.11,50.18,83.84,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2,4,3,160,240,130,110,110,50,6/15/2014,7/21/2014,8/25/2014,110,110,50,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,982,0.0,0.0,0.0,0
4,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,240.708,128.191,101.565,21.28,4.83,6.13,56.99,38.11,9.63,53.64,0.0,0.0,15.73,0.0,0.0,10.16,4.83,6.13,36.74,19.88,4.61,11.99,1.23,5.01,0.0,9.85,0.0,58.91,25.94,15.76,0.0,0.0,0.0,4.35,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.35,0.0,0.0,0.0,0.0,0.0,0.0,17.0,0.0,0.0,0.0,0.0,63.26,42.94,15.76,5.44,1.39,2.66,10.58,4.33,19.49,5.51,3.63,6.14,21.54,9.36,28.31,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.54,9.36,28.31,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13,10,8,290,136,122,50,41,30,6/25/2014,7/26/2014,8/30/2014,25,10,30,6/25/2014,7/23/2014,8/20/2014,7.0,7.0,6.0,25.0,41.0,25.0,7.0,6.0,6.0,0.0,1.0,0.0,175.0,191.0,142.0,390.8,308.89,213.47,0.0,0.0,0.0,0.0,35.0,0.0,0.0,35.12,0.0,0.0,0.0,0.0,0,0,0,7,6,6,0,0,0,0,1,0,1.0,1.0,1.0,647,0.0,0.0,0.0,0


#### Data Understanding

<table align="left">
<tr><th>Acronyms</th><th>Description</th></tr>    
<tr><td>CIRCLE_ID</td><td>Telecom circle area to which the customer belongs to</td></tr>
<tr><td>LOC</td><td>Local calls  within same telecom circle</td></tr>
<tr><td>STD</td><td>STD calls  outside the calling circle</td></tr>
<tr><td>IC</td><td>Incoming calls</td></tr>
<tr><td>OG</td><td>Outgoing calls</td></tr>
<tr><td>T2T</td><td>Operator T to T ie within same operator mobile to mobile</td></tr>
<tr><td>T2M    </td><td>Operator T to other operator mobile</td></tr>
<tr><td>T2O    </td><td>Operator T to other operator fixed line</td></tr>
<tr><td>T2F    </td><td>Operator T to fixed lines of T</td></tr>
<tr><td>T2C    </td><td>Operator T to its own call center</td></tr>
<tr><td>ARPU    </td><td>Average revenue per user</td></tr>
<tr><td>MOU    </td><td>Minutes of usage  voice calls</td></tr>
<tr><td>AON    </td><td>Age on network  number of days the customer is using the operator T network</td></tr>
<tr><td>ONNET   </td><td>All kind of calls within the same operator network</td></tr>
<tr><td>OFFNET    </td><td>All kind of calls outside the operator T network</td></tr>
<tr><td>ROAM</td><td>Indicates that customer is in roaming zone during the call</td></tr>
<tr><td>SPL   </td><td>Special calls</td></tr>
<tr><td>ISD    </td><td>ISD calls</td></tr>
<tr><td>RECH    </td><td>Recharge</td></tr>
<tr><td>NUM    </td><td>Number</td></tr>
<tr><td>AMT    </td><td>Amount in local currency</td></tr>
<tr><td>MAX    </td><td>Maximum</td></tr>
<tr><td>DATA    </td><td>Mobile internet</td></tr>
<tr><td>3G    </td><td>G network</td></tr>
<tr><td>AV    </td><td>Average</td></tr>
<tr><td>VOL    </td><td>Mobile internet usage volume in MB</td></tr>
<tr><td>2G    </td><td>G network</td></tr>
<tr><td>PCK    </td><td>Prepaid service schemes called  PACKS</td></tr>
<tr><td>NIGHT    </td><td>Scheme to use during specific night hours only</td></tr>
<tr><td>MONTHLY    </td><td>Service schemes with validity equivalent to a month</td></tr>
<tr><td>SACHET   </td><td>Service schemes with validity smaller than a month</td></tr>
<tr><td>*.6    </td><td>KPI for the month of June</td></tr>
<tr><td>*.7    </td><td>KPI for the month of July</td></tr>
<tr><td>*.8    </td><td>KPI for the month of August</td></tr>
<tr><td>FB_USER</td><td>Service scheme to avail services of Facebook and similar social networking sites</td></tr>
<tr><td>VBC    </td><td>Volume based cost  when no specific scheme is not purchased and paid as per usage</td></tr>
</table>

In [125]:
# Checking Null value in entire row.
df.isnull().all(axis=1).sum()

0

In [126]:
# Checking Null value in entire column.
df.isnull().all(axis=0).sum()

0

In [127]:
df.tail()

Unnamed: 0,circle_id,loc_og_t2o_mou,std_og_t2o_mou,loc_ic_t2o_mou,last_date_of_month_6,last_date_of_month_7,last_date_of_month_8,arpu_6,arpu_7,arpu_8,onnet_mou_6,onnet_mou_7,onnet_mou_8,offnet_mou_6,offnet_mou_7,offnet_mou_8,roam_ic_mou_6,roam_ic_mou_7,roam_ic_mou_8,roam_og_mou_6,roam_og_mou_7,roam_og_mou_8,loc_og_t2t_mou_6,loc_og_t2t_mou_7,loc_og_t2t_mou_8,loc_og_t2m_mou_6,loc_og_t2m_mou_7,loc_og_t2m_mou_8,loc_og_t2f_mou_6,loc_og_t2f_mou_7,loc_og_t2f_mou_8,loc_og_t2c_mou_6,loc_og_t2c_mou_7,loc_og_t2c_mou_8,loc_og_mou_6,loc_og_mou_7,loc_og_mou_8,std_og_t2t_mou_6,std_og_t2t_mou_7,std_og_t2t_mou_8,std_og_t2m_mou_6,std_og_t2m_mou_7,std_og_t2m_mou_8,std_og_t2f_mou_6,std_og_t2f_mou_7,std_og_t2f_mou_8,std_og_t2c_mou_6,std_og_t2c_mou_7,std_og_t2c_mou_8,std_og_mou_6,std_og_mou_7,std_og_mou_8,isd_og_mou_6,isd_og_mou_7,isd_og_mou_8,spl_og_mou_6,spl_og_mou_7,spl_og_mou_8,og_others_6,og_others_7,og_others_8,total_og_mou_6,total_og_mou_7,total_og_mou_8,loc_ic_t2t_mou_6,loc_ic_t2t_mou_7,loc_ic_t2t_mou_8,loc_ic_t2m_mou_6,loc_ic_t2m_mou_7,loc_ic_t2m_mou_8,loc_ic_t2f_mou_6,loc_ic_t2f_mou_7,loc_ic_t2f_mou_8,loc_ic_mou_6,loc_ic_mou_7,loc_ic_mou_8,std_ic_t2t_mou_6,std_ic_t2t_mou_7,std_ic_t2t_mou_8,std_ic_t2m_mou_6,std_ic_t2m_mou_7,std_ic_t2m_mou_8,std_ic_t2f_mou_6,std_ic_t2f_mou_7,std_ic_t2f_mou_8,std_ic_t2o_mou_6,std_ic_t2o_mou_7,std_ic_t2o_mou_8,std_ic_mou_6,std_ic_mou_7,std_ic_mou_8,total_ic_mou_6,total_ic_mou_7,total_ic_mou_8,spl_ic_mou_6,spl_ic_mou_7,spl_ic_mou_8,isd_ic_mou_6,isd_ic_mou_7,isd_ic_mou_8,ic_others_6,ic_others_7,ic_others_8,total_rech_num_6,total_rech_num_7,total_rech_num_8,total_rech_amt_6,total_rech_amt_7,total_rech_amt_8,max_rech_amt_6,max_rech_amt_7,max_rech_amt_8,date_of_last_rech_6,date_of_last_rech_7,date_of_last_rech_8,last_day_rch_amt_6,last_day_rch_amt_7,last_day_rch_amt_8,date_of_last_rech_data_6,date_of_last_rech_data_7,date_of_last_rech_data_8,total_rech_data_6,total_rech_data_7,total_rech_data_8,max_rech_data_6,max_rech_data_7,max_rech_data_8,count_rech_2g_6,count_rech_2g_7,count_rech_2g_8,count_rech_3g_6,count_rech_3g_7,count_rech_3g_8,av_rech_amt_data_6,av_rech_amt_data_7,av_rech_amt_data_8,vol_2g_mb_6,vol_2g_mb_7,vol_2g_mb_8,vol_3g_mb_6,vol_3g_mb_7,vol_3g_mb_8,arpu_3g_6,arpu_3g_7,arpu_3g_8,arpu_2g_6,arpu_2g_7,arpu_2g_8,night_pck_user_6,night_pck_user_7,night_pck_user_8,monthly_2g_6,monthly_2g_7,monthly_2g_8,sachet_2g_6,sachet_2g_7,sachet_2g_8,monthly_3g_6,monthly_3g_7,monthly_3g_8,sachet_3g_6,sachet_3g_7,sachet_3g_8,fb_user_6,fb_user_7,fb_user_8,aon,aug_vbc_3g,jul_vbc_3g,jun_vbc_3g,churn_probability
69994,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,15.76,410.924,329.136,0.0,7.36,10.93,0.0,488.46,381.64,14.96,0.0,0.0,0.0,0.0,0.0,0.0,2.44,7.19,0.0,60.64,89.66,0.0,0.0,0.0,0.0,2.43,0.86,0.0,63.09,96.86,0.0,4.91,3.73,0.0,414.61,290.14,0.0,0.0,0.0,0.0,0.0,0.0,0.0,419.53,293.88,0.0,0.0,0.0,0.0,14.05,1.83,0.0,0.0,0.0,0.0,496.68,392.58,0.0,26.59,33.84,0.0,172.33,223.91,0.0,1.06,0.0,0.0,199.99,257.76,0.0,0.0,0.0,0.0,21.99,11.79,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.99,11.79,0.0,221.99,269.56,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,17,13,50,397,512,50,110,130,6/18/2014,7/31/2014,8/31/2014,50,20,130,,7/31/2014,8/21/2014,,7.0,1.0,,25.0,17.0,,6.0,1.0,,1.0,0.0,,135.0,17.0,0.0,244.59,144.31,0.0,0.0,0.0,,21.91,0.0,,60.61,48.0,,0.0,0.0,0,0,0,0,6,1,0,0,0,0,1,0,,1.0,1.0,221,0.0,0.0,0.0,0
69995,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,160.083,289.129,265.772,116.54,196.46,232.63,49.53,96.28,48.06,0.0,0.0,0.0,0.0,0.0,0.0,7.18,30.11,9.06,37.53,73.84,47.34,2.01,0.0,0.0,0.0,4.01,0.0,46.73,103.96,56.41,109.36,166.34,223.56,9.98,18.41,0.53,0.0,0.0,0.0,0.0,0.0,0.0,119.34,184.76,224.09,0.0,0.0,0.0,0.13,4.01,0.18,0.0,0.0,0.0,166.21,292.74,280.69,30.48,28.48,23.09,21.78,35.18,28.79,2.38,0.21,0.0,54.64,63.88,51.89,16.63,39.23,66.28,8.96,9.31,17.24,0.0,0.0,0.0,0.0,0.0,0.0,25.59,48.54,83.53,80.24,112.43,136.01,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.08,5,11,9,200,313,308,90,44,44,6/28/2014,7/31/2014,8/27/2014,50,30,42,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,712,0.0,0.0,0.0,0
69996,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,372.088,258.374,279.782,77.13,68.44,78.44,335.54,227.94,263.84,0.0,0.0,0.0,0.0,0.0,0.0,77.13,44.28,78.44,143.19,82.58,138.26,142.58,141.26,125.58,0.0,4.1,0.0,362.91,268.13,342.29,0.0,24.16,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,24.16,0.0,0.21,0.0,0.0,49.54,4.1,0.0,0.0,0.0,0.0,412.68,296.39,342.29,46.41,30.29,86.53,143.94,147.01,177.73,339.11,236.16,147.74,529.48,413.48,412.01,0.0,0.0,0.0,0.0,0.0,0.0,2.5,0.0,2.48,0.0,0.0,0.0,2.5,0.0,2.48,542.18,416.58,414.54,0.0,0.0,0.0,5.05,0.0,0.05,5.14,3.09,0.0,3,1,4,626,250,397,279,250,349,6/25/2014,7/30/2014,8/29/2014,279,250,48,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,,,,879,0.0,0.0,0.0,0
69997,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,238.575,245.414,145.062,14.01,7.64,6.71,30.34,16.68,12.56,25.06,0.0,0.0,4.58,0.0,0.0,10.88,7.64,6.71,4.44,6.66,8.84,7.99,1.45,2.86,0.0,0.0,0.0,23.33,15.76,18.43,2.15,0.0,0.0,14.3,8.56,0.85,0.0,0.0,0.0,0.0,0.0,0.0,16.45,8.56,0.85,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,39.78,24.33,19.28,11.36,3.64,1.04,0.66,1.68,3.94,0.34,4.28,2.81,12.38,9.61,7.81,3.7,4.61,1.3,2.74,2.01,7.36,0.0,0.0,1.28,0.0,0.0,0.0,6.44,6.63,9.94,18.83,16.24,17.76,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5,3,2,379,252,145,200,252,145,6/29/2014,7/19/2014,8/26/2014,0,0,0,6/17/2014,7/13/2014,8/14/2014,1.0,1.0,1.0,179.0,252.0,145.0,0.0,0.0,0.0,1.0,1.0,1.0,179.0,252.0,145.0,46.25,57.61,44.64,1253.47,1774.18,658.19,150.67,212.18,122.08,150.67,212.17,122.07,0.0,0.0,0.0,0,0,0,0,0,0,1,1,1,0,0,0,1.0,1.0,1.0,277,664.25,1402.96,990.97,0
69998,109,0.0,0.0,0.0,6/30/2014,7/31/2014,8/31/2014,168.269,42.815,167.961,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.21,4.31,0.96,2.68,38.71,31.69,0.43,5.78,0.0,5.33,48.81,32.66,0.0,0.0,0.0,0.0,16.28,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.28,0.0,8.13,65.09,33.58,0.0,0.0,0.0,0.0,0.0,0.55,2.8,0.0,0.36,2,2,2,198,50,198,198,50,198,6/19/2014,7/27/2014,8/25/2014,198,0,0,6/19/2014,,8/8/2014,1.0,,1.0,198.0,,198.0,1.0,,1.0,0.0,,0.0,198.0,,198.0,280.7,0.0,982.54,0.0,0.0,0.0,0.0,,0.0,0.0,,0.02,0.0,,0.0,1,0,1,0,0,0,0,0,0,0,0,0,1.0,,1.0,1876,0.0,0.0,0.0,0


In [128]:
#### Percentage of Null values check 
(df.isnull().sum() * 100 / len(df)).sort_values(ascending = False)

count_rech_2g_6             74.902499
arpu_2g_6                   74.902499
night_pck_user_6            74.902499
date_of_last_rech_data_6    74.902499
total_rech_data_6           74.902499
av_rech_amt_data_6          74.902499
max_rech_data_6             74.902499
count_rech_3g_6             74.902499
arpu_3g_6                   74.902499
fb_user_6                   74.902499
arpu_3g_7                   74.478207
night_pck_user_7            74.478207
date_of_last_rech_data_7    74.478207
total_rech_data_7           74.478207
max_rech_data_7             74.478207
fb_user_7                   74.478207
av_rech_amt_data_7          74.478207
count_rech_2g_7             74.478207
count_rech_3g_7             74.478207
arpu_2g_7                   74.478207
arpu_2g_8                   73.689624
night_pck_user_8            73.689624
arpu_3g_8                   73.689624
max_rech_data_8             73.689624
av_rech_amt_data_8          73.689624
date_of_last_rech_data_8    73.689624
fb_user_8   

In [129]:
df.loc_ic_t2t_mou_8                          

0         7.79
1        10.53
2         6.49
3        23.88
4         2.66
         ...  
69994    33.84
69995    23.09
69996    86.53
69997     1.04
69998     0.96
Name: loc_ic_t2t_mou_8, Length: 69999, dtype: float64

In [130]:
# Fixing null values
# recharge count columns -- we can assume 0 recharge in place of NA
df.count_rech_2g_6 = df.count_rech_2g_6.fillna(0)
df.count_rech_3g_6 = df.count_rech_3g_6.fillna(0)
df.count_rech_2g_7 = df.count_rech_2g_7.fillna(0)
df.count_rech_3g_7 = df.count_rech_3g_7.fillna(0)
df.count_rech_3g_8 = df.count_rech_3g_8.fillna(0)
df.count_rech_2g_8 = df.count_rech_2g_8.fillna(0)

df.max_rech_data_6 = df.max_rech_data_6.fillna(0)
df.max_rech_data_7 = df.max_rech_data_7.fillna(0)
df.max_rech_data_8 = df.max_rech_data_8.fillna(0)

df.total_rech_data_6 = df.total_rech_data_6.fillna(0)
df.total_rech_data_7 = df.total_rech_data_7.fillna(0)
df.total_rech_data_8 = df.total_rech_data_8.fillna(0)

df.av_rech_amt_data_6 = df.av_rech_amt_data_6.fillna(0)
df.av_rech_amt_data_7 = df.av_rech_amt_data_7.fillna(0)
df.av_rech_amt_data_8 = df.av_rech_amt_data_8.fillna(0)

# Average revenue per user -- we can assume 0 in place of NA
df.arpu_3g_6 = df.arpu_3g_6.fillna(0)
df.arpu_2g_6 = df.arpu_2g_6.fillna(0)
df.arpu_2g_7 = df.arpu_2g_7.fillna(0)
df.arpu_3g_7 = df.arpu_3g_7.fillna(0)
df.arpu_3g_8 = df.arpu_3g_8.fillna(0)
df.arpu_2g_8 = df.arpu_2g_8.fillna(0)

# night pack columns -- we can assume 0 pack in place of NA values
df.night_pck_user_6 = df.night_pck_user_6.fillna(0)
df.night_pck_user_7 = df.night_pck_user_7.fillna(0)
df.night_pck_user_8 = df.night_pck_user_8.fillna(0)

# FB user columns -- we can assume 0 as in place of NA
df.fb_user_6 = df.fb_user_6.fillna(0)
df.fb_user_7 = df.fb_user_7.fillna(0)
df.fb_user_8 = df.fb_user_8.fillna(0)

In [136]:
# loc_ic_t2t_mou columns -- we can assume 0 as incoming calls in place of NA
df.loc_ic_t2t_mou_6 = df.loc_ic_t2t_mou_6.fillna(0)
df.loc_ic_t2t_mou_7 = df.loc_ic_t2t_mou_7.fillna(0)
df.loc_ic_t2t_mou_8 = df.loc_ic_t2t_mou_8.fillna(0)


df.loc_og_t2c_mou_6 = df.loc_og_t2c_mou_6.fillna(0)
df.loc_og_t2c_mou_7 = df.loc_og_t2c_mou_7.fillna(0)
df.loc_og_t2c_mou_8 = df.loc_og_t2c_mou_8.fillna(0)

df.loc_og_mou_6 = df.loc_og_mou_6.fillna(0)
df.loc_og_mou_7 = df.loc_og_mou_7.fillna(0)
df.loc_og_mou_8 = df.loc_og_mou_8.fillna(0)

df.loc_og_t2t_mou_6 = df.loc_og_t2t_mou_6.fillna(0)
df.loc_og_t2t_mou_7 = df.loc_og_t2t_mou_7.fillna(0)
df.loc_og_t2t_mou_8 = df.loc_og_t2t_mou_8.fillna(0)

df.loc_ic_t2m_mou_6 = df.loc_ic_t2m_mou_6.fillna(0)
df.loc_ic_t2m_mou_7 = df.loc_ic_t2m_mou_7.fillna(0)
df.loc_ic_t2m_mou_8 = df.loc_ic_t2m_mou_8.fillna(0)


df.loc_ic_t2f_mou_6 = df.loc_ic_t2f_mou_6.fillna(0)
df.loc_ic_t2f_mou_7 = df.loc_ic_t2f_mou_7.fillna(0)
df.loc_ic_t2f_mou_8 = df.loc_ic_t2f_mou_8.fillna(0)

df.loc_ic_t2o_mou = df.loc_ic_t2o_mou.fillna(0)
df.loc_og_t2o_mou = df.loc_og_t2o_mou.fillna(0)

df.loc_ic_mou_6 = df.loc_ic_mou_6.fillna(0)
df.loc_ic_mou_7 = df.loc_ic_mou_7.fillna(0)
df.loc_ic_mou_8 = df.loc_ic_mou_8.fillna(0)

df.loc_og_t2t_mou_8 = df.loc_og_t2t_mou_8.fillna(0)
df.loc_og_t2c_mou_8 = df.loc_og_t2c_mou_8.fillna(0)
df.loc_og_t2m_mou_8 = df.loc_og_t2m_mou_8.fillna(0)

df.loc_og_t2m_mou_6 = df.loc_og_t2m_mou_6.fillna(0)
df.loc_og_t2m_mou_7 = df.loc_og_t2m_mou_7.fillna(0)
df.loc_og_t2m_mou_8 = df.loc_og_t2m_mou_8.fillna(0)

df.loc_og_t2f_mou_6 = df.loc_og_t2f_mou_6.fillna(0)
df.loc_og_t2f_mou_7 = df.loc_og_t2f_mou_7.fillna(0)
df.loc_og_t2f_mou_8 = df.loc_og_t2f_mou_8.fillna(0)

df.loc_og_mou_8 = df.loc_og_mou_8.fillna(0)

# og_others -- we can assume 0 as outgoing calls in place of NA
df.og_others_8 = df.og_others_8.fillna(0)

df.spl_og_mou_6 = df.spl_og_mou_6.fillna(0)
df.spl_og_mou_7 = df.spl_og_mou_7.fillna(0)
df.spl_og_mou_8 = df.spl_og_mou_8.fillna(0)

df.isd_og_mou_8 = df.isd_og_mou_8.fillna(0)
df.isd_og_mou_7 = df.isd_og_mou_7.fillna(0)
df.isd_og_mou_6 = df.isd_og_mou_6.fillna(0)

df.isd_ic_mou_6 = df.isd_ic_mou_6.fillna(0)
df.isd_ic_mou_7 = df.isd_ic_mou_7.fillna(0)
df.isd_ic_mou_8 = df.isd_ic_mou_8.fillna(0)

df.std_og_mou_6 = df.std_og_mou_6.fillna(0)
df.std_og_mou_7 = df.std_og_mou_7.fillna(0)
df.std_og_mou_8 = df.std_og_mou_8.fillna(0)

df.std_ic_mou_6 = df.std_ic_mou_6.fillna(0)
df.std_ic_mou_7 = df.std_ic_mou_7.fillna(0)
df.std_ic_mou_8 = df.std_ic_mou_8.fillna(0)

df.loc_ic_mou_6 = df.loc_ic_mou_6.fillna(0)

df.std_ic_t2f_mou_6 = df.std_ic_t2f_mou_6.fillna(0)
df.std_ic_t2f_mou_7 = df.std_ic_t2f_mou_7.fillna(0)

df.std_og_t2f_mou_6 = df.std_og_t2f_mou_6.fillna(0)
df.std_og_t2f_mou_7 = df.std_og_t2f_mou_7.fillna(0)
df.std_og_t2f_mou_8 = df.std_og_t2f_mou_8.fillna(0)

df.std_og_t2t_mou_6 = df.std_og_t2t_mou_6.fillna(0)
df.std_og_t2t_mou_7 = df.std_og_t2t_mou_7.fillna(0)
df.std_og_t2t_mou_8 = df.std_og_t2t_mou_8.fillna(0)


df.std_og_t2m_mou_6 = df.std_og_t2m_mou_6.fillna(0)
df.std_og_t2m_mou_7 = df.std_og_t2m_mou_7.fillna(0)
df.std_og_t2m_mou_8 = df.std_og_t2m_mou_8.fillna(0)



df.std_og_t2c_mou_6 = df.std_og_t2c_mou_6.fillna(0)
df.std_og_t2c_mou_7 = df.std_og_t2c_mou_7.fillna(0)
df.std_og_t2c_mou_8 = df.std_og_t2c_mou_8.fillna(0)

In [132]:
df.std_ic_t2t_mou_6 = df.std_ic_t2t_mou_6.fillna(0)
df.std_ic_t2t_mou_7 = df.std_ic_t2t_mou_7.fillna(0)
df.std_ic_t2t_mou_8 = df.std_ic_t2t_mou_8.fillna(0)

df.std_ic_t2o_mou_6 = df.std_ic_t2o_mou_6.fillna(0)
df.std_ic_t2o_mou_7 = df.std_ic_t2o_mou_7.fillna(0)
df.std_ic_t2o_mou_8 = df.std_ic_t2o_mou_8.fillna(0)


df.std_ic_t2m_mou_6 = df.std_ic_t2m_mou_6.fillna(0)
df.std_ic_t2m_mou_7 = df.std_ic_t2m_mou_7.fillna(0)
df.std_ic_t2m_mou_8 = df.std_ic_t2m_mou_8.fillna(0)

df.std_ic_t2f_mou_6 = df.std_ic_t2f_mou_6.fillna(0)
df.std_ic_t2f_mou_7 = df.std_ic_t2f_mou_7.fillna(0)
df.std_ic_t2f_mou_8 = df.std_ic_t2f_mou_8.fillna(0)

df.std_ic_mou_8 = df.std_ic_mou_8.fillna(0)


df.roam_og_mou_6 = df.roam_og_mou_6.fillna(0)
df.roam_og_mou_7 = df.roam_og_mou_7.fillna(0)
df.roam_og_mou_8 = df.roam_og_mou_8.fillna(0)


df.roam_ic_mou_6 = df.roam_ic_mou_6.fillna(0)
df.roam_ic_mou_7 = df.roam_ic_mou_7.fillna(0)
df.roam_ic_mou_8 = df.roam_ic_mou_8.fillna(0)


df.offnet_mou_6 = df.offnet_mou_6.fillna(0)
df.offnet_mou_7 = df.offnet_mou_7.fillna(0)
df.offnet_mou_8 = df.offnet_mou_8.fillna(0)

df.std_og_t2o_mou = df.std_og_t2o_mou.fillna(0)

df.ic_others_6 = df.ic_others_6.fillna(0)
df.ic_others_7 = df.ic_others_7.fillna(0)
df.ic_others_8 = df.ic_others_8.fillna(0)

df.og_others_6 = df.og_others_6.fillna(0)
df.og_others_7 = df.og_others_7.fillna(0)
df.og_others_8 = df.og_others_8.fillna(0)

df.spl_ic_mou_6 = df.spl_ic_mou_6.fillna(0)
df.spl_ic_mou_7 = df.spl_ic_mou_7.fillna(0)
df.spl_ic_mou_8 = df.spl_ic_mou_8.fillna(0)

df.onnet_mou_6 = df.onnet_mou_6.fillna(0)
df.onnet_mou_7 = df.onnet_mou_7.fillna(0)
df.onnet_mou_8 = df.onnet_mou_8.fillna(0)

In [133]:
#Fixing null values for datetime columns
df.date_of_last_rech_data_6 = pd.to_datetime(df.date_of_last_rech_data_6)
df.date_of_last_rech_data_7 = pd.to_datetime(df.date_of_last_rech_data_7)
df.date_of_last_rech_data_8 = pd.to_datetime(df.date_of_last_rech_data_8)
df.date_of_last_rech_6 = pd.to_datetime(df.date_of_last_rech_6)
df.date_of_last_rech_7 = pd.to_datetime(df.date_of_last_rech_7)
df.date_of_last_rech_8 = pd.to_datetime(df.date_of_last_rech_8)
df.last_date_of_month_6 = pd.to_datetime(df.last_date_of_month_6)
df.last_date_of_month_7 = pd.to_datetime(df.last_date_of_month_7)
df.last_date_of_month_8 = pd.to_datetime(df.last_date_of_month_8)


#assigning last date of the month in place of NA values
df.date_of_last_rech_data_6 = df.date_of_last_rech_data_6.fillna('2014-06-30')
df.date_of_last_rech_data_7 = df.date_of_last_rech_data_7.fillna('2014-07-31')
df.date_of_last_rech_data_8 = df.date_of_last_rech_data_8.fillna('2014-08-31')


df.date_of_last_rech_6 = df.date_of_last_rech_6.fillna('2014-06-30')
df.date_of_last_rech_7 = df.date_of_last_rech_7.fillna('2014-07-31')
df.date_of_last_rech_8 = df.date_of_last_rech_8.fillna('2014-08-31')

df.last_date_of_month_7 = df.last_date_of_month_7.fillna('2014-07-31')
df.last_date_of_month_8 = df.last_date_of_month_8.fillna('2014-08-31')



In [137]:
#### Percentage of Null values check 
(df.isnull().sum() * 100 / len(df)).sort_values(ascending = False)

circle_id                   0.0
last_day_rch_amt_8          0.0
max_rech_amt_6              0.0
max_rech_amt_7              0.0
max_rech_amt_8              0.0
date_of_last_rech_6         0.0
date_of_last_rech_7         0.0
date_of_last_rech_8         0.0
last_day_rch_amt_6          0.0
last_day_rch_amt_7          0.0
date_of_last_rech_data_6    0.0
count_rech_2g_7             0.0
date_of_last_rech_data_7    0.0
date_of_last_rech_data_8    0.0
total_rech_data_6           0.0
total_rech_data_7           0.0
total_rech_data_8           0.0
max_rech_data_6             0.0
max_rech_data_7             0.0
max_rech_data_8             0.0
total_rech_amt_8            0.0
total_rech_amt_7            0.0
total_rech_amt_6            0.0
total_rech_num_8            0.0
std_ic_mou_6                0.0
std_ic_mou_7                0.0
std_ic_mou_8                0.0
total_ic_mou_6              0.0
total_ic_mou_7              0.0
total_ic_mou_8              0.0
spl_ic_mou_6                0.0
spl_ic_m