# Min, Max and Range of Data

In [1]:
# import libraries
import pandas as pd
import numpy as np

This is the dataset for **Customer Churn Problem**

In [2]:
# importing dataset
data = pd.read_csv('churn_prediction.csv')

## Identification of Datatypes

In [3]:
data.dtypes

customer_id                         int64
vintage                             int64
age                                 int64
gender                             object
dependents                        float64
occupation                         object
city                              float64
customer_nw_category                int64
branch_code                         int64
current_balance                   float64
previous_month_end_balance        float64
average_monthly_balance_prevQ     float64
average_monthly_balance_prevQ2    float64
current_month_credit              float64
previous_month_credit             float64
current_month_debit               float64
previous_month_debit              float64
current_month_balance             float64
previous_month_balance            float64
churn                               int64
last_transaction                   object
dtype: object

## Isolating numerical columns

Storing indices of **Integer and Float** in numerical_cols because we are dealing with **numerical variables**

In [5]:
# Storing indices of all numerical data types in numerical_cols
numerical_cols = data.select_dtypes(include=['int','float']).columns

# checking
numerical_cols

Index(['customer_id', 'vintage', 'age', 'dependents', 'city',
       'customer_nw_category', 'branch_code', 'current_balance',
       'previous_month_end_balance', 'average_monthly_balance_prevQ',
       'average_monthly_balance_prevQ2', 'current_month_credit',
       'previous_month_credit', 'current_month_debit', 'previous_month_debit',
       'current_month_balance', 'previous_month_balance', 'churn'],
      dtype='object')

## Min observation

In [8]:
# observation with dminimum current balance
data[data['current_balance'] == data['current_balance'].min()]

Unnamed: 0,customer_id,vintage,age,gender,dependents,occupation,city,customer_nw_category,branch_code,current_balance,...,average_monthly_balance_prevQ,average_monthly_balance_prevQ2,current_month_credit,previous_month_credit,current_month_debit,previous_month_debit,current_month_balance,previous_month_balance,churn,last_transaction
12608,13467,2140,80,Male,0.0,retired,1096.0,1,27,-5503.96,...,1694.57,868.26,9471.01,2680.04,15229.44,7859.37,1050.17,2002.97,1,2019-12-26


- Customer's id is 25712
- Customer has **maximum current month debit** is 7637857.36

## Range

**Range of Age** is our data is indicating the difference of Age between the oldest and youngest customers

In [9]:
# Range of Age
print(data['age'].min(), data['age'].max())

1 90


- Oldest Customer Age is 90
- Youngest Customer Age is 1
- Range is [1,90]

## Max, Min, Range for each column

In [10]:
# Printing Max of every numerical column
data[numerical_cols].max()

customer_id                          30301.00
vintage                               2476.00
age                                     90.00
dependents                              52.00
city                                  1649.00
customer_nw_category                     3.00
branch_code                           4782.00
current_balance                    5905904.03
previous_month_end_balance         5740438.63
average_monthly_balance_prevQ      5700289.57
average_monthly_balance_prevQ2     5010170.10
current_month_credit              12269845.39
previous_month_credit              2361808.29
current_month_debit                7637857.36
previous_month_debit               1414168.06
current_month_balance              5778184.77
previous_month_balance             5720144.50
churn                                    1.00
dtype: float64

- Maximum value of vintage for a customer is 12899.
- Maximum age of a customer is our dataset is 90.
- Maximum number of dependents in our dataset is 52.
- Maximum data since last transaction is 365.
- Maximum values of **current_balance, previous_month_end_balance, average_monthly_balance_prevQ, current_month_balance, previous_month_balance** are close to 57 lakhs.
- Maximum value for current_month_credit is 1269845.39
- Maximum value for previous_month_credit is 2361808.29
- Maximum value for current_month_debit and previous_month debit is respectively 7637857.36 and 1414168.06.
- The features like **customer_id, city, customer_nw_category, branch_code, chrun** are required to be treated as categorical variable so thier maximum value don't represent numerical significance.


In [11]:
# printing min of every numerical column
data[numerical_cols].min()

customer_id                           1.00
vintage                              73.00
age                                   1.00
dependents                            0.00
city                                  0.00
customer_nw_category                  1.00
branch_code                           1.00
current_balance                   -5503.96
previous_month_end_balance        -3149.57
average_monthly_balance_prevQ      1428.69
average_monthly_balance_prevQ2   -16506.10
current_month_credit                  0.01
previous_month_credit                 0.01
current_month_debit                   0.01
previous_month_debit                  0.01
current_month_balance             -3374.18
previous_month_balance            -5171.92
churn                                 0.00
dtype: float64

In [13]:
for col in numerical_cols:
    print("range of {}{}{}{}{}{}{}{}".format(col,":"," ","[",data[col].min(), ", ", data[col].max(), "]"))

range of customer_id: [1, 30301]
range of vintage: [73, 2476]
range of age: [1, 90]
range of dependents: [0.0, 52.0]
range of city: [0.0, 1649.0]
range of customer_nw_category: [1, 3]
range of branch_code: [1, 4782]
range of current_balance: [-5503.96, 5905904.03]
range of previous_month_end_balance: [-3149.57, 5740438.63]
range of average_monthly_balance_prevQ: [1428.69, 5700289.57]
range of average_monthly_balance_prevQ2: [-16506.1, 5010170.1]
range of current_month_credit: [0.01, 12269845.39]
range of previous_month_credit: [0.01, 2361808.29]
range of current_month_debit: [0.01, 7637857.36]
range of previous_month_debit: [0.01, 1414168.06]
range of current_month_balance: [-3374.18, 5778184.77]
range of previous_month_balance: [-5171.92, 5720144.5]
range of churn: [0, 1]


- Range of current_month_credit is highest among all features.
- Range of days_since_last_transaction is 1 year.