# Все необходимые импорты

In [1]:
import pandas as pd
import numpy as np
import pandas_profiling

In [26]:
pd.set_option('display.max_colwidth', -1)
pd.set_option('display.max_rows', 219)

# Общее описание данных

- **application_{train|test}.csv**

This is the main table, broken into two files for Train (with TARGET) and Test (without TARGET).
Static data for all applications. One row represents one loan in our data sample.

- **bureau.csv**

All client's previous credits provided by other financial institutions that were reported to Credit Bureau (for clients who have a loan in our sample).
For every loan in our sample, there are as many rows as number of credits the client had in Credit Bureau before the application date.

- **bureau_balance.csv**

Monthly balances of previous credits in Credit Bureau.
This table has one row for each month of history of every previous credit reported to Credit Bureau – i.e the table has (#loans in sample * # of relative previous credits * # of months where we have some history observable for the previous credits) rows.

- **POS_CASH_balance.csv**

Monthly balance snapshots of previous POS (point of sales) and cash loans that the applicant had with Home Credit.
This table has one row for each month of history of every previous credit in Home Credit (consumer credit and cash loans) related to loans in our sample – i.e. the table has (#loans in sample * # of relative previous credits * # of months in which we have some history observable for the previous credits) rows.

- **credit_card_balance.csv**

Monthly balance snapshots of previous credit cards that the applicant has with Home Credit.
This table has one row for each month of history of every previous credit in Home Credit (consumer credit and cash loans) related to loans in our sample – i.e. the table has (#loans in sample * # of relative previous credit cards * # of months where we have some history observable for the previous credit card) rows.

- **previous_application.csv**

All previous applications for Home Credit loans of clients who have loans in our sample.
There is one row for each previous application related to loans in our data sample.

- **installments_payments.csv**

Repayment history for the previously disbursed credits in Home Credit related to the loans in our sample.
There is a) one row for every payment that was made plus b) one row each for missed payment.
One row is equivalent to one payment of one installment OR one installment corresponding to one payment of one previous Home Credit credit related to loans in our sample.

- **HomeCredit_columns_description.csv**

This file contains descriptions for the columns in the various data files.

# Чтение данных

Сначала просто вычитаем данные в память, потом по каждому набору сформируем общий отчет, потом уже надо будет глянуть на связи между наборами данных.

In [2]:
train = pd.read_csv('../data/dataset/application_train.csv')

In [3]:
test = pd.read_csv('../data/dataset/application_test.csv')

In [4]:
bureau = pd.read_csv('../data/dataset/bureau.csv')

In [5]:
bureau_balance = pd.read_csv('../data/dataset/bureau_balance.csv')

In [6]:
credit_card_balance = pd.read_csv('../data/dataset/credit_card_balance.csv')

In [30]:
columns_description = pd.read_csv(
    filepath_or_buffer='../data/dataset/HomeCredit_columns_description.csv',
    encoding='ISO-8859-1',
    index_col=0
)

In [8]:
installments_payments = pd.read_csv('../data/dataset/installments_payments.csv')

In [9]:
POS_CASH_balance = pd.read_csv('../data/dataset/POS_CASH_balance.csv')

In [10]:
previous_application = pd.read_csv('../data/dataset/previous_application.csv')

In [11]:
sample_submission = pd.read_csv('../data/dataset/sample_submission.csv')

# Общее описание данных

In [13]:
pandas_profiling.ProfileReport(train)

0,1
Number of variables,122
Number of observations,307511
Total Missing (%),9.6%
Total size in memory,286.2 MiB
Average record size in memory,976.0 B

0,1
Numeric,39
Categorical,16
Boolean,33
Date,0
Text (Unique),0
Rejected,34
Unsupported,0

0,1
Distinct count,13673
Unique (%),4.4%
Missing (%),0.0%
Missing (n),12
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,27109
Minimum,1615.5
Maximum,258030
Zeros (%),0.0%

0,1
Minimum,1615.5
5-th percentile,9000.0
Q1,16524.0
Median,24903.0
Q3,34596.0
95-th percentile,53325.0
Maximum,258030.0
Range,256410.0
Interquartile range,18072.0

0,1
Standard deviation,14494
Coef of variation,0.53466
Kurtosis,7.7073
Mean,27109
MAD,10975
Skewness,1.5798
Sum,8335900000
Variance,210070000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
9000.0,6385,2.1%,
13500.0,5514,1.8%,
6750.0,2279,0.7%,
10125.0,2035,0.7%,
37800.0,1602,0.5%,
11250.0,1459,0.5%,
26217.0,1453,0.5%,
20250.0,1345,0.4%,
12375.0,1339,0.4%,
31653.0,1269,0.4%,

Value,Count,Frequency (%),Unnamed: 3
1615.5,1,0.0%,
1980.0,2,0.0%,
1993.5,1,0.0%,
2052.0,1,0.0%,
2164.5,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
216589.5,1,0.0%,
220297.5,1,0.0%,
225000.0,23,0.0%,
230161.5,1,0.0%,
258025.5,1,0.0%,

0,1
Distinct count,5603
Unique (%),1.8%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,599030
Minimum,45000
Maximum,4050000
Zeros (%),0.0%

0,1
Minimum,45000
5-th percentile,135000
Q1,270000
Median,513530
Q3,808650
95-th percentile,1350000
Maximum,4050000
Range,4005000
Interquartile range,538650

0,1
Standard deviation,402490
Coef of variation,0.67191
Kurtosis,1.934
Mean,599030
MAD,316580
Skewness,1.2348
Sum,184210000000
Variance,162000000000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
450000.0,9709,3.2%,
675000.0,8877,2.9%,
225000.0,8162,2.7%,
180000.0,7342,2.4%,
270000.0,7241,2.4%,
900000.0,6246,2.0%,
254700.0,4500,1.5%,
545040.0,4437,1.4%,
808650.0,4152,1.4%,
135000.0,3660,1.2%,

Value,Count,Frequency (%),Unnamed: 3
45000.0,230,0.1%,
47970.0,218,0.1%,
48519.0,1,0.0%,
49455.0,19,0.0%,
49500.0,40,0.0%,

Value,Count,Frequency (%),Unnamed: 3
3860019.0,1,0.0%,
3956274.0,1,0.0%,
4027680.0,1,0.0%,
4031032.5,1,0.0%,
4050000.0,8,0.0%,

0,1
Correlation,0.98697

0,1
Distinct count,2548
Unique (%),0.8%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,168800
Minimum,25650
Maximum,117000000
Zeros (%),0.0%

0,1
Minimum,25650
5-th percentile,67500
Q1,112500
Median,147150
Q3,202500
95-th percentile,337500
Maximum,117000000
Range,116970000
Interquartile range,90000

0,1
Standard deviation,237120
Coef of variation,1.4048
Kurtosis,191790
Mean,168800
MAD,66226
Skewness,391.56
Sum,51907000000
Variance,56227000000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
135000.0,35750,11.6%,
112500.0,31019,10.1%,
157500.0,26556,8.6%,
180000.0,24719,8.0%,
90000.0,22483,7.3%,
225000.0,20698,6.7%,
202500.0,16341,5.3%,
67500.0,11147,3.6%,
270000.0,10827,3.5%,
81000.0,6001,2.0%,

Value,Count,Frequency (%),Unnamed: 3
25650.0,2,0.0%,
26100.0,3,0.0%,
26460.0,1,0.0%,
26550.0,2,0.0%,
27000.0,66,0.0%,

Value,Count,Frequency (%),Unnamed: 3
6750000.0,1,0.0%,
9000000.0,1,0.0%,
13500000.0,1,0.0%,
18000090.0,1,0.0%,
117000000.0,1,0.0%,

0,1
Distinct count,10
Unique (%),0.0%
Missing (%),13.5%
Missing (n),41519
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0070002
Minimum,0
Maximum,9
Zeros (%),86.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,9
Range,9
Interquartile range,0

0,1
Standard deviation,0.11076
Coef of variation,15.822
Kurtosis,1151.9
Mean,0.0070002
MAD,0.013922
Skewness,27.044
Sum,1862
Variance,0.012267
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,264503,86.0%,
1.0,1292,0.4%,
2.0,106,0.0%,
3.0,45,0.0%,
4.0,26,0.0%,
5.0,9,0.0%,
6.0,8,0.0%,
9.0,2,0.0%,
8.0,1,0.0%,
(Missing),41519,13.5%,

Value,Count,Frequency (%),Unnamed: 3
0.0,264503,86.0%,
1.0,1292,0.4%,
2.0,106,0.0%,
3.0,45,0.0%,
4.0,26,0.0%,

Value,Count,Frequency (%),Unnamed: 3
4.0,26,0.0%,
5.0,9,0.0%,
6.0,8,0.0%,
8.0,1,0.0%,
9.0,2,0.0%,

0,1
Distinct count,6
Unique (%),0.0%
Missing (%),13.5%
Missing (n),41519
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0064024
Minimum,0
Maximum,4
Zeros (%),86.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,4
Range,4
Interquartile range,0

0,1
Standard deviation,0.083849
Coef of variation,13.096
Kurtosis,254.24
Mean,0.0064024
MAD,0.012727
Skewness,14.534
Sum,1703
Variance,0.0070307
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,264366,86.0%,
1.0,1560,0.5%,
2.0,56,0.0%,
3.0,9,0.0%,
4.0,1,0.0%,
(Missing),41519,13.5%,

Value,Count,Frequency (%),Unnamed: 3
0.0,264366,86.0%,
1.0,1560,0.5%,
2.0,56,0.0%,
3.0,9,0.0%,
4.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,264366,86.0%,
1.0,1560,0.5%,
2.0,56,0.0%,
3.0,9,0.0%,
4.0,1,0.0%,

0,1
Distinct count,25
Unique (%),0.0%
Missing (%),13.5%
Missing (n),41519
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.2674
Minimum,0
Maximum,27
Zeros (%),72.3%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,1
Maximum,27
Range,27
Interquartile range,0

0,1
Standard deviation,0.916
Coef of variation,3.4256
Kurtosis,90.435
Mean,0.2674
MAD,0.44681
Skewness,7.8048
Sum,71125
Variance,0.83906
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,222233,72.3%,
1.0,33147,10.8%,
2.0,5386,1.8%,
3.0,1991,0.6%,
4.0,1076,0.3%,
5.0,602,0.2%,
6.0,343,0.1%,
7.0,298,0.1%,
9.0,206,0.1%,
8.0,185,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,222233,72.3%,
1.0,33147,10.8%,
2.0,5386,1.8%,
3.0,1991,0.6%,
4.0,1076,0.3%,

Value,Count,Frequency (%),Unnamed: 3
19.0,3,0.0%,
22.0,1,0.0%,
23.0,1,0.0%,
24.0,1,0.0%,
27.0,1,0.0%,

0,1
Distinct count,12
Unique (%),0.0%
Missing (%),13.5%
Missing (n),41519
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.26547
Minimum,0
Maximum,261
Zeros (%),70.1%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,2
Maximum,261
Range,261
Interquartile range,0

0,1
Standard deviation,0.79406
Coef of variation,2.9911
Kurtosis,43707
Mean,0.26547
MAD,0.43
Skewness,134.37
Sum,70614
Variance,0.63052
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,215417,70.1%,
1.0,33862,11.0%,
2.0,14412,4.7%,
3.0,1717,0.6%,
4.0,476,0.2%,
5.0,64,0.0%,
6.0,28,0.0%,
7.0,7,0.0%,
8.0,7,0.0%,
19.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,215417,70.1%,
1.0,33862,11.0%,
2.0,14412,4.7%,
3.0,1717,0.6%,
4.0,476,0.2%,

Value,Count,Frequency (%),Unnamed: 3
6.0,28,0.0%,
7.0,7,0.0%,
8.0,7,0.0%,
19.0,1,0.0%,
261.0,1,0.0%,

0,1
Distinct count,10
Unique (%),0.0%
Missing (%),13.5%
Missing (n),41519
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.034362
Minimum,0
Maximum,8
Zeros (%),83.7%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,8
Range,8
Interquartile range,0

0,1
Standard deviation,0.20468
Coef of variation,5.9567
Kurtosis,166.75
Mean,0.034362
MAD,0.066518
Skewness,9.2936
Sum,9140
Variance,0.041896
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,257456,83.7%,
1.0,8208,2.7%,
2.0,199,0.1%,
3.0,58,0.0%,
4.0,34,0.0%,
6.0,20,0.0%,
5.0,10,0.0%,
8.0,5,0.0%,
7.0,2,0.0%,
(Missing),41519,13.5%,

Value,Count,Frequency (%),Unnamed: 3
0.0,257456,83.7%,
1.0,8208,2.7%,
2.0,199,0.1%,
3.0,58,0.0%,
4.0,34,0.0%,

Value,Count,Frequency (%),Unnamed: 3
4.0,34,0.0%,
5.0,10,0.0%,
6.0,20,0.0%,
7.0,2,0.0%,
8.0,5,0.0%,

0,1
Distinct count,26
Unique (%),0.0%
Missing (%),13.5%
Missing (n),41519
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.9
Minimum,0
Maximum,25
Zeros (%),23.3%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,1
Q3,3
95-th percentile,6
Maximum,25
Range,25
Interquartile range,3

0,1
Standard deviation,1.8693
Coef of variation,0.98385
Kurtosis,1.969
Mean,1.9
MAD,1.4548
Skewness,1.2436
Sum,505380
Variance,3.4943
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,71801,23.3%,
1.0,63405,20.6%,
2.0,50192,16.3%,
3.0,33628,10.9%,
4.0,20714,6.7%,
5.0,12052,3.9%,
6.0,6967,2.3%,
7.0,3869,1.3%,
8.0,2127,0.7%,
9.0,1096,0.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,71801,23.3%,
1.0,63405,20.6%,
2.0,50192,16.3%,
3.0,33628,10.9%,
4.0,20714,6.7%,

Value,Count,Frequency (%),Unnamed: 3
20.0,1,0.0%,
21.0,1,0.0%,
22.0,1,0.0%,
23.0,1,0.0%,
25.0,1,0.0%,

0,1
Distinct count,2340
Unique (%),0.8%
Missing (%),50.7%
Missing (n),156061
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.11744
Minimum,0
Maximum,1
Zeros (%),0.2%

0,1
Minimum,0.0
5-th percentile,0.0082
Q1,0.0577
Median,0.0876
Q3,0.1485
95-th percentile,0.3268
Maximum,1.0
Range,1.0
Interquartile range,0.0908

0,1
Standard deviation,0.10824
Coef of variation,0.92166
Kurtosis,11.394
Mean,0.11744
MAD,0.073286
Skewness,2.6418
Sum,17786
Variance,0.011716
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0825,6663,2.2%,
0.0619,6332,2.1%,
0.0928,4404,1.4%,
0.0722,3986,1.3%,
0.0082,3507,1.1%,
0.0165,3027,1.0%,
0.1031,2892,0.9%,
0.1485,2769,0.9%,
0.0124,2721,0.9%,
0.0742,2231,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,751,0.2%,
0.001,197,0.1%,
0.0014,1,0.0%,
0.0015,6,0.0%,
0.0017,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9814,9,0.0%,
0.9876,7,0.0%,
0.9897,1,0.0%,
0.9907,2,0.0%,
1.0,147,0.0%,

0,1
Correlation,0.93217

0,1
Correlation,0.90828

0,1
Distinct count,3781
Unique (%),1.2%
Missing (%),58.5%
Missing (n),179943
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.088442
Minimum,0
Maximum,1
Zeros (%),4.8%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0442
Median,0.0763
Q3,0.1122
95-th percentile,0.2237
Maximum,1.0
Range,1.0
Interquartile range,0.068

0,1
Standard deviation,0.082438
Coef of variation,0.93211
Kurtosis,25.93
Mean,0.088442
MAD,0.052361
Skewness,3.5663
Sum,11282
Variance,0.0067961
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,14745,4.8%,
0.0545,251,0.1%,
0.0818,251,0.1%,
0.0727,248,0.1%,
0.1091,246,0.1%,
0.0796,245,0.1%,
0.08,239,0.1%,
0.0805,230,0.1%,
0.0764,220,0.1%,
0.0793,211,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,14745,4.8%,
0.0001,99,0.0%,
0.0002,38,0.0%,
0.0003,8,0.0%,
0.0004,33,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9677,1,0.0%,
0.9682,1,0.0%,
0.9694,2,0.0%,
0.9945,1,0.0%,
1.0,130,0.0%,

0,1
Correlation,0.97794

0,1
Correlation,0.9735

0,1
Distinct count,15
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.41705
Minimum,0
Maximum,19
Zeros (%),70.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,1
95-th percentile,2
Maximum,19
Range,19
Interquartile range,1

0,1
Standard deviation,0.72212
Coef of variation,1.7315
Kurtosis,7.9041
Mean,0.41705
MAD,0.58418
Skewness,1.9746
Sum,128248
Variance,0.52146
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,215371,70.0%,
1,61119,19.9%,
2,26749,8.7%,
3,3717,1.2%,
4,429,0.1%,
5,84,0.0%,
6,21,0.0%,
7,7,0.0%,
14,3,0.0%,
19,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,215371,70.0%,
1,61119,19.9%,
2,26749,8.7%,
3,3717,1.2%,
4,429,0.1%,

Value,Count,Frequency (%),Unnamed: 3
10,2,0.0%,
11,1,0.0%,
12,2,0.0%,
14,3,0.0%,
19,2,0.0%,

0,1
Distinct count,18
Unique (%),0.0%
Missing (%),0.0%
Missing (n),2
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2.1527
Minimum,1
Maximum,20
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,2
Median,2
Q3,3
95-th percentile,4
Maximum,20
Range,19
Interquartile range,1

0,1
Standard deviation,0.91068
Coef of variation,0.42305
Kurtosis,2.802
Mean,2.1527
MAD,0.66587
Skewness,0.98754
Sum,661960
Variance,0.82934
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
2.0,158357,51.5%,
1.0,67847,22.1%,
3.0,52601,17.1%,
4.0,24697,8.0%,
5.0,3478,1.1%,
6.0,408,0.1%,
7.0,81,0.0%,
8.0,20,0.0%,
9.0,6,0.0%,
10.0,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1.0,67847,22.1%,
2.0,158357,51.5%,
3.0,52601,17.1%,
4.0,24697,8.0%,
5.0,3478,1.1%,

Value,Count,Frequency (%),Unnamed: 3
13.0,1,0.0%,
14.0,2,0.0%,
15.0,1,0.0%,
16.0,2,0.0%,
20.0,2,0.0%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
F,202448
M,105059
XNA,4

Value,Count,Frequency (%),Unnamed: 3
F,202448,65.8%,
M,105059,34.2%,
XNA,4,0.0%,

0,1
Distinct count,3182
Unique (%),1.0%
Missing (%),69.9%
Missing (n),214865
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.044621
Minimum,0
Maximum,1
Zeros (%),2.7%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0078
Median,0.0211
Q3,0.0515
95-th percentile,0.1601
Maximum,1.0
Range,1.0
Interquartile range,0.0437

0,1
Standard deviation,0.076036
Coef of variation,1.704
Kurtosis,45.988
Mean,0.044621
MAD,0.042024
Skewness,5.4573
Sum,4133.9
Variance,0.0057814
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,8442,2.7%,
0.0079,544,0.2%,
0.0078,475,0.2%,
0.008,446,0.1%,
0.0077,414,0.1%,
0.0086,365,0.1%,
0.0014,345,0.1%,
0.006999999999999999,343,0.1%,
0.0013,317,0.1%,
0.0069,314,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,8442,2.7%,
0.0001,45,0.0%,
0.0002,67,0.0%,
0.0003,84,0.0%,
0.0004,62,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9601,1,0.0%,
0.9833,1,0.0%,
0.9906,2,0.0%,
0.9937,2,0.0%,
1.0,92,0.0%,

0,1
Correlation,0.97989

0,1
Correlation,0.97715

0,1
Distinct count,17460
Unique (%),5.7%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-16037
Minimum,-25229
Maximum,-7489
Zeros (%),0.0%

0,1
Minimum,-25229
5-th percentile,-23204
Q1,-19682
Median,-15750
Q3,-12413
95-th percentile,-9407
Maximum,-7489
Range,17740
Interquartile range,7269

0,1
Standard deviation,4364
Coef of variation,-0.27212
Kurtosis,-1.0491
Mean,-16037
MAD,3728.4
Skewness,-0.11567
Sum,-4931552390
Variance,19044000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
-13749,43,0.0%,
-13481,42,0.0%,
-18248,41,0.0%,
-10020,41,0.0%,
-15771,40,0.0%,
-10292,40,0.0%,
-14395,39,0.0%,
-14267,39,0.0%,
-13263,39,0.0%,
-11664,39,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-25229,1,0.0%,
-25201,2,0.0%,
-25200,1,0.0%,
-25197,2,0.0%,
-25196,4,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-7679,1,0.0%,
-7678,3,0.0%,
-7676,2,0.0%,
-7673,1,0.0%,
-7489,1,0.0%,

0,1
Distinct count,12574
Unique (%),4.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,63815
Minimum,-17912
Maximum,365243
Zeros (%),0.0%

0,1
Minimum,-17912.0
5-th percentile,-6742.5
Q1,-2760.0
Median,-1213.0
Q3,-289.0
95-th percentile,365240.0
Maximum,365243.0
Range,383155.0
Interquartile range,2471.0

0,1
Standard deviation,141280
Coef of variation,2.2138
Kurtosis,0.77161
Mean,63815
MAD,108560
Skewness,1.6643
Sum,19623828581
Variance,19959000000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
365243,55374,18.0%,
-200,156,0.1%,
-224,152,0.0%,
-199,151,0.0%,
-230,151,0.0%,
-212,150,0.0%,
-229,143,0.0%,
-384,143,0.0%,
-231,140,0.0%,
-215,138,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-17912,1,0.0%,
-17583,1,0.0%,
-17546,1,0.0%,
-17531,1,0.0%,
-17522,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-3,3,0.0%,
-2,2,0.0%,
-1,1,0.0%,
0,2,0.0%,
365243,55374,18.0%,

0,1
Distinct count,6168
Unique (%),2.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-2994.2
Minimum,-7197
Maximum,0
Zeros (%),0.0%

0,1
Minimum,-7197
5-th percentile,-4944
Q1,-4299
Median,-3254
Q3,-1720
95-th percentile,-375
Maximum,0
Range,7197
Interquartile range,2579

0,1
Standard deviation,1509.5
Coef of variation,-0.50412
Kurtosis,-1.1068
Mean,-2994.2
MAD,1316.2
Skewness,0.34933
Sum,-920750166
Variance,2278400
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
-4053,169,0.1%,
-4095,162,0.1%,
-4046,161,0.1%,
-4417,159,0.1%,
-4256,158,0.1%,
-4151,157,0.1%,
-4032,157,0.1%,
-4200,156,0.1%,
-4214,155,0.1%,
-4171,155,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-7197,1,0.0%,
-6551,1,0.0%,
-6383,1,0.0%,
-6337,1,0.0%,
-6274,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4,57,0.0%,
-3,51,0.0%,
-2,50,0.0%,
-1,64,0.0%,
0,16,0.0%,

0,1
Distinct count,3774
Unique (%),1.2%
Missing (%),0.0%
Missing (n),1
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-962.86
Minimum,-4292
Maximum,0
Zeros (%),12.3%

0,1
Minimum,-4292
5-th percentile,-2522
Q1,-1570
Median,-757
Q3,-274
95-th percentile,0
Maximum,0
Range,4292
Interquartile range,1296

0,1
Standard deviation,826.81
Coef of variation,-0.8587
Kurtosis,-0.30858
Mean,-962.86
MAD,696.28
Skewness,-0.71361
Sum,-296090000
Variance,683610
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,37672,12.3%,
-1.0,2812,0.9%,
-2.0,2318,0.8%,
-3.0,1763,0.6%,
-4.0,1285,0.4%,
-5.0,824,0.3%,
-6.0,537,0.2%,
-7.0,442,0.1%,
-8.0,278,0.1%,
-476.0,222,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-4292.0,1,0.0%,
-4185.0,1,0.0%,
-4173.0,1,0.0%,
-4153.0,1,0.0%,
-4131.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4.0,1285,0.4%,
-3.0,1763,0.6%,
-2.0,2318,0.8%,
-1.0,2812,0.9%,
0.0,37672,12.3%,

0,1
Distinct count,15688
Unique (%),5.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-4986.1
Minimum,-24672
Maximum,0
Zeros (%),0.0%

0,1
Minimum,-24672.0
5-th percentile,-11416.0
Q1,-7479.5
Median,-4504.0
Q3,-2010.0
95-th percentile,-330.0
Maximum,0.0
Range,24672.0
Interquartile range,5469.5

0,1
Standard deviation,3522.9
Coef of variation,-0.70654
Kurtosis,-0.32135
Mean,-4986.1
MAD,2915.4
Skewness,-0.59087
Sum,-1533300000
Variance,12411000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
-1.0,113,0.0%,
-7.0,98,0.0%,
-6.0,96,0.0%,
-2.0,92,0.0%,
-4.0,92,0.0%,
-5.0,86,0.0%,
-9.0,84,0.0%,
-3.0,84,0.0%,
0.0,80,0.0%,
-21.0,80,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-24672.0,1,0.0%,
-23738.0,1,0.0%,
-23416.0,1,0.0%,
-22928.0,1,0.0%,
-22858.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4.0,92,0.0%,
-3.0,84,0.0%,
-2.0,92,0.0%,
-1.0,113,0.0%,
0.0,80,0.0%,

0,1
Distinct count,11
Unique (%),0.0%
Missing (%),0.3%
Missing (n),1021
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.14342
Minimum,0
Maximum,34
Zeros (%),88.2%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,1
Maximum,34
Range,34
Interquartile range,0

0,1
Standard deviation,0.4467
Coef of variation,3.1146
Kurtosis,126.31
Mean,0.14342
MAD,0.25393
Skewness,5.1835
Sum,43957
Variance,0.19954
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,271324,88.2%,
1.0,28328,9.2%,
2.0,5323,1.7%,
3.0,1192,0.4%,
4.0,253,0.1%,
5.0,56,0.0%,
6.0,11,0.0%,
7.0,1,0.0%,
8.0,1,0.0%,
34.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,271324,88.2%,
1.0,28328,9.2%,
2.0,5323,1.7%,
3.0,1192,0.4%,
4.0,253,0.1%,

Value,Count,Frequency (%),Unnamed: 3
5.0,56,0.0%,
6.0,11,0.0%,
7.0,1,0.0%,
8.0,1,0.0%,
34.0,1,0.0%,

0,1
Distinct count,10
Unique (%),0.0%
Missing (%),0.3%
Missing (n),1021
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.10005
Minimum,0
Maximum,24
Zeros (%),91.3%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,1
Maximum,24
Range,24
Interquartile range,0

0,1
Standard deviation,0.36229
Coef of variation,3.6211
Kurtosis,86.563
Mean,0.10005
MAD,0.18327
Skewness,5.2779
Sum,30664
Variance,0.13125
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,280721,91.3%,
1.0,21841,7.1%,
2.0,3170,1.0%,
3.0,598,0.2%,
4.0,135,0.0%,
5.0,20,0.0%,
6.0,3,0.0%,
24.0,1,0.0%,
7.0,1,0.0%,
(Missing),1021,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,280721,91.3%,
1.0,21841,7.1%,
2.0,3170,1.0%,
3.0,598,0.2%,
4.0,135,0.0%,

Value,Count,Frequency (%),Unnamed: 3
4.0,135,0.0%,
5.0,20,0.0%,
6.0,3,0.0%,
7.0,1,0.0%,
24.0,1,0.0%,

0,1
Distinct count,258
Unique (%),0.1%
Missing (%),53.3%
Missing (n),163891
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.078942
Minimum,0
Maximum,1
Zeros (%),27.9%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.12
95-th percentile,0.36
Maximum,1.0
Range,1.0
Interquartile range,0.12

0,1
Standard deviation,0.13458
Coef of variation,1.7048
Kurtosis,7.8694
Mean,0.078942
MAD,0.09788
Skewness,2.4394
Sum,11338
Variance,0.018111
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,85718,27.9%,
0.08,9886,3.2%,
0.16,8806,2.9%,
0.24,6071,2.0%,
0.12,5593,1.8%,
0.04,4585,1.5%,
0.2,4072,1.3%,
0.32,2788,0.9%,
0.28,2272,0.7%,
0.4,1532,0.5%,

Value,Count,Frequency (%),Unnamed: 3
0.0,85718,27.9%,
0.002,1,0.0%,
0.0024,1,0.0%,
0.0048,3,0.0%,
0.0064,5,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9,6,0.0%,
0.92,20,0.0%,
0.9332,2,0.0%,
0.96,81,0.0%,
1.0,158,0.1%,

0,1
Correlation,0.98283

0,1
Correlation,0.97884

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),47.4%
Missing (n),145755

0,1
No,159428
Yes,2328
(Missing),145755

Value,Count,Frequency (%),Unnamed: 3
No,159428,51.8%,
Yes,2328,0.8%,
(Missing),145755,47.4%,

0,1
Distinct count,286
Unique (%),0.1%
Missing (%),50.3%
Missing (n),154828
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.14972
Minimum,0
Maximum,1
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,0.0345
Q1,0.069
Median,0.1379
Q3,0.2069
95-th percentile,0.3103
Maximum,1.0
Range,1.0
Interquartile range,0.1379

0,1
Standard deviation,0.10005
Coef of variation,0.66822
Kurtosis,11.593
Mean,0.14972
MAD,0.069965
Skewness,2.3997
Sum,22860
Variance,0.01001
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.1379,34007,11.1%,
0.069,22956,7.5%,
0.1034,19533,6.4%,
0.2069,19062,6.2%,
0.0345,15380,5.0%,
0.1724,9185,3.0%,
0.2759,7895,2.6%,
0.2414,4165,1.4%,
0.3448,2066,0.7%,
0.3103,2049,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,323,0.1%,
0.0055,1,0.0%,
0.0086,2,0.0%,
0.0114,1,0.0%,
0.0172,7,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8621,14,0.0%,
0.8966,52,0.0%,
0.931,21,0.0%,
0.9655,25,0.0%,
1.0,153,0.0%,

0,1
Correlation,0.98068

0,1
Correlation,0.97774

0,1
Distinct count,114585
Unique (%),37.3%
Missing (%),56.4%
Missing (n),173378
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.50213
Minimum,0.014568
Maximum,0.96269
Zeros (%),0.0%

0,1
Minimum,0.014568
5-th percentile,0.15802
Q1,0.33401
Median,0.506
Q3,0.67505
95-th percentile,0.83226
Maximum,0.96269
Range,0.94812
Interquartile range,0.34105

0,1
Standard deviation,0.21106
Coef of variation,0.42033
Kurtosis,-0.96516
Mean,0.50213
MAD,0.17916
Skewness,-0.068755
Sum,67352
Variance,0.044547
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.5464264086050881,5,0.0%,
0.5984686928074158,5,0.0%,
0.4990017461254777,5,0.0%,
0.605151661169131,5,0.0%,
0.4439821179601821,5,0.0%,
0.528197430013715,5,0.0%,
0.6227066347478732,5,0.0%,
0.7657236984386736,5,0.0%,
0.5810147955776347,5,0.0%,
0.6677395635616753,5,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0145681324124455,1,0.0%,
0.0146914824034173,1,0.0%,
0.0150529213041636,1,0.0%,
0.0156000805809039,1,0.0%,
0.0170946577910388,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.946075521513652,1,0.0%,
0.946097614386872,1,0.0%,
0.9476493853501726,1,0.0%,
0.9516239622079844,1,0.0%,
0.962692770561306,1,0.0%,

0,1
Distinct count,119832
Unique (%),39.0%
Missing (%),0.2%
Missing (n),660
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.51439
Minimum,8.1736e-08
Maximum,0.855
Zeros (%),0.0%

0,1
Minimum,8.1736e-08
5-th percentile,0.1333
Q1,0.39246
Median,0.56596
Q3,0.66362
95-th percentile,0.74773
Maximum,0.855
Range,0.855
Interquartile range,0.27116

0,1
Standard deviation,0.19106
Coef of variation,0.37143
Kurtosis,-0.26913
Mean,0.51439
MAD,0.15717
Skewness,-0.79358
Sum,157840
Variance,0.036504
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.2858978721410488,721,0.2%,
0.2622583692422573,417,0.1%,
0.26525634018619443,343,0.1%,
0.15967923350263774,322,0.1%,
0.2653117484731741,306,0.1%,
0.26651977539251576,244,0.1%,
0.2631435910213423,243,0.1%,
0.16214456766623808,238,0.1%,
0.16219210595922867,234,0.1%,
0.16318703546427088,184,0.1%,

Value,Count,Frequency (%),Unnamed: 3
8.173616518884397e-08,1,0.0%,
1.3159555812626235e-06,1,0.0%,
5.002108762101576e-06,1,0.0%,
5.600337749107766e-06,1,0.0%,
5.9396509293128426e-06,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8206095060949257,1,0.0%,
0.8206159442383357,1,0.0%,
0.8213936273692694,1,0.0%,
0.8217142127828599,1,0.0%,
0.8549996664047012,26,0.0%,

0,1
Distinct count,815
Unique (%),0.3%
Missing (%),19.8%
Missing (n),60965
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.51085
Minimum,0.00052727
Maximum,0.89601
Zeros (%),0.0%

0,1
Minimum,0.00052727
5-th percentile,0.15474
Q1,0.37065
Median,0.53528
Q3,0.66906
95-th percentile,0.78627
Maximum,0.89601
Range,0.89548
Interquartile range,0.29841

0,1
Standard deviation,0.19484
Coef of variation,0.38141
Kurtosis,-0.66346
Mean,0.51085
MAD,0.16264
Skewness,-0.40939
Sum,125950
Variance,0.037964
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.746300213050371,1460,0.5%,
0.7136313997323308,1315,0.4%,
0.6940926425266661,1276,0.4%,
0.6706517530862718,1191,0.4%,
0.6528965519806539,1154,0.4%,
0.5814837058057234,1141,0.4%,
0.6894791426446275,1138,0.4%,
0.5954562029091491,1136,0.4%,
0.5549467685334323,1132,0.4%,
0.6212263380626669,1109,0.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0005272652387098,886,0.3%,
0.0113457194348374,1,0.0%,
0.0127159238587686,1,0.0%,
0.01394846558484,1,0.0%,
0.0141482655182073,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8825303127941461,26,0.0%,
0.8854883941521002,3,0.0%,
0.8876642018413868,1,0.0%,
0.8939760746042866,2,0.0%,
0.8960095494948396,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.99813

0,1
1,306937
0,574

Value,Count,Frequency (%),Unnamed: 3
1,306937,99.8%,
0,574,0.2%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,2.2763e-05

0,1
0,307504
1,7

Value,Count,Frequency (%),Unnamed: 3
0,307504,100.0%,
1,7,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0039121

0,1
0,306308
1,1203

Value,Count,Frequency (%),Unnamed: 3
0,306308,99.6%,
1,1203,0.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,6.5038e-06

0,1
0,307509
1,2

Value,Count,Frequency (%),Unnamed: 3
0,307509,100.0%,
1,2,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0035251

0,1
0,306427
1,1084

Value,Count,Frequency (%),Unnamed: 3
0,306427,99.6%,
1,1084,0.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0029365

0,1
0,306608
1,903

Value,Count,Frequency (%),Unnamed: 3
0,306608,99.7%,
1,903,0.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0012097

0,1
0,307139
1,372

Value,Count,Frequency (%),Unnamed: 3
0,307139,99.9%,
1,372,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0099281

0,1
0,304458
1,3053

Value,Count,Frequency (%),Unnamed: 3
0,304458,99.0%,
1,3053,1.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.00026666

0,1
0,307429
1,82

Value,Count,Frequency (%),Unnamed: 3
0,307429,100.0%,
1,82,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0081298

0,1
0,305011
1,2500

Value,Count,Frequency (%),Unnamed: 3
0,305011,99.2%,
1,2500,0.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0005951

0,1
0,307328
1,183

Value,Count,Frequency (%),Unnamed: 3
0,307328,99.9%,
1,183,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,4.2275e-05

0,1
0,307498
1,13

Value,Count,Frequency (%),Unnamed: 3
0,307498,100.0%,
1,13,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0005073

0,1
0,307355
1,156

Value,Count,Frequency (%),Unnamed: 3
0,307355,99.9%,
1,156,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.00033495

0,1
0,307408
1,103

Value,Count,Frequency (%),Unnamed: 3
0,307408,100.0%,
1,103,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.71002

0,1
1,218340
0,89171

Value,Count,Frequency (%),Unnamed: 3
1,218340,71.0%,
0,89171,29.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,8.1298e-05

0,1
0,307486
1,25

Value,Count,Frequency (%),Unnamed: 3
0,307486,100.0%,
1,25,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.015115

0,1
0,302863
1,4648

Value,Count,Frequency (%),Unnamed: 3
0,302863,98.5%,
1,4648,1.5%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.088055

0,1
0,280433
1,27078

Value,Count,Frequency (%),Unnamed: 3
0,280433,91.2%,
1,27078,8.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.00019186

0,1
0,307452
1,59

Value,Count,Frequency (%),Unnamed: 3
0,307452,100.0%,
1,59,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.081376

0,1
0,282487
1,25024

Value,Count,Frequency (%),Unnamed: 3
0,282487,91.9%,
1,25024,8.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0038958

0,1
0,306313
1,1198

Value,Count,Frequency (%),Unnamed: 3
0,306313,99.6%,
1,1198,0.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.05672

0,1
0,290069
1,17442

Value,Count,Frequency (%),Unnamed: 3
0,290069,94.3%,
1,17442,5.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.81989

0,1
1,252125
0,55386

Value,Count,Frequency (%),Unnamed: 3
1,252125,82.0%,
0,55386,18.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,1

0,1
1,307510
0,1

Value,Count,Frequency (%),Unnamed: 3
1,307510,100.0%,
0,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
N,202924
Y,104587

Value,Count,Frequency (%),Unnamed: 3
N,202924,66.0%,
Y,104587,34.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Y,213312
N,94199

Value,Count,Frequency (%),Unnamed: 3
Y,213312,69.4%,
N,94199,30.6%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.28107

0,1
0,221080
1,86431

Value,Count,Frequency (%),Unnamed: 3
0,221080,71.9%,
1,86431,28.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.19937

0,1
0,246203
1,61308

Value,Count,Frequency (%),Unnamed: 3
0,246203,80.1%,
1,61308,19.9%,

0,1
Distinct count,404
Unique (%),0.1%
Missing (%),49.8%
Missing (n),153020
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.22628
Minimum,0
Maximum,1
Zeros (%),1.0%

0,1
Minimum,0.0
5-th percentile,0.0417
Q1,0.1667
Median,0.1667
Q3,0.3333
95-th percentile,0.4792
Maximum,1.0
Range,1.0
Interquartile range,0.1666

0,1
Standard deviation,0.14464
Coef of variation,0.63921
Kurtosis,2.4325
Mean,0.22628
MAD,0.11612
Skewness,1.2265
Sum,34959
Variance,0.020921
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.1667,61875,20.1%,
0.3333,31909,10.4%,
0.0417,14600,4.7%,
0.375,7926,2.6%,
0.125,6974,2.3%,
0.0833,6586,2.1%,
0.0,2938,1.0%,
0.4583,2828,0.9%,
0.625,1915,0.6%,
0.5417,1685,0.5%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2938,1.0%,
0.0067,1,0.0%,
0.0083,3,0.0%,
0.01,4,0.0%,
0.0104,5,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9375,4,0.0%,
0.9479,2,0.0%,
0.9583,83,0.0%,
0.9792,1,0.0%,
1.0,167,0.1%,

0,1
Correlation,0.98824

0,1
Correlation,0.98569

0,1
Distinct count,306
Unique (%),0.1%
Missing (%),67.8%
Missing (n),208642
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.23189
Minimum,0
Maximum,1
Zeros (%),0.8%

0,1
Minimum,0.0
5-th percentile,0.0417
Q1,0.0833
Median,0.2083
Q3,0.375
95-th percentile,0.5
Maximum,1.0
Range,1.0
Interquartile range,0.2917

0,1
Standard deviation,0.16138
Coef of variation,0.69592
Kurtosis,1.3383
Mean,0.23189
MAD,0.1246
Skewness,0.9542
Sum,22927
Variance,0.026044
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.2083,32875,10.7%,
0.375,17845,5.8%,
0.0417,17776,5.8%,
0.0833,5086,1.7%,
0.4167,3961,1.3%,
0.1667,3537,1.2%,
0.125,3336,1.1%,
0.0,2320,0.8%,
0.5,1688,0.5%,
0.6667,1194,0.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2320,0.8%,
0.0067,3,0.0%,
0.0104,3,0.0%,
0.0138,1,0.0%,
0.0158,4,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9304,1,0.0%,
0.9408,2,0.0%,
0.9583,10,0.0%,
0.9792,5,0.0%,
1.0,141,0.0%,

0,1
Correlation,0.98841

0,1
Correlation,0.98588

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),68.4%
Missing (n),210295

0,1
reg oper account,73830
reg oper spec account,12080
not specified,5687
(Missing),210295

Value,Count,Frequency (%),Unnamed: 3
reg oper account,73830,24.0%,
reg oper spec account,12080,3.9%,
not specified,5687,1.8%,
org spec account,5619,1.8%,
(Missing),210295,68.4%,

0,1
Distinct count,24
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,12.063
Minimum,0
Maximum,23
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,7
Q1,10
Median,12
Q3,14
95-th percentile,17
Maximum,23
Range,23
Interquartile range,4

0,1
Standard deviation,3.2658
Coef of variation,0.27072
Kurtosis,-0.19417
Mean,12.063
MAD,2.6328
Skewness,-0.028024
Sum,3709634
Variance,10.666
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
10,37722,12.3%,
11,37229,12.1%,
12,34233,11.1%,
13,30959,10.1%,
14,27682,9.0%,
9,27384,8.9%,
15,24839,8.1%,
16,20385,6.6%,
8,15127,4.9%,
17,14900,4.8%,

Value,Count,Frequency (%),Unnamed: 3
0,40,0.0%,
1,86,0.0%,
2,305,0.1%,
3,1230,0.4%,
4,2090,0.7%,

Value,Count,Frequency (%),Unnamed: 3
19,3848,1.3%,
20,1196,0.4%,
21,405,0.1%,
22,150,0.0%,
23,41,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),50.2%
Missing (n),154297

0,1
block of flats,150503
specific housing,1499
terraced house,1212
(Missing),154297

Value,Count,Frequency (%),Unnamed: 3
block of flats,150503,48.9%,
specific housing,1499,0.5%,
terraced house,1212,0.4%,
(Missing),154297,50.2%,

0,1
Distinct count,3528
Unique (%),1.1%
Missing (%),59.4%
Missing (n),182590
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.066333
Minimum,0
Maximum,1
Zeros (%),5.1%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0187
Median,0.0481
Q3,0.0856
95-th percentile,0.1947
Maximum,1.0
Range,1.0
Interquartile range,0.0669

0,1
Standard deviation,0.081184
Coef of variation,1.2239
Kurtosis,34.745
Mean,0.066333
MAD,0.049532
Skewness,4.4587
Sum,8286.4
Variance,0.0065908
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,15600,5.1%,
0.0631,189,0.1%,
0.0316,187,0.1%,
0.0473,186,0.1%,
0.0174,180,0.1%,
0.0237,175,0.1%,
0.0552,173,0.1%,
0.0158,170,0.1%,
0.0331,170,0.1%,
0.015,165,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,15600,5.1%,
0.0001,13,0.0%,
0.0002,13,0.0%,
0.0003,9,0.0%,
0.0004,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9497,1,0.0%,
0.969,1,0.0%,
0.9777,3,0.0%,
0.9829,10,0.0%,
1.0,135,0.0%,

0,1
Correlation,0.98084

0,1
Correlation,0.9737

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.17955

0,1
0,252296
1,55215

Value,Count,Frequency (%),Unnamed: 3
0,252296,82.0%,
1,55215,18.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.040659

0,1
0,295008
1,12503

Value,Count,Frequency (%),Unnamed: 3
0,295008,95.9%,
1,12503,4.1%,

0,1
Correlation,0.94395

0,1
Correlation,0.94249

0,1
Correlation,0.93776

0,1
Correlation,0.91362

0,1
Correlation,0.91595

0,1
Correlation,0.91038

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Cash loans,278232
Revolving loans,29279

Value,Count,Frequency (%),Unnamed: 3
Cash loans,278232,90.5%,
Revolving loans,29279,9.5%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Secondary / secondary special,218391
Higher education,74863
Incomplete higher,10277
Other values (2),3980

Value,Count,Frequency (%),Unnamed: 3
Secondary / secondary special,218391,71.0%,
Higher education,74863,24.3%,
Incomplete higher,10277,3.3%,
Lower secondary,3816,1.2%,
Academic degree,164,0.1%,

0,1
Distinct count,6
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Married,196432
Single / not married,45444
Civil marriage,29775
Other values (3),35860

Value,Count,Frequency (%),Unnamed: 3
Married,196432,63.9%,
Single / not married,45444,14.8%,
Civil marriage,29775,9.7%,
Separated,19770,6.4%,
Widow,16088,5.2%,
Unknown,2,0.0%,

0,1
Distinct count,6
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
House / apartment,272868
With parents,14840
Municipal apartment,11183
Other values (3),8620

Value,Count,Frequency (%),Unnamed: 3
House / apartment,272868,88.7%,
With parents,14840,4.8%,
Municipal apartment,11183,3.6%,
Rented apartment,4881,1.6%,
Office apartment,2617,0.9%,
Co-op apartment,1122,0.4%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Working,158774
Commercial associate,71617
Pensioner,55362
Other values (5),21758

Value,Count,Frequency (%),Unnamed: 3
Working,158774,51.6%,
Commercial associate,71617,23.3%,
Pensioner,55362,18.0%,
State servant,21703,7.1%,
Unemployed,22,0.0%,
Student,18,0.0%,
Businessman,10,0.0%,
Maternity leave,5,0.0%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),0.4%
Missing (n),1292

0,1
Unaccompanied,248526
Family,40149
"Spouse, partner",11370
Other values (4),6174

Value,Count,Frequency (%),Unnamed: 3
Unaccompanied,248526,80.8%,
Family,40149,13.1%,
"Spouse, partner",11370,3.7%,
Children,3267,1.1%,
Other_B,1770,0.6%,
Other_A,866,0.3%,
Group of people,271,0.1%,
(Missing),1292,0.4%,

0,1
Distinct count,387
Unique (%),0.1%
Missing (%),69.4%
Missing (n),213514
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0088087
Minimum,0
Maximum,1
Zeros (%),17.7%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0039
95-th percentile,0.0309
Maximum,1.0
Range,1.0
Interquartile range,0.0039

0,1
Standard deviation,0.047732
Coef of variation,5.4187
Kurtosis,284.73
Mean,0.0088087
MAD,0.012235
Skewness,15.541
Sum,827.99
Variance,0.0022783
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,54549,17.7%,
0.0039,13606,4.4%,
0.0077,6351,2.1%,
0.0116,3714,1.2%,
0.0154,2533,0.8%,
0.0193,1673,0.5%,
0.0019,1250,0.4%,
0.0232,1195,0.4%,
0.027000000000000003,865,0.3%,
0.0309,717,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,54549,17.7%,
0.0002,1,0.0%,
0.0003,5,0.0%,
0.0004,25,0.0%,
0.0005,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.973,3,0.0%,
0.9884,1,0.0%,
0.9923,4,0.0%,
0.9961,2,0.0%,
1.0,97,0.0%,

0,1
Correlation,0.97857

0,1
Correlation,0.96937

0,1
Distinct count,3291
Unique (%),1.1%
Missing (%),55.2%
Missing (n),169682
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.028358
Minimum,0
Maximum,1
Zeros (%),19.1%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0036
Q3,0.0277
95-th percentile,0.1279
Maximum,1.0
Range,1.0
Interquartile range,0.0277

0,1
Standard deviation,0.069523
Coef of variation,2.4516
Kurtosis,64.912
Mean,0.028358
MAD,0.036058
Skewness,6.559
Sum,3908.5
Variance,0.0048335
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,58735,19.1%,
0.0012,546,0.2%,
0.0044,454,0.1%,
0.0022,440,0.1%,
0.0031,415,0.1%,
0.0011,405,0.1%,
0.001,405,0.1%,
0.0036,399,0.1%,
0.003,397,0.1%,
0.0024,395,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,58735,19.1%,
0.0001,163,0.1%,
0.0002,107,0.0%,
0.0003,95,0.0%,
0.0004,162,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.9591,2,0.0%,
0.9764,1,0.0%,
0.9823,1,0.0%,
0.9956,1,0.0%,
1.0,136,0.0%,

0,1
Correlation,0.97584

0,1
Correlation,0.96609

0,1
Distinct count,34
Unique (%),0.0%
Missing (%),0.3%
Missing (n),1021
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.4222
Minimum,0
Maximum,348
Zeros (%),53.3%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,2
95-th percentile,6
Maximum,348
Range,348
Interquartile range,2

0,1
Standard deviation,2.401
Coef of variation,1.6882
Kurtosis,1424.8
Mean,1.4222
MAD,1.6556
Skewness,12.14
Sum,435900
Variance,5.7647
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,163910,53.3%,
1.0,48783,15.9%,
2.0,29808,9.7%,
3.0,20322,6.6%,
4.0,14143,4.6%,
5.0,9553,3.1%,
6.0,6453,2.1%,
7.0,4390,1.4%,
8.0,2967,1.0%,
9.0,2003,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,163910,53.3%,
1.0,48783,15.9%,
2.0,29808,9.7%,
3.0,20322,6.6%,
4.0,14143,4.6%,

Value,Count,Frequency (%),Unnamed: 3
28.0,1,0.0%,
29.0,1,0.0%,
30.0,2,0.0%,
47.0,1,0.0%,
348.0,1,0.0%,

0,1
Correlation,0.99849

0,1
Distinct count,19
Unique (%),0.0%
Missing (%),31.3%
Missing (n),96391

0,1
Laborers,55186
Sales staff,32102
Core staff,27570
Other values (15),96262
(Missing),96391

Value,Count,Frequency (%),Unnamed: 3
Laborers,55186,17.9%,
Sales staff,32102,10.4%,
Core staff,27570,9.0%,
Managers,21371,6.9%,
Drivers,18603,6.0%,
High skill tech staff,11380,3.7%,
Accountants,9813,3.2%,
Medicine staff,8537,2.8%,
Security staff,6721,2.2%,
Cooking staff,5946,1.9%,

0,1
Distinct count,58
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Business Entity Type 3,67992
XNA,55374
Self-employed,38412
Other values (55),145733

Value,Count,Frequency (%),Unnamed: 3
Business Entity Type 3,67992,22.1%,
XNA,55374,18.0%,
Self-employed,38412,12.5%,
Other,16683,5.4%,
Medicine,11193,3.6%,
Business Entity Type 2,10553,3.4%,
Government,10404,3.4%,
School,8893,2.9%,
Trade: type 7,7831,2.5%,
Kindergarten,6880,2.2%,

0,1
Distinct count,63
Unique (%),0.0%
Missing (%),66.0%
Missing (n),202929
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,12.061
Minimum,0
Maximum,91
Zeros (%),0.7%

0,1
Minimum,0
5-th percentile,1
Q1,5
Median,9
Q3,15
95-th percentile,30
Maximum,91
Range,91
Interquartile range,10

0,1
Standard deviation,11.945
Coef of variation,0.99036
Kurtosis,9.2149
Mean,12.061
MAD,7.6692
Skewness,2.7454
Sum,1261400
Variance,142.68
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
7.0,7424,2.4%,
6.0,6382,2.1%,
3.0,6370,2.1%,
8.0,5887,1.9%,
2.0,5852,1.9%,
4.0,5557,1.8%,
1.0,5280,1.7%,
9.0,5020,1.6%,
10.0,4806,1.6%,
14.0,4594,1.5%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2134,0.7%,
1.0,5280,1.7%,
2.0,5852,1.9%,
3.0,6370,2.1%,
4.0,5557,1.8%,

Value,Count,Frequency (%),Unnamed: 3
63.0,2,0.0%,
64.0,2443,0.8%,
65.0,891,0.3%,
69.0,1,0.0%,
91.0,2,0.0%,

0,1
Distinct count,81
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.020868
Minimum,0.00029
Maximum,0.072508
Zeros (%),0.0%

0,1
Minimum,0.00029
5-th percentile,0.00496
Q1,0.010006
Median,0.01885
Q3,0.028663
95-th percentile,0.04622
Maximum,0.072508
Range,0.072218
Interquartile range,0.018657

0,1
Standard deviation,0.013831
Coef of variation,0.66279
Kurtosis,3.2601
Mean,0.020868
MAD,0.010291
Skewness,1.488
Sum,6417.2
Variance,0.0001913
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.035792000000000004,16408,5.3%,
0.04622,13442,4.4%,
0.030755,12163,4.0%,
0.025164,11950,3.9%,
0.026392,11601,3.8%,
0.031329,11321,3.7%,
0.028663,11157,3.6%,
0.019101,8694,2.8%,
0.072508,8412,2.7%,
0.020713,8066,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00029,2,0.0%,
0.000533,39,0.0%,
0.000938,28,0.0%,
0.001276,558,0.2%,
0.001333,235,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.031329,11321,3.7%,
0.032561,6636,2.2%,
0.035792,16408,5.3%,
0.04622,13442,4.4%,
0.072508,8412,2.7%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2.0525
Minimum,1
Maximum,3
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,2
Median,2
Q3,2
95-th percentile,3
Maximum,3
Range,2
Interquartile range,0

0,1
Standard deviation,0.50903
Coef of variation,0.24801
Kurtosis,0.80042
Mean,2.0525
MAD,0.29784
Skewness,0.087468
Sum,631155
Variance,0.25912
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
2,226984,73.8%,
3,48330,15.7%,
1,32197,10.5%,

Value,Count,Frequency (%),Unnamed: 3
1,32197,10.5%,
2,226984,73.8%,
3,48330,15.7%,

Value,Count,Frequency (%),Unnamed: 3
1,32197,10.5%,
2,226984,73.8%,
3,48330,15.7%,

0,1
Correlation,0.95084

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.078173

0,1
0,283472
1,24039

Value,Count,Frequency (%),Unnamed: 3
0,283472,92.2%,
1,24039,7.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.23045

0,1
0,236644
1,70867

Value,Count,Frequency (%),Unnamed: 3
0,236644,77.0%,
1,70867,23.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.015144

0,1
0,302854
1,4657

Value,Count,Frequency (%),Unnamed: 3
0,302854,98.5%,
1,4657,1.5%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.050769

0,1
0,291899
1,15612

Value,Count,Frequency (%),Unnamed: 3
0,291899,94.9%,
1,15612,5.1%,

0,1
Distinct count,307511
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,278180
Minimum,100002
Maximum,456255
Zeros (%),0.0%

0,1
Minimum,100002
5-th percentile,117950
Q1,189150
Median,278200
Q3,367140
95-th percentile,438430
Maximum,456255
Range,356253
Interquartile range,178000

0,1
Standard deviation,102790
Coef of variation,0.36951
Kurtosis,-1.199
Mean,278180
MAD,89010
Skewness,-0.0012002
Sum,85543569448
Variance,10566000000
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
100303,1,0.0%,
131861,1,0.0%,
158488,1,0.0%,
156441,1,0.0%,
160539,1,0.0%,
150300,1,0.0%,
148253,1,0.0%,
154398,1,0.0%,
152351,1,0.0%,
238369,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100002,1,0.0%,
100003,1,0.0%,
100004,1,0.0%,
100006,1,0.0%,
100007,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456251,1,0.0%,
456252,1,0.0%,
456253,1,0.0%,
456254,1,0.0%,
456255,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.080729

0,1
0,282686
1,24825

Value,Count,Frequency (%),Unnamed: 3
0,282686,91.9%,
1,24825,8.1%,

0,1
Correlation,0.91936

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),50.8%
Missing (n),156341

0,1
Panel,66040
"Stone, brick",64815
Block,9253
Other values (4),11062
(Missing),156341

Value,Count,Frequency (%),Unnamed: 3
Panel,66040,21.5%,
"Stone, brick",64815,21.1%,
Block,9253,3.0%,
Wooden,5362,1.7%,
Mixed,2296,0.7%,
Monolithic,1779,0.6%,
Others,1625,0.5%,
(Missing),156341,50.8%,

0,1
Distinct count,7
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
TUESDAY,53901
WEDNESDAY,51934
MONDAY,50714
Other values (4),150962

Value,Count,Frequency (%),Unnamed: 3
TUESDAY,53901,17.5%,
WEDNESDAY,51934,16.9%,
MONDAY,50714,16.5%,
THURSDAY,50591,16.5%,
FRIDAY,50338,16.4%,
SATURDAY,33852,11.0%,
SUNDAY,16181,5.3%,

0,1
Distinct count,286
Unique (%),0.1%
Missing (%),48.8%
Missing (n),150007
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.97773
Minimum,0
Maximum,1
Zeros (%),0.2%

0,1
Minimum,0.0
5-th percentile,0.9687
Q1,0.9767
Median,0.9816
Q3,0.9866
95-th percentile,0.996
Maximum,1.0
Range,1.0
Interquartile range,0.0099

0,1
Standard deviation,0.059223
Coef of variation,0.060572
Kurtosis,248.18
Mean,0.97773
MAD,0.010933
Skewness,-15.515
Sum,154000
Variance,0.0035074
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.9871,4311,1.4%,
0.9856,4189,1.4%,
0.9861,4171,1.4%,
0.9801,4123,1.3%,
0.9866,4114,1.3%,
0.9851,4096,1.3%,
0.9806,4096,1.3%,
0.9811,3986,1.3%,
0.9816,3982,1.3%,
0.9831,3970,1.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,514,0.2%,
0.0179,1,0.0%,
0.0447,1,0.0%,
0.0969,1,0.0%,
0.0974,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.998,1096,0.4%,
0.9985,1062,0.3%,
0.999,906,0.3%,
0.9995,691,0.2%,
1.0,186,0.1%,

0,1
Correlation,0.96354

0,1
Correlation,0.97189

0,1
Distinct count,150
Unique (%),0.0%
Missing (%),66.5%
Missing (n),204488
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.75247
Minimum,0
Maximum,1
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,0.592
Q1,0.6872
Median,0.7552
Q3,0.8232
95-th percentile,0.9524
Maximum,1.0
Range,1.0
Interquartile range,0.136

0,1
Standard deviation,0.11328
Coef of variation,0.15054
Kurtosis,4.3998
Mean,0.75247
MAD,0.08391
Skewness,-0.96249
Sum,77522
Variance,0.012832
Memory size,2.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.8232,2999,1.0%,
0.8164,2864,0.9%,
0.8028,2848,0.9%,
0.728,2802,0.9%,
0.7348,2761,0.9%,
0.8096,2755,0.9%,
0.83,2738,0.9%,
0.7959999999999999,2734,0.9%,
0.7484,2731,0.9%,
0.7688,2712,0.9%,

Value,Count,Frequency (%),Unnamed: 3
0.0,102,0.0%,
0.0004,2,0.0%,
0.0072,4,0.0%,
0.0139999999999999,3,0.0%,
0.0208,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9728,813,0.3%,
0.9796,786,0.3%,
0.9864,661,0.2%,
0.9932,478,0.2%,
1.0,173,0.1%,

0,1
Correlation,0.98946

0,1
Correlation,0.98944

Unnamed: 0,SK_ID_CURR,TARGET,NAME_CONTRACT_TYPE,CODE_GENDER,FLAG_OWN_CAR,FLAG_OWN_REALTY,CNT_CHILDREN,AMT_INCOME_TOTAL,AMT_CREDIT,AMT_ANNUITY,AMT_GOODS_PRICE,NAME_TYPE_SUITE,NAME_INCOME_TYPE,NAME_EDUCATION_TYPE,NAME_FAMILY_STATUS,NAME_HOUSING_TYPE,REGION_POPULATION_RELATIVE,DAYS_BIRTH,DAYS_EMPLOYED,DAYS_REGISTRATION,DAYS_ID_PUBLISH,OWN_CAR_AGE,FLAG_MOBIL,FLAG_EMP_PHONE,FLAG_WORK_PHONE,FLAG_CONT_MOBILE,FLAG_PHONE,FLAG_EMAIL,OCCUPATION_TYPE,CNT_FAM_MEMBERS,REGION_RATING_CLIENT,REGION_RATING_CLIENT_W_CITY,WEEKDAY_APPR_PROCESS_START,HOUR_APPR_PROCESS_START,REG_REGION_NOT_LIVE_REGION,REG_REGION_NOT_WORK_REGION,LIVE_REGION_NOT_WORK_REGION,REG_CITY_NOT_LIVE_CITY,REG_CITY_NOT_WORK_CITY,LIVE_CITY_NOT_WORK_CITY,ORGANIZATION_TYPE,EXT_SOURCE_1,EXT_SOURCE_2,EXT_SOURCE_3,APARTMENTS_AVG,BASEMENTAREA_AVG,YEARS_BEGINEXPLUATATION_AVG,YEARS_BUILD_AVG,COMMONAREA_AVG,ELEVATORS_AVG,ENTRANCES_AVG,FLOORSMAX_AVG,FLOORSMIN_AVG,LANDAREA_AVG,LIVINGAPARTMENTS_AVG,LIVINGAREA_AVG,NONLIVINGAPARTMENTS_AVG,NONLIVINGAREA_AVG,APARTMENTS_MODE,BASEMENTAREA_MODE,YEARS_BEGINEXPLUATATION_MODE,YEARS_BUILD_MODE,COMMONAREA_MODE,ELEVATORS_MODE,ENTRANCES_MODE,FLOORSMAX_MODE,FLOORSMIN_MODE,LANDAREA_MODE,LIVINGAPARTMENTS_MODE,LIVINGAREA_MODE,NONLIVINGAPARTMENTS_MODE,NONLIVINGAREA_MODE,APARTMENTS_MEDI,BASEMENTAREA_MEDI,YEARS_BEGINEXPLUATATION_MEDI,YEARS_BUILD_MEDI,COMMONAREA_MEDI,ELEVATORS_MEDI,ENTRANCES_MEDI,FLOORSMAX_MEDI,FLOORSMIN_MEDI,LANDAREA_MEDI,LIVINGAPARTMENTS_MEDI,LIVINGAREA_MEDI,NONLIVINGAPARTMENTS_MEDI,NONLIVINGAREA_MEDI,FONDKAPREMONT_MODE,HOUSETYPE_MODE,TOTALAREA_MODE,WALLSMATERIAL_MODE,EMERGENCYSTATE_MODE,OBS_30_CNT_SOCIAL_CIRCLE,DEF_30_CNT_SOCIAL_CIRCLE,OBS_60_CNT_SOCIAL_CIRCLE,DEF_60_CNT_SOCIAL_CIRCLE,DAYS_LAST_PHONE_CHANGE,FLAG_DOCUMENT_2,FLAG_DOCUMENT_3,FLAG_DOCUMENT_4,FLAG_DOCUMENT_5,FLAG_DOCUMENT_6,FLAG_DOCUMENT_7,FLAG_DOCUMENT_8,FLAG_DOCUMENT_9,FLAG_DOCUMENT_10,FLAG_DOCUMENT_11,FLAG_DOCUMENT_12,FLAG_DOCUMENT_13,FLAG_DOCUMENT_14,FLAG_DOCUMENT_15,FLAG_DOCUMENT_16,FLAG_DOCUMENT_17,FLAG_DOCUMENT_18,FLAG_DOCUMENT_19,FLAG_DOCUMENT_20,FLAG_DOCUMENT_21,AMT_REQ_CREDIT_BUREAU_HOUR,AMT_REQ_CREDIT_BUREAU_DAY,AMT_REQ_CREDIT_BUREAU_WEEK,AMT_REQ_CREDIT_BUREAU_MON,AMT_REQ_CREDIT_BUREAU_QRT,AMT_REQ_CREDIT_BUREAU_YEAR
0,100002,1,Cash loans,M,N,Y,0,202500.0,406597.5,24700.5,351000.0,Unaccompanied,Working,Secondary / secondary special,Single / not married,House / apartment,0.018801,-9461,-637,-3648.0,-2120,,1,1,0,1,1,0,Laborers,1.0,2,2,WEDNESDAY,10,0,0,0,0,0,0,Business Entity Type 3,0.083037,0.262949,0.139376,0.0247,0.0369,0.9722,0.6192,0.0143,0.0,0.069,0.0833,0.125,0.0369,0.0202,0.019,0.0,0.0,0.0252,0.0383,0.9722,0.6341,0.0144,0.0,0.069,0.0833,0.125,0.0377,0.022,0.0198,0.0,0.0,0.025,0.0369,0.9722,0.6243,0.0144,0.0,0.069,0.0833,0.125,0.0375,0.0205,0.0193,0.0,0.0,reg oper account,block of flats,0.0149,"Stone, brick",No,2.0,2.0,2.0,2.0,-1134.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,1.0
1,100003,0,Cash loans,F,N,N,0,270000.0,1293502.5,35698.5,1129500.0,Family,State servant,Higher education,Married,House / apartment,0.003541,-16765,-1188,-1186.0,-291,,1,1,0,1,1,0,Core staff,2.0,1,1,MONDAY,11,0,0,0,0,0,0,School,0.311267,0.622246,,0.0959,0.0529,0.9851,0.796,0.0605,0.08,0.0345,0.2917,0.3333,0.013,0.0773,0.0549,0.0039,0.0098,0.0924,0.0538,0.9851,0.804,0.0497,0.0806,0.0345,0.2917,0.3333,0.0128,0.079,0.0554,0.0,0.0,0.0968,0.0529,0.9851,0.7987,0.0608,0.08,0.0345,0.2917,0.3333,0.0132,0.0787,0.0558,0.0039,0.01,reg oper account,block of flats,0.0714,Block,No,1.0,0.0,1.0,0.0,-828.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,0.0
2,100004,0,Revolving loans,M,Y,Y,0,67500.0,135000.0,6750.0,135000.0,Unaccompanied,Working,Secondary / secondary special,Single / not married,House / apartment,0.010032,-19046,-225,-4260.0,-2531,26.0,1,1,1,1,1,0,Laborers,1.0,2,2,MONDAY,9,0,0,0,0,0,0,Government,,0.555912,0.729567,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,-815.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,0.0
3,100006,0,Cash loans,F,N,Y,0,135000.0,312682.5,29686.5,297000.0,Unaccompanied,Working,Secondary / secondary special,Civil marriage,House / apartment,0.008019,-19005,-3039,-9833.0,-2437,,1,1,0,1,0,0,Laborers,2.0,2,2,WEDNESDAY,17,0,0,0,0,0,0,Business Entity Type 3,,0.650442,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,0.0,2.0,0.0,-617.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,
4,100007,0,Cash loans,M,N,Y,0,121500.0,513000.0,21865.5,513000.0,Unaccompanied,Working,Secondary / secondary special,Single / not married,House / apartment,0.028663,-19932,-3038,-4311.0,-3458,,1,1,0,1,0,0,Core staff,1.0,2,2,THURSDAY,11,0,0,0,0,1,1,Religion,,0.322738,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,-1106.0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,0.0


In [14]:
pandas_profiling.ProfileReport(test)

0,1
Number of variables,121
Number of observations,48744
Total Missing (%),9.3%
Total size in memory,45.0 MiB
Average record size in memory,968.0 B

0,1
Numeric,39
Categorical,16
Boolean,21
Date,0
Text (Unique),0
Rejected,45
Unsupported,0

0,1
Distinct count,7492
Unique (%),15.4%
Missing (%),0.0%
Missing (n),24
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,29426
Minimum,2295
Maximum,180580
Zeros (%),0.0%

0,1
Minimum,2295.0
5-th percentile,9409.5
Q1,17973.0
Median,26199.0
Q3,37390.0
95-th percentile,58864.0
Maximum,180580.0
Range,178280.0
Interquartile range,19418.0

0,1
Standard deviation,16016
Coef of variation,0.54429
Kurtosis,5.0096
Mean,29426
MAD,12153
Skewness,1.4744
Sum,1433600000
Variance,256520000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
27652.5,269,0.6%,
30838.5,232,0.5%,
23107.5,221,0.5%,
22977.0,220,0.5%,
23539.5,186,0.4%,
43659.0,176,0.4%,
52452.0,174,0.4%,
35685.0,172,0.4%,
30951.0,170,0.3%,
25578.0,159,0.3%,

Value,Count,Frequency (%),Unnamed: 3
2295.0,1,0.0%,
2425.5,1,0.0%,
2439.0,2,0.0%,
2596.5,1,0.0%,
2965.5,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
173704.5,7,0.0%,
176062.5,2,0.0%,
177696.0,1,0.0%,
177826.5,3,0.0%,
180576.0,2,0.0%,

0,1
Distinct count,2937
Unique (%),6.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,516740
Minimum,45000
Maximum,2245500
Zeros (%),0.0%

0,1
Minimum,45000
5-th percentile,118510
Q1,260640
Median,450000
Q3,675000
95-th percentile,1258600
Maximum,2245500
Range,2200500
Interquartile range,414360

0,1
Standard deviation,365400
Coef of variation,0.70712
Kurtosis,3.3425
Mean,516740
MAD,268800
Skewness,1.6488
Sum,25188000000
Variance,133510000000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
450000.0,2118,4.3%,
225000.0,1836,3.8%,
675000.0,1426,2.9%,
360000.0,839,1.7%,
900000.0,835,1.7%,
296280.0,641,1.3%,
135000.0,629,1.3%,
260640.0,608,1.2%,
539100.0,565,1.2%,
270000.0,492,1.0%,

Value,Count,Frequency (%),Unnamed: 3
45000.0,111,0.2%,
47970.0,2,0.0%,
49500.0,16,0.0%,
49752.0,73,0.1%,
52128.0,161,0.3%,

Value,Count,Frequency (%),Unnamed: 3
2129445.0,1,0.0%,
2140227.0,1,0.0%,
2156400.0,62,0.1%,
2160000.0,1,0.0%,
2245500.0,1,0.0%,

0,1
Correlation,0.98806

0,1
Distinct count,606
Unique (%),1.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,178430
Minimum,26942
Maximum,4410000
Zeros (%),0.0%

0,1
Minimum,26942
5-th percentile,69368
Q1,112500
Median,157500
Q3,225000
95-th percentile,360000
Maximum,4410000
Range,4383100
Interquartile range,112500

0,1
Standard deviation,101520
Coef of variation,0.56897
Kurtosis,111.23
Mean,178430
MAD,67462
Skewness,5.3012
Sum,8697500000
Variance,10307000000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
135000.0,5562,11.4%,
112500.0,4851,10.0%,
157500.0,4435,9.1%,
180000.0,4205,8.6%,
225000.0,3764,7.7%,
202500.0,3058,6.3%,
90000.0,2944,6.0%,
270000.0,1929,4.0%,
67500.0,1369,2.8%,
315000.0,1085,2.2%,

Value,Count,Frequency (%),Unnamed: 3
26941.5,1,0.0%,
27000.0,4,0.0%,
28350.0,1,0.0%,
28575.0,1,0.0%,
28800.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2250000.0,1,0.0%,
2340000.0,1,0.0%,
2700000.0,1,0.0%,
3150000.0,1,0.0%,
4410000.0,1,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),12.4%
Missing (n),6049
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0018035
Minimum,0
Maximum,2
Zeros (%),87.4%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,2
Range,2
Interquartile range,0

0,1
Standard deviation,0.046132
Coef of variation,25.579
Kurtosis,897.7
Mean,0.0018035
MAD,0.0036011
Skewness,28.274
Sum,77
Variance,0.0021282
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,42625,87.4%,
1.0,63,0.1%,
2.0,7,0.0%,
(Missing),6049,12.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42625,87.4%,
1.0,63,0.1%,
2.0,7,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42625,87.4%,
1.0,63,0.1%,
2.0,7,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),12.4%
Missing (n),6049
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.002108
Minimum,0
Maximum,2
Zeros (%),87.4%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,2
Range,2
Interquartile range,0

0,1
Standard deviation,0.046373
Coef of variation,21.999
Kurtosis,519.75
Mean,0.002108
MAD,0.0042072
Skewness,22.413
Sum,90
Variance,0.0021504
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,42606,87.4%,
1.0,88,0.2%,
2.0,1,0.0%,
(Missing),6049,12.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42606,87.4%,
1.0,88,0.2%,
2.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42606,87.4%,
1.0,88,0.2%,
2.0,1,0.0%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),12.4%
Missing (n),6049
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0092985
Minimum,0
Maximum,6
Zeros (%),86.9%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,6
Range,6
Interquartile range,0

0,1
Standard deviation,0.11092
Coef of variation,11.929
Kurtosis,485.25
Mean,0.0092985
MAD,0.018443
Skewness,17.271
Sum,397
Variance,0.012304
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,42341,86.9%,
1.0,324,0.7%,
2.0,23,0.0%,
3.0,4,0.0%,
6.0,1,0.0%,
5.0,1,0.0%,
4.0,1,0.0%,
(Missing),6049,12.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42341,86.9%,
1.0,324,0.7%,
2.0,23,0.0%,
3.0,4,0.0%,
4.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2.0,23,0.0%,
3.0,4,0.0%,
4.0,1,0.0%,
5.0,1,0.0%,
6.0,1,0.0%,

0,1
Distinct count,9
Unique (%),0.0%
Missing (%),12.4%
Missing (n),6049
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.5469
Minimum,0
Maximum,7
Zeros (%),48.3%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,1
95-th percentile,2
Maximum,7
Range,7
Interquartile range,1

0,1
Standard deviation,0.69331
Coef of variation,1.2677
Kurtosis,1.9784
Mean,0.5469
MAD,0.60356
Skewness,1.2546
Sum,23350
Variance,0.48067
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,23559,48.3%,
1.0,15573,31.9%,
2.0,2998,6.2%,
3.0,495,1.0%,
4.0,57,0.1%,
5.0,11,0.0%,
7.0,1,0.0%,
6.0,1,0.0%,
(Missing),6049,12.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,23559,48.3%,
1.0,15573,31.9%,
2.0,2998,6.2%,
3.0,495,1.0%,
4.0,57,0.1%,

Value,Count,Frequency (%),Unnamed: 3
3.0,495,1.0%,
4.0,57,0.1%,
5.0,11,0.0%,
6.0,1,0.0%,
7.0,1,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),12.4%
Missing (n),6049
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0027872
Minimum,0
Maximum,2
Zeros (%),87.4%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,2
Range,2
Interquartile range,0

0,1
Standard deviation,0.054037
Coef of variation,19.388
Kurtosis,435.14
Mean,0.0027872
MAD,0.0055593
Skewness,20.182
Sum,119
Variance,0.00292
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,42579,87.4%,
1.0,113,0.2%,
2.0,3,0.0%,
(Missing),6049,12.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42579,87.4%,
1.0,113,0.2%,
2.0,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,42579,87.4%,
1.0,113,0.2%,
2.0,3,0.0%,

0,1
Distinct count,17
Unique (%),0.0%
Missing (%),12.4%
Missing (n),6049
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.9838
Minimum,0
Maximum,17
Zeros (%),22.2%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,2
Q3,3
95-th percentile,6
Maximum,17
Range,17
Interquartile range,3

0,1
Standard deviation,1.8389
Coef of variation,0.92696
Kurtosis,1.2251
Mean,1.9838
MAD,1.4317
Skewness,1.0654
Sum,84697
Variance,3.3815
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,10839,22.2%,
1.0,9211,18.9%,
2.0,8489,17.4%,
3.0,6194,12.7%,
4.0,3745,7.7%,
5.0,2076,4.3%,
6.0,1127,2.3%,
7.0,553,1.1%,
8.0,297,0.6%,
9.0,122,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,10839,22.2%,
1.0,9211,18.9%,
2.0,8489,17.4%,
3.0,6194,12.7%,
4.0,3745,7.7%,

Value,Count,Frequency (%),Unnamed: 3
11.0,12,0.0%,
12.0,5,0.0%,
13.0,2,0.0%,
14.0,1,0.0%,
17.0,1,0.0%,

0,1
Distinct count,1544
Unique (%),3.2%
Missing (%),49.0%
Missing (n),23887
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.12239
Minimum,0
Maximum,1
Zeros (%),0.2%

0,1
Minimum,0.0
5-th percentile,0.0113
Q1,0.0619
Median,0.0928
Q3,0.1485
95-th percentile,0.334
Maximum,1.0
Range,1.0
Interquartile range,0.0866

0,1
Standard deviation,0.11311
Coef of variation,0.92421
Kurtosis,11.884
Mean,0.12239
MAD,0.075925
Skewness,2.7314
Sum,3042.2
Variance,0.012794
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0825,1110,2.3%,
0.0619,1013,2.1%,
0.0928,768,1.6%,
0.0722,633,1.3%,
0.1485,487,1.0%,
0.0082,479,1.0%,
0.1031,438,0.9%,
0.0124,427,0.9%,
0.0165,424,0.9%,
0.1237,373,0.8%,

Value,Count,Frequency (%),Unnamed: 3
0.0,121,0.2%,
0.001,28,0.1%,
0.0013,1,0.0%,
0.0015,1,0.0%,
0.0021,103,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.9562,1,0.0%,
0.9577,2,0.0%,
0.966,1,0.0%,
0.9701,1,0.0%,
1.0,35,0.1%,

0,1
Correlation,0.93553

0,1
Correlation,0.91153

0,1
Distinct count,2817
Unique (%),5.8%
Missing (%),56.7%
Missing (n),27641
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.090065
Minimum,0
Maximum,1
Zeros (%),4.6%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0467
Median,0.0781
Q3,0.1134
95-th percentile,0.22619
Maximum,1.0
Range,1.0
Interquartile range,0.0667

0,1
Standard deviation,0.081536
Coef of variation,0.9053
Kurtosis,24.872
Mean,0.090065
MAD,0.052155
Skewness,3.4501
Sum,1900.7
Variance,0.0066482
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2241,4.6%,
0.1091,59,0.1%,
0.0727,47,0.1%,
0.0545,45,0.1%,
0.0804,45,0.1%,
0.0803,42,0.1%,
0.0764,42,0.1%,
0.0785,41,0.1%,
0.0795,41,0.1%,
0.0818,41,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2241,4.6%,
0.0001,15,0.0%,
0.0002,3,0.0%,
0.0003,3,0.0%,
0.0004,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8246,2,0.0%,
0.9169,1,0.0%,
0.9397,2,0.0%,
0.9636,1,0.0%,
1.0,21,0.0%,

0,1
Correlation,0.97014

0,1
Correlation,0.96364

0,1
Distinct count,11
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.39705
Minimum,0
Maximum,20
Zeros (%),71.2%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,1
95-th percentile,2
Maximum,20
Range,20
Interquartile range,1

0,1
Standard deviation,0.70905
Coef of variation,1.7858
Kurtosis,17.637
Mean,0.39705
MAD,0.56507
Skewness,2.3669
Sum,19354
Variance,0.50275
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0,34685,71.2%,
1,9504,19.5%,
2,3949,8.1%,
3,535,1.1%,
4,49,0.1%,
5,12,0.0%,
8,3,0.0%,
6,3,0.0%,
11,2,0.0%,
20,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,34685,71.2%,
1,9504,19.5%,
2,3949,8.1%,
3,535,1.1%,
4,49,0.1%,

Value,Count,Frequency (%),Unnamed: 3
6,3,0.0%,
7,1,0.0%,
8,3,0.0%,
11,2,0.0%,
20,1,0.0%,

0,1
Distinct count,12
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2.1468
Minimum,1
Maximum,21
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,2
Median,2
Q3,3
95-th percentile,4
Maximum,21
Range,20
Interquartile range,1

0,1
Standard deviation,0.89042
Coef of variation,0.41477
Kurtosis,6.3259
Mean,2.1468
MAD,0.63923
Skewness,1.1677
Sum,104640
Variance,0.79285
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
2.0,26054,53.5%,
1.0,10251,21.0%,
3.0,8173,16.8%,
4.0,3690,7.6%,
5.0,512,1.1%,
6.0,43,0.1%,
7.0,12,0.0%,
10.0,3,0.0%,
13.0,2,0.0%,
8.0,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1.0,10251,21.0%,
2.0,26054,53.5%,
3.0,8173,16.8%,
4.0,3690,7.6%,
5.0,512,1.1%,

Value,Count,Frequency (%),Unnamed: 3
8.0,2,0.0%,
9.0,1,0.0%,
10.0,3,0.0%,
13.0,2,0.0%,
21.0,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
F,32678
M,16066

Value,Count,Frequency (%),Unnamed: 3
F,32678,67.0%,
M,16066,33.0%,

0,1
Distinct count,2043
Unique (%),4.2%
Missing (%),68.7%
Missing (n),33495
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.047624
Minimum,0
Maximum,1
Zeros (%),2.8%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0081
Median,0.0227
Q3,0.0539
95-th percentile,0.1737
Maximum,1.0
Range,1.0
Interquartile range,0.0458

0,1
Standard deviation,0.082868
Coef of variation,1.7401
Kurtosis,44.847
Mean,0.047624
MAD,0.045013
Skewness,5.5263
Sum,726.21
Variance,0.0068672
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,2.8%,
0.0079,89,0.2%,
0.0078,84,0.2%,
0.0077,73,0.1%,
0.008,69,0.1%,
0.0086,64,0.1%,
0.0125,56,0.1%,
0.006999999999999999,56,0.1%,
0.0014,55,0.1%,
0.0088,52,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,2.8%,
0.0001,13,0.0%,
0.0002,10,0.0%,
0.0003,7,0.0%,
0.0004,8,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9277,1,0.0%,
0.9339,1,0.0%,
0.9552,1,0.0%,
0.9613,1,0.0%,
1.0,20,0.0%,

0,1
Correlation,0.9814

0,1
Correlation,0.9768

0,1
Distinct count,15477
Unique (%),31.8%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-16068
Minimum,-25195
Maximum,-7338
Zeros (%),0.0%

0,1
Minimum,-25195
5-th percentile,-23202
Q1,-19637
Median,-15785
Q3,-12496
95-th percentile,-9485
Maximum,-7338
Range,17857
Interquartile range,7141

0,1
Standard deviation,4325.9
Coef of variation,-0.26922
Kurtosis,-1.0345
Mean,-16068
MAD,3687.5
Skewness,-0.11783
Sum,-783222716
Variance,18713000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
-11590,13,0.0%,
-15812,12,0.0%,
-11667,12,0.0%,
-11603,12,0.0%,
-11997,12,0.0%,
-12038,12,0.0%,
-16219,11,0.0%,
-11443,11,0.0%,
-15609,11,0.0%,
-17334,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-25195,1,0.0%,
-25175,1,0.0%,
-25154,1,0.0%,
-25126,1,0.0%,
-25121,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-7684,1,0.0%,
-7676,1,0.0%,
-7650,1,0.0%,
-7372,1,0.0%,
-7338,1,0.0%,

0,1
Distinct count,7863
Unique (%),16.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,67485
Minimum,-17463
Maximum,365243
Zeros (%),0.0%

0,1
Minimum,-17463
5-th percentile,-6728
Q1,-2910
Median,-1293
Q3,-296
95-th percentile,365240
Maximum,365243
Range,382706
Interquartile range,2614

0,1
Standard deviation,144350
Coef of variation,2.139
Kurtosis,0.48998
Mean,67485
MAD,113300
Skewness,1.5775
Sum,3289506696
Variance,20836000000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
365243,9274,19.0%,
-1119,32,0.1%,
-389,31,0.1%,
-1240,30,0.1%,
-148,28,0.1%,
-277,27,0.1%,
-655,27,0.1%,
-1020,26,0.1%,
-1032,26,0.1%,
-1342,26,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-17463,1,0.0%,
-17124,1,0.0%,
-17077,1,0.0%,
-16774,1,0.0%,
-16547,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-16,1,0.0%,
-14,1,0.0%,
-5,1,0.0%,
-1,1,0.0%,
365243,9274,19.0%,

0,1
Distinct count,5880
Unique (%),12.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-3051.7
Minimum,-6348
Maximum,0
Zeros (%),0.0%

0,1
Minimum,-6348
5-th percentile,-5124
Q1,-4448
Median,-3234
Q3,-1706
95-th percentile,-381
Maximum,0
Range,6348
Interquartile range,2742

0,1
Standard deviation,1569.3
Coef of variation,-0.51423
Kurtosis,-1.1644
Mean,-3051.7
MAD,1371.9
Skewness,0.28366
Sum,-148752696
Variance,2462600
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
-4557,40,0.1%,
-4255,39,0.1%,
-4291,35,0.1%,
-4277,32,0.1%,
-4592,32,0.1%,
-4543,30,0.1%,
-4263,30,0.1%,
-4270,30,0.1%,
-4424,29,0.1%,
-4319,29,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-6348,1,0.0%,
-6346,1,0.0%,
-6331,1,0.0%,
-6328,1,0.0%,
-6317,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4,8,0.0%,
-3,7,0.0%,
-2,7,0.0%,
-1,4,0.0%,
0,5,0.0%,

0,1
Distinct count,3579
Unique (%),7.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-1077.8
Minimum,-4361
Maximum,0
Zeros (%),11.9%

0,1
Minimum,-4361.0
5-th percentile,-2729.8
Q1,-1766.2
Median,-863.0
Q3,-363.0
95-th percentile,0.0
Maximum,0.0
Range,4361.0
Interquartile range,1403.2

0,1
Standard deviation,878.92
Coef of variation,-0.8155
Kurtosis,-0.39542
Mean,-1077.8
MAD,740.73
Skewness,-0.67382
Sum,-52535000
Variance,772500
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,5801,11.9%,
-1.0,174,0.4%,
-2.0,71,0.1%,
-3.0,45,0.1%,
-1799.0,44,0.1%,
-4.0,43,0.1%,
-565.0,42,0.1%,
-1783.0,42,0.1%,
-1800.0,41,0.1%,
-608.0,41,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-4361.0,1,0.0%,
-4285.0,1,0.0%,
-4257.0,1,0.0%,
-4250.0,1,0.0%,
-4187.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4.0,43,0.1%,
-3.0,45,0.1%,
-2.0,71,0.1%,
-1.0,174,0.4%,
0.0,5801,11.9%,

0,1
Distinct count,12618
Unique (%),25.9%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-4967.7
Minimum,-23722
Maximum,0
Zeros (%),0.0%

0,1
Minimum,-23722.0
5-th percentile,-11440.0
Q1,-7459.2
Median,-4490.0
Q3,-1901.0
95-th percentile,-345.0
Maximum,0.0
Range,23722.0
Interquartile range,5558.2

0,1
Standard deviation,3552.6
Coef of variation,-0.71515
Kurtosis,-0.30051
Mean,-4967.7
MAD,2939.9
Skewness,-0.61327
Sum,-242140000
Variance,12621000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
-818.0,20,0.0%,
-427.0,19,0.0%,
-991.0,18,0.0%,
-3758.0,16,0.0%,
-223.0,15,0.0%,
-1435.0,15,0.0%,
-602.0,15,0.0%,
-819.0,15,0.0%,
-832.0,15,0.0%,
-650.0,15,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-23722.0,1,0.0%,
-22842.0,1,0.0%,
-21099.0,1,0.0%,
-19842.0,1,0.0%,
-19263.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4.0,7,0.0%,
-3.0,10,0.0%,
-2.0,7,0.0%,
-1.0,10,0.0%,
0.0,13,0.0%,

0,1
Distinct count,9
Unique (%),0.0%
Missing (%),0.1%
Missing (n),29
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.14365
Minimum,0
Maximum,34
Zeros (%),88.6%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,1
Maximum,34
Range,34
Interquartile range,0

0,1
Standard deviation,0.51441
Coef of variation,3.581
Kurtosis,1164.9
Mean,0.14365
MAD,0.25475
Skewness,20.003
Sum,6998
Variance,0.26462
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,43195,88.6%,
1.0,4443,9.1%,
2.0,834,1.7%,
3.0,189,0.4%,
4.0,40,0.1%,
5.0,8,0.0%,
34.0,3,0.0%,
6.0,3,0.0%,
(Missing),29,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,43195,88.6%,
1.0,4443,9.1%,
2.0,834,1.7%,
3.0,189,0.4%,
4.0,40,0.1%,

Value,Count,Frequency (%),Unnamed: 3
3.0,189,0.4%,
4.0,40,0.1%,
5.0,8,0.0%,
6.0,3,0.0%,
34.0,3,0.0%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),0.1%
Missing (n),29
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.10114
Minimum,0
Maximum,24
Zeros (%),91.5%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,1
Maximum,24
Range,24
Interquartile range,0

0,1
Standard deviation,0.40379
Coef of variation,3.9924
Kurtosis,769.63
Mean,0.10114
MAD,0.18525
Skewness,15.8
Sum,4927
Variance,0.16305
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,44614,91.5%,
1.0,3484,7.1%,
2.0,496,1.0%,
3.0,97,0.2%,
4.0,17,0.0%,
5.0,4,0.0%,
24.0,3,0.0%,
(Missing),29,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,44614,91.5%,
1.0,3484,7.1%,
2.0,496,1.0%,
3.0,97,0.2%,
4.0,17,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2.0,496,1.0%,
3.0,97,0.2%,
4.0,17,0.0%,
5.0,4,0.0%,
24.0,3,0.0%,

0,1
Distinct count,182
Unique (%),0.4%
Missing (%),51.7%
Missing (n),25189
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.085168
Minimum,0
Maximum,1
Zeros (%),27.6%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.16
95-th percentile,0.36
Maximum,1.0
Range,1.0
Interquartile range,0.16

0,1
Standard deviation,0.13916
Coef of variation,1.634
Kurtosis,7.0423
Mean,0.085168
MAD,0.10258
Skewness,2.3203
Sum,2006.1
Variance,0.019367
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,13467,27.6%,
0.08,1682,3.5%,
0.16,1586,3.3%,
0.24,1067,2.2%,
0.12,922,1.9%,
0.04,759,1.6%,
0.2,698,1.4%,
0.32,475,1.0%,
0.28,398,0.8%,
0.4,314,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,13467,27.6%,
0.0044,1,0.0%,
0.0048,2,0.0%,
0.0064,7,0.0%,
0.008,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.84,11,0.0%,
0.88,16,0.0%,
0.92,2,0.0%,
0.96,17,0.0%,
1.0,24,0.0%,

0,1
Correlation,0.98618

0,1
Correlation,0.9814

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),45.6%
Missing (n),22209

0,1
No,26179
Yes,356
(Missing),22209

Value,Count,Frequency (%),Unnamed: 3
No,26179,53.7%,
Yes,356,0.7%,
(Missing),22209,45.6%,

0,1
Distinct count,201
Unique (%),0.4%
Missing (%),48.4%
Missing (n),23579
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.15178
Minimum,0
Maximum,1
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,0.0345
Q1,0.0745
Median,0.1379
Q3,0.2069
95-th percentile,0.3103
Maximum,1.0
Range,1.0
Interquartile range,0.1324

0,1
Standard deviation,0.10067
Coef of variation,0.66327
Kurtosis,12.503
Mean,0.15178
MAD,0.070153
Skewness,2.4898
Sum,3819.5
Variance,0.010134
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.1379,5683,11.7%,
0.069,3659,7.5%,
0.2069,3243,6.7%,
0.1034,3132,6.4%,
0.0345,2398,4.9%,
0.1724,1541,3.2%,
0.2759,1318,2.7%,
0.2414,705,1.4%,
0.3448,381,0.8%,
0.3103,356,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,38,0.1%,
0.0345,2398,4.9%,
0.0407,1,0.0%,
0.0414,3,0.0%,
0.0431,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8621,2,0.0%,
0.8966,7,0.0%,
0.931,6,0.0%,
0.9655,6,0.0%,
1.0,29,0.1%,

0,1
Correlation,0.98229

0,1
Correlation,0.97882

0,1
Distinct count,27208
Unique (%),55.8%
Missing (%),42.1%
Missing (n),20532
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.50118
Minimum,0.013458
Maximum,0.93914
Zeros (%),0.0%

0,1
Minimum,0.013458
5-th percentile,0.15714
Q1,0.3437
Median,0.50677
Q3,0.66596
95-th percentile,0.82126
Maximum,0.93914
Range,0.92569
Interquartile range,0.32226

0,1
Standard deviation,0.20514
Coef of variation,0.40932
Kurtosis,-0.87481
Mean,0.50118
MAD,0.17256
Skewness,-0.12179
Sum,14139
Variance,0.042083
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.4152812269709412,4,0.0%,
0.3776684051869344,4,0.0%,
0.6390957284454417,3,0.0%,
0.6179579067171371,3,0.0%,
0.58411453572162,3,0.0%,
0.4374936492679527,3,0.0%,
0.5673372445332411,3,0.0%,
0.3539137186352733,3,0.0%,
0.7066083433716649,3,0.0%,
0.3755951414984175,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0134579104986917,1,0.0%,
0.0150029392241375,1,0.0%,
0.0157250495617777,1,0.0%,
0.015832811772233,1,0.0%,
0.0160191403660058,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.929133160594686,1,0.0%,
0.93147085800672,1,0.0%,
0.9320799837837532,1,0.0%,
0.9341448661797344,1,0.0%,
0.9391445326561508,1,0.0%,

0,1
Distinct count,38886
Unique (%),79.8%
Missing (%),0.0%
Missing (n),8
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.51802
Minimum,8.0979e-06
Maximum,0.855
Zeros (%),0.0%

0,1
Minimum,8.0979e-06
5-th percentile,0.15451
Q1,0.40807
Median,0.55876
Q3,0.6585
95-th percentile,0.74777
Maximum,0.855
Range,0.85499
Interquartile range,0.25043

0,1
Standard deviation,0.18128
Coef of variation,0.34994
Kurtosis,-0.1132
Mean,0.51802
MAD,0.14733
Skewness,-0.78106
Sum,25246
Variance,0.032862
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.2858978721410488,100,0.2%,
0.2518770396779223,68,0.1%,
0.5716197625863495,55,0.1%,
0.5637630973619067,51,0.1%,
0.25084006448472995,50,0.1%,
0.15124295062050858,48,0.1%,
0.34246594940694985,35,0.1%,
0.4724002764874421,35,0.1%,
0.15041112624731465,33,0.1%,
0.4821327460011443,30,0.1%,

Value,Count,Frequency (%),Unnamed: 3
8.09785587553435e-06,1,0.0%,
1.7886937824749202e-05,1,0.0%,
2.546852218320021e-05,1,0.0%,
4.299709118718917e-05,1,0.0%,
0.000104938789094,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8136666961587994,1,0.0%,
0.8168285575720258,1,0.0%,
0.8179351006825469,1,0.0%,
0.8251289217383433,1,0.0%,
0.8549996664047012,5,0.0%,

0,1
Distinct count,703
Unique (%),1.4%
Missing (%),17.8%
Missing (n),8668
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.50011
Minimum,0.00052727
Maximum,0.88253
Zeros (%),0.0%

0,1
Minimum,0.00052727
5-th percentile,0.15952
Q1,0.36395
Median,0.5191
Q3,0.6529
95-th percentile,0.77516
Maximum,0.88253
Range,0.882
Interquartile range,0.28895

0,1
Standard deviation,0.1895
Coef of variation,0.37892
Kurtosis,-0.73302
Mean,0.50011
MAD,0.15839
Skewness,-0.33621
Sum,20042
Variance,0.035909
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.7062051096536562,219,0.4%,
0.746300213050371,200,0.4%,
0.5954562029091491,197,0.4%,
0.6706517530862718,197,0.4%,
0.7136313997323308,195,0.4%,
0.5814837058057234,194,0.4%,
0.6263042766749393,193,0.4%,
0.5549467685334323,187,0.4%,
0.6832688314232291,185,0.4%,
0.5585066276769286,183,0.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0005272652387098,64,0.1%,
0.0214915157350364,1,0.0%,
0.0224206800096286,1,0.0%,
0.0240571964235513,1,0.0%,
0.0247439400696317,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8771942581973663,3,0.0%,
0.878739766981187,1,0.0%,
0.8802684803872619,3,0.0%,
0.881026574983999,2,0.0%,
0.8825303127941461,2,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.9984

0,1
1,48666
0,78

Value,Count,Frequency (%),Unnamed: 3
1,48666,99.8%,
0,78,0.2%,

0,1
Constant value,0

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0011694

0,1
0,48687
1,57

Value,Count,Frequency (%),Unnamed: 3
0,48687,99.9%,
1,57,0.1%,

0,1
Constant value,0

0,1
Constant value,0

0,1
Constant value,0

0,1
Constant value,0

0,1
Constant value,0

0,1
Constant value,0

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0015592

0,1
0,48668
1,76

Value,Count,Frequency (%),Unnamed: 3
0,48668,99.8%,
1,76,0.2%,

0,1
Constant value,0

0,1
Constant value,0

0,1
Constant value,0

0,1
Constant value,0

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.78662

0,1
1,38343
0,10401

Value,Count,Frequency (%),Unnamed: 3
1,38343,78.7%,
0,10401,21.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.00010258

0,1
0,48739
1,5

Value,Count,Frequency (%),Unnamed: 3
0,48739,100.0%,
1,5,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.014751

0,1
0,48025
1,719

Value,Count,Frequency (%),Unnamed: 3
0,48025,98.5%,
1,719,1.5%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.087477

0,1
0,44480
1,4264

Value,Count,Frequency (%),Unnamed: 3
0,44480,91.3%,
1,4264,8.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,4.1031e-05

0,1
0,48742
1,2

Value,Count,Frequency (%),Unnamed: 3
0,48742,100.0%,
1,2,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.088462

0,1
0,44432
1,4312

Value,Count,Frequency (%),Unnamed: 3
0,44432,91.2%,
1,4312,8.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.0044929

0,1
0,48525
1,219

Value,Count,Frequency (%),Unnamed: 3
0,48525,99.6%,
1,219,0.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16265

0,1
0,40816
1,7928

Value,Count,Frequency (%),Unnamed: 3
0,40816,83.7%,
1,7928,16.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.80972

0,1
1,39469
0,9275

Value,Count,Frequency (%),Unnamed: 3
1,39469,81.0%,
0,9275,19.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.99998

0,1
1,48743
0,1

Value,Count,Frequency (%),Unnamed: 3
1,48743,100.0%,
0,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
N,32311
Y,16433

Value,Count,Frequency (%),Unnamed: 3
N,32311,66.3%,
Y,16433,33.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Y,33658
N,15086

Value,Count,Frequency (%),Unnamed: 3
Y,33658,69.1%,
N,15086,30.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.26313

0,1
0,35918
1,12826

Value,Count,Frequency (%),Unnamed: 3
0,35918,73.7%,
1,12826,26.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.2047

0,1
0,38766
1,9978

Value,Count,Frequency (%),Unnamed: 3
0,38766,79.5%,
1,9978,20.5%,

0,1
Distinct count,253
Unique (%),0.5%
Missing (%),47.8%
Missing (n),23321
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.23371
Minimum,0
Maximum,1
Zeros (%),0.9%

0,1
Minimum,0.0
5-th percentile,0.0417
Q1,0.1667
Median,0.1667
Q3,0.3333
95-th percentile,0.5417
Maximum,1.0
Range,1.0
Interquartile range,0.1666

0,1
Standard deviation,0.14736
Coef of variation,0.63054
Kurtosis,2.5639
Mean,0.23371
MAD,0.11857
Skewness,1.2559
Sum,5941.5
Variance,0.021715
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.1667,10118,20.8%,
0.3333,5480,11.2%,
0.0417,2165,4.4%,
0.375,1421,2.9%,
0.125,1052,2.2%,
0.0833,997,2.0%,
0.4583,468,1.0%,
0.0,417,0.9%,
0.625,320,0.7%,
0.6667,292,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,417,0.9%,
0.0083,2,0.0%,
0.01,1,0.0%,
0.0104,2,0.0%,
0.0138,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9167,12,0.0%,
0.9304,3,0.0%,
0.9375,1,0.0%,
0.9583,15,0.0%,
1.0,34,0.1%,

0,1
Correlation,0.9881

0,1
Correlation,0.98515

0,1
Distinct count,199
Unique (%),0.4%
Missing (%),66.6%
Missing (n),32466
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.23842
Minimum,0
Maximum,1
Zeros (%),0.8%

0,1
Minimum,0.0
5-th percentile,0.0417
Q1,0.1042
Median,0.2083
Q3,0.375
95-th percentile,0.5417
Maximum,1.0
Range,1.0
Interquartile range,0.2708

0,1
Standard deviation,0.16498
Coef of variation,0.69195
Kurtosis,1.266
Mean,0.23842
MAD,0.12893
Skewness,0.93205
Sum,3881.1
Variance,0.027217
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.2083,5337,10.9%,
0.375,3068,6.3%,
0.0417,2852,5.9%,
0.0833,750,1.5%,
0.4167,699,1.4%,
0.125,513,1.1%,
0.1667,498,1.0%,
0.0,391,0.8%,
0.5,269,0.6%,
0.7083,198,0.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,391,0.8%,
0.0067,1,0.0%,
0.0104,1,0.0%,
0.0158,2,0.0%,
0.0208,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.875,5,0.0%,
0.8888,1,0.0%,
0.9167,24,0.0%,
0.9583,5,0.0%,
1.0,25,0.1%,

0,1
Correlation,0.98826

0,1
Correlation,0.98547

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),67.3%
Missing (n),32797

0,1
reg oper account,12124
reg oper spec account,1990
org spec account,920
(Missing),32797

Value,Count,Frequency (%),Unnamed: 3
reg oper account,12124,24.9%,
reg oper spec account,1990,4.1%,
org spec account,920,1.9%,
not specified,913,1.9%,
(Missing),32797,67.3%,

0,1
Distinct count,24
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,12.007
Minimum,0
Maximum,23
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,7
Q1,10
Median,12
Q3,14
95-th percentile,17
Maximum,23
Range,23
Interquartile range,4

0,1
Standard deviation,3.2782
Coef of variation,0.27301
Kurtosis,-0.16983
Mean,12.007
MAD,2.6328
Skewness,0.013816
Sum,585287
Variance,10.746
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
10,6474,13.3%,
11,5934,12.2%,
12,5418,11.1%,
13,4715,9.7%,
9,4341,8.9%,
14,4197,8.6%,
15,3761,7.7%,
16,3106,6.4%,
8,2574,5.3%,
17,2356,4.8%,

Value,Count,Frequency (%),Unnamed: 3
0,8,0.0%,
1,19,0.0%,
2,39,0.1%,
3,240,0.5%,
4,302,0.6%,

Value,Count,Frequency (%),Unnamed: 3
19,629,1.3%,
20,214,0.4%,
21,69,0.1%,
22,25,0.1%,
23,7,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),48.5%
Missing (n),23619

0,1
block of flats,24659
specific housing,262
terraced house,204
(Missing),23619

Value,Count,Frequency (%),Unnamed: 3
block of flats,24659,50.6%,
specific housing,262,0.5%,
terraced house,204,0.4%,
(Missing),23619,48.5%,

0,1
Distinct count,2541
Unique (%),5.2%
Missing (%),58.0%
Missing (n),28254
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.067192
Minimum,0
Maximum,1
Zeros (%),5.4%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.019
Median,0.0483
Q3,0.0868
95-th percentile,0.1997
Maximum,1.0
Range,1.0
Interquartile range,0.0678

0,1
Standard deviation,0.081909
Coef of variation,1.219
Kurtosis,32.476
Mean,0.067192
MAD,0.050376
Skewness,4.3076
Sum,1376.8
Variance,0.0067091
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2656,5.4%,
0.0141,37,0.1%,
0.0158,33,0.1%,
0.0114,33,0.1%,
0.0568,32,0.1%,
0.0394,32,0.1%,
0.0221,32,0.1%,
0.0316,32,0.1%,
0.0947,31,0.1%,
0.0458,31,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2656,5.4%,
0.0001,2,0.0%,
0.0002,2,0.0%,
0.0004,3,0.0%,
0.0006,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.894,3,0.0%,
0.8947,1,0.0%,
0.913,1,0.0%,
0.9777,1,0.0%,
1.0,20,0.0%,

0,1
Correlation,0.98778

0,1
Correlation,0.9795

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.17422

0,1
0,40252
1,8492

Value,Count,Frequency (%),Unnamed: 3
0,40252,82.6%,
1,8492,17.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.042036

0,1
0,46695
1,2049

Value,Count,Frequency (%),Unnamed: 3
0,46695,95.8%,
1,2049,4.2%,

0,1
Correlation,0.94334

0,1
Correlation,0.94174

0,1
Correlation,0.93636

0,1
Correlation,0.91006

0,1
Correlation,0.91048

0,1
Correlation,0.90556

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Cash loans,48305
Revolving loans,439

Value,Count,Frequency (%),Unnamed: 3
Cash loans,48305,99.1%,
Revolving loans,439,0.9%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Secondary / secondary special,33988
Higher education,12516
Incomplete higher,1724
Other values (2),516

Value,Count,Frequency (%),Unnamed: 3
Secondary / secondary special,33988,69.7%,
Higher education,12516,25.7%,
Incomplete higher,1724,3.5%,
Lower secondary,475,1.0%,
Academic degree,41,0.1%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Married,32283
Single / not married,7036
Civil marriage,4261
Other values (2),5164

Value,Count,Frequency (%),Unnamed: 3
Married,32283,66.2%,
Single / not married,7036,14.4%,
Civil marriage,4261,8.7%,
Separated,2955,6.1%,
Widow,2209,4.5%,

0,1
Distinct count,6
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
House / apartment,43645
With parents,2234
Municipal apartment,1617
Other values (3),1248

Value,Count,Frequency (%),Unnamed: 3
House / apartment,43645,89.5%,
With parents,2234,4.6%,
Municipal apartment,1617,3.3%,
Rented apartment,718,1.5%,
Office apartment,407,0.8%,
Co-op apartment,123,0.3%,

0,1
Distinct count,7
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Working,24533
Commercial associate,11402
Pensioner,9273
Other values (4),3536

Value,Count,Frequency (%),Unnamed: 3
Working,24533,50.3%,
Commercial associate,11402,23.4%,
Pensioner,9273,19.0%,
State servant,3532,7.2%,
Student,2,0.0%,
Unemployed,1,0.0%,
Businessman,1,0.0%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),1.9%
Missing (n),911

0,1
Unaccompanied,39727
Family,5881
"Spouse, partner",1448
Other values (4),777
(Missing),911

Value,Count,Frequency (%),Unnamed: 3
Unaccompanied,39727,81.5%,
Family,5881,12.1%,
"Spouse, partner",1448,3.0%,
Children,408,0.8%,
Other_B,211,0.4%,
Other_A,109,0.2%,
Group of people,49,0.1%,
(Missing),911,1.9%,

0,1
Distinct count,242
Unique (%),0.5%
Missing (%),68.4%
Missing (n),33347
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0092315
Minimum,0
Maximum,1
Zeros (%),18.1%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0051
95-th percentile,0.0309
Maximum,1.0
Range,1.0
Interquartile range,0.0051

0,1
Standard deviation,0.048749
Coef of variation,5.2807
Kurtosis,241.31
Mean,0.0092315
MAD,0.012839
Skewness,14.303
Sum,142.14
Variance,0.0023765
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,8815,18.1%,
0.0039,2234,4.6%,
0.0077,1020,2.1%,
0.0116,582,1.2%,
0.0154,478,1.0%,
0.0193,261,0.5%,
0.0232,215,0.4%,
0.0019,196,0.4%,
0.027000000000000003,140,0.3%,
0.0309,134,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,8815,18.1%,
0.0004,3,0.0%,
0.0005,3,0.0%,
0.0006,5,0.0%,
0.0007,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8571,1,0.0%,
0.8687,1,0.0%,
0.8803,1,0.0%,
0.9344,1,0.0%,
1.0,14,0.0%,

0,1
Correlation,0.97868

0,1
Correlation,0.96234

0,1
Distinct count,2027
Unique (%),4.2%
Missing (%),53.5%
Missing (n),26084
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.029387
Minimum,0
Maximum,1
Zeros (%),19.4%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0038
Q3,0.029
95-th percentile,0.1323
Maximum,1.0
Range,1.0
Interquartile range,0.029

0,1
Standard deviation,0.072007
Coef of variation,2.4503
Kurtosis,63.033
Mean,0.029387
MAD,0.037246
Skewness,6.5177
Sum,665.92
Variance,0.0051851
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,9469,19.4%,
0.0012,102,0.2%,
0.0024,75,0.2%,
0.0052,75,0.2%,
0.001,72,0.1%,
0.0044,71,0.1%,
0.0031,71,0.1%,
0.0022,70,0.1%,
0.0036,69,0.1%,
0.0013,69,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,9469,19.4%,
0.0001,25,0.1%,
0.0002,20,0.0%,
0.0003,22,0.0%,
0.0004,24,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9043,2,0.0%,
0.9099,1,0.0%,
0.9295,1,0.0%,
0.9813,1,0.0%,
1.0,26,0.1%,

0,1
Correlation,0.9823

0,1
Correlation,0.9772

0,1
Distinct count,29
Unique (%),0.1%
Missing (%),0.1%
Missing (n),29
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.4476
Minimum,0
Maximum,354
Zeros (%),53.4%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,2
95-th percentile,6
Maximum,354
Range,354
Interquartile range,2

0,1
Standard deviation,3.6081
Coef of variation,2.4924
Kurtosis,5550.4
Mean,1.4476
MAD,1.6895
Skewness,57.638
Sum,70522
Variance,13.018
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,26025,53.4%,
1.0,7765,15.9%,
2.0,4751,9.7%,
3.0,3269,6.7%,
4.0,2202,4.5%,
5.0,1467,3.0%,
6.0,1034,2.1%,
7.0,695,1.4%,
8.0,486,1.0%,
9.0,344,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,26025,53.4%,
1.0,7765,15.9%,
2.0,4751,9.7%,
3.0,3269,6.7%,
4.0,2202,4.5%,

Value,Count,Frequency (%),Unnamed: 3
23.0,3,0.0%,
29.0,1,0.0%,
352.0,1,0.0%,
353.0,1,0.0%,
354.0,1,0.0%,

0,1
Correlation,0.99954

0,1
Distinct count,19
Unique (%),0.0%
Missing (%),32.0%
Missing (n),15605

0,1
Laborers,8655
Sales staff,5072
Core staff,4361
Other values (15),15051
(Missing),15605

Value,Count,Frequency (%),Unnamed: 3
Laborers,8655,17.8%,
Sales staff,5072,10.4%,
Core staff,4361,8.9%,
Managers,3574,7.3%,
Drivers,2773,5.7%,
High skill tech staff,1854,3.8%,
Accountants,1628,3.3%,
Medicine staff,1316,2.7%,
Security staff,915,1.9%,
Cooking staff,894,1.8%,

0,1
Distinct count,58
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Business Entity Type 3,10840
XNA,9274
Self-employed,5920
Other values (55),22710

Value,Count,Frequency (%),Unnamed: 3
Business Entity Type 3,10840,22.2%,
XNA,9274,19.0%,
Self-employed,5920,12.1%,
Other,2707,5.6%,
Medicine,1716,3.5%,
Government,1508,3.1%,
Business Entity Type 2,1479,3.0%,
Trade: type 7,1303,2.7%,
School,1287,2.6%,
Construction,1039,2.1%,

0,1
Distinct count,53
Unique (%),0.1%
Missing (%),66.3%
Missing (n),32312
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,11.786
Minimum,0
Maximum,74
Zeros (%),0.2%

0,1
Minimum,0
5-th percentile,2
Q1,4
Median,9
Q3,15
95-th percentile,28
Maximum,74
Range,74
Interquartile range,11

0,1
Standard deviation,11.463
Coef of variation,0.97258
Kurtosis,10.597
Mean,11.786
MAD,7.3318
Skewness,2.9059
Sum,193670
Variance,131.4
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
4.0,1352,2.8%,
7.0,1201,2.5%,
8.0,1097,2.3%,
3.0,1027,2.1%,
2.0,980,2.0%,
9.0,849,1.7%,
5.0,829,1.7%,
10.0,764,1.6%,
11.0,710,1.5%,
14.0,706,1.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,118,0.2%,
1.0,650,1.3%,
2.0,980,2.0%,
3.0,1027,2.1%,
4.0,1352,2.8%,

Value,Count,Frequency (%),Unnamed: 3
52.0,1,0.0%,
55.0,1,0.0%,
56.0,1,0.0%,
65.0,459,0.9%,
74.0,1,0.0%,

0,1
Distinct count,81
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.021226
Minimum,0.000253
Maximum,0.072508
Zeros (%),0.0%

0,1
Minimum,0.000253
5-th percentile,0.00496
Q1,0.010006
Median,0.01885
Q3,0.028663
95-th percentile,0.04622
Maximum,0.072508
Range,0.072255
Interquartile range,0.018657

0,1
Standard deviation,0.014428
Coef of variation,0.67975
Kurtosis,2.9862
Mean,0.021226
MAD,0.010734
Skewness,1.4934
Sum,1034.6
Variance,0.00020817
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.035792000000000004,2582,5.3%,
0.04622,2497,5.1%,
0.030755,1991,4.1%,
0.026392,1805,3.7%,
0.028663,1740,3.6%,
0.025164,1599,3.3%,
0.031329,1595,3.3%,
0.072508,1565,3.2%,
0.019101,1466,3.0%,
0.020713,1327,2.7%,

Value,Count,Frequency (%),Unnamed: 3
0.000253,1,0.0%,
0.000533,1,0.0%,
0.000938,5,0.0%,
0.001276,69,0.1%,
0.001333,28,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.031329,1595,3.3%,
0.032561,1088,2.2%,
0.035792,2582,5.3%,
0.04622,2497,5.1%,
0.072508,1565,3.2%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2.0382
Minimum,1
Maximum,3
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,2
Median,2
Q3,2
95-th percentile,3
Maximum,3
Range,2
Interquartile range,0

0,1
Standard deviation,0.52269
Coef of variation,0.25645
Kurtosis,0.634
Mean,2.0382
MAD,0.30088
Skewness,0.047815
Sum,99348
Variance,0.27321
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
2,35356,72.5%,
3,7624,15.6%,
1,5764,11.8%,

Value,Count,Frequency (%),Unnamed: 3
1,5764,11.8%,
2,35356,72.5%,
3,7624,15.6%,

Value,Count,Frequency (%),Unnamed: 3
1,5764,11.8%,
2,35356,72.5%,
3,7624,15.6%,

0,1
Correlation,0.94248

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.077466

0,1
0,44968
1,3776

Value,Count,Frequency (%),Unnamed: 3
0,44968,92.3%,
1,3776,7.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.22466

0,1
0,37793
1,10951

Value,Count,Frequency (%),Unnamed: 3
0,37793,77.5%,
1,10951,22.5%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.018833

0,1
0,47826
1,918

Value,Count,Frequency (%),Unnamed: 3
0,47826,98.1%,
1,918,1.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.055166

0,1
0,46055
1,2689

Value,Count,Frequency (%),Unnamed: 3
0,46055,94.5%,
1,2689,5.5%,

0,1
Distinct count,48744
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,277800
Minimum,100001
Maximum,456250
Zeros (%),0.0%

0,1
Minimum,100001
5-th percentile,117040
Q1,188560
Median,277550
Q3,367560
95-th percentile,438550
Maximum,456250
Range,356249
Interquartile range,179000

0,1
Standard deviation,103170
Coef of variation,0.37139
Kurtosis,-1.2063
Mean,277800
MAD,89401
Skewness,0.0075597
Sum,13540921192
Variance,10644000000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
198655,1,0.0%,
415104,1,0.0%,
200572,1,0.0%,
132490,1,0.0%,
333192,1,0.0%,
290183,1,0.0%,
386308,1,0.0%,
369105,1,0.0%,
281987,1,0.0%,
229357,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100001,1,0.0%,
100005,1,0.0%,
100013,1,0.0%,
100028,1,0.0%,
100038,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456221,1,0.0%,
456222,1,0.0%,
456223,1,0.0%,
456224,1,0.0%,
456250,1,0.0%,

0,1
Correlation,0.91829

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),49.0%
Missing (n),23893

0,1
Panel,11269
"Stone, brick",10434
Block,1428
Other values (4),1720
(Missing),23893

Value,Count,Frequency (%),Unnamed: 3
Panel,11269,23.1%,
"Stone, brick",10434,21.4%,
Block,1428,2.9%,
Wooden,794,1.6%,
Mixed,353,0.7%,
Monolithic,289,0.6%,
Others,284,0.6%,
(Missing),23893,49.0%,

0,1
Distinct count,7
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
TUESDAY,9751
WEDNESDAY,8457
THURSDAY,8418
Other values (4),22118

Value,Count,Frequency (%),Unnamed: 3
TUESDAY,9751,20.0%,
WEDNESDAY,8457,17.3%,
THURSDAY,8418,17.3%,
MONDAY,8406,17.2%,
FRIDAY,7250,14.9%,
SATURDAY,4603,9.4%,
SUNDAY,1859,3.8%,

0,1
Distinct count,176
Unique (%),0.4%
Missing (%),46.9%
Missing (n),22856
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.97883
Minimum,0
Maximum,1
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,0.9692
Q1,0.9767
Median,0.9816
Q3,0.9866
95-th percentile,0.996
Maximum,1.0
Range,1.0
Interquartile range,0.0099

0,1
Standard deviation,0.049318
Coef of variation,0.050385
Kurtosis,357.39
Mean,0.97883
MAD,0.0092508
Skewness,-18.523
Sum,25340
Variance,0.0024323
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.9866,747,1.5%,
0.9871,701,1.4%,
0.9846,694,1.4%,
0.9806,681,1.4%,
0.9856,677,1.4%,
0.9861,670,1.4%,
0.9851,667,1.4%,
0.9776,666,1.4%,
0.9801,665,1.4%,
0.9841,664,1.4%,

Value,Count,Frequency (%),Unnamed: 3
0.0,58,0.1%,
0.1937,1,0.0%,
0.1982,1,0.0%,
0.1987,1,0.0%,
0.4704,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.998,194,0.4%,
0.9985,179,0.4%,
0.999,173,0.4%,
0.9995,93,0.2%,
1.0,10,0.0%,

0,1
Correlation,0.97165

0,1
Correlation,0.97451

0,1
Distinct count,131
Unique (%),0.3%
Missing (%),65.3%
Missing (n),31818
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.75114
Minimum,0
Maximum,1
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,0.5988
Q1,0.6872
Median,0.7552
Q3,0.8164
95-th percentile,0.9524
Maximum,1.0
Range,1.0
Interquartile range,0.1292

0,1
Standard deviation,0.11319
Coef of variation,0.15069
Kurtosis,4.2524
Mean,0.75114
MAD,0.084109
Skewness,-0.8842
Sum,12714
Variance,0.012812
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0.8232,507,1.0%,
0.8164,497,1.0%,
0.8096,457,0.9%,
0.7348,452,0.9%,
0.7959999999999999,452,0.9%,
0.7416,448,0.9%,
0.7892,448,0.9%,
0.7212,448,0.9%,
0.7824,446,0.9%,
0.8028,445,0.9%,

Value,Count,Frequency (%),Unnamed: 3
0.0,18,0.0%,
0.0004,1,0.0%,
0.0072,1,0.0%,
0.0548,1,0.0%,
0.0684,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9728,136,0.3%,
0.9796,132,0.3%,
0.9864,115,0.2%,
0.9932,73,0.1%,
1.0,8,0.0%,

0,1
Correlation,0.99065

0,1
Correlation,0.9905

Unnamed: 0,SK_ID_CURR,NAME_CONTRACT_TYPE,CODE_GENDER,FLAG_OWN_CAR,FLAG_OWN_REALTY,CNT_CHILDREN,AMT_INCOME_TOTAL,AMT_CREDIT,AMT_ANNUITY,AMT_GOODS_PRICE,NAME_TYPE_SUITE,NAME_INCOME_TYPE,NAME_EDUCATION_TYPE,NAME_FAMILY_STATUS,NAME_HOUSING_TYPE,REGION_POPULATION_RELATIVE,DAYS_BIRTH,DAYS_EMPLOYED,DAYS_REGISTRATION,DAYS_ID_PUBLISH,OWN_CAR_AGE,FLAG_MOBIL,FLAG_EMP_PHONE,FLAG_WORK_PHONE,FLAG_CONT_MOBILE,FLAG_PHONE,FLAG_EMAIL,OCCUPATION_TYPE,CNT_FAM_MEMBERS,REGION_RATING_CLIENT,REGION_RATING_CLIENT_W_CITY,WEEKDAY_APPR_PROCESS_START,HOUR_APPR_PROCESS_START,REG_REGION_NOT_LIVE_REGION,REG_REGION_NOT_WORK_REGION,LIVE_REGION_NOT_WORK_REGION,REG_CITY_NOT_LIVE_CITY,REG_CITY_NOT_WORK_CITY,LIVE_CITY_NOT_WORK_CITY,ORGANIZATION_TYPE,EXT_SOURCE_1,EXT_SOURCE_2,EXT_SOURCE_3,APARTMENTS_AVG,BASEMENTAREA_AVG,YEARS_BEGINEXPLUATATION_AVG,YEARS_BUILD_AVG,COMMONAREA_AVG,ELEVATORS_AVG,ENTRANCES_AVG,FLOORSMAX_AVG,FLOORSMIN_AVG,LANDAREA_AVG,LIVINGAPARTMENTS_AVG,LIVINGAREA_AVG,NONLIVINGAPARTMENTS_AVG,NONLIVINGAREA_AVG,APARTMENTS_MODE,BASEMENTAREA_MODE,YEARS_BEGINEXPLUATATION_MODE,YEARS_BUILD_MODE,COMMONAREA_MODE,ELEVATORS_MODE,ENTRANCES_MODE,FLOORSMAX_MODE,FLOORSMIN_MODE,LANDAREA_MODE,LIVINGAPARTMENTS_MODE,LIVINGAREA_MODE,NONLIVINGAPARTMENTS_MODE,NONLIVINGAREA_MODE,APARTMENTS_MEDI,BASEMENTAREA_MEDI,YEARS_BEGINEXPLUATATION_MEDI,YEARS_BUILD_MEDI,COMMONAREA_MEDI,ELEVATORS_MEDI,ENTRANCES_MEDI,FLOORSMAX_MEDI,FLOORSMIN_MEDI,LANDAREA_MEDI,LIVINGAPARTMENTS_MEDI,LIVINGAREA_MEDI,NONLIVINGAPARTMENTS_MEDI,NONLIVINGAREA_MEDI,FONDKAPREMONT_MODE,HOUSETYPE_MODE,TOTALAREA_MODE,WALLSMATERIAL_MODE,EMERGENCYSTATE_MODE,OBS_30_CNT_SOCIAL_CIRCLE,DEF_30_CNT_SOCIAL_CIRCLE,OBS_60_CNT_SOCIAL_CIRCLE,DEF_60_CNT_SOCIAL_CIRCLE,DAYS_LAST_PHONE_CHANGE,FLAG_DOCUMENT_2,FLAG_DOCUMENT_3,FLAG_DOCUMENT_4,FLAG_DOCUMENT_5,FLAG_DOCUMENT_6,FLAG_DOCUMENT_7,FLAG_DOCUMENT_8,FLAG_DOCUMENT_9,FLAG_DOCUMENT_10,FLAG_DOCUMENT_11,FLAG_DOCUMENT_12,FLAG_DOCUMENT_13,FLAG_DOCUMENT_14,FLAG_DOCUMENT_15,FLAG_DOCUMENT_16,FLAG_DOCUMENT_17,FLAG_DOCUMENT_18,FLAG_DOCUMENT_19,FLAG_DOCUMENT_20,FLAG_DOCUMENT_21,AMT_REQ_CREDIT_BUREAU_HOUR,AMT_REQ_CREDIT_BUREAU_DAY,AMT_REQ_CREDIT_BUREAU_WEEK,AMT_REQ_CREDIT_BUREAU_MON,AMT_REQ_CREDIT_BUREAU_QRT,AMT_REQ_CREDIT_BUREAU_YEAR
0,100001,Cash loans,F,N,Y,0,135000.0,568800.0,20560.5,450000.0,Unaccompanied,Working,Higher education,Married,House / apartment,0.01885,-19241,-2329,-5170.0,-812,,1,1,0,1,0,1,,2.0,2,2,TUESDAY,18,0,0,0,0,0,0,Kindergarten,0.752614,0.789654,0.15952,0.066,0.059,0.9732,,,,0.1379,0.125,,,,0.0505,,,0.0672,0.0612,0.9732,,,,0.1379,0.125,,,,0.0526,,,0.0666,0.059,0.9732,,,,0.1379,0.125,,,,0.0514,,,,block of flats,0.0392,"Stone, brick",No,0.0,0.0,0.0,0.0,-1740.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,0.0
1,100005,Cash loans,M,N,Y,0,99000.0,222768.0,17370.0,180000.0,Unaccompanied,Working,Secondary / secondary special,Married,House / apartment,0.035792,-18064,-4469,-9118.0,-1623,,1,1,0,1,0,0,Low-skill Laborers,2.0,2,2,FRIDAY,9,0,0,0,0,0,0,Self-employed,0.56499,0.291656,0.432962,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,3.0
2,100013,Cash loans,M,Y,Y,0,202500.0,663264.0,69777.0,630000.0,,Working,Higher education,Married,House / apartment,0.019101,-20038,-4458,-2175.0,-3503,5.0,1,1,0,1,0,0,Drivers,2.0,2,2,MONDAY,14,0,0,0,0,0,0,Transport: type 3,,0.699787,0.610991,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,-856.0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,1.0,4.0
3,100028,Cash loans,F,N,Y,2,315000.0,1575000.0,49018.5,1575000.0,Unaccompanied,Working,Secondary / secondary special,Married,House / apartment,0.026392,-13976,-1866,-2000.0,-4208,,1,1,0,1,1,0,Sales staff,4.0,2,2,WEDNESDAY,11,0,0,0,0,0,0,Business Entity Type 3,0.525734,0.509677,0.612704,0.3052,0.1974,0.997,0.9592,0.1165,0.32,0.2759,0.375,0.0417,0.2042,0.2404,0.3673,0.0386,0.08,0.3109,0.2049,0.997,0.9608,0.1176,0.3222,0.2759,0.375,0.0417,0.2089,0.2626,0.3827,0.0389,0.0847,0.3081,0.1974,0.997,0.9597,0.1173,0.32,0.2759,0.375,0.0417,0.2078,0.2446,0.3739,0.0388,0.0817,reg oper account,block of flats,0.37,Panel,No,0.0,0.0,0.0,0.0,-1805.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0,0.0,0.0,0.0,0.0,3.0
4,100038,Cash loans,M,Y,N,1,180000.0,625500.0,32067.0,625500.0,Unaccompanied,Working,Secondary / secondary special,Married,House / apartment,0.010032,-13040,-2191,-4000.0,-4262,16.0,1,1,1,1,0,0,,3.0,2,2,FRIDAY,5,0,0,0,0,1,1,Business Entity Type 3,0.202145,0.425687,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,-821.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,


In [16]:
pandas_profiling.ProfileReport(bureau)

0,1
Number of variables,17
Number of observations,1716428
Total Missing (%),13.5%
Total size in memory,222.6 MiB
Average record size in memory,136.0 B

0,1
Numeric,14
Categorical,3
Boolean,0
Date,0
Text (Unique),0
Rejected,0
Unsupported,0

0,1
Distinct count,40322
Unique (%),2.3%
Missing (%),71.5%
Missing (n),1226791
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,15713
Minimum,0
Maximum,118450000
Zeros (%),15.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,13500
95-th percentile,46571
Maximum,118450000
Range,118450000
Interquartile range,13500

0,1
Standard deviation,325830
Coef of variation,20.736
Kurtosis,58561
Mean,15713
MAD,20395
Skewness,212.54
Sum,7693500000
Variance,106160000000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,256915,15.0%,
4500.0,5182,0.3%,
13500.0,3147,0.2%,
22500.0,2502,0.1%,
9000.0,1725,0.1%,
18000.0,1605,0.1%,
45000.0,1593,0.1%,
27000.0,1252,0.1%,
2700.0,1208,0.1%,
6750.0,1164,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,256915,15.0%,
0.045,62,0.0%,
0.315,1,0.0%,
0.45,75,0.0%,
1.395,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
56844981.0,1,0.0%,
57476227.5,1,0.0%,
59586682.5,1,0.0%,
90632371.5,1,0.0%,
118453423.5,1,0.0%,

0,1
Distinct count,68252
Unique (%),4.0%
Missing (%),65.5%
Missing (n),1124488
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3825.4
Minimum,0
Maximum,115990000
Zeros (%),27.4%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,14220
Maximum,115990000
Range,115990000
Interquartile range,0

0,1
Standard deviation,206030
Coef of variation,53.859
Kurtosis,245700
Mean,3825.4
MAD,6325.2
Skewness,470.91
Sum,2264400000
Variance,42449000000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,470650,27.4%,
1440.0,688,0.0%,
225.0,405,0.0%,
45.0,377,0.0%,
4.5,315,0.0%,
90.0,222,0.0%,
4500.0,220,0.0%,
2700.0,192,0.0%,
9000.0,192,0.0%,
5400.0,189,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,470650,27.4%,
0.045,17,0.0%,
0.09,4,0.0%,
0.135,12,0.0%,
0.18,5,0.0%,

Value,Count,Frequency (%),Unnamed: 3
13975258.5,1,0.0%,
14111390.7,1,0.0%,
16950010.5,1,0.0%,
94812246.0,1,0.0%,
115987185.0,1,0.0%,

0,1
Distinct count,236709
Unique (%),13.8%
Missing (%),0.0%
Missing (n),13
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,354990
Minimum,0
Maximum,585000000
Zeros (%),3.9%

0,1
Minimum,0
5-th percentile,11250
Q1,51300
Median,125520
Q3,315000
95-th percentile,1350000
Maximum,585000000
Range,585000000
Interquartile range,263700

0,1
Standard deviation,1149800
Coef of variation,3.239
Kurtosis,49316
Mean,354990
MAD,381210
Skewness,124.59
Sum,609320000000
Variance,1322100000000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,66582,3.9%,
225000.0,57608,3.4%,
135000.0,50195,2.9%,
450000.0,37156,2.2%,
90000.0,36940,2.2%,
180000.0,28840,1.7%,
45000.0,26570,1.5%,
67500.0,25444,1.5%,
270000.0,22467,1.3%,
675000.0,20581,1.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,66582,3.9%,
0.45,80,0.0%,
2.565,1,0.0%,
4.5,546,0.0%,
9.0,10,0.0%,

Value,Count,Frequency (%),Unnamed: 3
146958507.0,1,0.0%,
164032200.0,1,0.0%,
170100000.0,1,0.0%,
396000000.0,1,0.0%,
585000000.0,1,0.0%,

0,1
Distinct count,226538
Unique (%),13.2%
Missing (%),15.0%
Missing (n),257669
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,137090
Minimum,-4705600
Maximum,170100000
Zeros (%),59.2%

0,1
Minimum,-4705600
5-th percentile,0
Q1,0
Median,0
Q3,40154
95-th percentile,628900
Maximum,170100000
Range,174810000
Interquartile range,40154

0,1
Standard deviation,677400
Coef of variation,4.9415
Kurtosis,5673.4
Mean,137090
MAD,212830
Skewness,36.415
Sum,199970000000
Variance,458870000000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1016434,59.2%,
4.5,653,0.0%,
-450.0,543,0.0%,
135000.0,344,0.0%,
90000.0,320,0.0%,
45000.0,316,0.0%,
22500.0,307,0.0%,
67500.0,238,0.0%,
225000.0,237,0.0%,
13500.0,205,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4705600.32,1,0.0%,
-3109510.98,1,0.0%,
-2796723.72,1,0.0%,
-2273021.73,1,0.0%,
-2167229.34,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
62218953.0,1,0.0%,
64570243.5,1,0.0%,
65441403.0,1,0.0%,
164032200.0,1,0.0%,
170100000.0,1,0.0%,

0,1
Distinct count,51727
Unique (%),3.0%
Missing (%),34.5%
Missing (n),591780
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,6229.5
Minimum,-586410
Maximum,4705600
Zeros (%),61.2%

0,1
Minimum,-586410.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0
95-th percentile,5736.1
Maximum,4705600.0
Range,5292000.0
Interquartile range,0.0

0,1
Standard deviation,45032
Coef of variation,7.2288
Kurtosis,796.1
Mean,6229.5
MAD,11788
Skewness,18.027
Sum,7006000000
Variance,2027900000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1050142,61.2%,
135000.0,2178,0.1%,
4500.0,1474,0.1%,
45000.0,1335,0.1%,
90000.0,974,0.1%,
13500.0,833,0.0%,
22500.0,766,0.0%,
225000.0,757,0.0%,
67500.0,678,0.0%,
450000.0,558,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-586406.115,1,0.0%,
-401346.945,1,0.0%,
-399166.875,1,0.0%,
-372598.245,1,0.0%,
-316391.895,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
3375000.0,2,0.0%,
3555065.655,1,0.0%,
4443255.0,1,0.0%,
4500000.0,2,0.0%,
4705600.32,1,0.0%,

0,1
Distinct count,1616
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,37.913
Minimum,0
Maximum,3756700
Zeros (%),99.8%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,3756700
Range,3756700
Interquartile range,0

0,1
Standard deviation,5937.7
Coef of variation,156.61
Kurtosis,211840
Mean,37.913
MAD,75.667
Skewness,403.24
Sum,65075000
Variance,35256000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1712270,99.8%,
4.5,301,0.0%,
9.0,107,0.0%,
13.5,81,0.0%,
18.0,72,0.0%,
22.5,60,0.0%,
45.0,56,0.0%,
27.0,52,0.0%,
36.0,50,0.0%,
31.5,48,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1712270,99.8%,
0.045,3,0.0%,
0.27,3,0.0%,
0.315,1,0.0%,
0.36,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1617403.5,1,0.0%,
1851210.0,1,0.0%,
2387232.0,1,0.0%,
3681063.0,1,0.0%,
3756681.0,1,0.0%,

0,1
Distinct count,10
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0064104
Minimum,0
Maximum,9
Zeros (%),99.5%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,9
Range,9
Interquartile range,0

0,1
Standard deviation,0.096224
Coef of variation,15.011
Kurtosis,615.44
Mean,0.0064104
MAD,0.012753
Skewness,20.319
Sum,11003
Variance,0.009259
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0,1707314,99.5%,
1,7620,0.4%,
2,1222,0.1%,
3,191,0.0%,
4,54,0.0%,
5,21,0.0%,
9,2,0.0%,
6,2,0.0%,
8,1,0.0%,
7,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,1707314,99.5%,
1,7620,0.4%,
2,1222,0.1%,
3,191,0.0%,
4,54,0.0%,

Value,Count,Frequency (%),Unnamed: 3
5,21,0.0%,
6,2,0.0%,
7,1,0.0%,
8,1,0.0%,
9,2,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Closed,1079273
Active,630607
Sold,6527

Value,Count,Frequency (%),Unnamed: 3
Closed,1079273,62.9%,
Active,630607,36.7%,
Sold,6527,0.4%,
Bad debt,21,0.0%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
currency 1,1715020
currency 2,1224
currency 3,174

Value,Count,Frequency (%),Unnamed: 3
currency 1,1715020,99.9%,
currency 2,1224,0.1%,
currency 3,174,0.0%,
currency 4,10,0.0%,

0,1
Distinct count,942
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.81817
Minimum,0
Maximum,2792
Zeros (%),99.8%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,2792
Range,2792
Interquartile range,0

0,1
Standard deviation,36.544
Coef of variation,44.666
Kurtosis,3374.5
Mean,0.81817
MAD,1.6323
Skewness,55.931
Sum,1404324
Variance,1335.5
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0,1712211,99.8%,
30,311,0.0%,
60,126,0.0%,
13,103,0.0%,
8,103,0.0%,
9,93,0.0%,
7,92,0.0%,
14,91,0.0%,
17,77,0.0%,
11,75,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,1712211,99.8%,
1,5,0.0%,
2,18,0.0%,
3,29,0.0%,
4,46,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2766,1,0.0%,
2770,1,0.0%,
2776,1,0.0%,
2781,1,0.0%,
2792,1,0.0%,

0,1
Distinct count,15
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Consumer credit,1251615
Credit card,402195
Car loan,27690
Other values (12),34928

Value,Count,Frequency (%),Unnamed: 3
Consumer credit,1251615,72.9%,
Credit card,402195,23.4%,
Car loan,27690,1.6%,
Mortgage,18391,1.1%,
Microloan,12413,0.7%,
Loan for business development,1975,0.1%,
Another type of loan,1017,0.1%,
Unknown type of loan,555,0.0%,
Loan for working capital replenishment,469,0.0%,
Cash loan (non-earmarked),56,0.0%,

0,1
Distinct count,2923
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-1142.1
Minimum,-2922
Maximum,0
Zeros (%),0.0%

0,1
Minimum,-2922
5-th percentile,-2665
Q1,-1666
Median,-987
Q3,-474
95-th percentile,-125
Maximum,0
Range,2922
Interquartile range,1192

0,1
Standard deviation,795.16
Coef of variation,-0.69623
Kurtosis,-0.73545
Mean,-1142.1
MAD,665.49
Skewness,-0.58235
Sum,-1960345609
Variance,632290
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
-364,1330,0.1%,
-336,1248,0.1%,
-273,1238,0.1%,
-357,1218,0.1%,
-343,1203,0.1%,
-315,1202,0.1%,
-371,1196,0.1%,
-365,1194,0.1%,
-210,1193,0.1%,
-245,1192,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-2922,278,0.0%,
-2921,283,0.0%,
-2920,317,0.0%,
-2919,344,0.0%,
-2918,329,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4,113,0.0%,
-3,74,0.0%,
-2,42,0.0%,
-1,17,0.0%,
0,25,0.0%,

0,1
Distinct count,14097
Unique (%),0.8%
Missing (%),6.1%
Missing (n),105553
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,510.52
Minimum,-42060
Maximum,31199
Zeros (%),0.1%

0,1
Minimum,-42060
5-th percentile,-2262
Q1,-1138
Median,-330
Q3,474
95-th percentile,2623
Maximum,31199
Range,73259
Interquartile range,1612

0,1
Standard deviation,4994.2
Coef of variation,9.7827
Kurtosis,28.18
Mean,510.52
MAD,2019.9
Skewness,5.1271
Sum,822380000
Variance,24942000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,883,0.1%,
3.0,845,0.0%,
-7.0,837,0.0%,
1.0,830,0.0%,
-14.0,787,0.0%,
-10.0,782,0.0%,
4.0,777,0.0%,
-2.0,772,0.0%,
-1.0,771,0.0%,
-42.0,768,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-42060.0,1,0.0%,
-42056.0,1,0.0%,
-42042.0,3,0.0%,
-42041.0,1,0.0%,
-42013.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
31195.0,103,0.0%,
31196.0,50,0.0%,
31197.0,63,0.0%,
31198.0,89,0.0%,
31199.0,1,0.0%,

0,1
Distinct count,2982
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-593.75
Minimum,-41947
Maximum,372
Zeros (%),0.0%

0,1
Minimum,-41947
5-th percentile,-2079
Q1,-908
Median,-395
Q3,-33
95-th percentile,-8
Maximum,372
Range,42319
Interquartile range,875

0,1
Standard deviation,720.75
Coef of variation,-1.2139
Kurtosis,596.37
Mean,-593.75
MAD,521.38
Skewness,-11.335
Sum,-1019126241
Variance,519480
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
-7,18503,1.1%,
-8,18462,1.1%,
-11,16975,1.0%,
-15,16870,1.0%,
-12,16827,1.0%,
-10,16651,1.0%,
-9,16546,1.0%,
-13,16387,1.0%,
-6,16281,0.9%,
-14,16210,0.9%,

Value,Count,Frequency (%),Unnamed: 3
-41947,1,0.0%,
-41946,2,0.0%,
-41945,1,0.0%,
-41943,2,0.0%,
-41940,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
19,1,0.0%,
20,2,0.0%,
22,1,0.0%,
23,2,0.0%,
372,1,0.0%,

0,1
Distinct count,2918
Unique (%),0.2%
Missing (%),36.9%
Missing (n),633653
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-1017.4
Minimum,-42023
Maximum,0
Zeros (%),0.0%

0,1
Minimum,-42023
5-th percentile,-2393
Q1,-1489
Median,-897
Q3,-425
95-th percentile,-94
Maximum,0
Range,42023
Interquartile range,1064

0,1
Standard deviation,714.01
Coef of variation,-0.70177
Kurtosis,9.4092
Mean,-1017.4
MAD,590.62
Skewness,-0.77475
Sum,-1101700000
Variance,509810
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
-329.0,811,0.0%,
-273.0,794,0.0%,
-301.0,791,0.0%,
-91.0,785,0.0%,
-84.0,783,0.0%,
-154.0,783,0.0%,
-182.0,782,0.0%,
-210.0,778,0.0%,
-238.0,778,0.0%,
-350.0,773,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-42023.0,1,0.0%,
-3042.0,1,0.0%,
-2922.0,1,0.0%,
-2919.0,1,0.0%,
-2917.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-4.0,265,0.0%,
-3.0,223,0.0%,
-2.0,162,0.0%,
-1.0,217,0.0%,
0.0,64,0.0%,

0,1
Distinct count,1716428
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,5924400
Minimum,5000000
Maximum,6843457
Zeros (%),0.0%

0,1
Minimum,5000000
5-th percentile,5092400
Q1,5464000
Median,5926300
Q3,6385700
95-th percentile,6752000
Maximum,6843457
Range,1843457
Interquartile range,921730

0,1
Standard deviation,532270
Coef of variation,0.089842
Kurtosis,-1.199
Mean,5924400
MAD,460810
Skewness,-0.0074978
Sum,10168865241140
Variance,283310000000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
5000822,1,0.0%,
6547158,1,0.0%,
6487797,1,0.0%,
6481654,1,0.0%,
6483703,1,0.0%,
6461176,1,0.0%,
6463225,1,0.0%,
6457082,1,0.0%,
6459131,1,0.0%,
6469372,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
5000000,1,0.0%,
5000001,1,0.0%,
5000002,1,0.0%,
5000003,1,0.0%,
5000004,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
6843453,1,0.0%,
6843454,1,0.0%,
6843455,1,0.0%,
6843456,1,0.0%,
6843457,1,0.0%,

0,1
Distinct count,305811
Unique (%),17.8%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,278210
Minimum,100001
Maximum,456255
Zeros (%),0.0%

0,1
Minimum,100001
5-th percentile,117920
Q1,188870
Median,278060
Q3,367430
95-th percentile,438630
Maximum,456255
Range,356254
Interquartile range,178560

0,1
Standard deviation,102940
Coef of variation,0.37
Kurtosis,-1.2028
Mean,278210
MAD,89172
Skewness,0.0010629
Sum,477535902126
Variance,10596000000
Memory size,13.1 MiB

Value,Count,Frequency (%),Unnamed: 3
120860,116,0.0%,
169704,94,0.0%,
318065,78,0.0%,
251643,61,0.0%,
425396,60,0.0%,
295809,59,0.0%,
129843,58,0.0%,
385133,57,0.0%,
177014,56,0.0%,
325354,55,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100001,7,0.0%,
100002,8,0.0%,
100003,4,0.0%,
100004,2,0.0%,
100005,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456249,13,0.0%,
456250,3,0.0%,
456253,4,0.0%,
456254,1,0.0%,
456255,11,0.0%,

Unnamed: 0,SK_ID_CURR,SK_ID_BUREAU,CREDIT_ACTIVE,CREDIT_CURRENCY,DAYS_CREDIT,CREDIT_DAY_OVERDUE,DAYS_CREDIT_ENDDATE,DAYS_ENDDATE_FACT,AMT_CREDIT_MAX_OVERDUE,CNT_CREDIT_PROLONG,AMT_CREDIT_SUM,AMT_CREDIT_SUM_DEBT,AMT_CREDIT_SUM_LIMIT,AMT_CREDIT_SUM_OVERDUE,CREDIT_TYPE,DAYS_CREDIT_UPDATE,AMT_ANNUITY
0,215354,5714462,Closed,currency 1,-497,0,-153.0,-153.0,,0,91323.0,0.0,,0.0,Consumer credit,-131,
1,215354,5714463,Active,currency 1,-208,0,1075.0,,,0,225000.0,171342.0,,0.0,Credit card,-20,
2,215354,5714464,Active,currency 1,-203,0,528.0,,,0,464323.5,,,0.0,Consumer credit,-16,
3,215354,5714465,Active,currency 1,-203,0,,,,0,90000.0,,,0.0,Credit card,-16,
4,215354,5714466,Active,currency 1,-629,0,1197.0,,77674.5,0,2700000.0,,,0.0,Consumer credit,-21,


In [17]:
pandas_profiling.ProfileReport(bureau_balance)

0,1
Number of variables,3
Number of observations,27299925
Total Missing (%),0.0%
Total size in memory,624.8 MiB
Average record size in memory,24.0 B

0,1
Numeric,2
Categorical,1
Boolean,0
Date,0
Text (Unique),0
Rejected,0
Unsupported,0

0,1
Distinct count,97
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-30.742
Minimum,-96
Maximum,0
Zeros (%),2.2%

0,1
Minimum,-96
5-th percentile,-79
Q1,-46
Median,-25
Q3,-11
95-th percentile,-2
Maximum,0
Range,96
Interquartile range,35

0,1
Standard deviation,23.865
Coef of variation,-0.77629
Kurtosis,-0.31614
Mean,-30.742
MAD,19.71
Skewness,-0.76069
Sum,-839245740
Variance,569.51
Memory size,208.3 MiB

Value,Count,Frequency (%),Unnamed: 3
-1,622601,2.3%,
-2,619243,2.3%,
-3,615080,2.3%,
0,610965,2.2%,
-4,609138,2.2%,
-5,602663,2.2%,
-6,594277,2.2%,
-7,583794,2.1%,
-8,573566,2.1%,
-9,563804,2.1%,

Value,Count,Frequency (%),Unnamed: 3
-96,43147,0.2%,
-95,46542,0.2%,
-94,49965,0.2%,
-93,53535,0.2%,
-92,57300,0.2%,

Value,Count,Frequency (%),Unnamed: 3
-4,609138,2.2%,
-3,615080,2.3%,
-2,619243,2.3%,
-1,622601,2.3%,
0,610965,2.2%,

0,1
Distinct count,817395
Unique (%),3.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,6036300
Minimum,5001709
Maximum,6842888
Zeros (%),0.0%

0,1
Minimum,5001709
5-th percentile,5113200
Q1,5730900
Median,6070800
Q3,6432000
95-th percentile,6759800
Maximum,6842888
Range,1841179
Interquartile range,701020

0,1
Standard deviation,492350
Coef of variation,0.081565
Kurtosis,-0.73797
Mean,6036300
MAD,403800
Skewness,-0.37219
Sum,164790464467878
Variance,242410000000
Memory size,208.3 MiB

Value,Count,Frequency (%),Unnamed: 3
6104966,97,0.0%,
5760763,97,0.0%,
5568354,97,0.0%,
6365230,97,0.0%,
6635521,97,0.0%,
6430797,97,0.0%,
5408505,97,0.0%,
5941084,97,0.0%,
5646028,97,0.0%,
5564052,97,0.0%,

Value,Count,Frequency (%),Unnamed: 3
5001709,97,0.0%,
5001710,83,0.0%,
5001711,4,0.0%,
5001712,19,0.0%,
5001713,22,0.0%,

Value,Count,Frequency (%),Unnamed: 3
6842884,48,0.0%,
6842885,24,0.0%,
6842886,33,0.0%,
6842887,37,0.0%,
6842888,62,0.0%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
C,13646993
0,7499507
X,5810482
Other values (5),342943

Value,Count,Frequency (%),Unnamed: 3
C,13646993,50.0%,
0,7499507,27.5%,
X,5810482,21.3%,
1,242347,0.9%,
5,62406,0.2%,
2,23419,0.1%,
3,8924,0.0%,
4,5847,0.0%,

Unnamed: 0,SK_ID_BUREAU,MONTHS_BALANCE,STATUS
0,5715448,0,C
1,5715448,-1,C
2,5715448,-2,C
3,5715448,-3,C
4,5715448,-4,C


In [18]:
pandas_profiling.ProfileReport(credit_card_balance)

0,1
Number of variables,23
Number of observations,3840312
Total Missing (%),5.8%
Total size in memory,673.9 MiB
Average record size in memory,184.0 B

0,1
Numeric,17
Categorical,1
Boolean,0
Date,0
Text (Unique),0
Rejected,5
Unsupported,0

0,1
Distinct count,1347904
Unique (%),35.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,58300
Minimum,-420250
Maximum,1505900
Zeros (%),56.2%

0,1
Minimum,-420250
5-th percentile,0
Q1,0
Median,0
Q3,89047
95-th percentile,257180
Maximum,1505900
Range,1926200
Interquartile range,89047

0,1
Standard deviation,106310
Coef of variation,1.8234
Kurtosis,11.779
Mean,58300
MAD,74044
Skewness,2.9202
Sum,223890000000
Variance,11301000000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2156420,56.2%,
67.5,16049,0.4%,
130.5,3662,0.1%,
270.0,2313,0.1%,
135.0,921,0.0%,
202.5,742,0.0%,
450.0,536,0.0%,
1345.5,312,0.0%,
92043.0,252,0.0%,
46570.5,235,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-420250.185,1,0.0%,
-261471.015,1,0.0%,
-259848.945,1,0.0%,
-240305.985,1,0.0%,
-223224.21,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1329264.63,1,0.0%,
1347979.5,1,0.0%,
1354613.265,1,0.0%,
1354829.265,1,0.0%,
1505902.185,1,0.0%,

0,1
Distinct count,181
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,153810
Minimum,0
Maximum,1350000
Zeros (%),19.6%

0,1
Minimum,0
5-th percentile,0
Q1,45000
Median,112500
Q3,180000
95-th percentile,450000
Maximum,1350000
Range,1350000
Interquartile range,135000

0,1
Standard deviation,165150
Coef of variation,1.0737
Kurtosis,5.184
Mean,153810
MAD,113620
Skewness,2.0597
Sum,590670544500
Variance,27273000000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,753823,19.6%,
180000,529292,13.8%,
135000,430142,11.2%,
45000,329932,8.6%,
90000,319996,8.3%,
67500,308866,8.0%,
450000,238197,6.2%,
112500,182429,4.8%,
225000,162053,4.2%,
270000,125935,3.3%,

Value,Count,Frequency (%),Unnamed: 3
0,753823,19.6%,
4500,1111,0.0%,
9000,2215,0.1%,
13500,2413,0.1%,
18000,735,0.0%,

Value,Count,Frequency (%),Unnamed: 3
900000,33583,0.9%,
1012500,7,0.0%,
1125000,22,0.0%,
1237500,46,0.0%,
1350000,145,0.0%,

0,1
Distinct count,2268
Unique (%),0.1%
Missing (%),19.5%
Missing (n),749816
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,5961.3
Minimum,-6827.3
Maximum,2115000
Zeros (%),69.4%

0,1
Minimum,-6827.3
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0
95-th percentile,33750.0
Maximum,2115000.0
Range,2121800.0
Interquartile range,0.0

0,1
Standard deviation,28226
Coef of variation,4.7348
Kurtosis,164.93
Mean,5961.3
MAD,10471
Skewness,9.6648
Sum,18423000000
Variance,796690000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2665718,69.4%,
4500.0,35851,0.9%,
9000.0,27726,0.7%,
45000.0,22946,0.6%,
2250.0,22854,0.6%,
22500.0,22676,0.6%,
13500.0,21198,0.6%,
6750.0,14712,0.4%,
18000.0,13318,0.3%,
90000.0,12034,0.3%,

Value,Count,Frequency (%),Unnamed: 3
-6827.31,1,0.0%,
0.0,2665718,69.4%,
45.0,17,0.0%,
90.0,9,0.0%,
135.0,9,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1332000.0,1,0.0%,
1350000.0,8,0.0%,
1354500.0,1,0.0%,
1676250.0,1,0.0%,
2115000.0,1,0.0%,

0,1
Distinct count,187005
Unique (%),4.9%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,7433.4
Minimum,-6211.6
Maximum,2287100
Zeros (%),83.9%

0,1
Minimum,-6211.6
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0
95-th percentile,45000.0
Maximum,2287100.0
Range,2293300.0
Interquartile range,0.0

0,1
Standard deviation,33846
Coef of variation,4.5533
Kurtosis,184.27
Mean,7433.4
MAD,12841
Skewness,10.066
Sum,28547000000
Variance,1145600000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,3223443,83.9%,
4500.0,30257,0.8%,
9000.0,22968,0.6%,
2250.0,20212,0.5%,
45000.0,18947,0.5%,
22500.0,18670,0.5%,
13500.0,16964,0.4%,
6750.0,12658,0.3%,
18000.0,10729,0.3%,
90000.0,10169,0.3%,

Value,Count,Frequency (%),Unnamed: 3
-6211.62,1,0.0%,
-1687.5,1,0.0%,
-519.57,1,0.0%,
0.0,3223443,83.9%,
0.045,15,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1992949.2,1,0.0%,
2060030.16,1,0.0%,
2115000.0,1,0.0%,
2239274.16,1,0.0%,
2287098.315,1,0.0%,

0,1
Distinct count,1833
Unique (%),0.0%
Missing (%),19.5%
Missing (n),749816
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,288.17
Minimum,0
Maximum,1529800
Zeros (%),80.2%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,1529800
Range,1529800
Interquartile range,0

0,1
Standard deviation,8202
Coef of variation,28.462
Kurtosis,3628
Mean,288.17
MAD,574.04
Skewness,50.57
Sum,890590000
Variance,67273000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,3078163,80.2%,
3343.5,782,0.0%,
6682.5,460,0.0%,
4455.0,441,0.0%,
8910.0,319,0.0%,
46800.0,242,0.0%,
4680.0,235,0.0%,
9360.0,216,0.0%,
23400.0,209,0.0%,
5571.0,195,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,3078163,80.2%,
234.0,1,0.0%,
280.8,1,0.0%,
459.0,2,0.0%,
461.25,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
899437.5,1,0.0%,
899910.0,2,0.0%,
900000.0,2,0.0%,
1302750.0,1,0.0%,
1529847.0,1,0.0%,

0,1
Distinct count,168749
Unique (%),4.4%
Missing (%),19.5%
Missing (n),749816
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2968.8
Minimum,0
Maximum,2239300
Zeros (%),73.6%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,10116
Maximum,2239300
Range,2239300
Interquartile range,0

0,1
Standard deviation,20797
Coef of variation,7.0051
Kurtosis,713.99
Mean,2968.8
MAD,5470.3
Skewness,19.421
Sum,9175100000
Variance,432510000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2825595,73.6%,
450.0,1287,0.0%,
900.0,976,0.0%,
2250.0,910,0.0%,
4500.0,801,0.0%,
1350.0,784,0.0%,
225.0,609,0.0%,
45000.0,462,0.0%,
1800.0,407,0.0%,
2700.0,379,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2825595,73.6%,
0.045,1,0.0%,
0.225,1,0.0%,
0.45,1,0.0%,
0.495,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1840173.57,1,0.0%,
1896559.2,1,0.0%,
1933462.125,1,0.0%,
2060030.16,1,0.0%,
2239274.16,1,0.0%,

0,1
Distinct count,312267
Unique (%),8.1%
Missing (%),7.9%
Missing (n),305236
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3540.2
Minimum,0
Maximum,202880
Zeros (%),50.2%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,6633.9
95-th percentile,13500.0
Maximum,202880.0
Range,202880.0
Interquartile range,6633.9

0,1
Standard deviation,5600.2
Coef of variation,1.5819
Kurtosis,10.183
Mean,3540.2
MAD,4141.9
Skewness,2.4944
Sum,12515000000
Variance,31362000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1928864,50.2%,
9000.0,225429,5.9%,
6750.0,147469,3.8%,
3375.0,127613,3.3%,
4500.0,124979,3.3%,
2250.0,108350,2.8%,
5625.0,82303,2.1%,
7875.0,53931,1.4%,
8100.0,34436,0.9%,
10800.0,28194,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1928864,50.2%,
0.045,149,0.0%,
0.09,168,0.0%,
0.135,56,0.0%,
0.18,151,0.0%,

Value,Count,Frequency (%),Unnamed: 3
126728.145,1,0.0%,
136760.625,1,0.0%,
188976.6,1,0.0%,
194198.31,1,0.0%,
202882.005,1,0.0%,

0,1
Distinct count,163210
Unique (%),4.2%
Missing (%),20.0%
Missing (n),767988
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10281
Minimum,0
Maximum,4289200
Zeros (%),10.2%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,152.37
Median,2702.7
Q3,9000.0
95-th percentile,31500.0
Maximum,4289200.0
Range,4289200.0
Interquartile range,8847.6

0,1
Standard deviation,36078
Coef of variation,3.5094
Kurtosis,315.76
Mean,10281
MAD,12071
Skewness,12.991
Sum,31585000000
Variance,1301600000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,390507,10.2%,
9000.0,257297,6.7%,
4500.0,143572,3.7%,
6750.0,128296,3.3%,
13500.0,100454,2.6%,
22500.0,56016,1.5%,
11250.0,45497,1.2%,
3375.0,44600,1.2%,
18000.0,37897,1.0%,
5625.0,35591,0.9%,

Value,Count,Frequency (%),Unnamed: 3
0.0,390507,10.2%,
0.045,1585,0.0%,
0.09,1094,0.0%,
0.135,1364,0.0%,
0.18,1260,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2115000.0,1,0.0%,
2287098.315,1,0.0%,
2436495.255,1,0.0%,
2497500.0,1,0.0%,
4289207.445,1,0.0%,

0,1
Correlation,0.99476

0,1
Correlation,0.99972

0,1
Correlation,0.99973

0,1
Correlation,0.99999

0,1
Distinct count,45
Unique (%),0.0%
Missing (%),19.5%
Missing (n),749816
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.30945
Minimum,0
Maximum,51
Zeros (%),69.4%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,2
Maximum,51
Range,51
Interquartile range,0

0,1
Standard deviation,1.1004
Coef of variation,3.556
Kurtosis,81.549
Mean,0.30945
MAD,0.53383
Skewness,6.9067
Sum,956350
Variance,1.2109
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2665718,69.4%,
1.0,213460,5.6%,
2.0,95647,2.5%,
3.0,46730,1.2%,
4.0,26335,0.7%,
5.0,14910,0.4%,
6.0,9179,0.2%,
7.0,5718,0.1%,
8.0,3992,0.1%,
9.0,2427,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2665718,69.4%,
1.0,213460,5.6%,
2.0,95647,2.5%,
3.0,46730,1.2%,
4.0,26335,0.7%,

Value,Count,Frequency (%),Unnamed: 3
39.0,2,0.0%,
41.0,2,0.0%,
43.0,1,0.0%,
44.0,1,0.0%,
51.0,1,0.0%,

0,1
Distinct count,129
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.70314
Minimum,0
Maximum,165
Zeros (%),84.1%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,4
Maximum,165
Range,165
Interquartile range,0

0,1
Standard deviation,3.1903
Coef of variation,4.5373
Kurtosis,177.93
Mean,0.70314
MAD,1.1828
Skewness,10.635
Sum,2700292
Variance,10.178
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,3229952,84.1%,
1,231319,6.0%,
2,116762,3.0%,
3,65318,1.7%,
4,43290,1.1%,
5,28987,0.8%,
6,20768,0.5%,
7,15872,0.4%,
8,12345,0.3%,
9,9534,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0,3229952,84.1%,
1,231319,6.0%,
2,116762,3.0%,
3,65318,1.7%,
4,43290,1.1%,

Value,Count,Frequency (%),Unnamed: 3
142,2,0.0%,
143,1,0.0%,
151,1,0.0%,
162,2,0.0%,
165,1,0.0%,

0,1
Distinct count,12
Unique (%),0.0%
Missing (%),19.5%
Missing (n),749816
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.0048125
Minimum,0
Maximum,12
Zeros (%),80.1%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,12
Range,12
Interquartile range,0

0,1
Standard deviation,0.082639
Coef of variation,17.172
Kurtosis,1253.3
Mean,0.0048125
MAD,0.0095851
Skewness,26.324
Sum,14873
Variance,0.0068291
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,3077688,80.1%,
1.0,11354,0.3%,
2.0,1076,0.0%,
3.0,259,0.0%,
4.0,65,0.0%,
5.0,27,0.0%,
6.0,11,0.0%,
7.0,9,0.0%,
10.0,3,0.0%,
8.0,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,3077688,80.1%,
1.0,11354,0.3%,
2.0,1076,0.0%,
3.0,259,0.0%,
4.0,65,0.0%,

Value,Count,Frequency (%),Unnamed: 3
6.0,11,0.0%,
7.0,9,0.0%,
8.0,3,0.0%,
10.0,3,0.0%,
12.0,1,0.0%,

0,1
Correlation,0.95055

0,1
Distinct count,122
Unique (%),0.0%
Missing (%),7.9%
Missing (n),305236
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,20.825
Minimum,0
Maximum,120
Zeros (%),14.4%

0,1
Minimum,0
5-th percentile,0
Q1,4
Median,15
Q3,32
95-th percentile,61
Maximum,120
Range,120
Interquartile range,28

0,1
Standard deviation,20.051
Coef of variation,0.96285
Kurtosis,0.64033
Mean,20.825
MAD,16.22
Skewness,1.0756
Sum,73618000
Variance,402.06
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,551467,14.4%,
5.0,91691,2.4%,
2.0,91035,2.4%,
4.0,89941,2.3%,
6.0,89829,2.3%,
3.0,89492,2.3%,
1.0,87306,2.3%,
7.0,87177,2.3%,
8.0,86882,2.3%,
9.0,81423,2.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,551467,14.4%,
1.0,87306,2.3%,
2.0,91035,2.4%,
3.0,89492,2.3%,
4.0,89941,2.3%,

Value,Count,Frequency (%),Unnamed: 3
116.0,6,0.0%,
117.0,4,0.0%,
118.0,3,0.0%,
119.0,2,0.0%,
120.0,1,0.0%,

0,1
Distinct count,96
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-34.522
Minimum,-96
Maximum,-1
Zeros (%),0.0%

0,1
Minimum,-96
5-th percentile,-84
Q1,-55
Median,-28
Q3,-11
95-th percentile,-3
Maximum,-1
Range,95
Interquartile range,44

0,1
Standard deviation,26.668
Coef of variation,-0.77249
Kurtosis,-0.85615
Mean,-34.522
MAD,22.885
Skewness,-0.59804
Sum,-132574946
Variance,711.17
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
-4,102115,2.7%,
-5,100546,2.6%,
-3,100355,2.6%,
-6,98577,2.6%,
-7,95332,2.5%,
-2,94643,2.5%,
-8,91419,2.4%,
-9,86842,2.3%,
-10,82525,2.1%,
-11,78441,2.0%,

Value,Count,Frequency (%),Unnamed: 3
-96,11722,0.3%,
-95,12521,0.3%,
-94,13397,0.3%,
-93,14197,0.4%,
-92,14911,0.4%,

Value,Count,Frequency (%),Unnamed: 3
-5,100546,2.6%,
-4,102115,2.7%,
-3,100355,2.6%,
-2,94643,2.5%,
-1,62356,1.6%,

0,1
Distinct count,7
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Active,3698436
Completed,128918
Signed,11058
Other values (4),1900

Value,Count,Frequency (%),Unnamed: 3
Active,3698436,96.3%,
Completed,128918,3.4%,
Signed,11058,0.3%,
Demand,1365,0.0%,
Sent proposal,513,0.0%,
Refused,17,0.0%,
Approved,5,0.0%,

0,1
Distinct count,917
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,9.2837
Minimum,0
Maximum,3260
Zeros (%),96.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,3260
Range,3260
Interquartile range,0

0,1
Standard deviation,97.516
Coef of variation,10.504
Kurtosis,190.37
Mean,9.2837
MAD,18.221
Skewness,12.947
Sum,35652179
Variance,9509.3
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,3686957,96.0%,
1,90369,2.4%,
8,2772,0.1%,
32,2340,0.1%,
7,1797,0.0%,
62,1406,0.0%,
31,1290,0.0%,
366,959,0.0%,
93,899,0.0%,
18,837,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,3686957,96.0%,
1,90369,2.4%,
5,532,0.0%,
6,99,0.0%,
7,1797,0.0%,

Value,Count,Frequency (%),Unnamed: 3
3137,1,0.0%,
3168,1,0.0%,
3198,1,0.0%,
3229,1,0.0%,
3260,1,0.0%,

0,1
Distinct count,378
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.33162
Minimum,0
Maximum,3260
Zeros (%),97.7%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,3260
Range,3260
Interquartile range,0

0,1
Standard deviation,21.479
Coef of variation,64.77
Kurtosis,9007.7
Mean,0.33162
MAD,0.64781
Skewness,89.83
Sum,1273532
Variance,461.36
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,3750972,97.7%,
1,83080,2.2%,
8,1729,0.0%,
7,1176,0.0%,
32,492,0.0%,
5,406,0.0%,
31,291,0.0%,
18,253,0.0%,
12,125,0.0%,
62,114,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,3750972,97.7%,
1,83080,2.2%,
5,406,0.0%,
6,83,0.0%,
7,1176,0.0%,

Value,Count,Frequency (%),Unnamed: 3
3137,1,0.0%,
3168,1,0.0%,
3198,1,0.0%,
3229,1,0.0%,
3260,1,0.0%,

0,1
Distinct count,103558
Unique (%),2.7%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,278320
Minimum,100006
Maximum,456250
Zeros (%),0.0%

0,1
Minimum,100006
5-th percentile,118200
Q1,189520
Median,278400
Q3,367580
95-th percentile,438160
Maximum,456250
Range,356244
Interquartile range,178060

0,1
Standard deviation,102700
Coef of variation,0.36901
Kurtosis,-1.1995
Mean,278320
MAD,88987
Skewness,-0.0018338
Sum,1068851793142
Variance,10548000000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
186401,192,0.0%,
311118,178,0.0%,
120076,140,0.0%,
128827,129,0.0%,
246089,128,0.0%,
191826,128,0.0%,
432607,128,0.0%,
173773,127,0.0%,
264667,127,0.0%,
116448,127,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100006,6,0.0%,
100011,74,0.0%,
100013,96,0.0%,
100021,17,0.0%,
100023,8,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456244,41,0.0%,
456246,8,0.0%,
456247,95,0.0%,
456248,23,0.0%,
456250,12,0.0%,

0,1
Distinct count,104307
Unique (%),2.7%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1904500
Minimum,1000018
Maximum,2843496
Zeros (%),0.0%

0,1
Minimum,1000018
5-th percentile,1082600
Q1,1434400
Median,1897100
Q3,2369300
95-th percentile,2748800
Maximum,2843496
Range,1843478
Interquartile range,934940

0,1
Standard deviation,536470
Coef of variation,0.28168
Kurtosis,-1.2196
Mean,1904500
MAD,466510
Skewness,0.038385
Sum,7313887990335
Variance,287800000000
Memory size,29.3 MiB

Value,Count,Frequency (%),Unnamed: 3
1009171,96,0.0%,
1348858,96,0.0%,
1745395,96,0.0%,
2526035,96,0.0%,
1567893,96,0.0%,
2044225,96,0.0%,
1226992,96,0.0%,
2360398,96,0.0%,
1198318,96,0.0%,
2280596,96,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1000018,5,0.0%,
1000030,8,0.0%,
1000031,16,0.0%,
1000035,5,0.0%,
1000077,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2843476,95,0.0%,
2843477,85,0.0%,
2843478,90,0.0%,
2843493,15,0.0%,
2843496,15,0.0%,

Unnamed: 0,SK_ID_PREV,SK_ID_CURR,MONTHS_BALANCE,AMT_BALANCE,AMT_CREDIT_LIMIT_ACTUAL,AMT_DRAWINGS_ATM_CURRENT,AMT_DRAWINGS_CURRENT,AMT_DRAWINGS_OTHER_CURRENT,AMT_DRAWINGS_POS_CURRENT,AMT_INST_MIN_REGULARITY,AMT_PAYMENT_CURRENT,AMT_PAYMENT_TOTAL_CURRENT,AMT_RECEIVABLE_PRINCIPAL,AMT_RECIVABLE,AMT_TOTAL_RECEIVABLE,CNT_DRAWINGS_ATM_CURRENT,CNT_DRAWINGS_CURRENT,CNT_DRAWINGS_OTHER_CURRENT,CNT_DRAWINGS_POS_CURRENT,CNT_INSTALMENT_MATURE_CUM,NAME_CONTRACT_STATUS,SK_DPD,SK_DPD_DEF
0,2562384,378907,-6,56.97,135000,0.0,877.5,0.0,877.5,1700.325,1800.0,1800.0,0.0,0.0,0.0,0.0,1,0.0,1.0,35.0,Active,0,0
1,2582071,363914,-1,63975.555,45000,2250.0,2250.0,0.0,0.0,2250.0,2250.0,2250.0,60175.08,64875.555,64875.555,1.0,1,0.0,0.0,69.0,Active,0,0
2,1740877,371185,-7,31815.225,450000,0.0,0.0,0.0,0.0,2250.0,2250.0,2250.0,26926.425,31460.085,31460.085,0.0,0,0.0,0.0,30.0,Active,0,0
3,1389973,337855,-4,236572.11,225000,2250.0,2250.0,0.0,0.0,11795.76,11925.0,11925.0,224949.285,233048.97,233048.97,1.0,1,0.0,0.0,10.0,Active,0,0
4,1891521,126868,-1,453919.455,450000,0.0,11547.0,0.0,11547.0,22924.89,27000.0,27000.0,443044.395,453919.455,453919.455,0.0,1,0.0,1.0,101.0,Active,0,0


In [48]:
pandas_profiling.ProfileReport(columns_description)

0,1
Number of variables,5
Number of observations,219
Total Missing (%),12.1%
Total size in memory,8.6 KiB
Average record size in memory,40.4 B

0,1
Numeric,1
Categorical,4
Boolean,0
Date,0
Text (Unique),0
Rejected,0
Unsupported,0

0,1
Distinct count,163
Unique (%),74.4%
Missing (%),0.0%
Missing (n),0

0,1
"Normalized information about building where the client lives, What is average (_AVG suffix), modus (_MODE suffix), median (_MEDI suffix) apartment size, common area, living area, age of building, number of elevators, number of entrances, state of the building, number of floor",47
ID of loan in our sample,5
Normalized score from external data source,3
Other values (160),164

Value,Count,Frequency (%),Unnamed: 3
"Normalized information about building where the client lives, What is average (_AVG suffix), modus (_MODE suffix), median (_MEDI suffix) apartment size, common area, living area, age of building, number of elevators, number of entrances, state of the building, number of floor",47,21.5%,
ID of loan in our sample,5,2.3%,
Normalized score from external data source,3,1.4%,
"Did client provide home phone (1=YES, 0=NO)",2,0.9%,
"ID of previous credit in Home credit related to loan in our sample. (One loan in our sample can have 0,1,2 or more previous loans in Home Credit)",2,0.9%,
Month of balance relative to application date (-1 means the freshest balance date),2,0.9%,
Interest rate normalized on previous credit,2,0.9%,
"ID of loan in our sample - one loan in our sample can have 0,1,2 or more related previous credits in credit bureau",1,0.5%,
Annuity of the Credit Bureau credit,1,0.5%,
For how much credit did client ask on the previous application,1,0.5%,

0,1
Distinct count,196
Unique (%),89.5%
Missing (%),0.0%
Missing (n),0

0,1
SK_ID_CURR,6
SK_ID_PREV,4
NAME_CONTRACT_STATUS,3
Other values (193),206

Value,Count,Frequency (%),Unnamed: 3
SK_ID_CURR,6,2.7%,
SK_ID_PREV,4,1.8%,
NAME_CONTRACT_STATUS,3,1.4%,
MONTHS_BALANCE,3,1.4%,
AMT_ANNUITY,3,1.4%,
AMT_CREDIT,2,0.9%,
NAME_TYPE_SUITE,2,0.9%,
SK_DPD,2,0.9%,
SK_BUREAU_ID,2,0.9%,
SK_DPD_DEF,2,0.9%,

0,1
Distinct count,8
Unique (%),3.7%
Missing (%),60.7%
Missing (n),133

0,1
normalized,53
time only relative to the application,19
hashed,9
Other values (4),5
(Missing),133

Value,Count,Frequency (%),Unnamed: 3
normalized,53,24.2%,
time only relative to the application,19,8.7%,
hashed,9,4.1%,
rounded,2,0.9%,
normalized,1,0.5%,
recoded,1,0.5%,
grouped,1,0.5%,
(Missing),133,60.7%,

0,1
Distinct count,7
Unique (%),3.2%
Missing (%),0.0%
Missing (n),0

0,1
application_{train|test}.csv,122
previous_application.csv,38
credit_card_balance.csv,23
Other values (4),36

Value,Count,Frequency (%),Unnamed: 3
application_{train|test}.csv,122,55.7%,
previous_application.csv,38,17.4%,
credit_card_balance.csv,23,10.5%,
bureau.csv,17,7.8%,
installments_payments.csv,8,3.7%,
POS_CASH_balance.csv,8,3.7%,
bureau_balance.csv,3,1.4%,

0,1
Distinct count,219
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,111.98
Minimum,1
Maximum,221
Zeros (%),0.0%

0,1
Minimum,1.0
5-th percentile,13.9
Q1,57.5
Median,112.0
Q3,166.5
95-th percentile,210.1
Maximum,221.0
Range,220.0
Interquartile range,109.0

0,1
Standard deviation,63.396
Coef of variation,0.56613
Kurtosis,-1.1976
Mean,111.98
MAD,54.767
Skewness,-0.0017421
Sum,24524
Variance,4019
Memory size,1.8 KiB

Value,Count,Frequency (%),Unnamed: 3
221,1,0.5%,
71,1,0.5%,
82,1,0.5%,
81,1,0.5%,
80,1,0.5%,
79,1,0.5%,
78,1,0.5%,
77,1,0.5%,
76,1,0.5%,
75,1,0.5%,

Value,Count,Frequency (%),Unnamed: 3
1,1,0.5%,
2,1,0.5%,
5,1,0.5%,
6,1,0.5%,
7,1,0.5%,

Value,Count,Frequency (%),Unnamed: 3
217,1,0.5%,
218,1,0.5%,
219,1,0.5%,
220,1,0.5%,
221,1,0.5%,

Unnamed: 0,Table,Row,Description,Special
1,application_{train|test}.csv,SK_ID_CURR,ID of loan in our sample,
2,application_{train|test}.csv,TARGET,"Target variable (1 - client with payment difficulties: he/she had late payment more than X days on at least one of the first Y installments of the loan in our sample, 0 - all other cases)",
5,application_{train|test}.csv,NAME_CONTRACT_TYPE,Identification if loan is cash or revolving,
6,application_{train|test}.csv,CODE_GENDER,Gender of the client,
7,application_{train|test}.csv,FLAG_OWN_CAR,Flag if the client owns a car,


In [20]:
pandas_profiling.ProfileReport(installments_payments)

0,1
Number of variables,8
Number of observations,13605401
Total Missing (%),0.0%
Total size in memory,830.4 MiB
Average record size in memory,64.0 B

0,1
Numeric,6
Categorical,0
Boolean,0
Date,0
Text (Unique),0
Rejected,2
Unsupported,0

0,1
Distinct count,902539
Unique (%),6.6%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,17051
Minimum,0
Maximum,3771500
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,188.15
Q1,4226.1
Median,8884.1
Q3,16710.0
95-th percentile,47041.0
Maximum,3771500.0
Range,3771500.0
Interquartile range,12484.0

0,1
Standard deviation,50570
Coef of variation,2.9658
Kurtosis,388.84
Mean,17051
MAD,15518
Skewness,16.236
Sum,231980000000
Variance,2557400000
Memory size,103.8 MiB

Value,Count,Frequency (%),Unnamed: 3
9000.0,254062,1.9%,
2250.0,179120,1.3%,
4500.0,174143,1.3%,
6750.0,173659,1.3%,
3375.0,149941,1.1%,
5625.0,96362,0.7%,
7875.0,60248,0.4%,
1125.0,60224,0.4%,
13500.0,42926,0.3%,
8100.0,37295,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,290,0.0%,
0.045,2059,0.0%,
0.09,1508,0.0%,
0.135,1680,0.0%,
0.18,1644,0.0%,

Value,Count,Frequency (%),Unnamed: 3
3202061.805,1,0.0%,
3371884.155,1,0.0%,
3436835.13,1,0.0%,
3473582.895,1,0.0%,
3771487.845,1,0.0%,

0,1
Correlation,0.93719

0,1
Correlation,0.99949

0,1
Distinct count,2922
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-1042.3
Minimum,-2922
Maximum,-1
Zeros (%),0.0%

0,1
Minimum,-2922
5-th percentile,-2553
Q1,-1654
Median,-818
Q3,-361
95-th percentile,-81
Maximum,-1
Range,2921
Interquartile range,1293

0,1
Standard deviation,800.95
Coef of variation,-0.76846
Kurtosis,-0.79874
Mean,-1042.3
MAD,684.08
Skewness,-0.6287
Sum,-14181000000
Variance,641510
Memory size,103.8 MiB

Value,Count,Frequency (%),Unnamed: 3
-120.0,11512,0.1%,
-180.0,11212,0.1%,
-150.0,11194,0.1%,
-119.0,11183,0.1%,
-149.0,11144,0.1%,
-210.0,11140,0.1%,
-90.0,11135,0.1%,
-148.0,10922,0.1%,
-179.0,10838,0.1%,
-59.0,10828,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-2922.0,1327,0.0%,
-2921.0,1436,0.0%,
-2920.0,1458,0.0%,
-2919.0,1485,0.0%,
-2918.0,1454,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-5.0,6122,0.0%,
-4.0,6157,0.0%,
-3.0,5031,0.0%,
-2.0,661,0.0%,
-1.0,2,0.0%,

0,1
Distinct count,277
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,18.871
Minimum,1
Maximum,277
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,4
Median,8
Q3,19
95-th percentile,82
Maximum,277
Range,276
Interquartile range,15

0,1
Standard deviation,26.664
Coef of variation,1.413
Kurtosis,6.7051
Mean,18.871
MAD,18.074
Skewness,2.4976
Sum,256746105
Variance,710.97
Memory size,103.8 MiB

Value,Count,Frequency (%),Unnamed: 3
1,1004160,7.4%,
2,985716,7.2%,
3,968279,7.1%,
4,943502,6.9%,
5,880007,6.5%,
6,827973,6.1%,
7,679739,5.0%,
8,644708,4.7%,
9,592473,4.4%,
10,549140,4.0%,

Value,Count,Frequency (%),Unnamed: 3
1,1004160,7.4%,
2,985716,7.2%,
3,968279,7.1%,
4,943502,6.9%,
5,880007,6.5%,

Value,Count,Frequency (%),Unnamed: 3
273,2,0.0%,
274,1,0.0%,
275,2,0.0%,
276,1,0.0%,
277,1,0.0%,

0,1
Distinct count,65
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.85664
Minimum,0
Maximum,178
Zeros (%),30.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,1
Q3,1
95-th percentile,2
Maximum,178
Range,178
Interquartile range,1

0,1
Standard deviation,1.0352
Coef of variation,1.2085
Kurtosis,259.61
Mean,0.85664
MAD,0.51409
Skewness,9.5934
Sum,11655000
Variance,1.0717
Memory size,103.8 MiB

Value,Count,Frequency (%),Unnamed: 3
1.0,8485004,62.4%,
0.0,4082498,30.0%,
2.0,620283,4.6%,
3.0,237063,1.7%,
4.0,55274,0.4%,
5.0,48404,0.4%,
6.0,17092,0.1%,
7.0,16771,0.1%,
9.0,8359,0.1%,
8.0,7814,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,4082498,30.0%,
1.0,8485004,62.4%,
2.0,620283,4.6%,
3.0,237063,1.7%,
4.0,55274,0.4%,

Value,Count,Frequency (%),Unnamed: 3
61.0,8,0.0%,
68.0,1,0.0%,
72.0,7,0.0%,
73.0,1,0.0%,
178.0,1,0.0%,

0,1
Distinct count,339587
Unique (%),2.5%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,278440
Minimum,100001
Maximum,456255
Zeros (%),0.0%

0,1
Minimum,100001
5-th percentile,118150
Q1,189640
Median,278680
Q3,367530
95-th percentile,438470
Maximum,456255
Range,356254
Interquartile range,177890

0,1
Standard deviation,102720
Coef of variation,0.3689
Kurtosis,-1.197
Mean,278440
MAD,88949
Skewness,-0.0033541
Sum,3788354272446
Variance,10551000000
Memory size,103.8 MiB

Value,Count,Frequency (%),Unnamed: 3
145728,372,0.0%,
296205,350,0.0%,
453103,347,0.0%,
189699,344,0.0%,
186851,337,0.0%,
172690,336,0.0%,
418081,332,0.0%,
192083,324,0.0%,
434807,323,0.0%,
217360,318,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100001,7,0.0%,
100002,19,0.0%,
100003,25,0.0%,
100004,3,0.0%,
100005,9,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456251,7,0.0%,
456252,6,0.0%,
456253,14,0.0%,
456254,19,0.0%,
456255,74,0.0%,

0,1
Distinct count,997752
Unique (%),7.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1903400
Minimum,1000001
Maximum,2843499
Zeros (%),0.0%

0,1
Minimum,1000001
5-th percentile,1082900
Q1,1434200
Median,1896500
Q3,2369100
95-th percentile,2749700
Maximum,2843499
Range,1843498
Interquartile range,934900

0,1
Standard deviation,536200
Coef of variation,0.28171
Kurtosis,-1.2171
Mean,1903400
MAD,465830
Skewness,0.04251
Sum,25896043660071
Variance,287510000000
Memory size,103.8 MiB

Value,Count,Frequency (%),Unnamed: 3
2360056,293,0.0%,
2592574,279,0.0%,
1017477,248,0.0%,
1449382,243,0.0%,
1746731,236,0.0%,
1690678,223,0.0%,
2709164,222,0.0%,
1383111,220,0.0%,
1152155,219,0.0%,
2543266,216,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1000001,2,0.0%,
1000002,4,0.0%,
1000003,3,0.0%,
1000004,7,0.0%,
1000005,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2843495,7,0.0%,
2843496,34,0.0%,
2843497,20,0.0%,
2843498,6,0.0%,
2843499,10,0.0%,

Unnamed: 0,SK_ID_PREV,SK_ID_CURR,NUM_INSTALMENT_VERSION,NUM_INSTALMENT_NUMBER,DAYS_INSTALMENT,DAYS_ENTRY_PAYMENT,AMT_INSTALMENT,AMT_PAYMENT
0,1054186,161674,1.0,6,-1180.0,-1187.0,6948.36,6948.36
1,1330831,151639,0.0,34,-2156.0,-2156.0,1716.525,1716.525
2,2085231,193053,2.0,1,-63.0,-63.0,25425.0,25425.0
3,2452527,199697,1.0,3,-2418.0,-2426.0,24350.13,24350.13
4,2714724,167756,1.0,2,-1383.0,-1366.0,2165.04,2160.585


In [21]:
pandas_profiling.ProfileReport(POS_CASH_balance)

0,1
Number of variables,8
Number of observations,10001358
Total Missing (%),0.1%
Total size in memory,610.4 MiB
Average record size in memory,64.0 B

0,1
Numeric,7
Categorical,1
Boolean,0
Date,0
Text (Unique),0
Rejected,0
Unsupported,0

0,1
Distinct count,74
Unique (%),0.0%
Missing (%),0.3%
Missing (n),26071
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,17.09
Minimum,1
Maximum,92
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,6
Q1,10
Median,12
Q3,24
95-th percentile,45
Maximum,92
Range,91
Interquartile range,14

0,1
Standard deviation,11.995
Coef of variation,0.70189
Kurtosis,2.4469
Mean,17.09
MAD,9.2211
Skewness,1.6017
Sum,170470000
Variance,143.88
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
12.0,2496845,25.0%,
24.0,1517472,15.2%,
10.0,1243449,12.4%,
6.0,1065500,10.7%,
18.0,727394,7.3%,
36.0,584574,5.8%,
8.0,303751,3.0%,
48.0,278513,2.8%,
4.0,238223,2.4%,
30.0,211920,2.1%,

Value,Count,Frequency (%),Unnamed: 3
1.0,24544,0.2%,
2.0,26826,0.3%,
3.0,43081,0.4%,
4.0,238223,2.4%,
5.0,136840,1.4%,

Value,Count,Frequency (%),Unnamed: 3
72.0,1519,0.0%,
77.0,4,0.0%,
81.0,1,0.0%,
84.0,5,0.0%,
92.0,1,0.0%,

0,1
Distinct count,80
Unique (%),0.0%
Missing (%),0.3%
Missing (n),26087
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.484
Minimum,0
Maximum,85
Zeros (%),11.9%

0,1
Minimum,0
5-th percentile,0
Q1,3
Median,7
Q3,14
95-th percentile,35
Maximum,85
Range,85
Interquartile range,11

0,1
Standard deviation,11.109
Coef of variation,1.0596
Kurtosis,3.7133
Mean,10.484
MAD,8.0358
Skewness,1.8467
Sum,104580000
Variance,123.41
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1185960,11.9%,
6.0,614058,6.1%,
4.0,613632,6.1%,
5.0,600295,6.0%,
3.0,582007,5.8%,
2.0,547199,5.5%,
1.0,512279,5.1%,
10.0,481390,4.8%,
8.0,480167,4.8%,
7.0,472665,4.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1185960,11.9%,
1.0,512279,5.1%,
2.0,547199,5.5%,
3.0,582007,5.8%,
4.0,613632,6.1%,

Value,Count,Frequency (%),Unnamed: 3
81.0,1,0.0%,
82.0,1,0.0%,
83.0,1,0.0%,
84.0,1,0.0%,
85.0,1,0.0%,

0,1
Distinct count,96
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-35.013
Minimum,-96
Maximum,-1
Zeros (%),0.0%

0,1
Minimum,-96
5-th percentile,-85
Q1,-54
Median,-28
Q3,-13
95-th percentile,-4
Maximum,-1
Range,95
Interquartile range,41

0,1
Standard deviation,26.067
Coef of variation,-0.74449
Kurtosis,-0.71068
Mean,-35.013
MAD,22.095
Skewness,-0.67278
Sum,-350173427
Variance,679.47
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
-10,216441,2.2%,
-11,216023,2.2%,
-9,215558,2.2%,
-12,214716,2.1%,
-8,214149,2.1%,
-13,210950,2.1%,
-7,210229,2.1%,
-14,208352,2.1%,
-6,206849,2.1%,
-15,204935,2.0%,

Value,Count,Frequency (%),Unnamed: 3
-96,36448,0.4%,
-95,38514,0.4%,
-94,39900,0.4%,
-93,41025,0.4%,
-92,42283,0.4%,

Value,Count,Frequency (%),Unnamed: 3
-5,200726,2.0%,
-4,193147,1.9%,
-3,183589,1.8%,
-2,169529,1.7%,
-1,94908,0.9%,

0,1
Distinct count,9
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Active,9151119
Completed,744883
Signed,87260
Other values (6),18096

Value,Count,Frequency (%),Unnamed: 3
Active,9151119,91.5%,
Completed,744883,7.4%,
Signed,87260,0.9%,
Demand,7065,0.1%,
Returned to the store,5461,0.1%,
Approved,4917,0.0%,
Amortized debt,636,0.0%,
Canceled,15,0.0%,
XNA,2,0.0%,

0,1
Distinct count,3400
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,11.607
Minimum,0
Maximum,4231
Zeros (%),97.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,4231
Range,4231
Interquartile range,0

0,1
Standard deviation,132.71
Coef of variation,11.434
Kurtosis,255.32
Mean,11.607
MAD,22.696
Skewness,14.899
Sum,116085045
Variance,17613
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,9706131,97.0%,
1,21872,0.2%,
2,17358,0.2%,
3,14403,0.1%,
4,12350,0.1%,
5,11046,0.1%,
6,9615,0.1%,
7,8332,0.1%,
8,7360,0.1%,
9,6668,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0,9706131,97.0%,
1,21872,0.2%,
2,17358,0.2%,
3,14403,0.1%,
4,12350,0.1%,

Value,Count,Frequency (%),Unnamed: 3
4110,1,0.0%,
4141,1,0.0%,
4172,1,0.0%,
4200,1,0.0%,
4231,1,0.0%,

0,1
Distinct count,2307
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.65447
Minimum,0
Maximum,3595
Zeros (%),98.9%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,0
95-th percentile,0
Maximum,3595
Range,3595
Interquartile range,0

0,1
Standard deviation,32.762
Coef of variation,50.06
Kurtosis,4836.5
Mean,0.65447
MAD,1.294
Skewness,66.34
Sum,6545573
Variance,1073.4
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
0,9887389,98.9%,
1,22134,0.2%,
2,14690,0.1%,
3,11652,0.1%,
4,9528,0.1%,
5,8031,0.1%,
6,6629,0.1%,
7,5425,0.1%,
8,4538,0.0%,
9,3935,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,9887389,98.9%,
1,22134,0.2%,
2,14690,0.1%,
3,11652,0.1%,
4,9528,0.1%,

Value,Count,Frequency (%),Unnamed: 3
3475,1,0.0%,
3506,1,0.0%,
3534,1,0.0%,
3565,1,0.0%,
3595,1,0.0%,

0,1
Distinct count,337252
Unique (%),3.4%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,278400
Minimum,100001
Maximum,456255
Zeros (%),0.0%

0,1
Minimum,100001
5-th percentile,117990
Q1,189550
Median,278650
Q3,367430
95-th percentile,438530
Maximum,456255
Range,356254
Interquartile range,177880

0,1
Standard deviation,102760
Coef of variation,0.36912
Kurtosis,-1.1968
Mean,278400
MAD,88972
Skewness,-0.0031283
Sum,2784416705502
Variance,10560000000
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
265042,295,0.0%,
172612,247,0.0%,
309133,246,0.0%,
197583,245,0.0%,
127659,245,0.0%,
185185,245,0.0%,
203046,244,0.0%,
362661,239,0.0%,
398407,237,0.0%,
228307,235,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100001,9,0.0%,
100002,19,0.0%,
100003,28,0.0%,
100004,4,0.0%,
100005,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456251,9,0.0%,
456252,7,0.0%,
456253,17,0.0%,
456254,20,0.0%,
456255,71,0.0%,

0,1
Distinct count,936325
Unique (%),9.4%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1903200
Minimum,1000001
Maximum,2843499
Zeros (%),0.0%

0,1
Minimum,1000001
5-th percentile,1083700
Q1,1434400
Median,1896600
Q3,2369000
95-th percentile,2749800
Maximum,2843499
Range,1843498
Interquartile range,934560

0,1
Standard deviation,535850
Coef of variation,0.28155
Kurtosis,-1.2162
Mean,1903200
MAD,465330
Skewness,0.044229
Sum,19034750557710
Variance,287130000000
Memory size,76.3 MiB

Value,Count,Frequency (%),Unnamed: 3
1624618,96,0.0%,
2746611,96,0.0%,
1889497,96,0.0%,
1235285,96,0.0%,
2263451,96,0.0%,
1835828,96,0.0%,
1000256,96,0.0%,
1856103,96,0.0%,
2687350,96,0.0%,
1012861,96,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1000001,3,0.0%,
1000002,5,0.0%,
1000003,4,0.0%,
1000004,8,0.0%,
1000005,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2843494,3,0.0%,
2843495,8,0.0%,
2843497,21,0.0%,
2843498,7,0.0%,
2843499,11,0.0%,

Unnamed: 0,SK_ID_PREV,SK_ID_CURR,MONTHS_BALANCE,CNT_INSTALMENT,CNT_INSTALMENT_FUTURE,NAME_CONTRACT_STATUS,SK_DPD,SK_DPD_DEF
0,1803195,182943,-31,48.0,45.0,Active,0,0
1,1715348,367990,-33,36.0,35.0,Active,0,0
2,1784872,397406,-32,12.0,9.0,Active,0,0
3,1903291,269225,-35,48.0,42.0,Active,0,0
4,2341044,334279,-35,36.0,35.0,Active,0,0


In [22]:
pandas_profiling.ProfileReport(previous_application)

0,1
Number of variables,37
Number of observations,1670214
Total Missing (%),16.3%
Total size in memory,471.5 MiB
Average record size in memory,296.0 B

0,1
Numeric,17
Categorical,16
Boolean,1
Date,0
Text (Unique),0
Rejected,3
Unsupported,0

0,1
Distinct count,357960
Unique (%),21.4%
Missing (%),22.3%
Missing (n),372235
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,15955
Minimum,0
Maximum,418060
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,2726.6
Q1,6321.8
Median,11250.0
Q3,20658.0
95-th percentile,45337.0
Maximum,418060.0
Range,418060.0
Interquartile range,14337.0

0,1
Standard deviation,14782
Coef of variation,0.92648
Kurtosis,15.07
Mean,15955
MAD,10389
Skewness,2.6926
Sum,20709000000
Variance,218510000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
2250.0,31865,1.9%,
11250.0,13974,0.8%,
6750.0,13442,0.8%,
9000.0,12496,0.7%,
22500.0,11903,0.7%,
4500.0,10597,0.6%,
13500.0,7171,0.4%,
3375.0,4806,0.3%,
7875.0,4674,0.3%,
38250.0,4129,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1637,0.1%,
579.78,1,0.0%,
585.855,1,0.0%,
635.04,1,0.0%,
637.65,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
309942.0,1,0.0%,
357733.26,1,0.0%,
393868.665,1,0.0%,
417927.645,2,0.0%,
418058.145,2,0.0%,

0,1
Distinct count,93885
Unique (%),5.6%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,175230
Minimum,0
Maximum,6905200
Zeros (%),23.5%

0,1
Minimum,0
5-th percentile,0
Q1,18720
Median,71046
Q3,180360
95-th percentile,787500
Maximum,6905200
Range,6905200
Interquartile range,161640

0,1
Standard deviation,292780
Coef of variation,1.6708
Kurtosis,15.762
Mean,175230
MAD,181690
Skewness,3.3914
Sum,292680000000
Variance,85720000000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,392402,23.5%,
45000.0,47831,2.9%,
225000.0,43543,2.6%,
135000.0,40678,2.4%,
450000.0,38905,2.3%,
90000.0,29367,1.8%,
180000.0,24738,1.5%,
270000.0,20573,1.2%,
675000.0,20227,1.2%,
67500.0,16861,1.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,392402,23.5%,
3456.0,1,0.0%,
4225.5,1,0.0%,
4500.0,4,0.0%,
5400.0,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
4237875.0,3,0.0%,
4455000.0,1,0.0%,
5085000.0,1,0.0%,
5850000.0,2,0.0%,
6905160.0,1,0.0%,

0,1
Correlation,0.97582

0,1
Distinct count,29279
Unique (%),1.8%
Missing (%),53.6%
Missing (n),895844
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,6697.4
Minimum,-0.9
Maximum,3060000
Zeros (%),22.1%

0,1
Minimum,-0.9
5-th percentile,0.0
Q1,0.0
Median,1638.0
Q3,7740.0
95-th percentile,26184.0
Maximum,3060000.0
Range,3060000.0
Interquartile range,7740.0

0,1
Standard deviation,20921
Coef of variation,3.1238
Kurtosis,2901.8
Mean,6697.4
MAD,7864.8
Skewness,36.477
Sum,5186300000
Variance,437710000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,369854,22.1%,
4500.0,21241,1.3%,
9000.0,14747,0.9%,
13500.0,9655,0.6%,
22500.0,8165,0.5%,
6750.0,7709,0.5%,
2250.0,6241,0.4%,
18000.0,4526,0.3%,
45000.0,4059,0.2%,
2700.0,3362,0.2%,

Value,Count,Frequency (%),Unnamed: 3
-0.9,1,0.0%,
-0.45,1,0.0%,
0.0,369854,22.1%,
0.045,37,0.0%,
0.09,40,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2118937.5,3,0.0%,
2135700.0,1,0.0%,
2150100.0,1,0.0%,
2475000.0,1,0.0%,
3060045.0,1,0.0%,

0,1
Correlation,0.99309

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Credit and cash offices,719968
Country-wide,494690
Stone,212083
Other values (5),243473

Value,Count,Frequency (%),Unnamed: 3
Credit and cash offices,719968,43.1%,
Country-wide,494690,29.6%,
Stone,212083,12.7%,
Regional / Local,108528,6.5%,
Contact center,71297,4.3%,
AP+ (Cash loan),57046,3.4%,
Channel of corporate sales,6150,0.4%,
Car dealer,452,0.0%,

0,1
Distinct count,50
Unique (%),0.0%
Missing (%),22.3%
Missing (n),372230
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,16.054
Minimum,0
Maximum,84
Zeros (%),8.7%

0,1
Minimum,0
5-th percentile,0
Q1,6
Median,12
Q3,24
95-th percentile,48
Maximum,84
Range,84
Interquartile range,18

0,1
Standard deviation,14.567
Coef of variation,0.90739
Kurtosis,1.868
Mean,16.054
MAD,10.912
Skewness,1.5314
Sum,20838000
Variance,212.21
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
12.0,323049,19.3%,
6.0,190461,11.4%,
0.0,144985,8.7%,
10.0,141851,8.5%,
24.0,137764,8.2%,
18.0,77430,4.6%,
36.0,72583,4.3%,
60.0,53600,3.2%,
48.0,47316,2.8%,
8.0,30349,1.8%,

Value,Count,Frequency (%),Unnamed: 3
0.0,144985,8.7%,
3.0,1100,0.1%,
4.0,26924,1.6%,
5.0,3957,0.2%,
6.0,190461,11.4%,

Value,Count,Frequency (%),Unnamed: 3
59.0,4,0.0%,
60.0,53600,3.2%,
66.0,10,0.0%,
72.0,139,0.0%,
84.0,45,0.0%,

0,1
Distinct count,9
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
XAP,1353093
HC,175231
LIMIT,55680
Other values (6),86210

Value,Count,Frequency (%),Unnamed: 3
XAP,1353093,81.0%,
HC,175231,10.5%,
LIMIT,55680,3.3%,
SCO,37467,2.2%,
CLIENT,26436,1.6%,
SCOFR,12811,0.8%,
XNA,5244,0.3%,
VERIF,3535,0.2%,
SYSTEM,717,0.0%,

0,1
Distinct count,2922
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-880.68
Minimum,-2922
Maximum,-1
Zeros (%),0.0%

0,1
Minimum,-2922
5-th percentile,-2559
Q1,-1300
Median,-581
Q3,-280
95-th percentile,-85
Maximum,-1
Range,2921
Interquartile range,1020

0,1
Standard deviation,779.1
Coef of variation,-0.88466
Kurtosis,-0.037846
Mean,-880.68
MAD,637.23
Skewness,-1.0531
Sum,-1470923511
Variance,607000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
-245,2444,0.1%,
-238,2390,0.1%,
-210,2375,0.1%,
-273,2350,0.1%,
-196,2315,0.1%,
-224,2305,0.1%,
-252,2300,0.1%,
-182,2283,0.1%,
-240,2279,0.1%,
-231,2270,0.1%,

Value,Count,Frequency (%),Unnamed: 3
-2922,162,0.0%,
-2921,158,0.0%,
-2920,168,0.0%,
-2919,171,0.0%,
-2918,185,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-5,1324,0.1%,
-4,1507,0.1%,
-3,1516,0.1%,
-2,1172,0.1%,
-1,2,0.0%,

0,1
Distinct count,2839
Unique (%),0.2%
Missing (%),40.3%
Missing (n),673065
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,342210
Minimum,-2922
Maximum,365240
Zeros (%),0.0%

0,1
Minimum,-2922
5-th percentile,-269
Q1,365240
Median,365240
Q3,365240
95-th percentile,365240
Maximum,365240
Range,368160
Interquartile range,0

0,1
Standard deviation,88916
Coef of variation,0.25983
Kurtosis,10.97
Mean,342210
MAD,43169
Skewness,-3.6013
Sum,341230000000
Variance,7906100000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
365243.0,934444,55.9%,
-228.0,123,0.0%,
-224.0,121,0.0%,
-212.0,121,0.0%,
-223.0,119,0.0%,
-220.0,118,0.0%,
-210.0,117,0.0%,
-235.0,117,0.0%,
-240.0,116,0.0%,
-226.0,115,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-2922.0,1,0.0%,
-2921.0,2,0.0%,
-2920.0,5,0.0%,
-2919.0,12,0.0%,
-2918.0,7,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-5.0,14,0.0%,
-4.0,10,0.0%,
-3.0,14,0.0%,
-2.0,20,0.0%,
365243.0,934444,55.9%,

0,1
Distinct count,2893
Unique (%),0.2%
Missing (%),40.3%
Missing (n),673065
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,13826
Minimum,-2892
Maximum,365240
Zeros (%),0.0%

0,1
Minimum,-2892
5-th percentile,-2608
Q1,-1628
Median,-831
Q3,-411
95-th percentile,-48
Maximum,365240
Range,368140
Interquartile range,1217

0,1
Standard deviation,72445
Coef of variation,5.2397
Kurtosis,19.571
Mean,13826
MAD,28648
Skewness,4.6441
Sum,13787000000
Variance,5248300000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
365243.0,40645,2.4%,
-334.0,772,0.0%,
-509.0,760,0.0%,
-208.0,751,0.0%,
-330.0,750,0.0%,
-292.0,746,0.0%,
-691.0,745,0.0%,
-299.0,744,0.0%,
-270.0,744,0.0%,
-327.0,743,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-2892.0,9,0.0%,
-2891.0,55,0.0%,
-2890.0,73,0.0%,
-2889.0,86,0.0%,
-2888.0,96,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-5.0,182,0.0%,
-4.0,132,0.0%,
-3.0,136,0.0%,
-2.0,14,0.0%,
365243.0,40645,2.4%,

0,1
Distinct count,2874
Unique (%),0.2%
Missing (%),40.3%
Missing (n),673065
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,76582
Minimum,-2889
Maximum,365240
Zeros (%),0.0%

0,1
Minimum,-2889
5-th percentile,-2349
Q1,-1314
Median,-537
Q3,-74
95-th percentile,365240
Maximum,365240
Range,368130
Interquartile range,1240

0,1
Standard deviation,149650
Coef of variation,1.9541
Kurtosis,-0.010447
Mean,76582
MAD,122290
Skewness,1.4105
Sum,76364000000
Variance,22394000000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
365243.0,211221,12.6%,
-245.0,658,0.0%,
-188.0,650,0.0%,
-239.0,642,0.0%,
-167.0,638,0.0%,
-247.0,629,0.0%,
-305.0,627,0.0%,
-268.0,624,0.0%,
-236.0,623,0.0%,
-160.0,623,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-2889.0,1,0.0%,
-2888.0,1,0.0%,
-2885.0,1,0.0%,
-2884.0,2,0.0%,
-2883.0,3,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-5.0,444,0.0%,
-4.0,501,0.0%,
-3.0,402,0.0%,
-2.0,30,0.0%,
365243.0,211221,12.6%,

0,1
Distinct count,4606
Unique (%),0.3%
Missing (%),40.3%
Missing (n),673065
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,33768
Minimum,-2801
Maximum,365240
Zeros (%),0.0%

0,1
Minimum,-2801
5-th percentile,-2327
Q1,-1242
Median,-361
Q3,129
95-th percentile,365240
Maximum,365240
Range,368040
Interquartile range,1371

0,1
Standard deviation,106860
Coef of variation,3.1645
Kurtosis,5.7261
Mean,33768
MAD,62405
Skewness,2.7794
Sum,33672000000
Variance,11418000000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
365243.0,93864,5.6%,
9.0,720,0.0%,
8.0,706,0.0%,
0.0,705,0.0%,
5.0,702,0.0%,
10.0,698,0.0%,
2.0,688,0.0%,
1.0,685,0.0%,
6.0,685,0.0%,
-1.0,675,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-2801.0,9,0.0%,
-2800.0,9,0.0%,
-2799.0,6,0.0%,
-2798.0,18,0.0%,
-2797.0,11,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2032.0,1,0.0%,
2090.0,1,0.0%,
2098.0,1,0.0%,
2389.0,1,0.0%,
365243.0,93864,5.6%,

0,1
Correlation,0.92799

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Y,1661739
N,8475

Value,Count,Frequency (%),Unnamed: 3
Y,1661739,99.5%,
N,8475,0.5%,

0,1
Distinct count,24
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,12.484
Minimum,0
Maximum,23
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,7
Q1,10
Median,12
Q3,15
95-th percentile,18
Maximum,23
Range,23
Interquartile range,5

0,1
Standard deviation,3.334
Coef of variation,0.26706
Kurtosis,-0.27777
Mean,12.484
MAD,2.7212
Skewness,-0.025629
Sum,20851255
Variance,11.116
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
11,192728,11.5%,
12,185980,11.1%,
10,181690,10.9%,
13,172256,10.3%,
14,157711,9.4%,
15,142965,8.6%,
9,127002,7.6%,
16,121361,7.3%,
17,95064,5.7%,
8,73085,4.4%,

Value,Count,Frequency (%),Unnamed: 3
0,109,0.0%,
1,212,0.0%,
2,1116,0.1%,
3,5035,0.3%,
4,9319,0.6%,

Value,Count,Frequency (%),Unnamed: 3
19,34089,2.0%,
20,14535,0.9%,
21,4082,0.2%,
22,720,0.0%,
23,202,0.0%,

0,1
Distinct count,25
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
XAP,922661
XNA,677918
Repairs,23765
Other values (22),45870

Value,Count,Frequency (%),Unnamed: 3
XAP,922661,55.2%,
XNA,677918,40.6%,
Repairs,23765,1.4%,
Other,15608,0.9%,
Urgent needs,8412,0.5%,
Buying a used car,2888,0.2%,
Building a house or an annex,2693,0.2%,
Everyday expenses,2416,0.1%,
Medicine,2174,0.1%,
Payments on other loans,1931,0.1%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Repeater,1231261
New,301363
Refreshed,135649

Value,Count,Frequency (%),Unnamed: 3
Repeater,1231261,73.7%,
New,301363,18.0%,
Refreshed,135649,8.1%,
XNA,1941,0.1%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Approved,1036781
Canceled,316319
Refused,290678

Value,Count,Frequency (%),Unnamed: 3
Approved,1036781,62.1%,
Canceled,316319,18.9%,
Refused,290678,17.4%,
Unused offer,26436,1.6%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Cash loans,747553
Consumer loans,729151
Revolving loans,193164

Value,Count,Frequency (%),Unnamed: 3
Cash loans,747553,44.8%,
Consumer loans,729151,43.7%,
Revolving loans,193164,11.6%,
XNA,346,0.0%,

0,1
Distinct count,28
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
XNA,950809
Mobile,224708
Consumer Electronics,121576
Other values (25),373121

Value,Count,Frequency (%),Unnamed: 3
XNA,950809,56.9%,
Mobile,224708,13.5%,
Consumer Electronics,121576,7.3%,
Computers,105769,6.3%,
Audio/Video,99441,6.0%,
Furniture,53656,3.2%,
Photo / Cinema Equipment,25021,1.5%,
Construction Materials,24995,1.5%,
Clothing and Accessories,23554,1.4%,
Auto Accessories,7381,0.4%,

0,1
Distinct count,4
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Cash through the bank,1033552
XNA,627384
Non-cash from your account,8193

Value,Count,Frequency (%),Unnamed: 3
Cash through the bank,1033552,61.9%,
XNA,627384,37.6%,
Non-cash from your account,8193,0.5%,
Cashless from the account of the employer,1085,0.1%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
POS,691011
Cash,461563
XNA,372230
Other values (2),145410

Value,Count,Frequency (%),Unnamed: 3
POS,691011,41.4%,
Cash,461563,27.6%,
XNA,372230,22.3%,
Cards,144985,8.7%,
Cars,425,0.0%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
XNA,1063666
x-sell,456287
walk-in,150261

Value,Count,Frequency (%),Unnamed: 3
XNA,1063666,63.7%,
x-sell,456287,27.3%,
walk-in,150261,9.0%,

0,1
Distinct count,11
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
XNA,855720
Consumer electronics,398265
Connectivity,276029
Other values (8),140200

Value,Count,Frequency (%),Unnamed: 3
XNA,855720,51.2%,
Consumer electronics,398265,23.8%,
Connectivity,276029,16.5%,
Furniture,57849,3.5%,
Construction,29781,1.8%,
Clothing,23949,1.4%,
Industry,19194,1.1%,
Auto technology,4990,0.3%,
Jewelry,2709,0.2%,
MLM partners,1215,0.1%,

0,1
Distinct count,8
Unique (%),0.0%
Missing (%),49.1%
Missing (n),820405

0,1
Unaccompanied,508970
Family,213263
"Spouse, partner",67069
Other values (4),60507
(Missing),820405

Value,Count,Frequency (%),Unnamed: 3
Unaccompanied,508970,30.5%,
Family,213263,12.8%,
"Spouse, partner",67069,4.0%,
Children,31566,1.9%,
Other_B,17624,1.1%,
Other_A,9077,0.5%,
Group of people,2240,0.1%,
(Missing),820405,49.1%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
XNA,517215
middle,385532
high,353331
Other values (2),414136

Value,Count,Frequency (%),Unnamed: 3
XNA,517215,31.0%,
middle,385532,23.1%,
high,353331,21.2%,
low_normal,322095,19.3%,
low_action,92041,5.5%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),40.3%
Missing (n),673065
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.33257
Minimum,0
Maximum,1
Zeros (%),39.8%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,1
95-th percentile,1
Maximum,1
Range,1
Interquartile range,1

0,1
Standard deviation,0.47113
Coef of variation,1.4166
Kurtosis,-1.4948
Mean,0.33257
MAD,0.44393
Skewness,0.71075
Sum,331620
Variance,0.22197
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,665527,39.8%,
1.0,331622,19.9%,
(Missing),673065,40.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,665527,39.8%,
1.0,331622,19.9%,

Value,Count,Frequency (%),Unnamed: 3
0.0,665527,39.8%,
1.0,331622,19.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.99647

0,1
1,1664314
0,5900

Value,Count,Frequency (%),Unnamed: 3
1,1664314,99.6%,
0,5900,0.4%,

0,1
Distinct count,18
Unique (%),0.0%
Missing (%),0.0%
Missing (n),346

0,1
Cash,285990
POS household with interest,263622
POS mobile with interest,220670
Other values (14),899586

Value,Count,Frequency (%),Unnamed: 3
Cash,285990,17.1%,
POS household with interest,263622,15.8%,
POS mobile with interest,220670,13.2%,
Cash X-Sell: middle,143883,8.6%,
Cash X-Sell: low,130248,7.8%,
Card Street,112582,6.7%,
POS industry with interest,98833,5.9%,
POS household without interest,82908,5.0%,
Card X-Sell,80582,4.8%,
Cash Street: high,59639,3.6%,

0,1
Distinct count,207034
Unique (%),12.4%
Missing (%),53.6%
Missing (n),895844
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.079637
Minimum,-1.4979e-05
Maximum,1
Zeros (%),22.1%

0,1
Minimum,-1.4979e-05
5-th percentile,0.0
Q1,0.0
Median,0.051605
Q3,0.10891
95-th percentile,0.29413
Maximum,1.0
Range,1.0
Interquartile range,0.10891

0,1
Standard deviation,0.10782
Coef of variation,1.3539
Kurtosis,6.2045
Mean,0.079637
MAD,0.079204
Skewness,2.1077
Sum,61668
Variance,0.011626
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
0.0,369854,22.1%,
0.1089090909090909,36341,2.2%,
0.2178181818181818,6482,0.4%,
0.3267272727272727,1081,0.1%,
0.5445454545454544,746,0.0%,
0.4356363636363636,449,0.0%,
0.10427466150870406,304,0.0%,
0.10137814313346223,258,0.0%,
0.09946035699460357,252,0.0%,
0.10000834794223223,243,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-1.4978763414307848e-05,1,0.0%,
-1.3693400421089204e-05,1,0.0%,
0.0,369854,22.1%,
1.5544449333405877e-07,1,0.0%,
2.102938266421599e-07,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.9725503050348784,1,0.0%,
0.980181818181818,1,0.0%,
0.9807152645111248,1,0.0%,
0.9897398775709312,1,0.0%,
1.0,1,0.0%,

0,1
Distinct count,149
Unique (%),0.0%
Missing (%),99.6%
Missing (n),1664263
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.18836
Minimum,0.034781
Maximum,1
Zeros (%),0.0%

0,1
Minimum,0.034781
5-th percentile,0.14244
Q1,0.16072
Median,0.18912
Q3,0.19333
95-th percentile,0.19691
Maximum,1.0
Range,0.96522
Interquartile range,0.032614

0,1
Standard deviation,0.087671
Coef of variation,0.46545
Kurtosis,28.205
Mean,0.18836
MAD,0.031486
Skewness,5.1982
Sum,1120.9
Variance,0.0076862
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
0.1891363481808909,1218,0.1%,
0.14244021307945146,951,0.1%,
0.1607163096452454,821,0.0%,
0.19332993312932112,681,0.0%,
0.19690014734217387,573,0.0%,
0.17600306018361106,241,0.0%,
0.1891221806641732,210,0.0%,
0.1607021421285277,204,0.0%,
0.1828176357248102,187,0.0%,
0.19691431485889155,139,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0347812535418791,1,0.0%,
0.0591210472628357,2,0.0%,
0.0591352147795534,61,0.0%,
0.0591493822962711,2,0.0%,
0.0957724130114473,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8067692394877027,1,0.0%,
0.8155105973025051,1,0.0%,
0.8613000113340133,1,0.0%,
0.9029241754505272,1,0.0%,
1.0,1,0.0%,

0,1
Distinct count,26
Unique (%),0.0%
Missing (%),99.6%
Missing (n),1664263
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.7735
Minimum,0.37315
Maximum,1
Zeros (%),0.0%

0,1
Minimum,0.37315
5-th percentile,0.63795
Q1,0.71564
Median,0.8351
Q3,0.85254
95-th percentile,0.86734
Maximum,1.0
Range,0.62685
Interquartile range,0.13689

0,1
Standard deviation,0.10088
Coef of variation,0.13042
Kurtosis,0.25558
Mean,0.7735
MAD,0.089626
Skewness,-1.0077
Sum,4603.1
Variance,0.010176
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
0.8350951374207188,1717,0.1%,
0.715644820295983,1046,0.1%,
0.6379492600422833,1039,0.1%,
0.8673361522198731,931,0.1%,
0.852536997885835,876,0.1%,
0.5687103594080338,127,0.0%,
0.4244186046511628,66,0.0%,
0.5137420718816067,45,0.0%,
0.8324524312896405,40,0.0%,
0.8451374207188159,19,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.3731501057082452,2,0.0%,
0.4244186046511628,66,0.0%,
0.4365750528541226,2,0.0%,
0.4841437632135306,1,0.0%,
0.5021141649048626,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.8451374207188159,19,0.0%,
0.852536997885835,876,0.1%,
0.8546511627906977,1,0.0%,
0.8673361522198731,931,0.1%,
1.0,1,0.0%,

0,1
Distinct count,2097
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,313.95
Minimum,-1
Maximum,4000000
Zeros (%),3.6%

0,1
Minimum,-1
5-th percentile,-1
Q1,-1
Median,3
Q3,82
95-th percentile,1820
Maximum,4000000
Range,4000001
Interquartile range,83

0,1
Standard deviation,7127.4
Coef of variation,22.702
Kurtosis,296880
Mean,313.95
MAD,489.13
Skewness,529.62
Sum,524365548
Variance,50800000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
-1,762675,45.7%,
0,60523,3.6%,
50,37401,2.2%,
30,34423,2.1%,
20,33840,2.0%,
100,31409,1.9%,
40,24429,1.5%,
25,18142,1.1%,
15,17175,1.0%,
150,16652,1.0%,

Value,Count,Frequency (%),Unnamed: 3
-1,762675,45.7%,
0,60523,3.6%,
1,5275,0.3%,
2,4374,0.3%,
3,5472,0.3%,

Value,Count,Frequency (%),Unnamed: 3
112000,4,0.0%,
120000,3,0.0%,
250000,9,0.0%,
256099,1,0.0%,
4000000,5,0.0%,

0,1
Distinct count,338857
Unique (%),20.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,278360
Minimum,100001
Maximum,456255
Zeros (%),0.0%

0,1
Minimum,100001
5-th percentile,117930
Q1,189330
Median,278710
Q3,367510
95-th percentile,438440
Maximum,456255
Range,356254
Interquartile range,178180

0,1
Standard deviation,102810
Coef of variation,0.36936
Kurtosis,-1.1993
Mean,278360
MAD,89049
Skewness,-0.0033025
Sum,464916049180
Variance,10571000000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
187868,77,0.0%,
265681,73,0.0%,
173680,72,0.0%,
242412,68,0.0%,
206783,67,0.0%,
156367,66,0.0%,
389950,64,0.0%,
382179,64,0.0%,
198355,63,0.0%,
345161,62,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100001,1,0.0%,
100002,1,0.0%,
100003,3,0.0%,
100004,1,0.0%,
100005,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456251,1,0.0%,
456252,1,0.0%,
456253,2,0.0%,
456254,2,0.0%,
456255,8,0.0%,

0,1
Distinct count,1670214
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1923100
Minimum,1000001
Maximum,2845382
Zeros (%),0.0%

0,1
Minimum,1000001
5-th percentile,1092600
Q1,1461900
Median,1923100
Q3,2384300
95-th percentile,2753200
Maximum,2845382
Range,1845381
Interquartile range,922420

0,1
Standard deviation,532600
Coef of variation,0.27695
Kurtosis,-1.1998
Mean,1923100
MAD,461230
Skewness,-0.00057313
Sum,3211970397077
Variance,283660000000
Memory size,12.7 MiB

Value,Count,Frequency (%),Unnamed: 3
1000983,1,0.0%,
2428426,1,0.0%,
1026910,1,0.0%,
1024863,1,0.0%,
2448896,1,0.0%,
2446849,1,0.0%,
2444802,1,0.0%,
2442755,1,0.0%,
2457092,1,0.0%,
2455045,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1000001,1,0.0%,
1000002,1,0.0%,
1000003,1,0.0%,
1000004,1,0.0%,
1000005,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2845377,1,0.0%,
2845378,1,0.0%,
2845379,1,0.0%,
2845381,1,0.0%,
2845382,1,0.0%,

0,1
Distinct count,7
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
TUESDAY,255118
WEDNESDAY,255010
MONDAY,253557
Other values (4),906529

Value,Count,Frequency (%),Unnamed: 3
TUESDAY,255118,15.3%,
WEDNESDAY,255010,15.3%,
MONDAY,253557,15.2%,
FRIDAY,252048,15.1%,
THURSDAY,249099,14.9%,
SATURDAY,240631,14.4%,
SUNDAY,164751,9.9%,

Unnamed: 0,SK_ID_PREV,SK_ID_CURR,NAME_CONTRACT_TYPE,AMT_ANNUITY,AMT_APPLICATION,AMT_CREDIT,AMT_DOWN_PAYMENT,AMT_GOODS_PRICE,WEEKDAY_APPR_PROCESS_START,HOUR_APPR_PROCESS_START,FLAG_LAST_APPL_PER_CONTRACT,NFLAG_LAST_APPL_IN_DAY,RATE_DOWN_PAYMENT,RATE_INTEREST_PRIMARY,RATE_INTEREST_PRIVILEGED,NAME_CASH_LOAN_PURPOSE,NAME_CONTRACT_STATUS,DAYS_DECISION,NAME_PAYMENT_TYPE,CODE_REJECT_REASON,NAME_TYPE_SUITE,NAME_CLIENT_TYPE,NAME_GOODS_CATEGORY,NAME_PORTFOLIO,NAME_PRODUCT_TYPE,CHANNEL_TYPE,SELLERPLACE_AREA,NAME_SELLER_INDUSTRY,CNT_PAYMENT,NAME_YIELD_GROUP,PRODUCT_COMBINATION,DAYS_FIRST_DRAWING,DAYS_FIRST_DUE,DAYS_LAST_DUE_1ST_VERSION,DAYS_LAST_DUE,DAYS_TERMINATION,NFLAG_INSURED_ON_APPROVAL
0,2030495,271877,Consumer loans,1730.43,17145.0,17145.0,0.0,17145.0,SATURDAY,15,Y,1,0.0,0.182832,0.867336,XAP,Approved,-73,Cash through the bank,XAP,,Repeater,Mobile,POS,XNA,Country-wide,35,Connectivity,12.0,middle,POS mobile with interest,365243.0,-42.0,300.0,-42.0,-37.0,0.0
1,2802425,108129,Cash loans,25188.615,607500.0,679671.0,,607500.0,THURSDAY,11,Y,1,,,,XNA,Approved,-164,XNA,XAP,Unaccompanied,Repeater,XNA,Cash,x-sell,Contact center,-1,XNA,36.0,low_action,Cash X-Sell: low,365243.0,-134.0,916.0,365243.0,365243.0,1.0
2,2523466,122040,Cash loans,15060.735,112500.0,136444.5,,112500.0,TUESDAY,11,Y,1,,,,XNA,Approved,-301,Cash through the bank,XAP,"Spouse, partner",Repeater,XNA,Cash,x-sell,Credit and cash offices,-1,XNA,12.0,high,Cash X-Sell: high,365243.0,-271.0,59.0,365243.0,365243.0,1.0
3,2819243,176158,Cash loans,47041.335,450000.0,470790.0,,450000.0,MONDAY,7,Y,1,,,,XNA,Approved,-512,Cash through the bank,XAP,,Repeater,XNA,Cash,x-sell,Credit and cash offices,-1,XNA,12.0,middle,Cash X-Sell: middle,365243.0,-482.0,-152.0,-182.0,-177.0,1.0
4,1784265,202054,Cash loans,31924.395,337500.0,404055.0,,337500.0,THURSDAY,9,Y,1,,,,Repairs,Refused,-781,Cash through the bank,HC,,Repeater,XNA,Cash,walk-in,Credit and cash offices,-1,XNA,24.0,high,Cash Street: high,,,,,,


In [23]:
pandas_profiling.ProfileReport(sample_submission)

0,1
Number of variables,2
Number of observations,48744
Total Missing (%),0.0%
Total size in memory,761.7 KiB
Average record size in memory,16.0 B

0,1
Numeric,1
Categorical,0
Boolean,0
Date,0
Text (Unique),0
Rejected,1
Unsupported,0

0,1
Distinct count,48744
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,277800
Minimum,100001
Maximum,456250
Zeros (%),0.0%

0,1
Minimum,100001
5-th percentile,117040
Q1,188560
Median,277550
Q3,367560
95-th percentile,438550
Maximum,456250
Range,356249
Interquartile range,179000

0,1
Standard deviation,103170
Coef of variation,0.37139
Kurtosis,-1.2063
Mean,277800
MAD,89401
Skewness,0.0075597
Sum,13540921192
Variance,10644000000
Memory size,380.9 KiB

Value,Count,Frequency (%),Unnamed: 3
198655,1,0.0%,
415104,1,0.0%,
200572,1,0.0%,
132490,1,0.0%,
333192,1,0.0%,
290183,1,0.0%,
386308,1,0.0%,
369105,1,0.0%,
281987,1,0.0%,
229357,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
100001,1,0.0%,
100005,1,0.0%,
100013,1,0.0%,
100028,1,0.0%,
100038,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
456221,1,0.0%,
456222,1,0.0%,
456223,1,0.0%,
456224,1,0.0%,
456250,1,0.0%,

0,1
Constant value,0.5

Unnamed: 0,SK_ID_CURR,TARGET
0,100001,0.5
1,100005,0.5
2,100013,0.5
3,100028,0.5
4,100038,0.5


# Описание переменных

Попробуем теперь глянуть в описание переменных, может там есть что-нибудь дельное.

In [31]:
columns_description

Unnamed: 0,Table,Row,Description,Special
1,application_{train|test}.csv,SK_ID_CURR,ID of loan in our sample,
2,application_{train|test}.csv,TARGET,"Target variable (1 - client with payment difficulties: he/she had late payment more than X days on at least one of the first Y installments of the loan in our sample, 0 - all other cases)",
5,application_{train|test}.csv,NAME_CONTRACT_TYPE,Identification if loan is cash or revolving,
6,application_{train|test}.csv,CODE_GENDER,Gender of the client,
7,application_{train|test}.csv,FLAG_OWN_CAR,Flag if the client owns a car,
8,application_{train|test}.csv,FLAG_OWN_REALTY,Flag if client owns a house or flat,
9,application_{train|test}.csv,CNT_CHILDREN,Number of children the client has,
10,application_{train|test}.csv,AMT_INCOME_TOTAL,Income of the client,
11,application_{train|test}.csv,AMT_CREDIT,Credit amount of the loan,
12,application_{train|test}.csv,AMT_ANNUITY,Loan annuity,


![Image of data scheme](https://storage.googleapis.com/kaggle-media/competitions/home-credit/home_credit.png)

Глянем теперь на связи записей в таблицах: где чего сколько лежит.

In [34]:
len(pd.merge(train, bureau, on='SK_ID_CURR', how='inner'))

1465325

In [35]:
len(pd.merge(test, bureau, on='SK_ID_CURR', how='inner'))

251103

In [36]:
len(pd.merge(train, previous_application, on='SK_ID_CURR', how='inner'))

1413701

In [37]:
len(pd.merge(test, previous_application, on='SK_ID_CURR', how='inner'))

256513

In [43]:
len(pd.merge(previous_application, POS_CASH_balance, on='SK_ID_PREV', how='inner'))

9660797

In [44]:
len(pd.merge(previous_application, installments_payments, on='SK_ID_PREV', how='inner'))

12354575

In [45]:
len(pd.merge(previous_application, credit_card_balance, on='SK_ID_PREV', how='inner'))

2757496

In [38]:
len(pd.merge(bureau, bureau_balance, on='SK_ID_BUREAU', how='inner'))

24179741

In [40]:
len(pd.merge(train, POS_CASH_balance, on='SK_ID_CURR', how='inner'))

8543375

In [41]:
len(pd.merge(test, POS_CASH_balance, on='SK_ID_CURR', how='inner'))

1457983