# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [11]:
import pandas as pd
import numpy as np
from scipy import stats
%matplotlib inline
import pandas_profiling
import matplotlib.pyplot as plt
import bokeh.plotting as bkp
from scipy.stats import ttest_ind
from mpl_toolkits.axes_grid1 import make_axes_locatable

In [4]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [5]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [6]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [7]:
pandas_profiling.ProfileReport(data)

0,1
Number of variables,65
Number of observations,4870
Total Missing (%),11.0%
Total size in memory,1.3 MiB
Average record size in memory,270.0 B

0,1
Numeric,22
Categorical,10
Boolean,32
Date,0
Text (Unique),0
Rejected,1
Unsupported,0

0,1
Distinct count,303
Unique (%),6.2%
Missing (%),0.0%
Missing (n),0

0,1
3,220
4,204
5,204
Other values (300),4242

Value,Count,Frequency (%),Unnamed: 3
3,220,4.5%,
4,204,4.2%,
5,204,4.2%,
1,200,4.1%,
6,200,4.1%,
2,196,4.0%,
7,186,3.8%,
8,186,3.8%,
9,184,3.8%,
10,178,3.7%,

0,1
Distinct count,1323
Unique (%),27.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,651.78
Minimum,1.0
Maximum,1344.0
Zeros (%),0.0%

0,1
Minimum,1.0
5-th percentile,61.45
Q1,306.25
Median,647.0
Q3,979.75
95-th percentile,1273.6
Maximum,1344.0
Range,1343.0
Interquartile range,673.5

0,1
Standard deviation,388.69
Coef of variation,0.59635
Kurtosis,-1.19236
Mean,651.78
MAD,335.84
Skewness,0.0567797
Sum,3.17416e+06
Variance,151080
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
1309.0,4,0.1%,
671.0,4,0.1%,
795.0,4,0.1%,
901.0,4,0.1%,
1015.0,4,0.1%,
1154.0,4,0.1%,
1254.0,4,0.1%,
591.0,4,0.1%,
449.0,4,0.1%,
1337.0,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1.0,4,0.1%,
2.0,4,0.1%,
3.0,4,0.1%,
4.0,4,0.1%,
5.0,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1340.0,4,0.1%,
1341.0,4,0.1%,
1342.0,4,0.1%,
1343.0,4,0.1%,
1344.0,4,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.08501

0,1
0.0,4456
1.0,414

Value,Count,Frequency (%),Unnamed: 3
0.0,4456,91.5%,
1.0,414,8.5%,

0,1
Distinct count,119
Unique (%),2.4%
Missing (%),86.5%
Missing (n),4212
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,755.42
Minimum,0.0
Maximum,12208.0
Zeros (%),0.3%

0,1
Minimum,0.0
5-th percentile,30.0
Q1,97.0
Median,200.0
Q3,500.0
95-th percentile,4528.0
Maximum,12208.0
Range,12208.0
Interquartile range,403.0

0,1
Standard deviation,1665.2
Coef of variation,2.2043
Kurtosis,15.1594
Mean,755.42
MAD,897.9
Skewness,3.75676
Sum,497064.0
Variance,2772800
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
100.0,32,0.7%,
500.0,30,0.6%,
60.0,20,0.4%,
150.0,20,0.4%,
40.0,20,0.4%,
120.0,18,0.4%,
0.0,16,0.3%,
250.0,16,0.3%,
65.0,14,0.3%,
175.0,12,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,16,0.3%,
14.0,4,0.1%,
20.0,2,0.0%,
28.0,4,0.1%,
30.0,12,0.2%,

Value,Count,Frequency (%),Unnamed: 3
6829.0,4,0.1%,
7200.0,4,0.1%,
8504.0,6,0.1%,
8577.0,4,0.1%,
12208.0,2,0.0%,

0,1
Distinct count,132
Unique (%),2.7%
Missing (%),87.5%
Missing (n),4262
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,196.05
Minimum,0.0
Maximum,10500.0
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,5.0
Q1,13.0
Median,34.9
Q3,86.7
95-th percentile,637.6
Maximum,10500.0
Range,10500.0
Interquartile range,73.7

0,1
Standard deviation,896.51
Coef of variation,4.5729
Kurtosis,113.977
Mean,196.05
MAD,271.93
Skewness,10.2854
Sum,119199.0
Variance,803730
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
17.5,20,0.4%,
300.0,14,0.3%,
8.69999980927,14,0.3%,
14.3000001907,12,0.2%,
75.0,12,0.2%,
37.5,12,0.2%,
5.0,12,0.2%,
40.0,10,0.2%,
18.0,8,0.2%,
133.399993896,8,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,4,0.1%,
2.40000009537,2,0.0%,
2.5,6,0.1%,
4.0,4,0.1%,
4.90000009537,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1000.0,4,0.1%,
1214.59997559,4,0.1%,
1432.0,6,0.1%,
3724.80004883,2,0.0%,
10500.0,4,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.26776

0,1
0.0,3566
1.0,1304

Value,Count,Frequency (%),Unnamed: 3
0.0,3566,73.2%,
1.0,1304,26.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.080493

0,1
0.0,4478
1.0,392

Value,Count,Frequency (%),Unnamed: 3
0.0,4478,92.0%,
1.0,392,8.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
c,2704
b,2166

Value,Count,Frequency (%),Unnamed: 3
c,2704,55.5%,
b,2166,44.5%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.71951

0,1
1.0,3504
0.0,1366

Value,Count,Frequency (%),Unnamed: 3
1.0,3504,72.0%,
0.0,1366,28.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.43717

0,1
0.0,2741
1.0,2129

Value,Count,Frequency (%),Unnamed: 3
0.0,2741,56.3%,
1.0,2129,43.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.82053

0,1
1,3996
0,874

Value,Count,Frequency (%),Unnamed: 3
1,3996,82.1%,
0,874,17.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.12485

0,1
0.0,4262
1.0,608

Value,Count,Frequency (%),Unnamed: 3
0.0,4262,87.5%,
1.0,608,12.5%,

0,1
Distinct count,5
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.6185
Minimum,0
Maximum,4
Zeros (%),0.9%

0,1
Minimum,0
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,4
Maximum,4
Range,4
Interquartile range,1

0,1
Standard deviation,0.715
Coef of variation,0.1976
Kurtosis,6.3362
Mean,3.6185
MAD,0.54901
Skewness,-2.3061
Sum,17622
Variance,0.51122
Memory size,42.8 KiB

Value,Count,Frequency (%),Unnamed: 3
4,3504,72.0%,
3,1006,20.7%,
2,274,5.6%,
0,46,0.9%,
1,40,0.8%,

Value,Count,Frequency (%),Unnamed: 3
0,46,0.9%,
1,40,0.8%,
2,274,5.6%,
3,1006,20.7%,
4,3504,72.0%,

Value,Count,Frequency (%),Unnamed: 3
0,46,0.9%,
1,40,0.8%,
2,274,5.6%,
3,1006,20.7%,
4,3504,72.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.10678

0,1
0.0,4350
1.0,520

Value,Count,Frequency (%),Unnamed: 3
0.0,4350,89.3%,
1.0,520,10.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.47926

0,1
0,2536
1,2334

Value,Count,Frequency (%),Unnamed: 3
0,2536,52.1%,
1,2334,47.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.44805

0,1
0,2688
1,2182

Value,Count,Frequency (%),Unnamed: 3
0,2688,55.2%,
1,2182,44.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.29117

0,1
0.0,3452
1.0,1418

Value,Count,Frequency (%),Unnamed: 3
0.0,3452,70.9%,
1.0,1418,29.1%,

0,1
Distinct count,13
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0

0,1
,2746
some,1064
2,356
Other values (10),704

Value,Count,Frequency (%),Unnamed: 3
,2746,56.4%,
some,1064,21.8%,
2,356,7.3%,
3,331,6.8%,
5,163,3.3%,
1,142,2.9%,
10,18,0.4%,
7,12,0.2%,
8,10,0.2%,
4,8,0.2%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.43532

0,1
0.0,2750
1.0,2120

Value,Count,Frequency (%),Unnamed: 3
0.0,2750,56.5%,
1.0,2120,43.5%,

0,1
Distinct count,3
Unique (%),0.1%
Missing (%),36.3%
Missing (n),1768
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.11476
Minimum,0.0
Maximum,1.0
Zeros (%),56.4%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0
95-th percentile,1.0
Maximum,1.0
Range,1.0
Interquartile range,0.0

0,1
Standard deviation,0.31879
Coef of variation,2.7778
Kurtosis,3.85126
Mean,0.11476
MAD,0.20319
Skewness,2.41843
Sum,356.0
Variance,0.10163
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,2746,56.4%,
1.0,356,7.3%,
(Missing),1768,36.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2746,56.4%,
1.0,356,7.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2746,56.4%,
1.0,356,7.3%,

0,1
Distinct count,36
Unique (%),0.7%
Missing (%),0.0%
Missing (n),0

0,1
Tamika,256
Anne,242
Allison,232
Other values (33),4140

Value,Count,Frequency (%),Unnamed: 3
Tamika,256,5.3%,
Anne,242,5.0%,
Allison,232,4.8%,
Latonya,230,4.7%,
Emily,227,4.7%,
Latoya,226,4.6%,
Kristen,213,4.4%,
Ebony,208,4.3%,
Tanisha,207,4.3%,
Jill,203,4.2%,

0,1
Distinct count,63
Unique (%),1.3%
Missing (%),1.8%
Missing (n),86
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.31083
Minimum,0.0
Maximum,0.992043
Zeros (%),0.6%

0,1
Minimum,0.0
5-th percentile,0.0033359
Q1,0.045275
Median,0.15995
Q3,0.51685
95-th percentile,0.97708
Maximum,0.992043
Range,0.992043
Interquartile range,0.47158

0,1
Standard deviation,0.33247
Coef of variation,1.0696
Kurtosis,-0.597835
Mean,0.31083
MAD,0.28282
Skewness,0.905349
Sum,1487.0
Variance,0.11054
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.00922433286905,148,3.0%,
0.189535841346,145,3.0%,
0.586862146854,144,3.0%,
0.0186697430909,144,3.0%,
0.00333586055785,142,2.9%,
0.0223919916898,139,2.9%,
0.144843429327,137,2.8%,
0.303463429213,130,2.7%,
0.436394661665,127,2.6%,
0.0498238056898,125,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,29,0.6%,
0.000904412940145,34,0.7%,
0.00217499886639,57,1.2%,
0.00279568345286,80,1.6%,
0.00333586055785,142,2.9%,

Value,Count,Frequency (%),Unnamed: 3
0.977077245712,61,1.3%,
0.988519906998,69,1.4%,
0.989359557629,23,0.5%,
0.989495635033,71,1.5%,
0.992042720318,65,1.3%,

0,1
Distinct count,224
Unique (%),4.6%
Missing (%),60.6%
Missing (n),2952
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.079096
Minimum,0.0
Maximum,0.98936
Zeros (%),1.3%

0,1
Minimum,0.0
5-th percentile,0.00096988
Q1,0.007125
Median,0.017404
Q3,0.089956
95-th percentile,0.33754
Maximum,0.98936
Range,0.98936
Interquartile range,0.082831

0,1
Standard deviation,0.14974
Coef of variation,1.8932
Kurtosis,14.6298
Mean,0.079096
MAD,0.089157
Skewness,3.57253
Sum,151.705
Variance,0.022423
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.125,74,1.5%,
0.0452748835087,72,1.5%,
0.0,62,1.3%,
0.0899563282728,46,0.9%,
0.00403025187552,32,0.7%,
0.336165100336,32,0.7%,
0.00375719554722,32,0.7%,
0.011153427884,32,0.7%,
0.0114176971838,32,0.7%,
0.0121448710561,32,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,62,1.3%,
0.00039625930367,16,0.3%,
0.000575746642426,4,0.1%,
0.000646621396299,4,0.1%,
0.000795498664957,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.832138895988,4,0.1%,
0.9452906847,4,0.1%,
0.954597353935,4,0.1%,
0.971468806267,4,0.1%,
0.989359557629,4,0.1%,

0,1
Distinct count,63
Unique (%),1.3%
Missing (%),1.8%
Missing (n),86
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.21382
Minimum,0.0308469
Maximum,0.780124
Zeros (%),0.0%

0,1
Minimum,0.0308469
5-th percentile,0.036977
Q1,0.092559
Median,0.14505
Q3,0.28431
95-th percentile,0.59001
Maximum,0.780124
Range,0.749277
Interquartile range,0.19176

0,1
Standard deviation,0.1693
Coef of variation,0.79182
Kurtosis,0.808599
Mean,0.21382
MAD,0.13487
Skewness,1.29267
Sum,1022.91
Variance,0.028664
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.145394697785,148,3.0%,
0.136053428054,145,3.0%,
0.0925594270229,144,3.0%,
0.132418602705,144,3.0%,
0.328430622816,142,2.9%,
0.187565863132,139,2.9%,
0.263202995062,137,2.8%,
0.119755521417,130,2.7%,
0.144759297371,127,2.6%,
0.550802648067,125,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0308468621224,96,2.0%,
0.0357846468687,105,2.2%,
0.0369772836566,54,1.1%,
0.0376615077257,23,0.5%,
0.0481537804008,71,1.5%,

Value,Count,Frequency (%),Unnamed: 3
0.550802648067,125,2.6%,
0.590010762215,77,1.6%,
0.591695487499,25,0.5%,
0.670837879181,117,2.4%,
0.780124247074,29,0.6%,

0,1
Distinct count,232
Unique (%),4.8%
Missing (%),60.6%
Missing (n),2952
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.33387
Minimum,0.0308469
Maximum,0.892857
Zeros (%),0.0%

0,1
Minimum,0.0308469
5-th percentile,0.094114
Q1,0.20197
Median,0.28841
Q3,0.41235
95-th percentile,0.74648
Maximum,0.892857
Range,0.86201
Interquartile range,0.21038

0,1
Standard deviation,0.19201
Coef of variation,0.57511
Kurtosis,1.04001
Mean,0.33387
MAD,0.14803
Skewness,1.1106
Sum,640.368
Variance,0.036869
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.892857134342,74,1.5%,
0.590010762215,72,1.5%,
0.606768548489,46,0.9%,
0.394765645266,32,0.7%,
0.406575918198,32,0.7%,
0.146427214146,32,0.7%,
0.406993329525,32,0.7%,
0.286532193422,32,0.7%,
0.670837879181,32,0.7%,
0.522528707981,28,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0308468621224,4,0.1%,
0.0357846468687,2,0.0%,
0.0369772836566,8,0.2%,
0.0376615077257,4,0.1%,
0.0559815950692,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.606768548489,46,0.9%,
0.670837879181,32,0.7%,
0.74647885561,10,0.2%,
0.780124247074,20,0.4%,
0.892857134342,74,1.5%,

0,1
Distinct count,63
Unique (%),1.3%
Missing (%),1.8%
Missing (n),86
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.18567
Minimum,0.0
Maximum,0.356164
Zeros (%),0.6%

0,1
Minimum,0.0
5-th percentile,0.029652
Q1,0.13971
Median,0.19075
Q3,0.2382
95-th percentile,0.3048
Maximum,0.356164
Range,0.356164
Interquartile range,0.098484

0,1
Standard deviation,0.081747
Coef of variation,0.44027
Kurtosis,-0.388231
Mean,0.18567
MAD,0.063922
Skewness,-0.285851
Sum,888.254
Variance,0.0066826
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.17773129046,148,3.0%,
0.173128113151,145,3.0%,
0.180472582579,144,3.0%,
0.274885416031,144,3.0%,
0.190750569105,142,2.9%,
0.356164395809,139,2.9%,
0.0688580796123,137,2.8%,
0.189628824592,130,2.7%,
0.180402874947,127,2.6%,
0.0583398602903,125,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,29,0.6%,
0.0156555771828,121,2.5%,
0.0296517945826,117,2.4%,
0.030812073499,77,1.6%,
0.0583398602903,125,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.299338251352,84,1.7%,
0.299457043409,57,1.2%,
0.30479863286,54,1.1%,
0.312872767448,96,2.0%,
0.356164395809,139,2.9%,

0,1
Distinct count,231
Unique (%),4.7%
Missing (%),60.6%
Missing (n),2952
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.10169
Minimum,0.0
Maximum,0.356164
Zeros (%),1.9%

0,1
Minimum,0.0
5-th percentile,0.01454
Q1,0.047958
Median,0.087009
Q3,0.14264
95-th percentile,0.2384
Maximum,0.356164
Range,0.356164
Interquartile range,0.094678

0,1
Standard deviation,0.071293
Coef of variation,0.70106
Kurtosis,0.831893
Mean,0.10169
MAD,0.055437
Skewness,0.996135
Sum,195.045
Variance,0.0050826
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,94,1.9%,
0.030812073499,72,1.5%,
0.0279475990683,46,0.9%,
0.049830827862,32,0.7%,
0.0870093256235,32,0.7%,
0.157457515597,32,0.7%,
0.108847863972,32,0.7%,
0.0296517945826,32,0.7%,
0.168411031365,32,0.7%,
0.110030688345,28,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,94,1.9%,
0.0145399849862,4,0.1%,
0.0147456638515,4,0.1%,
0.0156555771828,24,0.5%,
0.0160771701485,16,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.299338251352,12,0.2%,
0.300435423851,4,0.1%,
0.30479863286,8,0.2%,
0.312872767448,4,0.1%,
0.356164395809,12,0.2%,

0,1
Distinct count,63
Unique (%),1.3%
Missing (%),1.8%
Missing (n),86
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.54277
Minimum,0.00481429
Maximum,0.981653
Zeros (%),0.0%

0,1
Minimum,0.00481429
5-th percentile,0.0141
Q1,0.25216
Median,0.57183
Q3,0.8738
95-th percentile,0.97552
Maximum,0.981653
Range,0.976839
Interquartile range,0.62164

0,1
Standard deviation,0.32947
Coef of variation,0.60701
Kurtosis,-1.36606
Mean,0.54277
MAD,0.29471
Skewness,-0.234899
Sum,2596.6
Variance,0.10855
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.965914547443,148,3.0%,
0.665601849556,145,3.0%,
0.337866216898,144,3.0%,
0.873804688454,144,3.0%,
0.981652796268,142,2.9%,
0.221549004316,139,2.9%,
0.716077268124,137,2.8%,
0.471891522408,130,2.7%,
0.333424389362,127,2.6%,
0.883613944054,125,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00481428811327,65,1.3%,
0.0048631252721,71,1.5%,
0.00550022022799,23,0.5%,
0.00814846530557,69,1.4%,
0.0140997087583,61,1.3%,

Value,Count,Frequency (%),Unnamed: 3
0.96273291111,29,0.6%,
0.965914547443,148,3.0%,
0.973675429821,60,1.2%,
0.975516855717,117,2.4%,
0.981652796268,142,2.9%,

0,1
Distinct count,232
Unique (%),4.8%
Missing (%),60.6%
Missing (n),2952
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.84376
Minimum,0.00550022
Maximum,1.0
Zeros (%),0.0%

0,1
Minimum,0.00550022
5-th percentile,0.46474
Q1,0.82414
Median,0.90073
Q3,0.95636
95-th percentile,0.97978
Maximum,1.0
Range,0.9945
Interquartile range,0.13222

0,1
Standard deviation,0.18299
Coef of variation,0.21687
Kurtosis,5.72233
Mean,0.84376
MAD,0.12442
Skewness,-2.35613
Sum,1618.34
Variance,0.033486
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
0.875,74,1.5%,
0.898176431656,72,1.5%,
0.862882077694,46,0.9%,
0.937150299549,32,0.7%,
0.900727331638,32,0.7%,
0.637369632721,32,0.7%,
0.926485240459,32,0.7%,
0.975516855717,32,0.7%,
0.955994307995,32,0.7%,
0.791502475739,28,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00550022022799,4,0.1%,
0.024071501568,4,0.1%,
0.0428035445511,4,0.1%,
0.0483303107321,4,0.1%,
0.0964076370001,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.99280667305,4,0.1%,
0.993617892265,4,0.1%,
0.996473073959,4,0.1%,
0.997173130512,4,0.1%,
1.0,10,0.2%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.50226

0,1
1.0,2446
0.0,2424

Value,Count,Frequency (%),Unnamed: 3
1.0,2446,50.2%,
0.0,2424,49.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.052772

0,1
0,4613
1,257

Value,Count,Frequency (%),Unnamed: 3
0,4613,94.7%,
1,257,5.3%,

0,1
Distinct count,289
Unique (%),5.9%
Missing (%),0.0%
Missing (n),0

0,1
b,3364
a,358
147,4
Other values (286),1144

Value,Count,Frequency (%),Unnamed: 3
b,3364,69.1%,
a,358,7.4%,
147,4,0.1%,
64,4,0.1%,
86,4,0.1%,
175,4,0.1%,
373,4,0.1%,
301,4,0.1%,
19,4,0.1%,
213,4,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
a,2792
s,2078

Value,Count,Frequency (%),Unnamed: 3
a,2792,57.3%,
s,2078,42.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.49774

0,1
0.0,2446
1.0,2424

Value,Count,Frequency (%),Unnamed: 3
0.0,2446,50.2%,
1.0,2424,49.8%,

0,1
Correlation,0.93815

0,1
Distinct count,231
Unique (%),4.7%
Missing (%),60.6%
Missing (n),2952
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.032
Minimum,8.66251
Maximum,11.3624
Zeros (%),0.0%

0,1
Minimum,8.66251
5-th percentile,9.2644
Q1,9.6915
Median,9.9144
Q3,10.387
95-th percentile,11.079
Maximum,11.3624
Range,2.69993
Interquartile range,0.6954

0,1
Standard deviation,0.56782
Coef of variation,0.056603
Kurtosis,0.110862
Mean,10.032
MAD,0.44115
Skewness,0.498825
Sum,19240.4
Variance,0.32242
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
11.3624401093,74,1.5%,
11.0786600113,72,1.5%,
10.9789848328,46,0.9%,
10.5187005997,32,0.7%,
10.8619174957,32,0.7%,
10.4121408463,32,0.7%,
10.3869314194,32,0.7%,
9.68707084656,32,0.7%,
9.92000007629,32,0.7%,
9.70436573029,28,0.6%,

Value,Count,Frequency (%),Unnamed: 3
8.66250514984,10,0.2%,
8.69567394257,10,0.2%,
8.70632457733,4,0.1%,
8.72826385498,4,0.1%,
8.73327159882,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
10.9789848328,46,0.9%,
10.9970035553,2,0.0%,
11.0397491455,20,0.4%,
11.0786600113,72,1.5%,
11.3624401093,74,1.5%,

0,1
Distinct count,63
Unique (%),1.3%
Missing (%),1.8%
Missing (n),86
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.147
Minimum,8.84174
Maximum,11.1193
Zeros (%),0.0%

0,1
Minimum,8.84174
5-th percentile,9.6143
Q1,9.9651
Median,10.144
Q3,10.343
95-th percentile,10.753
Maximum,11.1193
Range,2.27755
Interquartile range,0.37782

0,1
Standard deviation,0.34578
Coef of variation,0.034076
Kurtosis,1.75209
Mean,10.147
MAD,0.26801
Skewness,-0.597916
Sum,48544.1
Variance,0.11956
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
10.1440782547,148,3.0%,
10.3428707123,145,3.0%,
10.2910604477,144,3.0%,
10.0399837494,144,3.0%,
10.2996768951,142,2.9%,
9.61433792114,139,2.9%,
9.96777629852,137,2.8%,
10.2307024002,130,2.7%,
9.96744823456,127,2.6%,
10.4836902618,125,2.6%,

Value,Count,Frequency (%),Unnamed: 3
8.84173774719,65,1.3%,
9.21463108063,20,0.4%,
9.47431850433,71,1.5%,
9.52748394012,23,0.5%,
9.61433792114,139,2.9%,

Value,Count,Frequency (%),Unnamed: 3
10.6310606003,65,1.3%,
10.7526979446,117,2.4%,
10.7584133148,34,0.7%,
10.7951784134,77,1.6%,
11.1192903519,29,0.6%,

0,1
Distinct count,231
Unique (%),4.7%
Missing (%),60.8%
Missing (n),2962
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.656
Minimum,9.17025
Maximum,11.8143
Zeros (%),0.0%

0,1
Minimum,9.17025
5-th percentile,9.842
Q1,10.449
Median,10.666
Q3,10.867
95-th percentile,11.45
Maximum,11.8143
Range,2.64406
Interquartile range,0.41847

0,1
Standard deviation,0.44193
Coef of variation,0.041474
Kurtosis,1.39787
Mean,10.656
MAD,0.31527
Skewness,0.00443649
Sum,20331.0
Variance,0.1953
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
11.8143110275,74,1.5%,
10.7951784134,72,1.5%,
10.9872741699,46,0.9%,
10.9300489426,32,0.7%,
10.5110492706,32,0.7%,
11.1297597885,32,0.7%,
10.7526979446,32,0.7%,
10.5610589981,32,0.7%,
10.4319076538,32,0.7%,
10.5978088379,28,0.6%,

Value,Count,Frequency (%),Unnamed: 3
9.17024707794,4,0.1%,
9.21463108063,10,0.2%,
9.29743480682,4,0.1%,
9.52748394012,4,0.1%,
9.61433792114,12,0.2%,

Value,Count,Frequency (%),Unnamed: 3
11.4502515793,4,0.1%,
11.457444191,8,0.2%,
11.463924408,10,0.2%,
11.6264238358,2,0.0%,
11.8143110275,74,1.5%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15216

0,1
0.0,4129
1.0,741

Value,Count,Frequency (%),Unnamed: 3
0.0,4129,84.8%,
1.0,741,15.2%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.082957

0,1
0.0,4466
1.0,404

Value,Count,Frequency (%),Unnamed: 3
0.0,4466,91.7%,
1.0,404,8.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.097125

0,1
0,4397
1,473

Value,Count,Frequency (%),Unnamed: 3
0,4397,90.3%,
1,473,9.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16509

0,1
0.0,4066
1.0,804

Value,Count,Frequency (%),Unnamed: 3
0.0,4066,83.5%,
1.0,804,16.5%,

0,1
Distinct count,5
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.4815
Minimum,1
Maximum,6
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,1
Median,4
Q3,6
95-th percentile,6
Maximum,6
Range,5
Interquartile range,5

0,1
Standard deviation,2.038
Coef of variation,0.58539
Kurtosis,-1.5737
Mean,3.4815
MAD,1.8357
Skewness,-0.13061
Sum,16955
Variance,4.1536
Memory size,42.8 KiB

Value,Count,Frequency (%),Unnamed: 3
1,1770,36.3%,
6,1248,25.6%,
4,1241,25.5%,
5,450,9.2%,
3,161,3.3%,

Value,Count,Frequency (%),Unnamed: 3
1,1770,36.3%,
3,161,3.3%,
4,1241,25.5%,
5,450,9.2%,
6,1248,25.6%,

Value,Count,Frequency (%),Unnamed: 3
1,1770,36.3%,
3,161,3.3%,
4,1241,25.5%,
5,450,9.2%,
6,1248,25.6%,

0,1
Distinct count,61
Unique (%),1.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,215.64
Minimum,7
Maximum,903
Zeros (%),0.0%

0,1
Minimum,7
5-th percentile,13
Q1,27
Median,267
Q3,313
95-th percentile,387
Maximum,903
Range,896
Interquartile range,286

0,1
Standard deviation,148.13
Coef of variation,0.68693
Kurtosis,0.25317
Mean,215.64
MAD,127.53
Skewness,0.049522
Sum,1050156
Variance,21942
Memory size,47.6 KiB

Value,Count,Frequency (%),Unnamed: 3
313,527,10.8%,
13,504,10.3%,
21,353,7.2%,
285,348,7.1%,
267,342,7.0%,
274,243,5.0%,
379,238,4.9%,
316,222,4.6%,
34,204,4.2%,
27,162,3.3%,

Value,Count,Frequency (%),Unnamed: 3
7,32,0.7%,
8,9,0.2%,
9,1,0.0%,
13,504,10.3%,
17,137,2.8%,

Value,Count,Frequency (%),Unnamed: 3
448,6,0.1%,
461,10,0.2%,
785,33,0.7%,
804,4,0.1%,
903,3,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.11869

0,1
0.0,4292
1.0,578

Value,Count,Frequency (%),Unnamed: 3
0.0,4292,88.1%,
1.0,578,11.9%,

0,1
Distinct count,7
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.6614
Minimum,1
Maximum,7
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,6
Maximum,7
Range,6
Interquartile range,1

0,1
Standard deviation,1.2191
Coef of variation,0.33297
Kurtosis,-0.29039
Mean,3.6614
MAD,0.98871
Skewness,0.25708
Sum,17831
Variance,1.4863
Memory size,42.8 KiB

Value,Count,Frequency (%),Unnamed: 3
4,1611,33.1%,
3,1429,29.3%,
2,704,14.5%,
5,533,10.9%,
6,464,9.5%,
1,110,2.3%,
7,19,0.4%,

Value,Count,Frequency (%),Unnamed: 3
1,110,2.3%,
2,704,14.5%,
3,1429,29.3%,
4,1611,33.1%,
5,533,10.9%,

Value,Count,Frequency (%),Unnamed: 3
3,1429,29.3%,
4,1611,33.1%,
5,533,10.9%,
6,464,9.5%,
7,19,0.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.07269

0,1
0.0,4516
1.0,354

Value,Count,Frequency (%),Unnamed: 3
0.0,4516,92.7%,
1.0,354,7.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15483

0,1
0.0,4116
1.0,754

Value,Count,Frequency (%),Unnamed: 3
0.0,4116,84.5%,
1.0,754,15.5%,

0,1
Distinct count,4
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Private,2134
,1992
Public,426

Value,Count,Frequency (%),Unnamed: 3
Private,2134,43.8%,
,1992,40.9%,
Public,426,8.7%,
Nonprofit,318,6.5%,

0,1
Distinct count,251
Unique (%),5.2%
Missing (%),64.6%
Missing (n),3148
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2287.1
Minimum,0.0
Maximum,124500.0
Zeros (%),0.7%

0,1
Minimum,0.0
5-th percentile,20.0
Q1,98.0
Median,220.0
Q3,700.0
95-th percentile,10000.0
Maximum,124500.0
Range,124500.0
Interquartile range,602.0

0,1
Standard deviation,8902.8
Coef of variation,3.8927
Kurtosis,91.3115
Mean,2287.1
MAD,3421.4
Skewness,8.31184
Sum,3.9383e+06
Variance,79261000
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
150.0,54,1.1%,
100.0,48,1.0%,
500.0,40,0.8%,
60.0,32,0.7%,
0.0,32,0.7%,
120.0,30,0.6%,
125.0,30,0.6%,
40.0,28,0.6%,
400.0,28,0.6%,
200.0,28,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,32,0.7%,
1.0,4,0.1%,
3.0,4,0.1%,
9.0,4,0.1%,
10.0,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
43800.0,4,0.1%,
48000.0,2,0.0%,
48900.0,4,0.1%,
60657.0,4,0.1%,
124500.0,4,0.1%,

0,1
Distinct count,322
Unique (%),6.6%
Missing (%),65.7%
Missing (n),3198
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,587.69
Minimum,0.0
Maximum,47947.6
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,5.2
Q1,12.975
Median,33.35
Q3,133.1
95-th percentile,2589.7
Maximum,47947.6
Range,47947.6
Interquartile range,120.12

0,1
Standard deviation,2907.6
Coef of variation,4.9476
Kurtosis,175.72
Mean,587.69
MAD,909.14
Skewness,11.9704
Sum,982612.0
Variance,8454300
Memory size,57.1 KiB

Value,Count,Frequency (%),Unnamed: 3
37.5,36,0.7%,
17.5,34,0.7%,
25.0,26,0.5%,
75.0,22,0.5%,
5.0,20,0.4%,
8.69999980927,18,0.4%,
10.0,16,0.3%,
9.80000019073,16,0.3%,
16.8999996185,16,0.3%,
5.19999980927,16,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,4,0.1%,
0.10000000149,4,0.1%,
0.300000011921,8,0.2%,
0.40000000596,4,0.1%,
1.0,4,0.1%,

Value,Count,Frequency (%),Unnamed: 3
10500.0,4,0.1%,
14268.0,2,0.0%,
15479.5996094,4,0.1%,
21784.5,4,0.1%,
47947.6015625,4,0.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
b,2435
w,2435

Value,Count,Frequency (%),Unnamed: 3
b,2435,50.0%,
w,2435,50.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.78727

0,1
1.0,3834
0.0,1036

Value,Count,Frequency (%),Unnamed: 3
1.0,3834,78.7%,
0.0,1036,21.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16797

0,1
0.0,4052
1.0,818

Value,Count,Frequency (%),Unnamed: 3
0.0,4052,83.2%,
1.0,818,16.8%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15113

0,1
0.0,4134
1.0,736

Value,Count,Frequency (%),Unnamed: 3
0.0,4134,84.9%,
1.0,736,15.1%,

0,1
Distinct count,4
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
,4350
somcol,252
colp,222

Value,Count,Frequency (%),Unnamed: 3
,4350,89.3%,
somcol,252,5.2%,
colp,222,4.6%,
hsg,46,0.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.33285

0,1
0.0,3249
1.0,1621

Value,Count,Frequency (%),Unnamed: 3
0.0,3249,66.7%,
1.0,1621,33.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
f,3746
m,1124

Value,Count,Frequency (%),Unnamed: 3
f,3746,76.9%,
m,1124,23.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.32875

0,1
0,3269
1,1601

Value,Count,Frequency (%),Unnamed: 3
0,3269,67.1%,
1,1601,32.9%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.077207

0,1
0.0,4494
1.0,376

Value,Count,Frequency (%),Unnamed: 3
0.0,4494,92.3%,
1.0,376,7.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.21396

0,1
0.0,3828
1.0,1042

Value,Count,Frequency (%),Unnamed: 3
0.0,3828,78.6%,
1.0,1042,21.4%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.03039

0,1
0.0,4722
1.0,148

Value,Count,Frequency (%),Unnamed: 3
0.0,4722,97.0%,
1.0,148,3.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.4115

0,1
0,2866
1,2004

Value,Count,Frequency (%),Unnamed: 3
0,2866,58.9%,
1,2004,41.1%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.55955

0,1
1,2725
0,2145

Value,Count,Frequency (%),Unnamed: 3
1,2725,56.0%,
0,2145,44.0%,

0,1
Distinct count,26
Unique (%),0.5%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,7.8429
Minimum,1
Maximum,44
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,5
Median,6
Q3,9
95-th percentile,19
Maximum,44
Range,43
Interquartile range,4

0,1
Standard deviation,5.0446
Coef of variation,0.64321
Kurtosis,3.2811
Mean,7.8429
MAD,3.602
Skewness,1.6855
Sum,38195
Variance,25.448
Memory size,42.8 KiB

Value,Count,Frequency (%),Unnamed: 3
6,817,16.8%,
8,578,11.9%,
7,541,11.1%,
4,537,11.0%,
5,507,10.4%,
2,352,7.2%,
3,194,4.0%,
11,173,3.6%,
9,159,3.3%,
13,154,3.2%,

Value,Count,Frequency (%),Unnamed: 3
1,45,0.9%,
2,352,7.2%,
3,194,4.0%,
4,537,11.0%,
5,507,10.4%,

Value,Count,Frequency (%),Unnamed: 3
22,8,0.2%,
23,9,0.2%,
25,7,0.1%,
26,104,2.1%,
44,1,0.0%,

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,email,computerskills,specialskills,firstname,sex,race,h,l,call,city,kind,adid,fracblack,fracwhite,lmedhhinc,fracdropout,fraccolp,linc,col,expminreq,schoolreq,eoe,parent_sales,parent_emp,branch_sales,branch_emp,fed,fracblack_empzip,fracwhite_empzip,lmedhhinc_empzip,fracdropout_empzip,fraccolp_empzip,linc_empzip,manager,supervisor,secretary,offsupport,salesrep,retailsales,req,expreq,comreq,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,1,0,0,1,0,Allison,f,w,0.0,1.0,0.0,c,a,384.0,0.98936,0.0055,9.527484,0.274151,0.037662,8.706325,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,6,1,1,1,0,Kristen,f,w,1.0,0.0,0.0,c,a,384.0,0.080736,0.888374,10.408828,0.233687,0.087285,9.532859,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,1,1,0,1,0,Lakisha,f,b,0.0,1.0,0.0,c,a,384.0,0.104301,0.83737,10.466754,0.101335,0.591695,10.540329,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,5,0,1,1,1,Latonya,f,b,1.0,0.0,0.0,c,a,384.0,0.336165,0.63737,10.431908,0.108848,0.406576,10.412141,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,5,1,1,1,0,Carrie,f,w,1.0,0.0,0.0,c,a,385.0,0.397595,0.180196,9.876219,0.312873,0.030847,8.728264,0.0,some,,1.0,9.4,143.0,9.4,143.0,0.0,0.204764,0.727046,10.619399,0.070493,0.369903,10.007352,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

This dataset seems to violate the central limit theorem and does not accurately represent the population as a whole.

2-sample t-test would be effective for this dataset if the dataset was more accureately collected.

null hypothesis:  There is no stastical signficance related to the number of callbacks for black or white people who submitted resumes.

alternate hypothesis:  There is a stastical signficance related to the number of callbacks for black or white people who submitted resumes.

In [8]:
w = data[data.race=='w']
b = data[data.race=='b']

In [9]:
pandas_profiling.ProfileReport(w)

0,1
Number of variables,66
Number of observations,2435
Total Missing (%),10.8%
Total size in memory,642.1 KiB
Average record size in memory,270.0 B

0,1
Numeric,23
Categorical,9
Boolean,32
Date,0
Text (Unique),0
Rejected,2
Unsupported,0

0,1
Distinct count,303
Unique (%),12.4%
Missing (%),0.0%
Missing (n),0

0,1
3,110
4,102
5,102
Other values (300),2121

Value,Count,Frequency (%),Unnamed: 3
3,110,4.5%,
4,102,4.2%,
5,102,4.2%,
6,100,4.1%,
1,100,4.1%,
2,98,4.0%,
8,93,3.8%,
7,93,3.8%,
9,92,3.8%,
10,89,3.7%,

0,1
Distinct count,1323
Unique (%),54.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,651.78
Minimum,1.0
Maximum,1344.0
Zeros (%),0.0%

0,1
Minimum,1.0
5-th percentile,61.7
Q1,306.5
Median,647.0
Q3,979.5
95-th percentile,1273.3
Maximum,1344.0
Range,1343.0
Interquartile range,673.0

0,1
Standard deviation,388.73
Coef of variation,0.59642
Kurtosis,-1.19236
Mean,651.78
MAD,335.84
Skewness,0.0567972
Sum,1.58708e+06
Variance,151110
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
1275.0,2,0.1%,
1158.0,2,0.1%,
1113.0,2,0.1%,
1247.0,2,0.1%,
1277.0,2,0.1%,
1315.0,2,0.1%,
1207.0,2,0.1%,
643.0,2,0.1%,
485.0,2,0.1%,
733.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1.0,2,0.1%,
2.0,2,0.1%,
3.0,2,0.1%,
4.0,2,0.1%,
5.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1340.0,2,0.1%,
1341.0,2,0.1%,
1342.0,2,0.1%,
1343.0,2,0.1%,
1344.0,2,0.1%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.08501

0,1
0.0,2228
1.0,207

Value,Count,Frequency (%),Unnamed: 3
0.0,2228,91.5%,
1.0,207,8.5%,

0,1
Distinct count,119
Unique (%),4.9%
Missing (%),86.5%
Missing (n),2106
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,755.42
Minimum,0.0
Maximum,12208.0
Zeros (%),0.3%

0,1
Minimum,0.0
5-th percentile,30.0
Q1,97.0
Median,200.0
Q3,500.0
95-th percentile,4528.0
Maximum,12208.0
Range,12208.0
Interquartile range,403.0

0,1
Standard deviation,1666.4
Coef of variation,2.206
Kurtosis,15.2849
Mean,755.42
MAD,897.9
Skewness,3.76538
Sum,248532.0
Variance,2777000
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
100.0,16,0.7%,
500.0,15,0.6%,
150.0,10,0.4%,
60.0,10,0.4%,
40.0,10,0.4%,
120.0,9,0.4%,
0.0,8,0.3%,
250.0,8,0.3%,
65.0,7,0.3%,
30.0,6,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,8,0.3%,
14.0,2,0.1%,
20.0,1,0.0%,
28.0,2,0.1%,
30.0,6,0.2%,

Value,Count,Frequency (%),Unnamed: 3
6829.0,2,0.1%,
7200.0,2,0.1%,
8504.0,3,0.1%,
8577.0,2,0.1%,
12208.0,1,0.0%,

0,1
Distinct count,132
Unique (%),5.4%
Missing (%),87.5%
Missing (n),2131
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,196.05
Minimum,0.0
Maximum,10500.0
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,5.075
Q1,13.0
Median,34.9
Q3,86.7
95-th percentile,637.6
Maximum,10500.0
Range,10500.0
Interquartile range,73.7

0,1
Standard deviation,897.25
Coef of variation,4.5766
Kurtosis,114.933
Mean,196.05
MAD,271.93
Skewness,10.3109
Sum,59599.4
Variance,805060
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
17.5,10,0.4%,
8.69999980927,7,0.3%,
300.0,7,0.3%,
75.0,6,0.2%,
5.0,6,0.2%,
37.5,6,0.2%,
14.3000001907,6,0.2%,
40.0,5,0.2%,
47.2000007629,4,0.2%,
133.399993896,4,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2,0.1%,
2.40000009537,1,0.0%,
2.5,3,0.1%,
4.0,2,0.1%,
4.90000009537,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1000.0,2,0.1%,
1214.59997559,2,0.1%,
1432.0,3,0.1%,
3724.80004883,1,0.0%,
10500.0,2,0.1%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.26776

0,1
0.0,1783
1.0,652

Value,Count,Frequency (%),Unnamed: 3
0.0,1783,73.2%,
1.0,652,26.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.096509

0,1
0.0,2200
1.0,235

Value,Count,Frequency (%),Unnamed: 3
0.0,2200,90.3%,
1.0,235,9.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
c,1352
b,1083

Value,Count,Frequency (%),Unnamed: 3
c,1352,55.5%,
b,1083,44.5%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.71622

0,1
1.0,1744
0.0,691

Value,Count,Frequency (%),Unnamed: 3
1.0,1744,71.6%,
0.0,691,28.4%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.43696

0,1
0.0,1371
1.0,1064

Value,Count,Frequency (%),Unnamed: 3
0.0,1371,56.3%,
1.0,1064,43.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.80862

0,1
1,1969
0,466

Value,Count,Frequency (%),Unnamed: 3
1,1969,80.9%,
0,466,19.1%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.12485

0,1
0.0,2131
1.0,304

Value,Count,Frequency (%),Unnamed: 3
0.0,2131,87.5%,
1.0,304,12.5%,

0,1
Distinct count,5
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.6209
Minimum,0
Maximum,4
Zeros (%),0.7%

0,1
Minimum,0
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,4
Maximum,4
Range,4
Interquartile range,1

0,1
Standard deviation,0.69661
Coef of variation,0.19238
Kurtosis,5.8048
Mean,3.6209
MAD,0.54298
Skewness,-2.2032
Sum,8817
Variance,0.48526
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
4,1744,71.6%,
3,513,21.1%,
2,142,5.8%,
1,18,0.7%,
0,18,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0,18,0.7%,
1,18,0.7%,
2,142,5.8%,
3,513,21.1%,
4,1744,71.6%,

Value,Count,Frequency (%),Unnamed: 3
0,18,0.7%,
1,18,0.7%,
2,142,5.8%,
3,513,21.1%,
4,1744,71.6%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.10678

0,1
0.0,2175
1.0,260

Value,Count,Frequency (%),Unnamed: 3
0.0,2175,89.3%,
1.0,260,10.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.47885

0,1
0,1269
1,1166

Value,Count,Frequency (%),Unnamed: 3
0,1269,52.1%,
1,1166,47.9%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.4501

0,1
0,1339
1,1096

Value,Count,Frequency (%),Unnamed: 3
0,1339,55.0%,
1,1096,45.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.29117

0,1
0.0,1726
1.0,709

Value,Count,Frequency (%),Unnamed: 3
0.0,1726,70.9%,
1.0,709,29.1%,

0,1
Distinct count,13
Unique (%),0.5%
Missing (%),0.0%
Missing (n),0

0,1
,1373
some,532
2,178
Other values (10),352

Value,Count,Frequency (%),Unnamed: 3
,1373,56.4%,
some,532,21.8%,
2,178,7.3%,
3,166,6.8%,
5,81,3.3%,
1,71,2.9%,
10,9,0.4%,
7,6,0.2%,
8,5,0.2%,
4,4,0.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.43532

0,1
0.0,1375
1.0,1060

Value,Count,Frequency (%),Unnamed: 3
0.0,1375,56.5%,
1.0,1060,43.5%,

0,1
Distinct count,3
Unique (%),0.1%
Missing (%),36.3%
Missing (n),884
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.11476
Minimum,0.0
Maximum,1.0
Zeros (%),56.4%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0
95-th percentile,1.0
Maximum,1.0
Range,1.0
Interquartile range,0.0

0,1
Standard deviation,0.31884
Coef of variation,2.7782
Kurtosis,3.85942
Mean,0.11476
MAD,0.20319
Skewness,2.4196
Sum,178.0
Variance,0.10166
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,56.4%,
1.0,178,7.3%,
(Missing),884,36.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,56.4%,
1.0,178,7.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,56.4%,
1.0,178,7.3%,

0,1
Distinct count,18
Unique (%),0.7%
Missing (%),0.0%
Missing (n),0

0,1
Anne,242
Allison,232
Emily,227
Other values (15),1734

Value,Count,Frequency (%),Unnamed: 3
Anne,242,9.9%,
Allison,232,9.5%,
Emily,227,9.3%,
Kristen,213,8.7%,
Jill,203,8.3%,
Laurie,195,8.0%,
Sarah,193,7.9%,
Meredith,187,7.7%,
Carrie,168,6.9%,
Neil,76,3.1%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.8%
Missing (n),45
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.30844
Minimum,0.0
Maximum,0.992043
Zeros (%),0.7%

0,1
Minimum,0.0
5-th percentile,0.0033359
Q1,0.045275
Median,0.15995
Q3,0.51218
95-th percentile,0.97708
Maximum,0.992043
Range,0.992043
Interquartile range,0.46691

0,1
Standard deviation,0.33115
Coef of variation,1.0736
Kurtosis,-0.543923
Mean,0.30844
MAD,0.28057
Skewness,0.924463
Sum,737.17
Variance,0.10966
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.189535841346,82,3.4%,
0.0186697430909,79,3.2%,
0.0498238056898,77,3.2%,
0.586862146854,75,3.1%,
0.00333586055785,72,3.0%,
0.0223919916898,71,2.9%,
0.00922433286905,65,2.7%,
0.34248149395,65,2.7%,
0.436394661665,64,2.6%,
0.512182354927,64,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,17,0.7%,
0.000904412940145,20,0.8%,
0.00217499886639,32,1.3%,
0.00279568345286,43,1.8%,
0.00333586055785,72,3.0%,

Value,Count,Frequency (%),Unnamed: 3
0.977077245712,30,1.2%,
0.988519906998,28,1.1%,
0.989359557629,12,0.5%,
0.989495635033,41,1.7%,
0.992042720318,34,1.4%,

0,1
Distinct count,224
Unique (%),9.2%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.079096
Minimum,0.0
Maximum,0.98936
Zeros (%),1.3%

0,1
Minimum,0.0
5-th percentile,0.00097373
Q1,0.0071762
Median,0.017404
Q3,0.089956
95-th percentile,0.33708
Maximum,0.98936
Range,0.98936
Interquartile range,0.08278

0,1
Standard deviation,0.14978
Coef of variation,1.8937
Kurtosis,14.6712
Mean,0.079096
MAD,0.089157
Skewness,3.57533
Sum,75.8526
Variance,0.022434
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.125,37,1.5%,
0.0452748835087,36,1.5%,
0.0,31,1.3%,
0.0899563282728,23,0.9%,
0.0114176971838,16,0.7%,
0.00375719554722,16,0.7%,
0.0121448710561,16,0.7%,
0.336165100336,16,0.7%,
0.00403025187552,16,0.7%,
0.011153427884,16,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,31,1.3%,
0.00039625930367,8,0.3%,
0.000575746642426,2,0.1%,
0.000646621396299,2,0.1%,
0.000795498664957,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.832138895988,2,0.1%,
0.9452906847,2,0.1%,
0.954597353935,2,0.1%,
0.971468806267,2,0.1%,
0.989359557629,2,0.1%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.8%
Missing (n),45
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.215
Minimum,0.0308469
Maximum,0.780124
Zeros (%),0.0%

0,1
Minimum,0.0308469
5-th percentile,0.036977
Q1,0.092559
Median,0.14476
Q3,0.28801
95-th percentile,0.5508
Maximum,0.780124
Range,0.749277
Interquartile range,0.19545

0,1
Standard deviation,0.17067
Coef of variation,0.79384
Kurtosis,0.686762
Mean,0.215
MAD,0.13667
Skewness,1.25868
Sum,513.846
Variance,0.02913
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.136053428054,82,3.4%,
0.0925594270229,79,3.2%,
0.550802648067,77,3.2%,
0.132418602705,75,3.1%,
0.328430622816,72,3.0%,
0.187565863132,71,2.9%,
0.145394697785,65,2.7%,
0.273801326752,65,2.7%,
0.0357846468687,64,2.6%,
0.144759297371,64,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0308468621224,44,1.8%,
0.0357846468687,64,2.6%,
0.0369772836566,29,1.2%,
0.0376615077257,12,0.5%,
0.0481537804008,41,1.7%,

Value,Count,Frequency (%),Unnamed: 3
0.550802648067,77,3.2%,
0.590010762215,42,1.7%,
0.591695487499,13,0.5%,
0.670837879181,46,1.9%,
0.780124247074,17,0.7%,

0,1
Distinct count,232
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.33387
Minimum,0.0308469
Maximum,0.892857
Zeros (%),0.0%

0,1
Minimum,0.0308469
5-th percentile,0.094114
Q1,0.20197
Median,0.28841
Q3,0.41235
95-th percentile,0.74648
Maximum,0.892857
Range,0.86201
Interquartile range,0.21038

0,1
Standard deviation,0.19206
Coef of variation,0.57526
Kurtosis,1.04586
Mean,0.33387
MAD,0.14803
Skewness,1.11147
Sum,320.184
Variance,0.036888
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.892857134342,37,1.5%,
0.590010762215,36,1.5%,
0.606768548489,23,0.9%,
0.394765645266,16,0.7%,
0.286532193422,16,0.7%,
0.406993329525,16,0.7%,
0.670837879181,16,0.7%,
0.146427214146,16,0.7%,
0.406575918198,16,0.7%,
0.320884615183,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0308468621224,2,0.1%,
0.0357846468687,1,0.0%,
0.0369772836566,4,0.2%,
0.0376615077257,2,0.1%,
0.0559815950692,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.606768548489,23,0.9%,
0.670837879181,16,0.7%,
0.74647885561,5,0.2%,
0.780124247074,10,0.4%,
0.892857134342,37,1.5%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.8%
Missing (n),45
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.18603
Minimum,0.0
Maximum,0.356164
Zeros (%),0.7%

0,1
Minimum,0.0
5-th percentile,0.030174
Q1,0.13984
Median,0.19075
Q3,0.2382
95-th percentile,0.3048
Maximum,0.356164
Range,0.356164
Interquartile range,0.098353

0,1
Standard deviation,0.081844
Coef of variation,0.43996
Kurtosis,-0.400806
Mean,0.18603
MAD,0.064139
Skewness,-0.278218
Sum,444.601
Variance,0.0066984
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.173128113151,82,3.4%,
0.274885416031,79,3.2%,
0.0583398602903,77,3.2%,
0.180472582579,75,3.1%,
0.190750569105,72,3.0%,
0.356164395809,71,2.9%,
0.194318026304,65,2.7%,
0.17773129046,65,2.7%,
0.180402874947,64,2.6%,
0.257437974215,64,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,17,0.7%,
0.0156555771828,57,2.3%,
0.0296517945826,46,1.9%,
0.030812073499,42,1.7%,
0.0583398602903,77,3.2%,

Value,Count,Frequency (%),Unnamed: 3
0.299338251352,42,1.7%,
0.299457043409,32,1.3%,
0.30479863286,29,1.2%,
0.312872767448,44,1.8%,
0.356164395809,71,2.9%,

0,1
Distinct count,231
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.10169
Minimum,0.0
Maximum,0.356164
Zeros (%),1.9%

0,1
Minimum,0.0
5-th percentile,0.01454
Q1,0.047958
Median,0.087009
Q3,0.14203
95-th percentile,0.23833
Maximum,0.356164
Range,0.356164
Interquartile range,0.094073

0,1
Standard deviation,0.071311
Coef of variation,0.70125
Kurtosis,0.837208
Mean,0.10169
MAD,0.055437
Skewness,0.996916
Sum,97.5226
Variance,0.0050853
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,47,1.9%,
0.030812073499,36,1.5%,
0.0279475990683,23,0.9%,
0.0296517945826,16,0.7%,
0.049830827862,16,0.7%,
0.0870093256235,16,0.7%,
0.157457515597,16,0.7%,
0.108847863972,16,0.7%,
0.168411031365,16,0.7%,
0.0804637074471,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,47,1.9%,
0.0145399849862,2,0.1%,
0.0147456638515,2,0.1%,
0.0156555771828,12,0.5%,
0.0160771701485,8,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.299338251352,6,0.2%,
0.300435423851,2,0.1%,
0.30479863286,4,0.2%,
0.312872767448,2,0.1%,
0.356164395809,6,0.2%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.8%
Missing (n),45
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.54521
Minimum,0.00481429
Maximum,0.981653
Zeros (%),0.0%

0,1
Minimum,0.00481429
5-th percentile,0.0141
Q1,0.27145
Median,0.57183
Q3,0.8738
95-th percentile,0.97368
Maximum,0.981653
Range,0.976839
Interquartile range,0.60236

0,1
Standard deviation,0.32777
Coef of variation,0.60118
Kurtosis,-1.34851
Mean,0.54521
MAD,0.29296
Skewness,-0.255874
Sum,1303.06
Variance,0.10744
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.665601849556,82,3.4%,
0.873804688454,79,3.2%,
0.883613944054,77,3.2%,
0.337866216898,75,3.1%,
0.981652796268,72,3.0%,
0.221549004316,71,2.9%,
0.422792255878,65,2.7%,
0.965914547443,65,2.7%,
0.298899203539,64,2.6%,
0.333424389362,64,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00481428811327,34,1.4%,
0.0048631252721,41,1.7%,
0.00550022022799,12,0.5%,
0.00814846530557,28,1.1%,
0.0140997087583,30,1.2%,

Value,Count,Frequency (%),Unnamed: 3
0.96273291111,17,0.7%,
0.965914547443,65,2.7%,
0.973675429821,25,1.0%,
0.975516855717,46,1.9%,
0.981652796268,72,3.0%,

0,1
Distinct count,232
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.84376
Minimum,0.00550022
Maximum,1.0
Zeros (%),0.0%

0,1
Minimum,0.00550022
5-th percentile,0.46474
Q1,0.82414
Median,0.90073
Q3,0.95636
95-th percentile,0.97974
Maximum,1.0
Range,0.9945
Interquartile range,0.13222

0,1
Standard deviation,0.18304
Coef of variation,0.21693
Kurtosis,5.74044
Mean,0.84376
MAD,0.12442
Skewness,-2.35798
Sum,809.168
Variance,0.033503
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.875,37,1.5%,
0.898176431656,36,1.5%,
0.862882077694,23,0.9%,
0.900727331638,16,0.7%,
0.975516855717,16,0.7%,
0.937150299549,16,0.7%,
0.637369632721,16,0.7%,
0.955994307995,16,0.7%,
0.926485240459,16,0.7%,
0.791502475739,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00550022022799,2,0.1%,
0.024071501568,2,0.1%,
0.0428035445511,2,0.1%,
0.0483303107321,2,0.1%,
0.0964076370001,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.99280667305,2,0.1%,
0.993617892265,2,0.1%,
0.996473073959,2,0.1%,
0.997173130512,2,0.1%,
1.0,5,0.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.50226

0,1
1.0,1223
0.0,1212

Value,Count,Frequency (%),Unnamed: 3
1.0,1223,50.2%,
0.0,1212,49.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.054209

0,1
0,2303
1,132

Value,Count,Frequency (%),Unnamed: 3
0,2303,94.6%,
1,132,5.4%,

0,1
Distinct count,289
Unique (%),11.9%
Missing (%),0.0%
Missing (n),0

0,1
b,1682
a,179
196,2
Other values (286),572

Value,Count,Frequency (%),Unnamed: 3
b,1682,69.1%,
a,179,7.4%,
196,2,0.1%,
360,2,0.1%,
369,2,0.1%,
403,2,0.1%,
172,2,0.1%,
293,2,0.1%,
290,2,0.1%,
147,2,0.1%,

0,1
Distinct count,2435
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2433.9
Minimum,0
Maximum,4869
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,244.4
Q1,1217.5
Median,2433.0
Q3,3651.0
95-th percentile,4624.6
Maximum,4869.0
Range,4869.0
Interquartile range,2433.5

0,1
Standard deviation,1406.2
Coef of variation,0.57776
Kurtosis,-1.2
Mean,2433.9
MAD,1217.5
Skewness,6.1975e-05
Sum,5926439
Variance,1977400
Memory size,19.1 KiB

Value,Count,Frequency (%),Unnamed: 3
4094,1,0.0%,
1286,1,0.0%,
1300,1,0.0%,
4731,1,0.0%,
1296,1,0.0%,
3343,1,0.0%,
1294,1,0.0%,
1288,1,0.0%,
3335,1,0.0%,
1284,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,1,0.0%,
1,1,0.0%,
4,1,0.0%,
5,1,0.0%,
6,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
4861,1,0.0%,
4862,1,0.0%,
4863,1,0.0%,
4867,1,0.0%,
4869,1,0.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
a,1396
s,1039

Value,Count,Frequency (%),Unnamed: 3
a,1396,57.3%,
s,1039,42.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.49774

0,1
0.0,1223
1.0,1212

Value,Count,Frequency (%),Unnamed: 3
0.0,1223,50.2%,
1.0,1212,49.8%,

0,1
Correlation,0.93724

0,1
Distinct count,231
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.031
Minimum,8.66251
Maximum,11.3624
Zeros (%),0.0%

0,1
Minimum,8.66251
5-th percentile,9.2652
Q1,9.6915
Median,9.9144
Q3,10.387
95-th percentile,11.079
Maximum,11.3624
Range,2.69993
Interquartile range,0.6954

0,1
Standard deviation,0.56796
Coef of variation,0.056618
Kurtosis,0.114292
Mean,10.031
MAD,0.44115
Skewness,0.499216
Sum,9620.21
Variance,0.32258
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
11.3624401093,37,1.5%,
11.0786600113,36,1.5%,
10.9789848328,23,0.9%,
10.5187005997,16,0.7%,
10.8619174957,16,0.7%,
9.68707084656,16,0.7%,
10.3869314194,16,0.7%,
10.4121408463,16,0.7%,
9.92000007629,16,0.7%,
10.5671033859,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
8.66250514984,5,0.2%,
8.69567394257,5,0.2%,
8.70632457733,2,0.1%,
8.72826385498,2,0.1%,
8.73327159882,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
10.9789848328,23,0.9%,
10.9970035553,1,0.0%,
11.0397491455,10,0.4%,
11.0786600113,36,1.5%,
11.3624401093,37,1.5%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.8%
Missing (n),45
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.151
Minimum,8.84174
Maximum,11.1193
Zeros (%),0.0%

0,1
Minimum,8.84174
5-th percentile,9.6143
Q1,9.9674
Median,10.218
Q3,10.343
95-th percentile,10.753
Maximum,11.1193
Range,2.27755
Interquartile range,0.37542

0,1
Standard deviation,0.34833
Coef of variation,0.034314
Kurtosis,1.78318
Mean,10.151
MAD,0.27026
Skewness,-0.617759
Sum,24261.7
Variance,0.12134
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
10.3428707123,82,3.4%,
10.0399837494,79,3.2%,
10.4836902618,77,3.2%,
10.2910604477,75,3.1%,
10.2996768951,72,3.0%,
9.61433792114,71,2.9%,
10.0995473862,65,2.7%,
10.1440782547,65,2.7%,
9.66287994385,64,2.6%,
9.96744823456,64,2.6%,

Value,Count,Frequency (%),Unnamed: 3
8.84173774719,34,1.4%,
9.21463108063,6,0.2%,
9.47431850433,41,1.7%,
9.52748394012,12,0.5%,
9.61433792114,71,2.9%,

Value,Count,Frequency (%),Unnamed: 3
10.6310606003,38,1.6%,
10.7526979446,46,1.9%,
10.7584133148,20,0.8%,
10.7951784134,42,1.7%,
11.1192903519,17,0.7%,

0,1
Distinct count,231
Unique (%),9.5%
Missing (%),60.8%
Missing (n),1481
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.656
Minimum,9.17025
Maximum,11.8143
Zeros (%),0.0%

0,1
Minimum,9.17025
5-th percentile,9.842
Q1,10.45
Median,10.666
Q3,10.867
95-th percentile,11.45
Maximum,11.8143
Range,2.64406
Interquartile range,0.41752

0,1
Standard deviation,0.44205
Coef of variation,0.041485
Kurtosis,1.4047
Mean,10.656
MAD,0.31527
Skewness,0.00443998
Sum,10165.5
Variance,0.19541
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
11.8143110275,37,1.5%,
10.7951784134,36,1.5%,
10.9872741699,23,0.9%,
10.4319076538,16,0.7%,
10.9300489426,16,0.7%,
11.1297597885,16,0.7%,
10.5110492706,16,0.7%,
10.7526979446,16,0.7%,
10.5610589981,16,0.7%,
10.3736782074,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
9.17024707794,2,0.1%,
9.21463108063,5,0.2%,
9.29743480682,2,0.1%,
9.52748394012,2,0.1%,
9.61433792114,6,0.2%,

Value,Count,Frequency (%),Unnamed: 3
11.4502515793,2,0.1%,
11.457444191,4,0.2%,
11.463924408,5,0.2%,
11.6264238358,1,0.0%,
11.8143110275,37,1.5%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15236

0,1
0.0,2064
1.0,371

Value,Count,Frequency (%),Unnamed: 3
0.0,2064,84.8%,
1.0,371,15.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.082957

0,1
0.0,2233
1.0,202

Value,Count,Frequency (%),Unnamed: 3
0.0,2233,91.7%,
1.0,202,8.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.092402

0,1
0,2210
1,225

Value,Count,Frequency (%),Unnamed: 3
0,2210,90.8%,
1,225,9.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16509

0,1
0.0,2033
1.0,402

Value,Count,Frequency (%),Unnamed: 3
0.0,2033,83.5%,
1.0,402,16.5%,

0,1
Distinct count,5
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.4752
Minimum,1
Maximum,6
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,1
Median,4
Q3,6
95-th percentile,6
Maximum,6
Range,5
Interquartile range,5

0,1
Standard deviation,2.0333
Coef of variation,0.58511
Kurtosis,-1.5704
Mean,3.4752
MAD,1.831
Skewness,-0.13162
Sum,8462
Variance,4.1344
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
1,887,36.4%,
4,636,26.1%,
6,613,25.2%,
5,228,9.4%,
3,71,2.9%,

Value,Count,Frequency (%),Unnamed: 3
1,887,36.4%,
3,71,2.9%,
4,636,26.1%,
5,228,9.4%,
6,613,25.2%,

Value,Count,Frequency (%),Unnamed: 3
1,887,36.4%,
3,71,2.9%,
4,636,26.1%,
5,228,9.4%,
6,613,25.2%,

0,1
Distinct count,57
Unique (%),2.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,214.53
Minimum,7
Maximum,903
Zeros (%),0.0%

0,1
Minimum,7
5-th percentile,13
Q1,27
Median,267
Q3,313
95-th percentile,387
Maximum,903
Range,896
Interquartile range,286

0,1
Standard deviation,148.26
Coef of variation,0.69107
Kurtosis,0.10929
Mean,214.53
MAD,128.52
Skewness,0.026758
Sum,522382
Variance,21980
Memory size,4.8 KiB

Value,Count,Frequency (%),Unnamed: 3
13,263,10.8%,
313,263,10.8%,
267,183,7.5%,
21,178,7.3%,
285,168,6.9%,
379,130,5.3%,
274,126,5.2%,
316,102,4.2%,
34,99,4.1%,
27,80,3.3%,

Value,Count,Frequency (%),Unnamed: 3
7,11,0.5%,
8,4,0.2%,
13,263,10.8%,
17,67,2.8%,
19,21,0.9%,

Value,Count,Frequency (%),Unnamed: 3
448,2,0.1%,
461,5,0.2%,
785,17,0.7%,
804,1,0.0%,
903,1,0.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.11869

0,1
0.0,2146
1.0,289

Value,Count,Frequency (%),Unnamed: 3
0.0,2146,88.1%,
1.0,289,11.9%,

0,1
Distinct count,7
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.6645
Minimum,1
Maximum,7
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,6
Maximum,7
Range,6
Interquartile range,1

0,1
Standard deviation,1.2193
Coef of variation,0.33275
Kurtosis,-0.31071
Mean,3.6645
MAD,0.9888
Skewness,0.26437
Sum,8923
Variance,1.4868
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
4,800,32.9%,
3,726,29.8%,
2,347,14.3%,
5,258,10.6%,
6,243,10.0%,
1,54,2.2%,
7,7,0.3%,

Value,Count,Frequency (%),Unnamed: 3
1,54,2.2%,
2,347,14.3%,
3,726,29.8%,
4,800,32.9%,
5,258,10.6%,

Value,Count,Frequency (%),Unnamed: 3
3,726,29.8%,
4,800,32.9%,
5,258,10.6%,
6,243,10.0%,
7,7,0.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.07269

0,1
0.0,2258
1.0,177

Value,Count,Frequency (%),Unnamed: 3
0.0,2258,92.7%,
1.0,177,7.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15483

0,1
0.0,2058
1.0,377

Value,Count,Frequency (%),Unnamed: 3
0.0,2058,84.5%,
1.0,377,15.5%,

0,1
Distinct count,4
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0

0,1
Private,1067
,996
Public,213

Value,Count,Frequency (%),Unnamed: 3
Private,1067,43.8%,
,996,40.9%,
Public,213,8.7%,
Nonprofit,159,6.5%,

0,1
Distinct count,251
Unique (%),10.3%
Missing (%),64.6%
Missing (n),1574
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2287.1
Minimum,0.0
Maximum,124500.0
Zeros (%),0.7%

0,1
Minimum,0.0
5-th percentile,20.0
Q1,98.0
Median,220.0
Q3,700.0
95-th percentile,10000.0
Maximum,124500.0
Range,124500.0
Interquartile range,602.0

0,1
Standard deviation,8905.4
Coef of variation,3.8939
Kurtosis,91.581
Mean,2287.1
MAD,3421.4
Skewness,8.3191
Sum,1.96915e+06
Variance,79307000
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
150.0,27,1.1%,
100.0,24,1.0%,
500.0,20,0.8%,
0.0,16,0.7%,
60.0,16,0.7%,
120.0,15,0.6%,
125.0,15,0.6%,
400.0,14,0.6%,
40.0,14,0.6%,
200.0,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,16,0.7%,
1.0,2,0.1%,
3.0,2,0.1%,
9.0,2,0.1%,
10.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
43800.0,2,0.1%,
48000.0,1,0.0%,
48900.0,2,0.1%,
60657.0,2,0.1%,
124500.0,2,0.1%,

0,1
Distinct count,322
Unique (%),13.2%
Missing (%),65.7%
Missing (n),1599
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,587.69
Minimum,0.0
Maximum,47947.6
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,5.2
Q1,12.975
Median,33.35
Q3,133.1
95-th percentile,2585.9
Maximum,47947.6
Range,47947.6
Interquartile range,120.12

0,1
Standard deviation,2908.5
Coef of variation,4.9491
Kurtosis,176.251
Mean,587.69
MAD,909.14
Skewness,11.9811
Sum,491306.0
Variance,8459400
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
37.5,18,0.7%,
17.5,17,0.7%,
25.0,13,0.5%,
75.0,11,0.5%,
5.0,10,0.4%,
8.69999980927,9,0.4%,
9.80000019073,8,0.3%,
16.8999996185,8,0.3%,
10.0,8,0.3%,
5.19999980927,8,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2,0.1%,
0.10000000149,2,0.1%,
0.300000011921,4,0.2%,
0.40000000596,2,0.1%,
1.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
10500.0,2,0.1%,
14268.0,1,0.0%,
15479.5996094,2,0.1%,
21784.5,2,0.1%,
47947.6015625,2,0.1%,

0,1
Constant value,w

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.78727

0,1
1.0,1917
0.0,518

Value,Count,Frequency (%),Unnamed: 3
1.0,1917,78.7%,
0.0,518,21.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16797

0,1
0.0,2026
1.0,409

Value,Count,Frequency (%),Unnamed: 3
0.0,2026,83.2%,
1.0,409,16.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15113

0,1
0.0,2067
1.0,368

Value,Count,Frequency (%),Unnamed: 3
0.0,2067,84.9%,
1.0,368,15.1%,

0,1
Distinct count,4
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0

0,1
,2175
somcol,126
colp,111

Value,Count,Frequency (%),Unnamed: 3
,2175,89.3%,
somcol,126,5.2%,
colp,111,4.6%,
hsg,23,0.9%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.33265

0,1
0.0,1625
1.0,810

Value,Count,Frequency (%),Unnamed: 3
0.0,1625,66.7%,
1.0,810,33.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
f,1860
m,575

Value,Count,Frequency (%),Unnamed: 3
f,1860,76.4%,
m,575,23.6%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.33018

0,1
0,1631
1,804

Value,Count,Frequency (%),Unnamed: 3
0,1631,67.0%,
1,804,33.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.077207

0,1
0.0,2247
1.0,188

Value,Count,Frequency (%),Unnamed: 3
0.0,2247,92.3%,
1.0,188,7.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.21396

0,1
0.0,1914
1.0,521

Value,Count,Frequency (%),Unnamed: 3
0.0,1914,78.6%,
1.0,521,21.4%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.03039

0,1
0.0,2361
1.0,74

Value,Count,Frequency (%),Unnamed: 3
0.0,2361,97.0%,
1.0,74,3.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.40862

0,1
0,1440
1,995

Value,Count,Frequency (%),Unnamed: 3
0,1440,59.1%,
1,995,40.9%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.55811

0,1
1,1359
0,1076

Value,Count,Frequency (%),Unnamed: 3
1,1359,55.8%,
0,1076,44.2%,

0,1
Distinct count,25
Unique (%),1.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,7.8563
Minimum,1
Maximum,26
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,5
Median,6
Q3,9
95-th percentile,19
Maximum,26
Range,25
Interquartile range,4

0,1
Standard deviation,5.0792
Coef of variation,0.64652
Kurtosis,2.6933
Mean,7.8563
MAD,3.6435
Skewness,1.616
Sum,19130
Variance,25.799
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
6,408,16.8%,
8,290,11.9%,
4,278,11.4%,
7,267,11.0%,
5,243,10.0%,
2,175,7.2%,
3,99,4.1%,
11,87,3.6%,
9,81,3.3%,
13,74,3.0%,

Value,Count,Frequency (%),Unnamed: 3
1,26,1.1%,
2,175,7.2%,
3,99,4.1%,
4,278,11.4%,
5,243,10.0%,

Value,Count,Frequency (%),Unnamed: 3
21,26,1.1%,
22,4,0.2%,
23,5,0.2%,
25,4,0.2%,
26,51,2.1%,

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,email,computerskills,specialskills,firstname,sex,race,h,l,call,city,kind,adid,fracblack,fracwhite,lmedhhinc,fracdropout,fraccolp,linc,col,expminreq,schoolreq,eoe,parent_sales,parent_emp,branch_sales,branch_emp,fed,fracblack_empzip,fracwhite_empzip,lmedhhinc_empzip,fracdropout_empzip,fraccolp_empzip,linc_empzip,manager,supervisor,secretary,offsupport,salesrep,retailsales,req,expreq,comreq,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,1,0,0,1,0,Allison,f,w,0.0,1.0,0.0,c,a,384.0,0.98936,0.0055,9.527484,0.274151,0.037662,8.706325,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,6,1,1,1,0,Kristen,f,w,1.0,0.0,0.0,c,a,384.0,0.080736,0.888374,10.408828,0.233687,0.087285,9.532859,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,5,1,1,1,0,Carrie,f,w,1.0,0.0,0.0,c,a,385.0,0.397595,0.180196,9.876219,0.312873,0.030847,8.728264,0.0,some,,1.0,9.4,143.0,9.4,143.0,0.0,0.204764,0.727046,10.619399,0.070493,0.369903,10.007352,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit
5,b,1,4,2,6,1,0,0,0,266,4,0,0,0,1,Jay,m,w,0.0,1.0,0.0,c,s,386.0,0.336165,0.63737,10.431908,0.108848,0.406576,10.412141,1.0,,,1.0,40.0,135.0,40.0,135.0,0.0,0.008141,0.973413,11.137956,0.047958,0.413306,10.393723,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
6,b,1,4,2,5,0,1,0,0,13,1,1,1,1,1,Jill,f,w,1.0,0.0,0.0,c,s,386.0,0.002796,0.95214,10.453601,0.236445,0.12498,9.621058,1.0,,,1.0,40.0,135.0,40.0,135.0,0.0,0.008141,0.973413,11.137956,0.047958,0.413306,10.393723,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private


In [10]:
pandas_profiling.ProfileReport(b)

0,1
Number of variables,66
Number of observations,2435
Total Missing (%),10.8%
Total size in memory,642.1 KiB
Average record size in memory,270.0 B

0,1
Numeric,23
Categorical,9
Boolean,32
Date,0
Text (Unique),0
Rejected,2
Unsupported,0

0,1
Distinct count,303
Unique (%),12.4%
Missing (%),0.0%
Missing (n),0

0,1
3,110
4,102
5,102
Other values (300),2121

Value,Count,Frequency (%),Unnamed: 3
3,110,4.5%,
4,102,4.2%,
5,102,4.2%,
6,100,4.1%,
1,100,4.1%,
2,98,4.0%,
8,93,3.8%,
7,93,3.8%,
9,92,3.8%,
10,89,3.7%,

0,1
Distinct count,1323
Unique (%),54.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,651.78
Minimum,1.0
Maximum,1344.0
Zeros (%),0.0%

0,1
Minimum,1.0
5-th percentile,61.7
Q1,306.5
Median,647.0
Q3,979.5
95-th percentile,1273.3
Maximum,1344.0
Range,1343.0
Interquartile range,673.0

0,1
Standard deviation,388.73
Coef of variation,0.59642
Kurtosis,-1.19236
Mean,651.78
MAD,335.84
Skewness,0.0567972
Sum,1.58708e+06
Variance,151110
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
1275.0,2,0.1%,
1158.0,2,0.1%,
1113.0,2,0.1%,
1247.0,2,0.1%,
1277.0,2,0.1%,
1315.0,2,0.1%,
1207.0,2,0.1%,
643.0,2,0.1%,
485.0,2,0.1%,
733.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1.0,2,0.1%,
2.0,2,0.1%,
3.0,2,0.1%,
4.0,2,0.1%,
5.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1340.0,2,0.1%,
1341.0,2,0.1%,
1342.0,2,0.1%,
1343.0,2,0.1%,
1344.0,2,0.1%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.08501

0,1
0.0,2228
1.0,207

Value,Count,Frequency (%),Unnamed: 3
0.0,2228,91.5%,
1.0,207,8.5%,

0,1
Distinct count,119
Unique (%),4.9%
Missing (%),86.5%
Missing (n),2106
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,755.42
Minimum,0.0
Maximum,12208.0
Zeros (%),0.3%

0,1
Minimum,0.0
5-th percentile,30.0
Q1,97.0
Median,200.0
Q3,500.0
95-th percentile,4528.0
Maximum,12208.0
Range,12208.0
Interquartile range,403.0

0,1
Standard deviation,1666.4
Coef of variation,2.206
Kurtosis,15.2849
Mean,755.42
MAD,897.9
Skewness,3.76538
Sum,248532.0
Variance,2777000
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
100.0,16,0.7%,
500.0,15,0.6%,
150.0,10,0.4%,
60.0,10,0.4%,
40.0,10,0.4%,
120.0,9,0.4%,
0.0,8,0.3%,
250.0,8,0.3%,
65.0,7,0.3%,
30.0,6,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,8,0.3%,
14.0,2,0.1%,
20.0,1,0.0%,
28.0,2,0.1%,
30.0,6,0.2%,

Value,Count,Frequency (%),Unnamed: 3
6829.0,2,0.1%,
7200.0,2,0.1%,
8504.0,3,0.1%,
8577.0,2,0.1%,
12208.0,1,0.0%,

0,1
Distinct count,132
Unique (%),5.4%
Missing (%),87.5%
Missing (n),2131
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,196.05
Minimum,0.0
Maximum,10500.0
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,5.075
Q1,13.0
Median,34.9
Q3,86.7
95-th percentile,637.6
Maximum,10500.0
Range,10500.0
Interquartile range,73.7

0,1
Standard deviation,897.25
Coef of variation,4.5766
Kurtosis,114.933
Mean,196.05
MAD,271.93
Skewness,10.3109
Sum,59599.4
Variance,805060
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
17.5,10,0.4%,
8.69999980927,7,0.3%,
300.0,7,0.3%,
75.0,6,0.2%,
5.0,6,0.2%,
37.5,6,0.2%,
14.3000001907,6,0.2%,
40.0,5,0.2%,
47.2000007629,4,0.2%,
133.399993896,4,0.2%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2,0.1%,
2.40000009537,1,0.0%,
2.5,3,0.1%,
4.0,2,0.1%,
4.90000009537,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
1000.0,2,0.1%,
1214.59997559,2,0.1%,
1432.0,3,0.1%,
3724.80004883,1,0.0%,
10500.0,2,0.1%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.26776

0,1
0.0,1783
1.0,652

Value,Count,Frequency (%),Unnamed: 3
0.0,1783,73.2%,
1.0,652,26.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.064476

0,1
0.0,2278
1.0,157

Value,Count,Frequency (%),Unnamed: 3
0.0,2278,93.6%,
1.0,157,6.4%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
c,1352
b,1083

Value,Count,Frequency (%),Unnamed: 3
c,1352,55.5%,
b,1083,44.5%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.72279

0,1
1.0,1760
0.0,675

Value,Count,Frequency (%),Unnamed: 3
1.0,1760,72.3%,
0.0,675,27.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.43737

0,1
0.0,1370
1.0,1065

Value,Count,Frequency (%),Unnamed: 3
0.0,1370,56.3%,
1.0,1065,43.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.83244

0,1
1,2027
0,408

Value,Count,Frequency (%),Unnamed: 3
1,2027,83.2%,
0,408,16.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.12485

0,1
0.0,2131
1.0,304

Value,Count,Frequency (%),Unnamed: 3
0.0,2131,87.5%,
1.0,304,12.5%,

0,1
Distinct count,5
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.616
Minimum,0
Maximum,4
Zeros (%),1.1%

0,1
Minimum,0
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,4
Maximum,4
Range,4
Interquartile range,1

0,1
Standard deviation,0.73306
Coef of variation,0.20273
Kurtosis,6.7333
Mean,3.616
MAD,0.55508
Skewness,-2.3907
Sum,8805
Variance,0.53738
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
4,1760,72.3%,
3,493,20.2%,
2,132,5.4%,
0,28,1.1%,
1,22,0.9%,

Value,Count,Frequency (%),Unnamed: 3
0,28,1.1%,
1,22,0.9%,
2,132,5.4%,
3,493,20.2%,
4,1760,72.3%,

Value,Count,Frequency (%),Unnamed: 3
0,28,1.1%,
1,22,0.9%,
2,132,5.4%,
3,493,20.2%,
4,1760,72.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.10678

0,1
0.0,2175
1.0,260

Value,Count,Frequency (%),Unnamed: 3
0.0,2175,89.3%,
1.0,260,10.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.47967

0,1
0,1267
1,1168

Value,Count,Frequency (%),Unnamed: 3
0,1267,52.0%,
1,1168,48.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.446

0,1
0,1349
1,1086

Value,Count,Frequency (%),Unnamed: 3
0,1349,55.4%,
1,1086,44.6%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.29117

0,1
0.0,1726
1.0,709

Value,Count,Frequency (%),Unnamed: 3
0.0,1726,70.9%,
1.0,709,29.1%,

0,1
Distinct count,13
Unique (%),0.5%
Missing (%),0.0%
Missing (n),0

0,1
,1373
some,532
2,178
Other values (10),352

Value,Count,Frequency (%),Unnamed: 3
,1373,56.4%,
some,532,21.8%,
2,178,7.3%,
3,165,6.8%,
5,82,3.4%,
1,71,2.9%,
10,9,0.4%,
7,6,0.2%,
8,5,0.2%,
4,4,0.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.43532

0,1
0.0,1375
1.0,1060

Value,Count,Frequency (%),Unnamed: 3
0.0,1375,56.5%,
1.0,1060,43.5%,

0,1
Distinct count,3
Unique (%),0.1%
Missing (%),36.3%
Missing (n),884
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.11476
Minimum,0.0
Maximum,1.0
Zeros (%),56.4%

0,1
Minimum,0.0
5-th percentile,0.0
Q1,0.0
Median,0.0
Q3,0.0
95-th percentile,1.0
Maximum,1.0
Range,1.0
Interquartile range,0.0

0,1
Standard deviation,0.31884
Coef of variation,2.7782
Kurtosis,3.85942
Mean,0.11476
MAD,0.20319
Skewness,2.4196
Sum,178.0
Variance,0.10166
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,56.4%,
1.0,178,7.3%,
(Missing),884,36.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,56.4%,
1.0,178,7.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1373,56.4%,
1.0,178,7.3%,

0,1
Distinct count,18
Unique (%),0.7%
Missing (%),0.0%
Missing (n),0

0,1
Tamika,256
Latonya,230
Latoya,226
Other values (15),1723

Value,Count,Frequency (%),Unnamed: 3
Tamika,256,10.5%,
Latonya,230,9.4%,
Latoya,226,9.3%,
Ebony,208,8.5%,
Tanisha,207,8.5%,
Lakisha,200,8.2%,
Kenya,196,8.0%,
Keisha,183,7.5%,
Aisha,180,7.4%,
Tyrone,75,3.1%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.7%
Missing (n),41
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.31321
Minimum,0.0
Maximum,0.992043
Zeros (%),0.5%

0,1
Minimum,0.0
5-th percentile,0.0033359
Q1,0.045275
Median,0.15995
Q3,0.53087
95-th percentile,0.97708
Maximum,0.992043
Range,0.992043
Interquartile range,0.4856

0,1
Standard deviation,0.33384
Coef of variation,1.0659
Kurtosis,-0.648698
Mean,0.31321
MAD,0.28505
Skewness,0.887036
Sum,749.834
Variance,0.11145
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.00922433286905,83,3.4%,
0.144843429327,78,3.2%,
0.011153427884,71,2.9%,
0.00333586055785,70,2.9%,
0.586862146854,69,2.8%,
0.0223919916898,68,2.8%,
0.303463429213,67,2.8%,
0.0186697430909,65,2.7%,
0.0547352209687,64,2.6%,
0.189535841346,63,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,12,0.5%,
0.000904412940145,14,0.6%,
0.00217499886639,25,1.0%,
0.00279568345286,37,1.5%,
0.00333586055785,70,2.9%,

Value,Count,Frequency (%),Unnamed: 3
0.977077245712,31,1.3%,
0.988519906998,41,1.7%,
0.989359557629,11,0.5%,
0.989495635033,30,1.2%,
0.992042720318,31,1.3%,

0,1
Distinct count,224
Unique (%),9.2%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.079096
Minimum,0.0
Maximum,0.98936
Zeros (%),1.3%

0,1
Minimum,0.0
5-th percentile,0.00097373
Q1,0.0071762
Median,0.017404
Q3,0.089956
95-th percentile,0.33708
Maximum,0.98936
Range,0.98936
Interquartile range,0.08278

0,1
Standard deviation,0.14978
Coef of variation,1.8937
Kurtosis,14.6712
Mean,0.079096
MAD,0.089157
Skewness,3.57533
Sum,75.8526
Variance,0.022434
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.125,37,1.5%,
0.0452748835087,36,1.5%,
0.0,31,1.3%,
0.0899563282728,23,0.9%,
0.0114176971838,16,0.7%,
0.00375719554722,16,0.7%,
0.0121448710561,16,0.7%,
0.336165100336,16,0.7%,
0.00403025187552,16,0.7%,
0.011153427884,16,0.7%,

Value,Count,Frequency (%),Unnamed: 3
0.0,31,1.3%,
0.00039625930367,8,0.3%,
0.000575746642426,2,0.1%,
0.000646621396299,2,0.1%,
0.000795498664957,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.832138895988,2,0.1%,
0.9452906847,2,0.1%,
0.954597353935,2,0.1%,
0.971468806267,2,0.1%,
0.989359557629,2,0.1%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.7%
Missing (n),41
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.21264
Minimum,0.0308469
Maximum,0.780124
Zeros (%),0.0%

0,1
Minimum,0.0308469
5-th percentile,0.037662
Q1,0.092559
Median,0.14505
Q3,0.28431
95-th percentile,0.59001
Maximum,0.780124
Range,0.749277
Interquartile range,0.19176

0,1
Standard deviation,0.16795
Coef of variation,0.78985
Kurtosis,0.940988
Mean,0.21264
MAD,0.13306
Skewness,1.32836
Sum,509.06
Variance,0.028208
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.145394697785,83,3.4%,
0.263202995062,78,3.2%,
0.670837879181,71,2.9%,
0.328430622816,70,2.9%,
0.132418602705,69,2.8%,
0.187565863132,68,2.8%,
0.119755521417,67,2.8%,
0.0925594270229,65,2.7%,
0.224218696356,64,2.6%,
0.136053428054,63,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0308468621224,52,2.1%,
0.0357846468687,41,1.7%,
0.0369772836566,25,1.0%,
0.0376615077257,11,0.5%,
0.0481537804008,30,1.2%,

Value,Count,Frequency (%),Unnamed: 3
0.550802648067,48,2.0%,
0.590010762215,35,1.4%,
0.591695487499,12,0.5%,
0.670837879181,71,2.9%,
0.780124247074,12,0.5%,

0,1
Distinct count,232
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.33387
Minimum,0.0308469
Maximum,0.892857
Zeros (%),0.0%

0,1
Minimum,0.0308469
5-th percentile,0.094114
Q1,0.20197
Median,0.28841
Q3,0.41235
95-th percentile,0.74648
Maximum,0.892857
Range,0.86201
Interquartile range,0.21038

0,1
Standard deviation,0.19206
Coef of variation,0.57526
Kurtosis,1.04586
Mean,0.33387
MAD,0.14803
Skewness,1.11147
Sum,320.184
Variance,0.036888
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.892857134342,37,1.5%,
0.590010762215,36,1.5%,
0.606768548489,23,0.9%,
0.394765645266,16,0.7%,
0.286532193422,16,0.7%,
0.406993329525,16,0.7%,
0.670837879181,16,0.7%,
0.146427214146,16,0.7%,
0.406575918198,16,0.7%,
0.320884615183,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0308468621224,2,0.1%,
0.0357846468687,1,0.0%,
0.0369772836566,4,0.2%,
0.0376615077257,2,0.1%,
0.0559815950692,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.606768548489,23,0.9%,
0.670837879181,16,0.7%,
0.74647885561,5,0.2%,
0.780124247074,10,0.4%,
0.892857134342,37,1.5%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.7%
Missing (n),41
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.18532
Minimum,0.0
Maximum,0.356164
Zeros (%),0.5%

0,1
Minimum,0.0
5-th percentile,0.029652
Q1,0.13971
Median,0.19075
Q3,0.2382
95-th percentile,0.30762
Maximum,0.356164
Range,0.356164
Interquartile range,0.098484

0,1
Standard deviation,0.081666
Coef of variation,0.44067
Kurtosis,-0.374155
Mean,0.18532
MAD,0.063711
Skewness,-0.293773
Sum,443.653
Variance,0.0066693
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.17773129046,83,3.4%,
0.0688580796123,78,3.2%,
0.0296517945826,71,2.9%,
0.190750569105,70,2.9%,
0.180472582579,69,2.8%,
0.356164395809,68,2.8%,
0.189628824592,67,2.8%,
0.274885416031,65,2.7%,
0.0156555771828,64,2.6%,
0.173128113151,63,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,12,0.5%,
0.0156555771828,64,2.6%,
0.0296517945826,71,2.9%,
0.030812073499,35,1.4%,
0.0583398602903,48,2.0%,

Value,Count,Frequency (%),Unnamed: 3
0.299338251352,42,1.7%,
0.299457043409,25,1.0%,
0.30479863286,25,1.0%,
0.312872767448,52,2.1%,
0.356164395809,68,2.8%,

0,1
Distinct count,231
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.10169
Minimum,0.0
Maximum,0.356164
Zeros (%),1.9%

0,1
Minimum,0.0
5-th percentile,0.01454
Q1,0.047958
Median,0.087009
Q3,0.14203
95-th percentile,0.23833
Maximum,0.356164
Range,0.356164
Interquartile range,0.094073

0,1
Standard deviation,0.071311
Coef of variation,0.70125
Kurtosis,0.837208
Mean,0.10169
MAD,0.055437
Skewness,0.996916
Sum,97.5226
Variance,0.0050853
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.0,47,1.9%,
0.030812073499,36,1.5%,
0.0279475990683,23,0.9%,
0.0296517945826,16,0.7%,
0.049830827862,16,0.7%,
0.0870093256235,16,0.7%,
0.157457515597,16,0.7%,
0.108847863972,16,0.7%,
0.168411031365,16,0.7%,
0.0804637074471,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,47,1.9%,
0.0145399849862,2,0.1%,
0.0147456638515,2,0.1%,
0.0156555771828,12,0.5%,
0.0160771701485,8,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.299338251352,6,0.2%,
0.300435423851,2,0.1%,
0.30479863286,4,0.2%,
0.312872767448,2,0.1%,
0.356164395809,6,0.2%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.7%
Missing (n),41
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.54033
Minimum,0.00481429
Maximum,0.981653
Zeros (%),0.0%

0,1
Minimum,0.00481429
5-th percentile,0.0141
Q1,0.2414
Median,0.57183
Q3,0.8738
95-th percentile,0.97552
Maximum,0.981653
Range,0.976839
Interquartile range,0.63241

0,1
Standard deviation,0.3312
Coef of variation,0.61296
Kurtosis,-1.38286
Mean,0.54033
MAD,0.29641
Skewness,-0.214312
Sum,1293.55
Variance,0.10969
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.965914547443,83,3.4%,
0.716077268124,78,3.2%,
0.975516855717,71,2.9%,
0.981652796268,70,2.9%,
0.337866216898,69,2.8%,
0.221549004316,68,2.8%,
0.471891522408,67,2.8%,
0.873804688454,65,2.7%,
0.810472607613,64,2.6%,
0.333424389362,63,2.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00481428811327,31,1.3%,
0.0048631252721,30,1.2%,
0.00550022022799,11,0.5%,
0.00814846530557,41,1.7%,
0.0140997087583,31,1.3%,

Value,Count,Frequency (%),Unnamed: 3
0.96273291111,12,0.5%,
0.965914547443,83,3.4%,
0.973675429821,35,1.4%,
0.975516855717,71,2.9%,
0.981652796268,70,2.9%,

0,1
Distinct count,232
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,0.84376
Minimum,0.00550022
Maximum,1.0
Zeros (%),0.0%

0,1
Minimum,0.00550022
5-th percentile,0.46474
Q1,0.82414
Median,0.90073
Q3,0.95636
95-th percentile,0.97974
Maximum,1.0
Range,0.9945
Interquartile range,0.13222

0,1
Standard deviation,0.18304
Coef of variation,0.21693
Kurtosis,5.74044
Mean,0.84376
MAD,0.12442
Skewness,-2.35798
Sum,809.168
Variance,0.033503
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
0.875,37,1.5%,
0.898176431656,36,1.5%,
0.862882077694,23,0.9%,
0.900727331638,16,0.7%,
0.975516855717,16,0.7%,
0.937150299549,16,0.7%,
0.637369632721,16,0.7%,
0.955994307995,16,0.7%,
0.926485240459,16,0.7%,
0.791502475739,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.00550022022799,2,0.1%,
0.024071501568,2,0.1%,
0.0428035445511,2,0.1%,
0.0483303107321,2,0.1%,
0.0964076370001,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.99280667305,2,0.1%,
0.993617892265,2,0.1%,
0.996473073959,2,0.1%,
0.997173130512,2,0.1%,
1.0,5,0.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.50226

0,1
1.0,1223
0.0,1212

Value,Count,Frequency (%),Unnamed: 3
1.0,1223,50.2%,
0.0,1212,49.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.051335

0,1
0,2310
1,125

Value,Count,Frequency (%),Unnamed: 3
0,2310,94.9%,
1,125,5.1%,

0,1
Distinct count,289
Unique (%),11.9%
Missing (%),0.0%
Missing (n),0

0,1
b,1682
a,179
196,2
Other values (286),572

Value,Count,Frequency (%),Unnamed: 3
b,1682,69.1%,
a,179,7.4%,
196,2,0.1%,
360,2,0.1%,
369,2,0.1%,
403,2,0.1%,
172,2,0.1%,
293,2,0.1%,
290,2,0.1%,
147,2,0.1%,

0,1
Distinct count,2435
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2435.1
Minimum,2
Maximum,4868
Zeros (%),0.0%

0,1
Minimum,2.0
5-th percentile,243.4
Q1,1217.5
Median,2439.0
Q3,3652.0
95-th percentile,4625.6
Maximum,4868.0
Range,4866.0
Interquartile range,2434.5

0,1
Standard deviation,1406.1
Coef of variation,0.57741
Kurtosis,-1.2
Mean,2435.1
MAD,1217.5
Skewness,-6.1795e-05
Sum,5929576
Variance,1977100
Memory size,19.1 KiB

Value,Count,Frequency (%),Unnamed: 3
2047,1,0.0%,
1344,1,0.0%,
1356,1,0.0%,
3403,1,0.0%,
3401,1,0.0%,
1352,1,0.0%,
1348,1,0.0%,
3395,1,0.0%,
3393,1,0.0%,
3391,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
2,1,0.0%,
3,1,0.0%,
7,1,0.0%,
8,1,0.0%,
9,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
4859,1,0.0%,
4864,1,0.0%,
4865,1,0.0%,
4866,1,0.0%,
4868,1,0.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
a,1396
s,1039

Value,Count,Frequency (%),Unnamed: 3
a,1396,57.3%,
s,1039,42.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.49774

0,1
0.0,1223
1.0,1212

Value,Count,Frequency (%),Unnamed: 3
0.0,1223,50.2%,
1.0,1212,49.8%,

0,1
Correlation,0.93909

0,1
Distinct count,231
Unique (%),9.5%
Missing (%),60.6%
Missing (n),1476
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.031
Minimum,8.66251
Maximum,11.3624
Zeros (%),0.0%

0,1
Minimum,8.66251
5-th percentile,9.2652
Q1,9.6915
Median,9.9144
Q3,10.387
95-th percentile,11.079
Maximum,11.3624
Range,2.69993
Interquartile range,0.6954

0,1
Standard deviation,0.56796
Coef of variation,0.056618
Kurtosis,0.114292
Mean,10.031
MAD,0.44115
Skewness,0.499216
Sum,9620.21
Variance,0.32258
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
11.3624401093,37,1.5%,
11.0786600113,36,1.5%,
10.9789848328,23,0.9%,
10.5187005997,16,0.7%,
10.8619174957,16,0.7%,
9.68707084656,16,0.7%,
10.3869314194,16,0.7%,
10.4121408463,16,0.7%,
9.92000007629,16,0.7%,
10.5671033859,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
8.66250514984,5,0.2%,
8.69567394257,5,0.2%,
8.70632457733,2,0.1%,
8.72826385498,2,0.1%,
8.73327159882,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
10.9789848328,23,0.9%,
10.9970035553,1,0.0%,
11.0397491455,10,0.4%,
11.0786600113,36,1.5%,
11.3624401093,37,1.5%,

0,1
Distinct count,63
Unique (%),2.6%
Missing (%),1.7%
Missing (n),41
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.143
Minimum,8.84174
Maximum,11.1193
Zeros (%),0.0%

0,1
Minimum,8.84174
5-th percentile,9.6143
Q1,9.9651
Median,10.144
Q3,10.343
95-th percentile,10.753
Maximum,11.1193
Range,2.27755
Interquartile range,0.37782

0,1
Standard deviation,0.34323
Coef of variation,0.033839
Kurtosis,1.72769
Mean,10.143
MAD,0.2657
Skewness,-0.578847
Sum,24282.4
Variance,0.11781
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
10.1440782547,83,3.4%,
9.96777629852,78,3.2%,
10.7526979446,71,2.9%,
10.2996768951,70,2.9%,
10.2910604477,69,2.8%,
9.61433792114,68,2.8%,
10.2307024002,67,2.8%,
10.0399837494,65,2.7%,
9.84198474884,64,2.6%,
9.96744823456,63,2.6%,

Value,Count,Frequency (%),Unnamed: 3
8.84173774719,31,1.3%,
9.21463108063,14,0.6%,
9.47431850433,30,1.2%,
9.52748394012,11,0.5%,
9.61433792114,68,2.8%,

Value,Count,Frequency (%),Unnamed: 3
10.6310606003,27,1.1%,
10.7526979446,71,2.9%,
10.7584133148,14,0.6%,
10.7951784134,35,1.4%,
11.1192903519,12,0.5%,

0,1
Distinct count,231
Unique (%),9.5%
Missing (%),60.8%
Missing (n),1481
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.656
Minimum,9.17025
Maximum,11.8143
Zeros (%),0.0%

0,1
Minimum,9.17025
5-th percentile,9.842
Q1,10.45
Median,10.666
Q3,10.867
95-th percentile,11.45
Maximum,11.8143
Range,2.64406
Interquartile range,0.41752

0,1
Standard deviation,0.44205
Coef of variation,0.041485
Kurtosis,1.4047
Mean,10.656
MAD,0.31527
Skewness,0.00443998
Sum,10165.5
Variance,0.19541
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
11.8143110275,37,1.5%,
10.7951784134,36,1.5%,
10.9872741699,23,0.9%,
10.4319076538,16,0.7%,
10.9300489426,16,0.7%,
11.1297597885,16,0.7%,
10.5110492706,16,0.7%,
10.7526979446,16,0.7%,
10.5610589981,16,0.7%,
10.3736782074,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
9.17024707794,2,0.1%,
9.21463108063,5,0.2%,
9.29743480682,2,0.1%,
9.52748394012,2,0.1%,
9.61433792114,6,0.2%,

Value,Count,Frequency (%),Unnamed: 3
11.4502515793,2,0.1%,
11.457444191,4,0.2%,
11.463924408,5,0.2%,
11.6264238358,1,0.0%,
11.8143110275,37,1.5%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15195

0,1
0.0,2065
1.0,370

Value,Count,Frequency (%),Unnamed: 3
0.0,2065,84.8%,
1.0,370,15.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.082957

0,1
0.0,2233
1.0,202

Value,Count,Frequency (%),Unnamed: 3
0.0,2233,91.7%,
1.0,202,8.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.10185

0,1
0,2187
1,248

Value,Count,Frequency (%),Unnamed: 3
0,2187,89.8%,
1,248,10.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16509

0,1
0.0,2033
1.0,402

Value,Count,Frequency (%),Unnamed: 3
0.0,2033,83.5%,
1.0,402,16.5%,

0,1
Distinct count,5
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.4879
Minimum,1
Maximum,6
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,1
Median,4
Q3,6
95-th percentile,6
Maximum,6
Range,5
Interquartile range,5

0,1
Standard deviation,2.0431
Coef of variation,0.58578
Kurtosis,-1.5779
Mean,3.4879
MAD,1.8404
Skewness,-0.12979
Sum,8493
Variance,4.1744
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
1,883,36.3%,
6,635,26.1%,
4,605,24.8%,
5,222,9.1%,
3,90,3.7%,

Value,Count,Frequency (%),Unnamed: 3
1,883,36.3%,
3,90,3.7%,
4,605,24.8%,
5,222,9.1%,
6,635,26.1%,

Value,Count,Frequency (%),Unnamed: 3
1,883,36.3%,
3,90,3.7%,
4,605,24.8%,
5,222,9.1%,
6,635,26.1%,

0,1
Distinct count,59
Unique (%),2.4%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,216.74
Minimum,7
Maximum,903
Zeros (%),0.0%

0,1
Minimum,7
5-th percentile,13
Q1,27
Median,267
Q3,313
95-th percentile,387
Maximum,903
Range,896
Interquartile range,286

0,1
Standard deviation,148.02
Coef of variation,0.68293
Kurtosis,0.39963
Mean,216.74
MAD,126.53
Skewness,0.072505
Sum,527774
Variance,21910
Memory size,4.8 KiB

Value,Count,Frequency (%),Unnamed: 3
313,264,10.8%,
13,241,9.9%,
285,180,7.4%,
21,175,7.2%,
267,159,6.5%,
316,120,4.9%,
274,117,4.8%,
379,108,4.4%,
34,105,4.3%,
27,82,3.4%,

Value,Count,Frequency (%),Unnamed: 3
7,21,0.9%,
8,5,0.2%,
9,1,0.0%,
13,241,9.9%,
17,70,2.9%,

Value,Count,Frequency (%),Unnamed: 3
448,4,0.2%,
461,5,0.2%,
785,16,0.7%,
804,3,0.1%,
903,2,0.1%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.11869

0,1
0.0,2146
1.0,289

Value,Count,Frequency (%),Unnamed: 3
0.0,2146,88.1%,
1.0,289,11.9%,

0,1
Distinct count,7
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.6583
Minimum,1
Maximum,7
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,6
Maximum,7
Range,6
Interquartile range,1

0,1
Standard deviation,1.2191
Coef of variation,0.33325
Kurtosis,-0.26835
Mean,3.6583
MAD,0.98865
Skewness,0.24994
Sum,8908
Variance,1.4863
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
4,811,33.3%,
3,703,28.9%,
2,357,14.7%,
5,275,11.3%,
6,221,9.1%,
1,56,2.3%,
7,12,0.5%,

Value,Count,Frequency (%),Unnamed: 3
1,56,2.3%,
2,357,14.7%,
3,703,28.9%,
4,811,33.3%,
5,275,11.3%,

Value,Count,Frequency (%),Unnamed: 3
3,703,28.9%,
4,811,33.3%,
5,275,11.3%,
6,221,9.1%,
7,12,0.5%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.07269

0,1
0.0,2258
1.0,177

Value,Count,Frequency (%),Unnamed: 3
0.0,2258,92.7%,
1.0,177,7.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15483

0,1
0.0,2058
1.0,377

Value,Count,Frequency (%),Unnamed: 3
0.0,2058,84.5%,
1.0,377,15.5%,

0,1
Distinct count,4
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0

0,1
Private,1067
,996
Public,213

Value,Count,Frequency (%),Unnamed: 3
Private,1067,43.8%,
,996,40.9%,
Public,213,8.7%,
Nonprofit,159,6.5%,

0,1
Distinct count,251
Unique (%),10.3%
Missing (%),64.6%
Missing (n),1574
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2287.1
Minimum,0.0
Maximum,124500.0
Zeros (%),0.7%

0,1
Minimum,0.0
5-th percentile,20.0
Q1,98.0
Median,220.0
Q3,700.0
95-th percentile,10000.0
Maximum,124500.0
Range,124500.0
Interquartile range,602.0

0,1
Standard deviation,8905.4
Coef of variation,3.8939
Kurtosis,91.581
Mean,2287.1
MAD,3421.4
Skewness,8.3191
Sum,1.96915e+06
Variance,79307000
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
150.0,27,1.1%,
100.0,24,1.0%,
500.0,20,0.8%,
0.0,16,0.7%,
60.0,16,0.7%,
120.0,15,0.6%,
125.0,15,0.6%,
400.0,14,0.6%,
40.0,14,0.6%,
200.0,14,0.6%,

Value,Count,Frequency (%),Unnamed: 3
0.0,16,0.7%,
1.0,2,0.1%,
3.0,2,0.1%,
9.0,2,0.1%,
10.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
43800.0,2,0.1%,
48000.0,1,0.0%,
48900.0,2,0.1%,
60657.0,2,0.1%,
124500.0,2,0.1%,

0,1
Distinct count,322
Unique (%),13.2%
Missing (%),65.7%
Missing (n),1599
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,587.69
Minimum,0.0
Maximum,47947.6
Zeros (%),0.1%

0,1
Minimum,0.0
5-th percentile,5.2
Q1,12.975
Median,33.35
Q3,133.1
95-th percentile,2585.9
Maximum,47947.6
Range,47947.6
Interquartile range,120.12

0,1
Standard deviation,2908.5
Coef of variation,4.9491
Kurtosis,176.251
Mean,587.69
MAD,909.14
Skewness,11.9811
Sum,491306.0
Variance,8459400
Memory size,9.6 KiB

Value,Count,Frequency (%),Unnamed: 3
37.5,18,0.7%,
17.5,17,0.7%,
25.0,13,0.5%,
75.0,11,0.5%,
5.0,10,0.4%,
8.69999980927,9,0.4%,
9.80000019073,8,0.3%,
16.8999996185,8,0.3%,
10.0,8,0.3%,
5.19999980927,8,0.3%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2,0.1%,
0.10000000149,2,0.1%,
0.300000011921,4,0.2%,
0.40000000596,2,0.1%,
1.0,2,0.1%,

Value,Count,Frequency (%),Unnamed: 3
10500.0,2,0.1%,
14268.0,1,0.0%,
15479.5996094,2,0.1%,
21784.5,2,0.1%,
47947.6015625,2,0.1%,

0,1
Constant value,b

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.78727

0,1
1.0,1917
0.0,518

Value,Count,Frequency (%),Unnamed: 3
1.0,1917,78.7%,
0.0,518,21.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.16797

0,1
0.0,2026
1.0,409

Value,Count,Frequency (%),Unnamed: 3
0.0,2026,83.2%,
1.0,409,16.8%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.15113

0,1
0.0,2067
1.0,368

Value,Count,Frequency (%),Unnamed: 3
0.0,2067,84.9%,
1.0,368,15.1%,

0,1
Distinct count,4
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0

0,1
,2175
somcol,126
colp,111

Value,Count,Frequency (%),Unnamed: 3
,2175,89.3%,
somcol,126,5.2%,
colp,111,4.6%,
hsg,23,0.9%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.33306

0,1
0.0,1624
1.0,811

Value,Count,Frequency (%),Unnamed: 3
0.0,1624,66.7%,
1.0,811,33.3%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
f,1886
m,549

Value,Count,Frequency (%),Unnamed: 3
f,1886,77.5%,
m,549,22.5%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.32731

0,1
0,1638
1,797

Value,Count,Frequency (%),Unnamed: 3
0,1638,67.3%,
1,797,32.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.077207

0,1
0.0,2247
1.0,188

Value,Count,Frequency (%),Unnamed: 3
0.0,2247,92.3%,
1.0,188,7.7%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.21396

0,1
0.0,1914
1.0,521

Value,Count,Frequency (%),Unnamed: 3
0.0,1914,78.6%,
1.0,521,21.4%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.03039

0,1
0.0,2361
1.0,74

Value,Count,Frequency (%),Unnamed: 3
0.0,2361,97.0%,
1.0,74,3.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.41437

0,1
0,1426
1,1009

Value,Count,Frequency (%),Unnamed: 3
0,1426,58.6%,
1,1009,41.4%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.56099

0,1
1,1366
0,1069

Value,Count,Frequency (%),Unnamed: 3
1,1366,56.1%,
0,1069,43.9%,

0,1
Distinct count,26
Unique (%),1.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,7.8296
Minimum,1
Maximum,44
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,5
Median,6
Q3,9
95-th percentile,18
Maximum,44
Range,43
Interquartile range,4

0,1
Standard deviation,5.0108
Coef of variation,0.63998
Kurtosis,3.9097
Mean,7.8296
MAD,3.5605
Skewness,1.7585
Sum,19065
Variance,25.108
Memory size,2.5 KiB

Value,Count,Frequency (%),Unnamed: 3
6,409,16.8%,
8,288,11.8%,
7,274,11.3%,
5,264,10.8%,
4,259,10.6%,
2,177,7.3%,
3,95,3.9%,
11,86,3.5%,
13,80,3.3%,
9,78,3.2%,

Value,Count,Frequency (%),Unnamed: 3
1,19,0.8%,
2,177,7.3%,
3,95,3.9%,
4,259,10.6%,
5,264,10.8%,

Value,Count,Frequency (%),Unnamed: 3
22,4,0.2%,
23,4,0.2%,
25,3,0.1%,
26,53,2.2%,
44,1,0.0%,

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,email,computerskills,specialskills,firstname,sex,race,h,l,call,city,kind,adid,fracblack,fracwhite,lmedhhinc,fracdropout,fraccolp,linc,col,expminreq,schoolreq,eoe,parent_sales,parent_emp,branch_sales,branch_emp,fed,fracblack_empzip,fracwhite_empzip,lmedhhinc_empzip,fracdropout_empzip,fraccolp_empzip,linc_empzip,manager,supervisor,secretary,offsupport,salesrep,retailsales,req,expreq,comreq,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
2,b,1,4,1,6,0,0,0,0,19,1,1,0,1,0,Lakisha,f,b,0.0,1.0,0.0,c,a,384.0,0.104301,0.83737,10.466754,0.101335,0.591695,10.540329,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,5,0,1,1,1,Latonya,f,b,1.0,0.0,0.0,c,a,384.0,0.336165,0.63737,10.431908,0.108848,0.406576,10.412141,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
7,b,1,3,4,21,0,1,0,1,313,5,0,1,1,1,Kenya,f,b,1.0,0.0,0.0,c,a,385.0,0.116624,0.728339,10.287047,0.139843,0.365636,9.933725,0.0,some,,1.0,9.4,143.0,9.4,143.0,0.0,0.204764,0.727046,10.619399,0.070493,0.369903,10.007352,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit
8,b,1,4,3,3,0,0,0,0,316,6,0,0,1,1,Latonya,f,b,0.0,1.0,0.0,c,s,386.0,0.080736,0.888374,10.408828,0.233687,0.087285,9.532859,1.0,,,1.0,40.0,135.0,40.0,135.0,0.0,0.008141,0.973413,11.137956,0.047958,0.413306,10.393723,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
9,b,1,4,2,6,0,1,0,0,263,4,1,1,0,1,Tyrone,m,b,1.0,0.0,0.0,c,s,386.0,0.992043,0.004814,8.841738,0.295093,0.053182,8.507345,1.0,,,1.0,40.0,135.0,40.0,135.0,0.0,0.008141,0.973413,11.137956,0.047958,0.413306,10.393723,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private


In [None]:
# Your solution to Q3 here


In [13]:
# Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
ttest_ind(w.call, b.call)

Ttest_indResult(statistic=4.1147052908617514, pvalue=3.9408021031288859e-05)

In [19]:
# confidence interval

mean_bc = b.call[b.call == 1].mean()
mean_bnc = b.call[b.call == 0].mean()
diff_mean = mean_bnc - mean_bc
print(diff_mean)

-1.0


In [20]:
mean_wc = w.call[w.call == 1].mean()
mean_wnc = w.call[w.call == 0].mean()
diff_mean = mean_wnc - mean_wc
print(diff_mean)

-1.0


<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

Write a story describing the statistical significance in the context or the original problem.

Because this dataset inaccurately describes the population, it is considered biased and any result should have no meaning and is a waste of time.  

Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

Because this dataset inaccurately describes the population, it is considered biased and any result should have no meaning and is a waste of time.  

To perform such analysis would be unethical and would give the data science world as a whole a bad name/reputation.