# Data Exploration
- This notebook performs exploratory data analysis on the dataset.
- To expand on the analysis, attach this notebook to a cluster with runtime version **16.3.x-cpu-ml-scala2.12**,
edit [the options of pandas-profiling](https://pandas-profiling.ydata.ai/docs/master/rtd/pages/advanced_usage.html), and rerun it.
- Explore completed trials in the [MLflow experiment](#mlflow/experiments/4297320214106197).

In [0]:
%pip install --no-deps ydata-profiling==4.8.3 pandas==2.2.3 visions==0.7.6 tzdata==2024.2

Collecting ydata-profiling==4.8.3
  Using cached ydata_profiling-4.8.3-py2.py3-none-any.whl.metadata (20 kB)


Collecting pandas==2.2.3
  Using cached pandas-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (89 kB)
Collecting visions==0.7.6
  Using cached visions-0.7.6-py3-none-any.whl.metadata (11 kB)


Collecting tzdata==2024.2
  Using cached tzdata-2024.2-py2.py3-none-any.whl.metadata (1.4 kB)
Using cached ydata_profiling-4.8.3-py2.py3-none-any.whl (359 kB)
Using cached pandas-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.7 MB)


Using cached visions-0.7.6-py3-none-any.whl (104 kB)
Using cached tzdata-2024.2-py2.py3-none-any.whl (346 kB)


Installing collected packages: ydata-profiling, visions, tzdata, pandas
  Attempting uninstall: ydata-profiling
    Found existing installation: ydata-profiling 4.9.0
    Not uninstalling ydata-profiling at /databricks/python3/lib/python3.12/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-daa344d1-3273-48e8-8b47-ede6c7ed87ec
    Can't uninstall 'ydata-profiling'. No files were found to uninstall.


  Attempting uninstall: visions
    Found existing installation: visions 0.7.5
    Not uninstalling visions at /databricks/python3/lib/python3.12/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-daa344d1-3273-48e8-8b47-ede6c7ed87ec
    Can't uninstall 'visions'. No files were found to uninstall.


  Attempting uninstall: pandas
    Found existing installation: pandas 1.5.3
    Not uninstalling pandas at /databricks/python3/lib/python3.12/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-daa344d1-3273-48e8-8b47-ede6c7ed87ec
    Can't uninstall 'pandas'. No files were found to uninstall.


Successfully installed pandas-2.2.3 tzdata-2024.2 visions-0.7.6 ydata-profiling-4.8.3



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


Note: you may need to restart the kernel to use updated packages.


In [0]:
import mlflow
import os
import uuid
import shutil
import pandas as pd
import databricks.automl_runtime

# Download input data from mlflow into a pandas DataFrame
# Create temporary directory to download data
temp_dir = os.path.join(os.environ["SPARK_LOCAL_DIRS"], "tmp", str(uuid.uuid4())[:8])
os.makedirs(temp_dir)

# Download the artifact and read it
training_data_path = mlflow.artifacts.download_artifacts(run_id="e342157b8ede49668a32741350c5c4bb", artifact_path="data", dst_path=temp_dir)
df = pd.read_parquet(os.path.join(training_data_path, "training_data"))

# Delete the temporary data
shutil.rmtree(temp_dir)

target_col = "Personal_Loan"

# Drop columns created by AutoML and user-specified sample weight column (if applicable) before pandas-profiling
df = df.drop(['_automl_split_col_0000'], axis=1)

# Convert columns detected to be of semantic type numeric
numeric_columns = ["CCAvg", "Mortgage"]
df[numeric_columns] = df[numeric_columns].apply(pd.to_numeric, errors="coerce")

Thu Aug  7 05:05:06 2025 Connection to spark from PID  90438
Thu Aug  7 05:05:06 2025 Initialized gateway on port 42599


Thu Aug  7 05:05:07 2025 Connected to spark.


Downloading artifacts:   0%|          | 0/2 [00:00<?, ?it/s]

## Semantic Type Detection Alerts

For details about the definition of the semantic types and how to override the detection, see
[Databricks documentation on semantic type detection](https://docs.databricks.com/applications/machine-learning/automl.html#semantic-type-detection).

- Semantic type `numeric` detected for columns `CCAvg`, `Mortgage`. Training notebooks will convert each column to a numeric type and encode features based on numerical transformations.

## Profiling Results

In [0]:
from ydata_profiling import ProfileReport
df_profile = ProfileReport(df,
                           correlations={
                               "auto": {"calculate": True},
                               "pearson": {"calculate": True},
                               "spearman": {"calculate": True},
                               "kendall": {"calculate": True},
                               "phi_k": {"calculate": True},
                               "cramers": {"calculate": True},
                           }, title="Profiling Report", progress_bar=False, infer_dtypes=False)
profile_html = df_profile.to_html()

displayHTML(profile_html)

  return df.corr(method="pearson")
  return df.corr(method="spearman")
  return df.corr(method="kendall")


0,1
Number of variables,12
Number of observations,5000
Missing cells,0
Missing cells (%),0.0%
Duplicate rows,13
Duplicate rows (%),0.3%
Total size in memory,468.9 KiB
Average record size in memory,96.0 B

0,1
Text,10
Numeric,2

0,1
Dataset has 13 (0.3%) duplicate rows,Duplicates
Age is highly overall correlated with Experience,High correlation
CCAvg is highly overall correlated with Personal_Loan,High correlation
Experience is highly overall correlated with Age,High correlation
Personal_Loan is highly overall correlated with CCAvg,High correlation
CCAvg has 106 (2.1%) zeros,Zeros
Mortgage has 3462 (69.2%) zeros,Zeros

0,1
Analysis started,2025-08-07 05:05:09.638594
Analysis finished,2025-08-07 05:05:13.483120
Duration,3.84 seconds
Software version,ydata-profiling v4.8.3
Download configuration,config.json

0,1
Distinct,45
Distinct (%),0.9%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,2
Median length,2
Mean length,2
Min length,2

0,1
Total characters,10000
Distinct characters,10
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,25
2nd row,35
3rd row,53
4th row,50
5th row,29

Value,Count,Frequency (%)
35,151,3.0%
43,149,3.0%
52,145,2.9%
58,143,2.9%
54,143,2.9%
50,138,2.8%
41,136,2.7%
30,136,2.7%
56,135,2.7%
34,134,2.7%

Value,Count,Frequency (%)
5,1870,18.7%
4,1761,17.6%
3,1748,17.5%
6,1145,11.5%
2,1002,10.0%
0,526,5.3%
1,512,5.1%
9,503,5.0%
8,479,4.8%
7,454,4.5%

Value,Count,Frequency (%)
Decimal Number,10000,100.0%

Value,Count,Frequency (%)
5,1870,18.7%
4,1761,17.6%
3,1748,17.5%
6,1145,11.5%
2,1002,10.0%
0,526,5.3%
1,512,5.1%
9,503,5.0%
8,479,4.8%
7,454,4.5%

Value,Count,Frequency (%)
Common,10000,100.0%

Value,Count,Frequency (%)
5,1870,18.7%
4,1761,17.6%
3,1748,17.5%
6,1145,11.5%
2,1002,10.0%
0,526,5.3%
1,512,5.1%
9,503,5.0%
8,479,4.8%
7,454,4.5%

Value,Count,Frequency (%)
ASCII,10000,100.0%

Value,Count,Frequency (%)
5,1870,18.7%
4,1761,17.6%
3,1748,17.5%
6,1145,11.5%
2,1002,10.0%
0,526,5.3%
1,512,5.1%
9,503,5.0%
8,479,4.8%
7,454,4.5%

0,1
Distinct,4
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,4
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,4
2nd row,4
3rd row,2
4th row,1
5th row,3

Value,Count,Frequency (%)
1,1472,29.4%
2,1296,25.9%
4,1222,24.4%
3,1010,20.2%

Value,Count,Frequency (%)
1,1472,29.4%
2,1296,25.9%
4,1222,24.4%
3,1010,20.2%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
1,1472,29.4%
2,1296,25.9%
4,1222,24.4%
3,1010,20.2%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
1,1472,29.4%
2,1296,25.9%
4,1222,24.4%
3,1010,20.2%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
1,1472,29.4%
2,1296,25.9%
4,1222,24.4%
3,1010,20.2%

0,1
Distinct,108
Distinct (%),2.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,1.937938

0,1
Minimum,0
Maximum,10
Zeros,106
Zeros (%),2.1%
Negative,0
Negative (%),0.0%
Memory size,39.2 KiB

0,1
Minimum,0.0
5-th percentile,0.1
Q1,0.7
median,1.5
Q3,2.5
95-th percentile,6.0
Maximum,10.0
Range,10.0
Interquartile range (IQR),1.8

0,1
Standard deviation,1.747659
Coefficient of variation (CV),0.90181367
Kurtosis,2.6467064
Mean,1.937938
Median Absolute Deviation (MAD),0.9
Skewness,1.5984433
Sum,9689.69
Variance,3.0543119
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0.3,241,4.8%
1,231,4.6%
0.2,204,4.1%
2,188,3.8%
0.8,187,3.7%
0.1,183,3.7%
0.4,179,3.6%
1.5,178,3.6%
0.7,169,3.4%
0.5,163,3.3%

Value,Count,Frequency (%)
0.0,106,2.1%
0.1,183,3.7%
0.2,204,4.1%
0.3,241,4.8%
0.4,179,3.6%
0.5,163,3.3%
0.6,118,2.4%
0.67,18,0.4%
0.7,169,3.4%
0.75,9,0.2%

Value,Count,Frequency (%)
10.0,3,0.1%
9.3,1,< 0.1%
9.0,2,< 0.1%
8.9,1,< 0.1%
8.8,9,0.2%
8.6,8,0.2%
8.5,2,< 0.1%
8.3,2,< 0.1%
8.2,1,< 0.1%
8.1,10,0.2%

0,1
Distinct,3
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,3
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,1
2nd row,2
3rd row,2
4th row,3
5th row,2

Value,Count,Frequency (%)
1,2096,41.9%
3,1501,30.0%
2,1403,28.1%

Value,Count,Frequency (%)
1,2096,41.9%
3,1501,30.0%
2,1403,28.1%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
1,2096,41.9%
3,1501,30.0%
2,1403,28.1%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
1,2096,41.9%
3,1501,30.0%
2,1403,28.1%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
1,2096,41.9%
3,1501,30.0%
2,1403,28.1%

0,1
Distinct,347
Distinct (%),6.9%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,56.4988

0,1
Minimum,0
Maximum,635
Zeros,3462
Zeros (%),69.2%
Negative,0
Negative (%),0.0%
Memory size,39.2 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,101
95-th percentile,272
Maximum,635
Range,635
Interquartile range (IQR),101

0,1
Standard deviation,101.7138
Coefficient of variation (CV),1.8002825
Kurtosis,4.7567967
Mean,56.4988
Median Absolute Deviation (MAD),0
Skewness,2.1040023
Sum,282494
Variance,10345.698
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,3462,69.2%
98,17,0.3%
103,16,0.3%
91,16,0.3%
83,16,0.3%
119,16,0.3%
89,16,0.3%
78,15,0.3%
102,15,0.3%
90,15,0.3%

Value,Count,Frequency (%)
0,3462,69.2%
75,8,0.2%
76,12,0.2%
77,4,0.1%
78,15,0.3%
79,11,0.2%
80,7,0.1%
81,13,0.3%
82,10,0.2%
83,16,0.3%

Value,Count,Frequency (%)
635,1,< 0.1%
617,1,< 0.1%
612,1,< 0.1%
601,1,< 0.1%
590,1,< 0.1%
589,1,< 0.1%
587,1,< 0.1%
582,1,< 0.1%
581,1,< 0.1%
577,1,< 0.1%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,2
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,1
2nd row,0
3rd row,0
4th row,0
5th row,0

Value,Count,Frequency (%)
0,4478,89.6%
1,522,10.4%

Value,Count,Frequency (%)
0,4478,89.6%
1,522,10.4%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
0,4478,89.6%
1,522,10.4%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
0,4478,89.6%
1,522,10.4%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
0,4478,89.6%
1,522,10.4%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,2
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,0
2nd row,0
3rd row,0
4th row,0
5th row,0

Value,Count,Frequency (%)
0,4698,94.0%
1,302,6.0%

Value,Count,Frequency (%)
0,4698,94.0%
1,302,6.0%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
0,4698,94.0%
1,302,6.0%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
0,4698,94.0%
1,302,6.0%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
0,4698,94.0%
1,302,6.0%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,2
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,0
2nd row,0
3rd row,1
4th row,0
5th row,1

Value,Count,Frequency (%)
1,2984,59.7%
0,2016,40.3%

Value,Count,Frequency (%)
1,2984,59.7%
0,2016,40.3%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
1,2984,59.7%
0,2016,40.3%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
1,2984,59.7%
0,2016,40.3%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
1,2984,59.7%
0,2016,40.3%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,2
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,0
2nd row,1
3rd row,0
4th row,1
5th row,0

Value,Count,Frequency (%)
0,3530,70.6%
1,1470,29.4%

Value,Count,Frequency (%)
0,3530,70.6%
1,1470,29.4%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
0,3530,70.6%
1,1470,29.4%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
0,3530,70.6%
1,1470,29.4%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
0,3530,70.6%
1,1470,29.4%

0,1
Distinct,47
Distinct (%),0.9%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,2.0
Median length,2.0
Mean length,1.7762
Min length,1.0

0,1
Total characters,8881
Distinct characters,11
Distinct categories,2 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,1
2nd row,8
3rd row,27
4th row,24
5th row,5

Value,Count,Frequency (%)
32,154,3.1%
20,148,3.0%
9,147,2.9%
5,146,2.9%
23,144,2.9%
35,143,2.9%
25,142,2.8%
28,138,2.8%
18,137,2.7%
19,135,2.7%

Value,Count,Frequency (%)
2,1811,20.4%
1,1706,19.2%
3,1686,19.0%
4,607,6.8%
5,550,6.2%
0,515,5.8%
6,494,5.6%
9,491,5.5%
7,487,5.5%
8,482,5.4%

Value,Count,Frequency (%)
Decimal Number,8829,99.4%
Dash Punctuation,52,0.6%

Value,Count,Frequency (%)
2,1811,20.5%
1,1706,19.3%
3,1686,19.1%
4,607,6.9%
5,550,6.2%
0,515,5.8%
6,494,5.6%
9,491,5.6%
7,487,5.5%
8,482,5.5%

Value,Count,Frequency (%)
-,52,100.0%

Value,Count,Frequency (%)
Common,8881,100.0%

Value,Count,Frequency (%)
2,1811,20.4%
1,1706,19.2%
3,1686,19.0%
4,607,6.8%
5,550,6.2%
0,515,5.8%
6,494,5.6%
9,491,5.5%
7,487,5.5%
8,482,5.4%

Value,Count,Frequency (%)
ASCII,8881,100.0%

Value,Count,Frequency (%)
2,1811,20.4%
1,1706,19.2%
3,1686,19.0%
4,607,6.8%
5,550,6.2%
0,515,5.8%
6,494,5.6%
9,491,5.5%
7,487,5.5%
8,482,5.4%

0,1
Distinct,162
Distinct (%),3.2%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,3.0
Median length,2.0
Mean length,2.2346
Min length,1.0

0,1
Total characters,11173
Distinct characters,10
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,2 ?
Unique (%),< 0.1%

0,1
1st row,49
2nd row,45
3rd row,72
4th row,22
5th row,45

Value,Count,Frequency (%)
44,85,1.7%
38,84,1.7%
81,83,1.7%
41,82,1.6%
39,81,1.6%
40,78,1.6%
42,77,1.5%
83,74,1.5%
43,70,1.4%
45,69,1.4%

Value,Count,Frequency (%)
1,2334,20.9%
4,1303,11.7%
2,1272,11.4%
3,1265,11.3%
5,1185,10.6%
8,1156,10.3%
9,928,8.3%
0,759,6.8%
6,497,4.4%
7,474,4.2%

Value,Count,Frequency (%)
Decimal Number,11173,100.0%

Value,Count,Frequency (%)
1,2334,20.9%
4,1303,11.7%
2,1272,11.4%
3,1265,11.3%
5,1185,10.6%
8,1156,10.3%
9,928,8.3%
0,759,6.8%
6,497,4.4%
7,474,4.2%

Value,Count,Frequency (%)
Common,11173,100.0%

Value,Count,Frequency (%)
1,2334,20.9%
4,1303,11.7%
2,1272,11.4%
3,1265,11.3%
5,1185,10.6%
8,1156,10.3%
9,928,8.3%
0,759,6.8%
6,497,4.4%
7,474,4.2%

Value,Count,Frequency (%)
ASCII,11173,100.0%

Value,Count,Frequency (%)
1,2334,20.9%
4,1303,11.7%
2,1272,11.4%
3,1265,11.3%
5,1185,10.6%
8,1156,10.3%
9,928,8.3%
0,759,6.8%
6,497,4.4%
7,474,4.2%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,39.2 KiB

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,5000
Distinct characters,2
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,0
2nd row,0
3rd row,0
4th row,0
5th row,0

Value,Count,Frequency (%)
0,4520,90.4%
1,480,9.6%

Value,Count,Frequency (%)
0,4520,90.4%
1,480,9.6%

Value,Count,Frequency (%)
Decimal Number,5000,100.0%

Value,Count,Frequency (%)
0,4520,90.4%
1,480,9.6%

Value,Count,Frequency (%)
Common,5000,100.0%

Value,Count,Frequency (%)
0,4520,90.4%
1,480,9.6%

Value,Count,Frequency (%)
ASCII,5000,100.0%

Value,Count,Frequency (%)
0,4520,90.4%
1,480,9.6%

Unnamed: 0,CCAvg,Mortgage
CCAvg,1.0,0.024
Mortgage,0.024,1.0

Unnamed: 0,CCAvg,Mortgage
CCAvg,1.0,0.11
Mortgage,0.11,1.0

Unnamed: 0,CCAvg,Mortgage
CCAvg,1.0,0.024
Mortgage,0.024,1.0

Unnamed: 0,CCAvg,Mortgage
CCAvg,1.0,0.019
Mortgage,0.019,1.0

Unnamed: 0,Age,Family,CCAvg,Education,Mortgage,Securities_Account,CD_Account,Online,CreditCard,Experience,Personal_Loan
Age,1.0,0.213,0.182,0.165,0.0,0.052,0.076,0.0,0.022,0.969,0.057
Family,0.213,1.0,0.207,0.135,0.065,0.0,0.057,0.0,0.02,0.234,0.11
CCAvg,0.182,0.207,1.0,0.227,0.245,0.021,0.227,0.025,0.0,0.192,0.58
Education,0.165,0.135,0.227,1.0,0.092,0.0,0.0,0.011,0.0,0.118,0.089
Mortgage,0.0,0.065,0.245,0.092,1.0,0.0,0.139,0.0,0.0,0.0,0.284
Securities_Account,0.052,0.0,0.021,0.0,0.0,1.0,0.475,0.0,0.003,0.0,0.024
CD_Account,0.076,0.057,0.227,0.0,0.139,0.475,1.0,0.271,0.422,0.063,0.474
Online,0.0,0.0,0.025,0.011,0.0,0.0,0.271,1.0,0.0,0.0,0.0
CreditCard,0.022,0.02,0.0,0.0,0.0,0.003,0.422,0.0,1.0,0.05,0.0
Experience,0.969,0.234,0.192,0.118,0.0,0.0,0.063,0.0,0.05,1.0,0.0

Unnamed: 0,Age,Family,CCAvg,Education,Mortgage,Securities_Account,CD_Account,Online,CreditCard,Experience,Income,Personal_Loan
0,25,4,1.6,1,0,1,0,0,0,1,49,0
1,35,4,1.0,2,0,0,0,0,1,8,45,0
2,53,2,1.5,2,0,0,0,1,0,27,72,0
3,50,1,0.3,3,0,0,0,0,1,24,22,0
4,29,3,0.1,2,0,0,0,1,0,5,45,0
5,48,2,3.8,3,0,1,0,0,0,23,114,0
6,67,1,2.0,1,0,1,0,0,0,41,112,0
7,38,4,4.7,3,134,0,0,0,0,14,130,1
8,55,1,0.5,2,0,1,0,0,1,28,21,0
9,56,4,0.9,2,111,0,0,1,0,31,25,0

Unnamed: 0,Age,Family,CCAvg,Education,Mortgage,Securities_Account,CD_Account,Online,CreditCard,Experience,Income,Personal_Loan
4990,47,3,1.5,1,75,0,0,1,0,21,32,0
4991,45,3,1.5,1,0,0,0,1,1,19,22,0
4992,63,2,0.7,3,0,0,0,1,1,37,39,0
4993,29,2,1.75,3,0,0,0,0,1,-1,50,0
4994,41,1,0.7,1,143,0,0,0,0,17,34,0
4995,50,1,2.6,2,213,0,0,0,1,26,92,0
4996,34,2,3.0,1,122,0,0,1,0,9,195,0
4997,32,1,2.9,3,0,0,0,0,0,6,78,0
4998,30,4,0.5,3,0,0,0,0,0,5,13,0
4999,63,2,0.3,3,0,0,0,0,0,39,24,0

Unnamed: 0,Age,Family,CCAvg,Education,Mortgage,Securities_Account,CD_Account,Online,CreditCard,Experience,Income,Personal_Loan,# duplicates
0,28,3,0.1,2,0,0,0,1,0,4,43,0,2
1,29,4,0.3,2,0,0,0,1,0,3,31,0,2
2,29,4,2.1,3,0,0,0,1,0,3,39,0,2
3,31,1,0.4,3,0,0,0,1,0,7,18,0,2
4,36,4,2.2,2,0,0,0,1,0,10,80,0,2
5,38,1,0.67,3,0,0,0,1,0,8,21,0,2
6,39,1,1.5,3,0,0,0,0,0,15,65,0,2
7,40,2,0.8,3,0,0,0,0,0,14,28,0,2
8,44,3,0.3,3,0,0,0,1,0,20,72,0,2
9,50,1,1.3,2,0,0,0,1,0,25,58,0,2
