# Data Exploration
This notebook performs exploratory data analysis on the dataset.
To expand on the analysis, attach this notebook to the **autoML** cluster,
edit [the options of pandas-profiling](https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/pages/advanced_usage.html), and rerun it.
- Explore completed trials in the [MLflow experiment](#mlflow/experiments/503296075867861/s?orderByKey=metrics.%60val_f1_score%60&orderByAsc=false)
- Navigate to the parent notebook [here](#notebook/503296075867851) (If you launched the AutoML experiment using the Experiments UI, this link isn't very useful.)

Runtime Version: _10.3.x-cpu-ml-scala2.12_

In [0]:
import os
import uuid
import shutil
import pandas as pd
import databricks.automl_runtime

from mlflow.tracking import MlflowClient

# Download input data from mlflow into a pandas DataFrame
# Create temporary directory to download data
temp_dir = os.path.join(os.environ["SPARK_LOCAL_DIRS"], "tmp", str(uuid.uuid4())[:8])
os.makedirs(temp_dir)

# Download the artifact and read it
client = MlflowClient()
training_data_path = client.download_artifacts("e9a782a368624f4c9a32fa566d5033f2", "data", temp_dir)
df = pd.read_parquet(os.path.join(training_data_path, "training_data"))

# Delete the temporary data
shutil.rmtree(temp_dir)

target_col = "HeartDisease"

## Profiling Results

In [0]:
from pandas_profiling import ProfileReport
df_profile = ProfileReport(df, title="Profiling Report", progress_bar=False, infer_dtypes=False)
profile_html = df_profile.to_html()

displayHTML(profile_html)

  (2 * xtie * ytie) / m + x0 * y0 / (9 * m * (size - 2)))


0,1
Number of variables,18
Number of observations,319795
Missing cells,0
Missing cells (%),0.0%
Duplicate rows,11852
Duplicate rows (%),3.7%
Total size in memory,43.9 MiB
Average record size in memory,144.0 B

0,1
Numeric,4
Categorical,14

0,1
Dataset has 11852 (3.7%) duplicate rows,Duplicates
PhysicalHealth is highly correlated with DiffWalking and 1 other fields,High correlation
DiffWalking is highly correlated with PhysicalHealth,High correlation
GenHealth is highly correlated with PhysicalHealth,High correlation
PhysicalHealth has 226589 (70.9%) zeros,Zeros
MentalHealth has 205401 (64.2%) zeros,Zeros

0,1
Analysis started,2022-03-23 18:37:00.464008
Analysis finished,2022-03-23 18:37:27.136222
Duration,26.67 seconds
Software version,pandas-profiling v3.1.0
Download configuration,config.json

0,1
Distinct,3604
Distinct (%),1.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,28.32539852

0,1
Minimum,12.02
Maximum,94.85
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,2.4 MiB

0,1
Minimum,12.02
5-th percentile,20.12
Q1,24.03
median,27.34
Q3,31.42
95-th percentile,40.18
Maximum,94.85
Range,82.83
Interquartile range (IQR),7.39

0,1
Standard deviation,6.3561002
Coefficient of variation (CV),0.2243957908
Kurtosis,3.890043353
Mean,28.32539852
Median Absolute Deviation (MAD),3.66
Skewness,1.332430643
Sum,9058320.82
Variance,40.40000976
Monotonicity,Not monotonic

Value,Count,Frequency (%)
26.63,3762,1.2%
27.46,2767,0.9%
27.44,2723,0.9%
24.41,2696,0.8%
27.12,2525,0.8%
25.1,2262,0.7%
28.7,1968,0.6%
29.53,1894,0.6%
32.28,1878,0.6%
29.29,1869,0.6%

Value,Count,Frequency (%)
12.02,2,< 0.1%
12.08,1,< 0.1%
12.13,1,< 0.1%
12.16,1,< 0.1%
12.2,1,< 0.1%
12.21,1,< 0.1%
12.26,1,< 0.1%
12.27,1,< 0.1%
12.4,3,< 0.1%
12.44,1,< 0.1%

Value,Count,Frequency (%)
94.85,1,< 0.1%
94.66,1,< 0.1%
93.97,1,< 0.1%
93.86,1,< 0.1%
92.53,1,< 0.1%
91.82,1,< 0.1%
91.55,2,< 0.1%
88.6,1,< 0.1%
88.19,1,< 0.1%
87.05,1,< 0.1%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,187887
Yes,131908

0,1
Max length,3.0
Median length,2.0
Mean length,2.412476743
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Yes
2nd row,No
3rd row,Yes
4th row,No
5th row,No

Value,Count,Frequency (%)
No,187887,58.8%
Yes,131908,41.2%

Value,Count,Frequency (%)
no,187887,58.8%
yes,131908,41.2%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,298018
Yes,21777

0,1
Max length,3.0
Median length,2.0
Mean length,2.068096749
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,No
2nd row,No
3rd row,No
4th row,No
5th row,No

Value,Count,Frequency (%)
No,298018,93.2%
Yes,21777,6.8%

Value,Count,Frequency (%)
no,298018,93.2%
yes,21777,6.8%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,307726
Yes,12069

0,1
Max length,3.0
Median length,2.0
Mean length,2.037739802
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,No
2nd row,Yes
3rd row,No
4th row,No
5th row,No

Value,Count,Frequency (%)
No,307726,96.2%
Yes,12069,3.8%

Value,Count,Frequency (%)
no,307726,96.2%
yes,12069,3.8%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,31
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,3.371710002

0,1
Minimum,0
Maximum,30
Zeros,226589
Zeros (%),70.9%
Negative,0
Negative (%),0.0%
Memory size,2.4 MiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,2
95-th percentile,30
Maximum,30
Range,30
Interquartile range (IQR),2

0,1
Standard deviation,7.950850183
Coefficient of variation (CV),2.358106177
Kurtosis,5.528449638
Mean,3.371710002
Median Absolute Deviation (MAD),0
Skewness,2.603973262
Sum,1078256
Variance,63.21601863
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,226589,70.9%
30,19509,6.1%
2,14880,4.7%
1,10489,3.3%
3,8617,2.7%
5,7606,2.4%
10,5453,1.7%
15,5012,1.6%
7,4629,1.4%
4,4468,1.4%

Value,Count,Frequency (%)
0,226589,70.9%
1,10489,3.3%
2,14880,4.7%
3,8617,2.7%
4,4468,1.4%
5,7606,2.4%
6,1270,0.4%
7,4629,1.4%
8,924,0.3%
9,180,0.1%

Value,Count,Frequency (%)
30,19509,6.1%
29,204,0.1%
28,446,0.1%
27,124,< 0.1%
26,66,< 0.1%
25,1164,0.4%
24,67,< 0.1%
23,46,< 0.1%
22,89,< 0.1%
21,626,0.2%

0,1
Distinct,31
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,3.898366141

0,1
Minimum,0
Maximum,30
Zeros,205401
Zeros (%),64.2%
Negative,0
Negative (%),0.0%
Memory size,2.4 MiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,3
95-th percentile,30
Maximum,30
Range,30
Interquartile range (IQR),3

0,1
Standard deviation,7.955235219
Coefficient of variation (CV),2.040658812
Kurtosis,4.40393662
Mean,3.898366141
Median Absolute Deviation (MAD),0
Skewness,2.331111549
Sum,1246678
Variance,63.28576739
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,205401,64.2%
30,17373,5.4%
2,16495,5.2%
5,14149,4.4%
10,10513,3.3%
3,10466,3.3%
15,9896,3.1%
1,9291,2.9%
7,5528,1.7%
20,5431,1.7%

Value,Count,Frequency (%)
0,205401,64.2%
1,9291,2.9%
2,16495,5.2%
3,10466,3.3%
4,5379,1.7%
5,14149,4.4%
6,1510,0.5%
7,5528,1.7%
8,1094,0.3%
9,203,0.1%

Value,Count,Frequency (%)
30,17373,5.4%
29,317,0.1%
28,515,0.2%
27,126,< 0.1%
26,59,< 0.1%
25,1954,0.6%
24,67,< 0.1%
23,68,< 0.1%
22,98,< 0.1%
21,352,0.1%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,275385
Yes,44410

0,1
Max length,3.0
Median length,2.0
Mean length,2.138870214
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,No
2nd row,No
3rd row,No
4th row,No
5th row,Yes

Value,Count,Frequency (%)
No,275385,86.1%
Yes,44410,13.9%

Value,Count,Frequency (%)
no,275385,86.1%
yes,44410,13.9%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
Female,167805
Male,151990

0,1
Max length,6.0
Median length,6.0
Mean length,5.049453556
Min length,4.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Female
2nd row,Female
3rd row,Male
4th row,Female
5th row,Female

Value,Count,Frequency (%)
Female,167805,52.5%
Male,151990,47.5%

Value,Count,Frequency (%)
female,167805,52.5%
male,151990,47.5%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,13
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
65-69,34151
60-64,33686
70-74,31065
55-59,29757
50-54,25382
Other values (8),165754

0,1
Max length,11.0
Median length,5.0
Mean length,5.453159055
Min length,5.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,55-59
2nd row,80 or older
3rd row,65-69
4th row,75-79
5th row,40-44

Value,Count,Frequency (%)
65-69,34151,10.7%
60-64,33686,10.5%
70-74,31065,9.7%
55-59,29757,9.3%
50-54,25382,7.9%
80 or older,24153,7.6%
45-49,21791,6.8%
75-79,21482,6.7%
18-24,21064,6.6%
40-44,21006,6.6%

Value,Count,Frequency (%)
65-69,34151,9.3%
60-64,33686,9.2%
70-74,31065,8.4%
55-59,29757,8.1%
50-54,25382,6.9%
80,24153,6.6%
or,24153,6.6%
older,24153,6.6%
45-49,21791,5.9%
75-79,21482,5.8%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,6
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
White,245212
Hispanic,27446
Black,22939
Other,10928
Asian,8068

0,1
Max length,30.0
Median length,5.0
Mean length,5.664137963
Min length,5.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,White
2nd row,White
3rd row,White
4th row,White
5th row,White

Value,Count,Frequency (%)
White,245212,76.7%
Hispanic,27446,8.6%
Black,22939,7.2%
Other,10928,3.4%
Asian,8068,2.5%
American Indian/Alaskan Native,5202,1.6%

Value,Count,Frequency (%)
white,245212,74.3%
hispanic,27446,8.3%
black,22939,6.9%
other,10928,3.3%
asian,8068,2.4%
american,5202,1.6%
indian/alaskan,5202,1.6%
native,5202,1.6%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,4
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,269653
Yes,40802
"No, borderline diabetes",6781
Yes (during pregnancy),2559

0,1
Max length,23.0
Median length,2.0
Mean length,2.7329164
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Yes
2nd row,No
3rd row,Yes
4th row,No
5th row,No

Value,Count,Frequency (%)
No,269653,84.3%
Yes,40802,12.8%
"No, borderline diabetes",6781,2.1%
Yes (during pregnancy),2559,0.8%

Value,Count,Frequency (%)
no,276434,81.7%
yes,43361,12.8%
borderline,6781,2.0%
diabetes,6781,2.0%
during,2559,0.8%
pregnancy,2559,0.8%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
Yes,247957
No,71838

0,1
Max length,3.0
Median length,3.0
Mean length,2.775362342
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Yes
2nd row,Yes
3rd row,Yes
4th row,No
5th row,Yes

Value,Count,Frequency (%)
Yes,247957,77.5%
No,71838,22.5%

Value,Count,Frequency (%)
yes,247957,77.5%
no,71838,22.5%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,5
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
Very good,113858
Good,93129
Excellent,66842
Fair,34677
Poor,11289

0,1
Max length,9.0
Median length,9.0
Mean length,6.825247424
Min length,4.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Very good
2nd row,Very good
3rd row,Fair
4th row,Good
5th row,Very good

Value,Count,Frequency (%)
Very good,113858,35.6%
Good,93129,29.1%
Excellent,66842,20.9%
Fair,34677,10.8%
Poor,11289,3.5%

Value,Count,Frequency (%)
good,206987,47.7%
very,113858,26.3%
excellent,66842,15.4%
fair,34677,8.0%
poor,11289,2.6%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,24
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,7.097074688

0,1
Minimum,1
Maximum,24
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,2.4 MiB

0,1
Minimum,1
5-th percentile,5
Q1,6
median,7
Q3,8
95-th percentile,9
Maximum,24
Range,23
Interquartile range (IQR),2

0,1
Standard deviation,1.436007061
Coefficient of variation (CV),0.202337882
Kurtosis,7.854868567
Mean,7.097074688
Median Absolute Deviation (MAD),1
Skewness,0.6790346208
Sum,2269609
Variance,2.062116279
Monotonicity,Not monotonic

Value,Count,Frequency (%)
7,97751,30.6%
8,97602,30.5%
6,66721,20.9%
5,19184,6.0%
9,16041,5.0%
10,7796,2.4%
4,7750,2.4%
12,2205,0.7%
3,1992,0.6%
2,788,0.2%

Value,Count,Frequency (%)
1,551,0.2%
2,788,0.2%
3,1992,0.6%
4,7750,2.4%
5,19184,6.0%
6,66721,20.9%
7,97751,30.6%
8,97602,30.5%
9,16041,5.0%
10,7796,2.4%

Value,Count,Frequency (%)
24,30,< 0.1%
23,3,< 0.1%
22,9,< 0.1%
21,2,< 0.1%
20,64,< 0.1%
19,3,< 0.1%
18,102,< 0.1%
17,21,< 0.1%
16,236,0.1%
15,189,0.1%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,276923
Yes,42872

0,1
Max length,3.0
Median length,2.0
Mean length,2.134060883
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Yes
2nd row,No
3rd row,Yes
4th row,No
5th row,No

Value,Count,Frequency (%)
No,276923,86.6%
Yes,42872,13.4%

Value,Count,Frequency (%)
no,276923,86.6%
yes,42872,13.4%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,308016
Yes,11779

0,1
Max length,3.0
Median length,2.0
Mean length,2.036832971
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,No
2nd row,No
3rd row,No
4th row,No
5th row,No

Value,Count,Frequency (%)
No,308016,96.3%
Yes,11779,3.7%

Value,Count,Frequency (%)
no,308016,96.3%
yes,11779,3.7%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,289976
Yes,29819

0,1
Max length,3.0
Median length,2.0
Mean length,2.09324411
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Yes
2nd row,No
3rd row,No
4th row,Yes
5th row,No

Value,Count,Frequency (%)
No,289976,90.7%
Yes,29819,9.3%

Value,Count,Frequency (%)
no,289976,90.7%
yes,29819,9.3%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,2.4 MiB

0,1
No,292422
Yes,27373

0,1
Max length,3.0
Median length,2.0
Mean length,2.08559546
Min length,2.0

0,1
Total characters,0
Distinct characters,0
Distinct categories,0 ?
Distinct scripts,0 ?
Distinct blocks,0 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,No
2nd row,No
3rd row,No
4th row,No
5th row,No

Value,Count,Frequency (%)
No,292422,91.4%
Yes,27373,8.6%

Value,Count,Frequency (%)
no,292422,91.4%
yes,27373,8.6%

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Value,Count,Frequency (%)
No values found.,No values found.,No values found.

Unnamed: 0,BMI,Smoking,AlcoholDrinking,Stroke,PhysicalHealth,MentalHealth,DiffWalking,Sex,AgeCategory,Race,Diabetic,PhysicalActivity,GenHealth,SleepTime,Asthma,KidneyDisease,SkinCancer,HeartDisease
0,16.6,Yes,No,No,3.0,30.0,No,Female,55-59,White,Yes,Yes,Very good,5.0,Yes,No,Yes,No
1,20.34,No,No,Yes,0.0,0.0,No,Female,80 or older,White,No,Yes,Very good,7.0,No,No,No,No
2,26.58,Yes,No,No,20.0,30.0,No,Male,65-69,White,Yes,Yes,Fair,8.0,Yes,No,No,No
3,24.21,No,No,No,0.0,0.0,No,Female,75-79,White,No,No,Good,6.0,No,No,Yes,No
4,23.71,No,No,No,28.0,0.0,Yes,Female,40-44,White,No,Yes,Very good,8.0,No,No,No,No
5,28.87,Yes,No,No,6.0,0.0,Yes,Female,75-79,Black,No,No,Fair,12.0,No,No,No,Yes
6,21.63,No,No,No,15.0,0.0,No,Female,70-74,White,No,Yes,Fair,4.0,Yes,No,Yes,No
7,31.64,Yes,No,No,5.0,0.0,Yes,Female,80 or older,White,Yes,No,Good,9.0,Yes,No,No,No
8,26.45,No,No,No,0.0,0.0,No,Female,80 or older,White,"No, borderline diabetes",No,Fair,5.0,No,Yes,No,No
9,40.69,No,No,No,0.0,0.0,Yes,Male,65-69,White,No,Yes,Good,10.0,No,No,No,No

Unnamed: 0,BMI,Smoking,AlcoholDrinking,Stroke,PhysicalHealth,MentalHealth,DiffWalking,Sex,AgeCategory,Race,Diabetic,PhysicalActivity,GenHealth,SleepTime,Asthma,KidneyDisease,SkinCancer,HeartDisease
319785,31.93,No,Yes,No,0.0,0.0,No,Male,65-69,Hispanic,No,Yes,Good,7.0,No,No,No,No
319786,33.2,Yes,No,No,0.0,0.0,No,Female,60-64,Hispanic,Yes,Yes,Very good,8.0,Yes,No,No,Yes
319787,36.54,No,No,No,7.0,0.0,No,Male,30-34,Hispanic,No,No,Good,9.0,No,No,No,No
319788,23.38,No,No,No,0.0,0.0,No,Female,60-64,Hispanic,No,Yes,Excellent,6.0,No,No,No,No
319789,22.22,No,No,No,0.0,0.0,No,Female,18-24,Hispanic,No,Yes,Excellent,8.0,No,No,No,No
319790,27.41,Yes,No,No,7.0,0.0,Yes,Male,60-64,Hispanic,Yes,No,Fair,6.0,Yes,No,No,Yes
319791,29.84,Yes,No,No,0.0,0.0,No,Male,35-39,Hispanic,No,Yes,Very good,5.0,Yes,No,No,No
319792,24.24,No,No,No,0.0,0.0,No,Female,45-49,Hispanic,No,Yes,Good,6.0,No,No,No,No
319793,32.81,No,No,No,0.0,0.0,No,Female,25-29,Hispanic,No,No,Good,12.0,No,No,No,No
319794,46.56,No,No,No,0.0,0.0,No,Female,80 or older,Hispanic,No,Yes,Good,8.0,No,No,No,No

Unnamed: 0,BMI,Smoking,AlcoholDrinking,Stroke,PhysicalHealth,MentalHealth,DiffWalking,Sex,AgeCategory,Race,Diabetic,PhysicalActivity,GenHealth,SleepTime,Asthma,KidneyDisease,SkinCancer,HeartDisease,# duplicates
7545,27.44,No,No,No,0.0,0.0,No,Female,65-69,White,No,Yes,Very good,7.0,No,No,No,No,16
6654,26.63,No,No,No,0.0,0.0,No,Female,65-69,White,No,Yes,Very good,8.0,No,No,No,No,15
3808,24.41,No,No,No,0.0,0.0,No,Male,55-59,White,No,Yes,Excellent,7.0,No,No,No,No,14
3755,24.41,No,No,No,0.0,0.0,No,Male,18-24,White,No,Yes,Excellent,7.0,No,No,No,No,13
7705,27.46,No,No,No,0.0,0.0,No,Female,65-69,White,No,Yes,Very good,7.0,No,No,No,No,13
3793,24.41,No,No,No,0.0,0.0,No,Male,45-49,White,No,Yes,Excellent,7.0,No,No,No,No,12
3823,24.41,No,No,No,0.0,0.0,No,Male,60-64,White,No,Yes,Excellent,8.0,No,No,No,No,12
6645,26.63,No,No,No,0.0,0.0,No,Female,60-64,White,No,Yes,Very good,7.0,No,No,No,No,12
6931,27.12,No,No,No,0.0,0.0,No,Male,18-24,White,No,Yes,Excellent,8.0,No,No,No,No,12
6995,27.12,No,No,No,0.0,0.0,No,Male,60-64,White,No,Yes,Very good,8.0,No,No,No,No,12
