# Data Exploration
- This notebook performs exploratory data analysis on the dataset.
- To expand on the analysis, attach this notebook to a cluster with runtime version **12.2.x-cpu-ml-scala2.12**,
edit [the options of pandas-profiling](https://pandas-profiling.ydata.ai/docs/master/rtd/pages/advanced_usage.html), and rerun it.
- Explore completed trials in the [MLflow experiment](#mlflow/experiments/535147687791316).

In [0]:
import mlflow
import os
import uuid
import shutil
import pandas as pd
import databricks.automl_runtime

# Download input data from mlflow into a pandas DataFrame
# Create temporary directory to download data
temp_dir = os.path.join(os.environ["SPARK_LOCAL_DIRS"], "tmp", str(uuid.uuid4())[:8])
os.makedirs(temp_dir)

# Download the artifact and read it
training_data_path = mlflow.artifacts.download_artifacts(run_id="c2110352faa644588216f476a07d8891", artifact_path="data", dst_path=temp_dir)
df = pd.read_parquet(os.path.join(training_data_path, "training_data"))

# Delete the temporary data
shutil.rmtree(temp_dir)

target_col = "price"

# Drop columns created by AutoML before pandas-profiling
df = df.drop(['_automl_split_col_faf9'], axis=1)

## Semantic Type Detection Alerts

For details about the definition of the semantic types and how to override the detection, see
[Databricks documentation on semantic type detection](https://docs.gcp.databricks.com/applications/machine-learning/automl.html#semantic-type-detection).

- Semantic type `categorical` detected for columns `bathrooms_na`, `bedrooms`, `bedrooms_na`, `beds_na`, `review_scores_accuracy`, `review_scores_accuracy_na`, `review_scores_checkin`, `review_scores_checkin_na`, `review_scores_cleanliness`, `review_scores_cleanliness_na`, `review_scores_communication`, `review_scores_communication_na`, `review_scores_location`, `review_scores_location_na`, `review_scores_rating_na`, `review_scores_value`, `review_scores_value_na`. Training notebooks will encode features based on categorical transformations.

## Profiling Results

In [0]:
from pandas_profiling import ProfileReport
df_profile = ProfileReport(df, minimal=True, title="Profiling Report", progress_bar=False, infer_dtypes=False)
profile_html = df_profile.to_html()

displayHTML(profile_html)

0,1
Number of variables,34
Number of observations,5786
Missing cells,0
Missing cells (%),0.0%
Total size in memory,1.5 MiB
Average record size in memory,272.0 B

0,1
Categorical,7
Numeric,27

0,1
bed_type is highly imbalanced (95.6%),Imbalance
bedrooms_na is highly skewed (γ1 = 53.77266482),Skewed
beds_na is highly skewed (γ1 = 28.70539763),Skewed
bedrooms has 671 (11.6%) zeros,Zeros
beds has 75 (1.3%) zeros,Zeros
number_of_reviews has 1140 (19.7%) zeros,Zeros
bedrooms_na has 5784 (> 99.9%) zeros,Zeros
bathrooms_na has 5768 (99.7%) zeros,Zeros
beds_na has 5779 (99.9%) zeros,Zeros
review_scores_rating_na has 4618 (79.8%) zeros,Zeros

0,1
Analysis started,2024-02-21 17:03:41.663613
Analysis finished,2024-02-21 17:03:42.083252
Duration,0.42 seconds
Software version,pandas-profiling vv3.6.2
Download configuration,config.json

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
f,3415
t,2371

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,f
2nd row,f
3rd row,f
4th row,f
5th row,f

Value,Count,Frequency (%)
f,3415,59.0%
t,2371,41.0%

0,1
Distinct,6
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
strict_14_with_grace_period,2519
moderate,2016
flexible,1153
super_strict_30,54
strict,36

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,flexible
2nd row,flexible
3rd row,flexible
4th row,flexible
5th row,flexible

Value,Count,Frequency (%)
strict_14_with_grace_period,2519,43.5%
moderate,2016,34.8%
flexible,1153,19.9%
super_strict_30,54,0.9%
strict,36,0.6%
super_strict_60,8,0.1%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
f,3594
t,2192

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,f
2nd row,f
3rd row,f
4th row,f
5th row,f

Value,Count,Frequency (%)
f,3594,62.1%
t,2192,37.9%

0,1
Distinct,64
Distinct (%),1.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,53.78275147

0,1
Minimum,0
Maximum,1199
Zeros,3
Zeros (%),0.1%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,1
Q1,1
median,2
Q3,8
95-th percentile,439
Maximum,1199
Range,1199
Interquartile range (IQR),7

0,1
Standard deviation,180.8794897
Coefficient of variation (CV),3.363150541
Kurtosis,18.32382194
Mean,53.78275147
Median Absolute Deviation (MAD),1
Skewness,4.314256346
Sum,311187
Variance,32717.38979
Monotonicity,Not monotonic

Value,Count,Frequency (%)
1,2115,36.6%
2,970,16.8%
3,475,8.2%
4,350,6.0%
5,190,3.3%
852,172,3.0%
6,132,2.3%
165,111,1.9%
8,76,1.3%
439,70,1.2%

Value,Count,Frequency (%)
0,3,0.1%
1,2115,36.6%
2,970,16.8%
3,475,8.2%
4,350,6.0%

Value,Count,Frequency (%)
1199,31,0.5%
852,172,3.0%
799,22,0.4%
483,11,0.2%
439,70,1.2%

0,1
Distinct,36
Distinct (%),0.6%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
Mission,572
South of Market,501
Western Addition,478
Downtown/Civic Center,445
Castro/Upper Market,330
Other values (31),3460

0,1
Unique,1 ?
Unique (%),< 0.1%

0,1
1st row,Bayview
2nd row,Bernal Heights
3rd row,Glen Park
4th row,Haight Ashbury
5th row,Inner Sunset

Value,Count,Frequency (%)
Mission,572,9.9%
South of Market,501,8.7%
Western Addition,478,8.3%
Downtown/Civic Center,445,7.7%
Castro/Upper Market,330,5.7%
Bernal Heights,291,5.0%
Haight Ashbury,289,5.0%
Noe Valley,256,4.4%
Outer Sunset,222,3.8%
Potrero Hill,176,3.0%

0,1
Distinct,4062
Distinct (%),70.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,37.76590488

0,1
Minimum,37.70743
Maximum,37.81031
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,37.70743
5-th percentile,37.7247625
Q1,37.75138
median,37.767835
Q3,37.7845225
95-th percentile,37.798995
Maximum,37.81031
Range,0.10288
Interquartile range (IQR),0.0331425

0,1
Standard deviation,0.02239802352
Coefficient of variation (CV),0.0005930752511
Kurtosis,-0.5046270157
Mean,37.76590488
Median Absolute Deviation (MAD),0.01657
Skewness,-0.404260205
Sum,218513.5256
Variance,0.0005016714575
Monotonicity,Not monotonic

Value,Count,Frequency (%)
37.78827,6,0.1%
37.78724,6,0.1%
37.78716,6,0.1%
37.77981,6,0.1%
37.7876,5,0.1%
37.76186,5,0.1%
37.78811,5,0.1%
37.78846,5,0.1%
37.7791,5,0.1%
37.77782,5,0.1%

Value,Count,Frequency (%)
37.70743,1,< 0.1%
37.70746,1,< 0.1%
37.70753,1,< 0.1%
37.70765,1,< 0.1%
37.70815,1,< 0.1%

Value,Count,Frequency (%)
37.81031,1,< 0.1%
37.8086,1,< 0.1%
37.80757,1,< 0.1%
37.80738,1,< 0.1%
37.80728,1,< 0.1%

0,1
Distinct,4150
Distinct (%),71.7%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,-122.4304006

0,1
Minimum,-122.51306
Maximum,-122.36979
Zeros,0
Zeros (%),0.0%
Negative,5786
Negative (%),100.0%
Memory size,45.3 KiB

0,1
Minimum,-122.51306
5-th percentile,-122.4889075
Q1,-122.44286
median,-122.42541
Q3,-122.4109725
95-th percentile,-122.3946025
Maximum,-122.36979
Range,0.14327
Interquartile range (IQR),0.0318875

0,1
Standard deviation,0.02674405505
Coefficient of variation (CV),-0.0002184429269
Kurtosis,0.6195103158
Mean,-122.4304006
Median Absolute Deviation (MAD),0.015495
Skewness,-0.9445769704
Sum,-708382.2978
Variance,0.0007152444803
Monotonicity,Not monotonic

Value,Count,Frequency (%)
-122.43386,6,0.1%
-122.41153,6,0.1%
-122.40958,6,0.1%
-122.42263,5,0.1%
-122.40962,5,0.1%
-122.44029,5,0.1%
-122.44126,5,0.1%
-122.40863,5,0.1%
-122.42506,5,0.1%
-122.40678,5,0.1%

Value,Count,Frequency (%)
-122.51306,1,< 0.1%
-122.51163,1,< 0.1%
-122.51117,1,< 0.1%
-122.51015,1,< 0.1%
-122.50968,1,< 0.1%

Value,Count,Frequency (%)
-122.36979,1,< 0.1%
-122.37043,1,< 0.1%
-122.37094,1,< 0.1%
-122.37104,1,< 0.1%
-122.37127,1,< 0.1%

0,1
Distinct,26
Distinct (%),0.4%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
Apartment,2445
House,1606
Condominium,623
Guest suite,395
Boutique hotel,146
Other values (21),571

0,1
Unique,6 ?
Unique (%),0.1%

0,1
1st row,House
2nd row,Apartment
3rd row,Apartment
4th row,House
5th row,Apartment

Value,Count,Frequency (%)
Apartment,2445,42.3%
House,1606,27.8%
Condominium,623,10.8%
Guest suite,395,6.8%
Boutique hotel,146,2.5%
Townhouse,113,2.0%
Serviced apartment,91,1.6%
Hotel,91,1.6%
Loft,74,1.3%
Hostel,71,1.2%

0,1
Distinct,3
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
Entire home/apt,3520
Private room,2121
Shared room,145

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Entire home/apt
2nd row,Entire home/apt
3rd row,Entire home/apt
4th row,Private room
5th row,Private room

Value,Count,Frequency (%)
Entire home/apt,3520,60.8%
Private room,2121,36.7%
Shared room,145,2.5%

0,1
Distinct,16
Distinct (%),0.3%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,3.199965434

0,1
Minimum,1
Maximum,16
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,1
5-th percentile,1
Q1,2
median,2
Q3,4
95-th percentile,7
Maximum,16
Range,15
Interquartile range (IQR),2

0,1
Standard deviation,1.912865959
Coefficient of variation (CV),0.5977770695
Kurtosis,4.35480723
Mean,3.199965434
Median Absolute Deviation (MAD),1
Skewness,1.695001637
Sum,18515
Variance,3.659056179
Monotonicity,Not monotonic

Value,Count,Frequency (%)
2,2593,44.8%
4,1178,20.4%
1,514,8.9%
3,468,8.1%
6,465,8.0%
5,271,4.7%
8,136,2.4%
7,74,1.3%
10,35,0.6%
9,18,0.3%

Value,Count,Frequency (%)
1,514,8.9%
2,2593,44.8%
3,468,8.1%
4,1178,20.4%
5,271,4.7%

Value,Count,Frequency (%)
16,2,< 0.1%
15,5,0.1%
14,4,0.1%
13,1,< 0.1%
12,18,0.3%

0,1
Distinct,16
Distinct (%),0.3%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,1.327082613

0,1
Minimum,0
Maximum,14
Zeros,32
Zeros (%),0.6%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0.0
5-th percentile,1.0
Q1,1.0
median,1.0
Q3,1.5
95-th percentile,2.5
Maximum,14.0
Range,14.0
Interquartile range (IQR),0.5

0,1
Standard deviation,0.7928206545
Coefficient of variation (CV),0.597416202
Kurtosis,48.36228607
Mean,1.327082613
Median Absolute Deviation (MAD),0
Skewness,5.38512102
Sum,7678.5
Variance,0.6285645902
Monotonicity,Not monotonic

Value,Count,Frequency (%)
1,4149,71.7%
2,816,14.1%
1.5,409,7.1%
2.5,138,2.4%
3,108,1.9%
3.5,42,0.7%
0,32,0.6%
4,28,0.5%
5,17,0.3%
0.5,14,0.2%

Value,Count,Frequency (%)
0.0,32,0.6%
0.5,14,0.2%
1.0,4149,71.7%
1.5,409,7.1%
2.0,816,14.1%

Value,Count,Frequency (%)
14.0,1,< 0.1%
10.0,10,0.2%
8.0,13,0.2%
6.0,3,0.1%
5.5,1,< 0.1%

0,1
Distinct,9
Distinct (%),0.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,1.337020394

0,1
Minimum,0
Maximum,14
Zeros,671
Zeros (%),11.6%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,1
median,1
Q3,2
95-th percentile,3
Maximum,14
Range,14
Interquartile range (IQR),1

0,1
Standard deviation,0.9336511383
Coefficient of variation (CV),0.6983073276
Kurtosis,7.812498961
Mean,1.337020394
Median Absolute Deviation (MAD),0
Skewness,1.592641623
Sum,7736
Variance,0.871704448
Monotonicity,Not monotonic

Value,Count,Frequency (%)
1,3376,58.3%
2,1073,18.5%
0,671,11.6%
3,499,8.6%
4,137,2.4%
5,21,0.4%
6,6,0.1%
7,2,< 0.1%
14,1,< 0.1%

Value,Count,Frequency (%)
0,671,11.6%
1,3376,58.3%
2,1073,18.5%
3,499,8.6%
4,137,2.4%

Value,Count,Frequency (%)
14,1,< 0.1%
7,2,< 0.1%
6,6,0.1%
5,21,0.4%
4,137,2.4%

0,1
Distinct,13
Distinct (%),0.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,1.761320429

0,1
Minimum,0
Maximum,14
Zeros,75
Zeros (%),1.3%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,1
Q1,1
median,1
Q3,2
95-th percentile,4
Maximum,14
Range,14
Interquartile range (IQR),1

0,1
Standard deviation,1.16987419
Coefficient of variation (CV),0.6642029303
Kurtosis,8.335638044
Mean,1.761320429
Median Absolute Deviation (MAD),0
Skewness,2.201157307
Sum,10191
Variance,1.36860562
Monotonicity,Not monotonic

Value,Count,Frequency (%)
1,3137,54.2%
2,1486,25.7%
3,590,10.2%
4,330,5.7%
5,93,1.6%
0,75,1.3%
6,36,0.6%
7,22,0.4%
8,9,0.2%
10,5,0.1%

Value,Count,Frequency (%)
0,75,1.3%
1,3137,54.2%
2,1486,25.7%
3,590,10.2%
4,330,5.7%

Value,Count,Frequency (%)
14,1,< 0.1%
12,1,< 0.1%
10,5,0.1%
9,1,< 0.1%
8,9,0.2%

0,1
Distinct,5
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Memory size,45.3 KiB

0,1
Real Bed,5726
Futon,29
Pull-out Sofa,17
Airbed,9
Couch,5

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,Real Bed
2nd row,Real Bed
3rd row,Real Bed
4th row,Real Bed
5th row,Real Bed

Value,Count,Frequency (%)
Real Bed,5726,99.0%
Futon,29,0.5%
Pull-out Sofa,17,0.3%
Airbed,9,0.2%
Couch,5,0.1%

0,1
Distinct,40
Distinct (%),0.7%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,15.64638783

0,1
Minimum,1
Maximum,365
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,1
5-th percentile,1
Q1,2
median,4
Q3,30
95-th percentile,30
Maximum,365
Range,364
Interquartile range (IQR),28

0,1
Standard deviation,21.66960403
Coefficient of variation (CV),1.384958897
Kurtosis,81.47853465
Mean,15.64638783
Median Absolute Deviation (MAD),3
Skewness,6.410941419
Sum,90530
Variance,469.5717389
Monotonicity,Not monotonic

Value,Count,Frequency (%)
30,2230,38.5%
2,1158,20.0%
1,1021,17.6%
3,677,11.7%
4,222,3.8%
5,145,2.5%
31,107,1.8%
7,57,1.0%
32,30,0.5%
60,28,0.5%

Value,Count,Frequency (%)
1,1021,17.6%
2,1158,20.0%
3,677,11.7%
4,222,3.8%
5,145,2.5%

Value,Count,Frequency (%)
365,5,0.1%
360,1,< 0.1%
188,1,< 0.1%
183,1,< 0.1%
180,21,0.4%

0,1
Distinct,353
Distinct (%),6.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,43.4173868

0,1
Minimum,0
Maximum,677
Zeros,1140
Zeros (%),19.7%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,1
median,11
Q3,53
95-th percentile,192
Maximum,677
Range,677
Interquartile range (IQR),52

0,1
Standard deviation,72.63299608
Coefficient of variation (CV),1.672901145
Kurtosis,11.22751783
Mean,43.4173868
Median Absolute Deviation (MAD),11
Skewness,2.909538897
Sum,251213
Variance,5275.552119
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,1140,19.7%
1,425,7.3%
2,288,5.0%
3,193,3.3%
4,145,2.5%
5,142,2.5%
6,142,2.5%
7,104,1.8%
9,82,1.4%
10,81,1.4%

Value,Count,Frequency (%)
0,1140,19.7%
1,425,7.3%
2,288,5.0%
3,193,3.3%
4,145,2.5%

Value,Count,Frequency (%)
677,1,< 0.1%
647,1,< 0.1%
608,1,< 0.1%
602,1,< 0.1%
576,1,< 0.1%

0,1
Distinct,38
Distinct (%),0.7%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,96.04666436

0,1
Minimum,20
Maximum,100
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,20
5-th percentile,86
Q1,95
median,98
Q3,99
95-th percentile,100
Maximum,100
Range,80
Interquartile range (IQR),4

0,1
Standard deviation,6.342451405
Coefficient of variation (CV),0.06603510332
Kurtosis,41.86714303
Mean,96.04666436
Median Absolute Deviation (MAD),2
Skewness,-5.139835077
Sum,555726
Variance,40.22668982
Monotonicity,Not monotonic

Value,Count,Frequency (%)
98,1714,29.6%
100,1304,22.5%
99,505,8.7%
97,471,8.1%
96,334,5.8%
95,276,4.8%
93,204,3.5%
94,185,3.2%
90,133,2.3%
92,118,2.0%

Value,Count,Frequency (%)
20,8,0.1%
30,1,< 0.1%
40,4,0.1%
50,2,< 0.1%
56,1,< 0.1%

Value,Count,Frequency (%)
100,1304,22.5%
99,505,8.7%
98,1714,29.6%
97,471,8.1%
96,334,5.8%

0,1
Distinct,9
Distinct (%),0.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,9.82025579

0,1
Minimum,2
Maximum,10
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,2
5-th percentile,9
Q1,10
median,10
Q3,10
95-th percentile,10
Maximum,10
Range,8
Interquartile range (IQR),0

0,1
Standard deviation,0.6029409372
Coefficient of variation (CV),0.0613976815
Kurtosis,52.65298706
Mean,9.82025579
Median Absolute Deviation (MAD),0
Skewness,-6.019596406
Sum,56820
Variance,0.3635377738
Monotonicity,Not monotonic

Value,Count,Frequency (%)
10,5045,87.2%
9,589,10.2%
8,92,1.6%
6,22,0.4%
7,19,0.3%
4,7,0.1%
2,6,0.1%
5,5,0.1%
3,1,< 0.1%

Value,Count,Frequency (%)
2,6,0.1%
3,1,< 0.1%
4,7,0.1%
5,5,0.1%
6,22,0.4%

Value,Count,Frequency (%)
10,5045,87.2%
9,589,10.2%
8,92,1.6%
7,19,0.3%
6,22,0.4%

0,1
Distinct,9
Distinct (%),0.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,9.699965434

0,1
Minimum,2
Maximum,10
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,2
5-th percentile,8
Q1,10
median,10
Q3,10
95-th percentile,10
Maximum,10
Range,8
Interquartile range (IQR),0

0,1
Standard deviation,0.7062065866
Coefficient of variation (CV),0.07280506219
Kurtosis,21.60354923
Mean,9.699965434
Median Absolute Deviation (MAD),0
Skewness,-3.72486089
Sum,56124
Variance,0.498727743
Monotonicity,Not monotonic

Value,Count,Frequency (%)
10,4556,78.7%
9,903,15.6%
8,229,4.0%
7,52,0.9%
6,32,0.6%
4,7,0.1%
2,4,0.1%
5,2,< 0.1%
3,1,< 0.1%

Value,Count,Frequency (%)
2,4,0.1%
3,1,< 0.1%
4,7,0.1%
5,2,< 0.1%
6,32,0.6%

Value,Count,Frequency (%)
10,4556,78.7%
9,903,15.6%
8,229,4.0%
7,52,0.9%
6,32,0.6%

0,1
Distinct,7
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,9.893881784

0,1
Minimum,2
Maximum,10
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,2
5-th percentile,9
Q1,10
median,10
Q3,10
95-th percentile,10
Maximum,10
Range,8
Interquartile range (IQR),0

0,1
Standard deviation,0.4593865767
Coefficient of variation (CV),0.04643137918
Kurtosis,98.76153412
Mean,9.893881784
Median Absolute Deviation (MAD),0
Skewness,-7.998685071
Sum,57246
Variance,0.2110360268
Monotonicity,Not monotonic

Value,Count,Frequency (%)
10,5338,92.3%
9,355,6.1%
8,61,1.1%
7,15,0.3%
6,10,0.2%
2,5,0.1%
4,2,< 0.1%

Value,Count,Frequency (%)
2,5,0.1%
4,2,< 0.1%
6,10,0.2%
7,15,0.3%
8,61,1.1%

Value,Count,Frequency (%)
10,5338,92.3%
9,355,6.1%
8,61,1.1%
7,15,0.3%
6,10,0.2%

0,1
Distinct,8
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,9.871586588

0,1
Minimum,2
Maximum,10
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,2
5-th percentile,9
Q1,10
median,10
Q3,10
95-th percentile,10
Maximum,10
Range,8
Interquartile range (IQR),0

0,1
Standard deviation,0.5313977321
Coefficient of variation (CV),0.05383103591
Kurtosis,86.67837336
Mean,9.871586588
Median Absolute Deviation (MAD),0
Skewness,-7.743186575
Sum,57117
Variance,0.2823835497
Monotonicity,Not monotonic

Value,Count,Frequency (%)
10,5264,91.0%
9,412,7.1%
8,67,1.2%
7,16,0.3%
6,14,0.2%
2,8,0.1%
4,4,0.1%
5,1,< 0.1%

Value,Count,Frequency (%)
2,8,0.1%
4,4,0.1%
5,1,< 0.1%
6,14,0.2%
7,16,0.3%

Value,Count,Frequency (%)
10,5264,91.0%
9,412,7.1%
8,67,1.2%
7,16,0.3%
6,14,0.2%

0,1
Distinct,9
Distinct (%),0.2%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,9.720877981

0,1
Minimum,2
Maximum,10
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,2
5-th percentile,9
Q1,10
median,10
Q3,10
95-th percentile,10
Maximum,10
Range,8
Interquartile range (IQR),0

0,1
Standard deviation,0.662649398
Coefficient of variation (CV),0.06816764898
Kurtosis,33.48954651
Mean,9.720877981
Median Absolute Deviation (MAD),0
Skewness,-4.477399502
Sum,56245
Variance,0.4391042247
Monotonicity,Not monotonic

Value,Count,Frequency (%)
10,4530,78.3%
9,1048,18.1%
8,141,2.4%
7,29,0.5%
6,21,0.4%
4,9,0.2%
2,6,0.1%
5,1,< 0.1%
3,1,< 0.1%

Value,Count,Frequency (%)
2,6,0.1%
3,1,< 0.1%
4,9,0.2%
5,1,< 0.1%
6,21,0.4%

Value,Count,Frequency (%)
10,4530,78.3%
9,1048,18.1%
8,141,2.4%
7,29,0.5%
6,21,0.4%

0,1
Distinct,8
Distinct (%),0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,9.527134462

0,1
Minimum,2
Maximum,10
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,2
5-th percentile,8
Q1,9
median,10
Q3,10
95-th percentile,10
Maximum,10
Range,8
Interquartile range (IQR),1

0,1
Standard deviation,0.7516320347
Coefficient of variation (CV),0.07889382034
Kurtosis,18.06671016
Mean,9.527134462
Median Absolute Deviation (MAD),0
Skewness,-3.015847999
Sum,55124
Variance,0.5649507155
Monotonicity,Not monotonic

Value,Count,Frequency (%)
10,3584,61.9%
9,1855,32.1%
8,259,4.5%
6,36,0.6%
7,34,0.6%
2,7,0.1%
4,6,0.1%
5,5,0.1%

Value,Count,Frequency (%)
2,7,0.1%
4,6,0.1%
5,5,0.1%
6,36,0.6%
7,34,0.6%

Value,Count,Frequency (%)
10,3584,61.9%
9,1855,32.1%
8,259,4.5%
7,34,0.6%
6,36,0.6%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.0003456619426

0,1
Minimum,0
Maximum,1
Zeros,5784
Zeros (%),> 99.9%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,0
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.018590379
Coefficient of variation (CV),53.78196643
Kurtosis,2890.498617
Mean,0.0003456619426
Median Absolute Deviation (MAD),0
Skewness,53.77266482
Sum,2
Variance,0.0003456021912
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,5784,> 99.9%
1,2,< 0.1%

Value,Count,Frequency (%)
0,5784,> 99.9%
1,2,< 0.1%

Value,Count,Frequency (%)
1,2,< 0.1%
0,5784,> 99.9%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.003110957484

0,1
Minimum,0
Maximum,1
Zeros,5768
Zeros (%),99.7%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,0
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.05569394507
Coefficient of variation (CV),17.90250923
Kurtosis,316.7222328
Mean,0.003110957484
Median Absolute Deviation (MAD),0
Skewness,17.849727
Sum,18
Variance,0.003101815517
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,5768,99.7%
1,18,0.3%

Value,Count,Frequency (%)
0,5768,99.7%
1,18,0.3%

Value,Count,Frequency (%)
1,18,0.3%
0,5768,99.7%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.001209816799

0,1
Minimum,0
Maximum,1
Zeros,5779
Zeros (%),99.9%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,0
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.0347643786
Coefficient of variation (CV),28.73524208
Kurtosis,822.2840855
Mean,0.001209816799
Median Absolute Deviation (MAD),0
Skewness,28.70539763
Sum,7
Variance,0.001208562019
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,5779,99.9%
1,7,0.1%

Value,Count,Frequency (%)
0,5779,99.9%
1,7,0.1%

Value,Count,Frequency (%)
1,7,0.1%
0,5779,99.9%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2018665745

0,1
Minimum,0
Maximum,1
Zeros,4618
Zeros (%),79.8%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4014278407
Coefficient of variation (CV),1.98858004
Kurtosis,0.2079068936
Mean,0.2018665745
Median Absolute Deviation (MAD),0
Skewness,1.485878578
Sum,1168
Variance,0.1611443113
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4618,79.8%
1,1168,20.2%

Value,Count,Frequency (%)
0,4618,79.8%
1,1168,20.2%

Value,Count,Frequency (%)
1,1168,20.2%
0,4618,79.8%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2023850674

0,1
Minimum,0
Maximum,1
Zeros,4615
Zeros (%),79.8%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4018124637
Coefficient of variation (CV),1.985385922
Kurtosis,0.1960199722
Mean,0.2023850674
Median Absolute Deviation (MAD),0
Skewness,1.481874602
Sum,1171
Variance,0.161453256
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4615,79.8%
1,1171,20.2%

Value,Count,Frequency (%)
0,4615,79.8%
1,1171,20.2%

Value,Count,Frequency (%)
1,1171,20.2%
0,4615,79.8%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2022122364

0,1
Minimum,0
Maximum,1
Zeros,4616
Zeros (%),79.8%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4016843714
Coefficient of variation (CV),1.986449378
Kurtosis,0.1999749239
Mean,0.2022122364
Median Absolute Deviation (MAD),0
Skewness,1.483207983
Sum,1170
Variance,0.1613503342
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4616,79.8%
1,1170,20.2%

Value,Count,Frequency (%)
0,4616,79.8%
1,1170,20.2%

Value,Count,Frequency (%)
1,1170,20.2%
0,4616,79.8%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2027307293

0,1
Minimum,0
Maximum,1
Zeros,4613
Zeros (%),79.7%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4020683031
Coefficient of variation (CV),1.983262747
Kurtosis,0.1881320426
Mean,0.2027307293
Median Absolute Deviation (MAD),0
Skewness,1.479211659
Sum,1173
Variance,0.1616589204
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4613,79.7%
1,1173,20.3%

Value,Count,Frequency (%)
0,4613,79.7%
1,1173,20.3%

Value,Count,Frequency (%)
1,1173,20.3%
0,4613,79.7%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2020394055

0,1
Minimum,0
Maximum,1
Zeros,4617
Zeros (%),79.8%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4015561637
Coefficient of variation (CV),1.987514083
Kurtosis,0.2039372249
Mean,0.2020394055
Median Absolute Deviation (MAD),0
Skewness,1.48454264
Sum,1169
Variance,0.1612473526
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4617,79.8%
1,1169,20.2%

Value,Count,Frequency (%)
0,4617,79.8%
1,1169,20.2%

Value,Count,Frequency (%)
1,1169,20.2%
0,4617,79.8%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2027307293

0,1
Minimum,0
Maximum,1
Zeros,4613
Zeros (%),79.7%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4020683031
Coefficient of variation (CV),1.983262747
Kurtosis,0.1881320426
Mean,0.2027307293
Median Absolute Deviation (MAD),0
Skewness,1.479211659
Sum,1173
Variance,0.1616589204
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4613,79.7%
1,1173,20.3%

Value,Count,Frequency (%)
0,4613,79.7%
1,1173,20.3%

Value,Count,Frequency (%)
1,1173,20.3%
0,4613,79.7%

0,1
Distinct,2
Distinct (%),< 0.1%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,0.2029035603

0,1
Minimum,0
Maximum,1
Zeros,4612
Zeros (%),79.7%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,0
5-th percentile,0
Q1,0
median,0
Q3,0
95-th percentile,1
Maximum,1
Range,1
Interquartile range (IQR),0

0,1
Standard deviation,0.4021960504
Coefficient of variation (CV),1.982203022
Kurtosis,0.1841990282
Mean,0.2029035603
Median Absolute Deviation (MAD),0
Skewness,1.477882092
Sum,1174
Variance,0.161761663
Monotonicity,Not monotonic

Value,Count,Frequency (%)
0,4612,79.7%
1,1174,20.3%

Value,Count,Frequency (%)
0,4612,79.7%
1,1174,20.3%

Value,Count,Frequency (%)
1,1174,20.3%
0,4612,79.7%

0,1
Distinct,450
Distinct (%),7.8%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,215.2701348

0,1
Minimum,10
Maximum,10000
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,45.3 KiB

0,1
Minimum,10
5-th percentile,56
Q1,100
median,150
Q3,235
95-th percentile,525
Maximum,10000
Range,9990
Interquartile range (IQR),135

0,1
Standard deviation,335.004952
Coefficient of variation (CV),1.556207285
Kurtosis,374.8703448
Mean,215.2701348
Median Absolute Deviation (MAD),60
Skewness,16.07068315
Sum,1245553
Variance,112228.3179
Monotonicity,Not monotonic

Value,Count,Frequency (%)
150,242,4.2%
100,158,2.7%
200,147,2.5%
250,143,2.5%
125,127,2.2%
140,103,1.8%
120,103,1.8%
300,99,1.7%
99,92,1.6%
80,89,1.5%

Value,Count,Frequency (%)
10,1,< 0.1%
19,3,0.1%
27,1,< 0.1%
28,2,< 0.1%
29,2,< 0.1%

Value,Count,Frequency (%)
10000,1,< 0.1%
9000,1,< 0.1%
8000,3,0.1%
4500,2,< 0.1%
3800,1,< 0.1%
