In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns

Note: Data is from the UCI Machine Learning Repository:

Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

In [2]:
# data: https://archive.ics.uci.edu/ml/datasets/heart+disease
heart = pd.read_csv('processed.cleveland.data.csv')

In [3]:
heart.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
0,63.0,1.0,1.0,145.0,233.0,1.0,2.0,150.0,0.0,2.3,3.0,0.0,6.0,0
1,67.0,1.0,4.0,160.0,286.0,0.0,2.0,108.0,1.0,1.5,2.0,3.0,3.0,2
2,67.0,1.0,4.0,120.0,229.0,0.0,2.0,129.0,1.0,2.6,2.0,2.0,7.0,1
3,37.0,1.0,3.0,130.0,250.0,0.0,0.0,187.0,0.0,3.5,3.0,0.0,3.0,0
4,41.0,0.0,2.0,130.0,204.0,0.0,2.0,172.0,0.0,1.4,1.0,0.0,3.0,0


- age: age in years
- sex: 1=male, 0=female
- cp: chest pain type
 - Value 1: typical angina
 - Value 2: atypical angina
 - Value 3: non-anginal pain
 - Value 4: asymptomatic
- trestbps: resting blood pressure (in mm Hg on admission to the hospital)
- chol: serum cholestoral in mg/dl
- fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
- restecg: resting electrocardiographic results
 - Value 0: normal
 - Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV) 
 - Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria
- thalach: maximum heart rate achieved in an exercise test
- exang: exercise induced angina (1 = yes; 0 = no)
- oldpeak: ST depression induced by exercise relative to rest
- slope: the slope of the peak exercise ST segment
 - Value 1: upsloping
 - Value 2: flat
 - Value 3: downsloping
- ca: number of major vessels (0-3) colored by flourosopy
- thal: 
 - Value 3: normal
 - Value 6: fixed defect
 - Value 7: reversable defect
- heart_disease: diagnosis of heart disease (angiographic disease status)
 - Value 0: < 50% diameter narrowing
 - Value 1: > 50% diameter narrowing
"\[This field\] refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0)."


In [4]:
heart.describe(include='all')

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
count,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0
unique,,,,,,,,,,,,5.0,4.0,
top,,,,,,,,,,,,0.0,3.0,
freq,,,,,,,,,,,,176.0,166.0,
mean,54.438944,0.679868,3.158416,131.689769,246.693069,0.148515,0.990099,149.607261,0.326733,1.039604,1.60066,,,0.937294
std,9.038662,0.467299,0.960126,17.599748,51.776918,0.356198,0.994971,22.875003,0.469794,1.161075,0.616226,,,1.228536
min,29.0,0.0,1.0,94.0,126.0,0.0,0.0,71.0,0.0,0.0,1.0,,,0.0
25%,48.0,0.0,3.0,120.0,211.0,0.0,0.0,133.5,0.0,0.0,1.0,,,0.0
50%,56.0,1.0,3.0,130.0,241.0,0.0,1.0,153.0,0.0,0.8,2.0,,,0.0
75%,61.0,1.0,4.0,140.0,275.0,0.0,2.0,166.0,1.0,1.6,2.0,,,2.0


mean of the sex column is .679
males are 1
females are 0
the decimal means that 67.9% of the cases are male.

unique: all possible values in that column<br>
top: row states the most frequent occuring <b>unique</b> value in that column.<br>
freq: is the number of occurences of the <b>top</b> value.<br>

the .describe is not working well with this dataset because .describe makes assumptions on how the dataset is stored. so to make our built-in functions we have to make adjustments to how it is stored to get them to work the way we want.

# .describe
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html

pandas.DataFrame.describe
DataFrame.describe(percentiles=None, include=None, exclude=None, datetime_is_numeric=False)[source]
Generate descriptive statistics.

Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.

Analyzes both numeric and object series, as well as DataFrame column sets of mixed data types. The output will vary depending on what is provided. Refer to the notes below for more detail.

Parameters
percentileslist-like of numbers, optional
The percentiles to include in the output. All should fall between 0 and 1. The default is [.25, .5, .75], which returns the 25th, 50th, and 75th percentiles.

include‘all’, list-like of dtypes or None (default), optional
A white list of data types to include in the result. Ignored for Series. Here are the options:

‘all’ : All columns of the input will be included in the output.

A list-like of dtypes : Limits the results to the provided data types. To limit the result to numeric types submit numpy.number. To limit it instead to object columns submit the numpy.object data type. Strings can also be used in the style of select_dtypes (e.g. df.describe(include=['O'])). To select pandas categorical columns, use 'category'

None (default) : The result will include all numeric columns.

excludelist-like of dtypes or None (default), optional,
A black list of data types to omit from the result. Ignored for Series. Here are the options:

A list-like of dtypes : Excludes the provided data types from the result. To exclude numeric types submit numpy.number. To exclude object columns submit the data type numpy.object. Strings can also be used in the style of select_dtypes (e.g. df.describe(include=['O'])). To exclude pandas categorical columns, use 'category'

None (default) : The result will exclude nothing.

datetime_is_numericbool, default False
Whether to treat datetime dtypes as numeric. This affects statistics calculated for the column. For DataFrame input, this also controls whether datetime columns are included by default.

In [5]:
heart.info() # or heart.dtypes

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   age            303 non-null    float64
 1   sex            303 non-null    float64
 2   cp             303 non-null    float64
 3   trestbps       303 non-null    float64
 4   chol           303 non-null    float64
 5   fbs            303 non-null    float64
 6   restecg        303 non-null    float64
 7   thalach        303 non-null    float64
 8   exang          303 non-null    float64
 9   oldpeak        303 non-null    float64
 10  slope          303 non-null    float64
 11  ca             303 non-null    object 
 12  thal           303 non-null    object 
 13  heart_disease  303 non-null    int64  
dtypes: float64(11), int64(1), object(2)
memory usage: 33.3+ KB


int / float = number
object = string

# .info
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.info.html
pandas.DataFrame.info
DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, show_counts=None, null_counts=None)[source]
Print a concise summary of a DataFrame.

This method prints information about a DataFrame including the index dtype and columns, non-null values and memory usage.

Parameters
dataDataFrame
DataFrame to print information about.

verbosebool, optional
Whether to print the full summary. By default, the setting in pandas.options.display.max_info_columns is followed.

bufwritable buffer, defaults to sys.stdout
Where to send the output. By default, the output is printed to sys.stdout. Pass a writable buffer if you need to further process the output.

max_colsint, optional
When to switch from the verbose to the truncated output. If the DataFrame has more than max_cols columns, the truncated output is used. By default, the setting in pandas.options.display.max_info_columns is used.

memory_usagebool, str, optional
Specifies whether total memory usage of the DataFrame elements (including the index) should be displayed. By default, this follows the pandas.options.display.memory_usage setting.

True always show memory usage. False never shows memory usage. A value of ‘deep’ is equivalent to “True with deep introspection”. Memory usage is shown in human-readable units (base-2 representation). Without deep introspection a memory estimation is made based in column dtype and number of rows assuming values consume the same memory amount for corresponding dtypes. With deep memory introspection, a real memory usage calculation is performed at the cost of computational resources.

show_countsbool, optional
Whether to show the non-null counts. By default, this is shown only if the DataFrame is smaller than pandas.options.display.max_info_rows and pandas.options.display.max_info_columns. A value of True always shows the counts, and False never shows the counts.

null_countsbool, optional
Deprecated since version 1.2.0: Use show_counts instead.

Returns
None
This method prints a summary of a DataFrame and returns None.

In [6]:
heart.ca.unique()

array(['0.0', '3.0', '2.0', '1.0', '?'], dtype=object)

Unique shows the sorted unique elemnts of the requested array and the <b>dtype</b>

# .Unique
numpy.unique
numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)[source]
Find the unique elements of an array.

Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:

the indices of the input array that give the unique values

the indices of the unique array that reconstruct the input array

the number of times each unique value comes up in the input array

Parameters
ararray_like
Input array. Unless axis is specified, this will be flattened if it is not already 1-D.

return_indexbool, optional
If True, also return the indices of ar (along the specified axis, if provided, or in the flattened array) that result in the unique array.

return_inversebool, optional
If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar.

return_countsbool, optional
If True, also return the number of times each unique item appears in ar.

New in version 1.9.0.

axisint or None, optional
The axis to operate on. If None, ar will be flattened. If an integer, the subarrays indexed by the given axis will be flattened and treated as the elements of a 1-D array with the dimension of the given axis, see the notes for more details. Object arrays or structured arrays that contain objects are not supported if the axis kwarg is used. The default is None.

New in version 1.13.0.

Returns
uniquendarray
The sorted unique values.

unique_indicesndarray, optional
The indices of the first occurrences of the unique values in the original array. Only provided if return_index is True.

unique_inversendarray, optional
The indices to reconstruct the original array from the unique array. Only provided if return_inverse is True.

unique_countsndarray, optional
The number of times each of the unique values comes up in the original array. Only provided if return_counts is True.

New in version 1.9.0.

In [7]:
heart[heart.ca=='?']

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
166,52.0,1.0,3.0,138.0,223.0,0.0,0.0,169.0,0.0,0.0,1.0,?,3.0,0
192,43.0,1.0,4.0,132.0,247.0,1.0,2.0,143.0,1.0,0.1,2.0,?,7.0,1
287,58.0,1.0,2.0,125.0,220.0,0.0,0.0,144.0,0.0,0.4,2.0,?,7.0,0
302,38.0,1.0,3.0,138.0,175.0,0.0,0.0,173.0,0.0,0.0,1.0,?,3.0,0


In [8]:
heart.thal.unique()

array(['6.0', '3.0', '7.0', '?'], dtype=object)

we are changing all question marks to a nan which means <b>Not A Number</b>

In [9]:
heart[heart.thal=='?']

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
87,53.0,0.0,3.0,128.0,216.0,0.0,2.0,115.0,0.0,0.0,1.0,0.0,?,0
266,52.0,1.0,4.0,128.0,204.0,1.0,0.0,156.0,1.0,1.0,2.0,0.0,?,2


In [10]:
heart.replace('?', np.nan)

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
0,63.0,1.0,1.0,145.0,233.0,1.0,2.0,150.0,0.0,2.3,3.0,0.0,6.0,0
1,67.0,1.0,4.0,160.0,286.0,0.0,2.0,108.0,1.0,1.5,2.0,3.0,3.0,2
2,67.0,1.0,4.0,120.0,229.0,0.0,2.0,129.0,1.0,2.6,2.0,2.0,7.0,1
3,37.0,1.0,3.0,130.0,250.0,0.0,0.0,187.0,0.0,3.5,3.0,0.0,3.0,0
4,41.0,0.0,2.0,130.0,204.0,0.0,2.0,172.0,0.0,1.4,1.0,0.0,3.0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
298,45.0,1.0,1.0,110.0,264.0,0.0,0.0,132.0,0.0,1.2,2.0,0.0,7.0,1
299,68.0,1.0,4.0,144.0,193.0,1.0,0.0,141.0,0.0,3.4,2.0,2.0,7.0,2
300,57.0,1.0,4.0,130.0,131.0,0.0,0.0,115.0,1.0,1.2,2.0,1.0,7.0,3
301,57.0,0.0,2.0,130.0,236.0,0.0,2.0,174.0,0.0,0.0,2.0,1.0,3.0,1


setting the function to 'heart =' were essentially saving the desired changes the functions to the variable <b>heart</b>

In [11]:
heart = heart.replace('?', np.nan)

In [12]:
heart[heart.thal=='?']

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease


In [13]:
print(heart.info) # or heart.dtypes

<bound method DataFrame.info of       age  sex   cp  trestbps   chol  fbs  restecg  thalach  exang  oldpeak  \
0    63.0  1.0  1.0     145.0  233.0  1.0      2.0    150.0    0.0      2.3   
1    67.0  1.0  4.0     160.0  286.0  0.0      2.0    108.0    1.0      1.5   
2    67.0  1.0  4.0     120.0  229.0  0.0      2.0    129.0    1.0      2.6   
3    37.0  1.0  3.0     130.0  250.0  0.0      0.0    187.0    0.0      3.5   
4    41.0  0.0  2.0     130.0  204.0  0.0      2.0    172.0    0.0      1.4   
..    ...  ...  ...       ...    ...  ...      ...      ...    ...      ...   
298  45.0  1.0  1.0     110.0  264.0  0.0      0.0    132.0    0.0      1.2   
299  68.0  1.0  4.0     144.0  193.0  1.0      0.0    141.0    0.0      3.4   
300  57.0  1.0  4.0     130.0  131.0  0.0      0.0    115.0    1.0      1.2   
301  57.0  0.0  2.0     130.0  236.0  0.0      2.0    174.0    0.0      0.0   
302  38.0  1.0  3.0     138.0  175.0  0.0      0.0    173.0    0.0      0.0   

     slope   ca tha

In [14]:
heart.dtypes

age              float64
sex              float64
cp               float64
trestbps         float64
chol             float64
fbs              float64
restecg          float64
thalach          float64
exang            float64
oldpeak          float64
slope            float64
ca                object
thal              object
heart_disease      int64
dtype: object

In [15]:
heart.ca = heart.ca.astype('float')


# .astype
pandas.DataFrame.astype
DataFrame.astype(dtype, copy=True, errors='raise')[source]
Cast a pandas object to a specified dtype dtype.

Parameters
dtypedata type, or dict of column name -> data type
Use a numpy.dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.dtype or Python type to cast one or more of the DataFrame’s columns to column-specific types.

copybool, default True
Return a copy when copy=True (be very careful setting copy=False as changes to values then may propagate to other pandas objects).

errors{‘raise’, ‘ignore’}, default ‘raise’
Control raising of exceptions on invalid data for provided dtype.

raise : allow exceptions to be raised

ignore : suppress exceptions. On error return original object.

Returns
castedsame type as caller

In [16]:
#sex 1=male, 0=female
heart.sex.replace({0.0: 'female', 1.0: 'male'}, inplace = True)

# inplace = true

When inplace=True is passed, the data is renamed in place (it returns nothing), so you'd use:

df.an_operation(inplace=True)
When inplace=False is passed (this is the default value, so isn't necessary), performs the operation and returns a copy of the object, so you'd use:

df = df.an_operation(inplace=False) 

In pandas, is inplace = True considered harmful, or not?
TLDR; Yes, yes it is.
inplace, contrary to what the name implies, often does not prevent copies from being created, and (almost) never offers any performance benefits
inplace does not work with method chaining
inplace can lead to SettingWithCopyWarning if used on a DataFrame column, and may prevent the operation from going though, leading to hard-to-debug errors in code
The pain points above are common pitfalls for beginners, so removing this option will simplify the API.


I don't advise setting this parameter as it serves little purpose. See this GitHub issue which proposes the inplace argument be deprecated api-wide.


It is a common misconception that using inplace=True will lead to more efficient or optimized code. In reality, there are absolutely no performance benefits to using inplace=True. Both the in-place and out-of-place versions create a copy of the data anyway, with the in-place version automatically assigning the copy back.


inplace=True is a common pitfall for beginners. For example, it can trigger the SettingWithCopyWarning:


df = pd.DataFrame({'a': [3, 2, 1], 'b': ['x', 'y', 'z']})


df2 = df[df['a'] > 1]
df2['b'].replace({'x': 'abc'}, inplace=True)

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame


Calling a function on a DataFrame column with inplace=True may or may not work. This is especially true when chained indexing is involved.


As if the problems described above aren't enough, inplace=True also hinders method chaining. Contrast the working of


### result = df.some_function1().reset_index().some_function2()


As opposed to


temp = df.some_function1()
temp.reset_index(inplace=True)
result = temp.some_function2()
The former lends itself to better code organization and readability.

In [17]:
heart.info() # or heart.dtypes

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   age            303 non-null    float64
 1   sex            303 non-null    object 
 2   cp             303 non-null    float64
 3   trestbps       303 non-null    float64
 4   chol           303 non-null    float64
 5   fbs            303 non-null    float64
 6   restecg        303 non-null    float64
 7   thalach        303 non-null    float64
 8   exang          303 non-null    float64
 9   oldpeak        303 non-null    float64
 10  slope          303 non-null    float64
 11  ca             299 non-null    float64
 12  thal           301 non-null    object 
 13  heart_disease  303 non-null    int64  
dtypes: float64(11), int64(1), object(2)
memory usage: 33.3+ KB


In [21]:
# - cp: chest pain type
#  - Value 1: typical angina
#  - Value 2: atypical angina
#  - Value 3: non-anginal pain
#  - Value 4: asymptomatic

heart.cp.replace({1.0: 'typical angina', 2.0: 'atypical angina', 3.0: 'non-anginal pain', 4.0: 'asymptomatic' }, inplace = True)

In [22]:
heart.info() # or heart.dtypes

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   age            303 non-null    float64
 1   sex            303 non-null    object 
 2   cp             303 non-null    object 
 3   trestbps       303 non-null    float64
 4   chol           303 non-null    float64
 5   fbs            303 non-null    float64
 6   restecg        303 non-null    float64
 7   thalach        303 non-null    float64
 8   exang          303 non-null    float64
 9   oldpeak        303 non-null    float64
 10  slope          303 non-null    float64
 11  ca             299 non-null    float64
 12  thal           301 non-null    object 
 13  heart_disease  303 non-null    int64  
dtypes: float64(10), int64(1), object(3)
memory usage: 33.3+ KB


In [26]:
heart.describe(include='all')

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
count,303.0,303,303,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303,299.0,301.0,303.0
unique,,2,4,,,,,,,,3,,3.0,
top,,male,asymptomatic,,,,,,,,upsloping,,3.0,
freq,,206,144,,,,,,,,142,,166.0,
mean,54.438944,,,131.689769,246.693069,0.148515,0.990099,149.607261,0.326733,1.039604,,0.672241,,0.937294
std,9.038662,,,17.599748,51.776918,0.356198,0.994971,22.875003,0.469794,1.161075,,0.937438,,1.228536
min,29.0,,,94.0,126.0,0.0,0.0,71.0,0.0,0.0,,0.0,,0.0
25%,48.0,,,120.0,211.0,0.0,0.0,133.5,0.0,0.0,,0.0,,0.0
50%,56.0,,,130.0,241.0,0.0,1.0,153.0,0.0,0.8,,0.0,,0.0
75%,61.0,,,140.0,275.0,0.0,2.0,166.0,1.0,1.6,,1.0,,2.0


# pandas.Categorical
class pandas.Categorical(values, categories=None, ordered=None, dtype=None, fastpath=False, copy=True)[source]
Represent a categorical variable in classic R / S-plus fashion.

Categoricals can only take on only a limited, and usually fixed, number of possible values (categories). In contrast to statistical categorical variables, a Categorical might have an order, but numerical operations (additions, divisions, …) are not possible.

All values of the Categorical are either in categories or np.nan. Assigning values outside of categories will raise a ValueError. Order is defined by the order of the categories, not lexical order of the values.

Parameters
valueslist-like
The values of the categorical. If categories are given, values not in categories will be replaced with NaN.

categoriesIndex-like (unique), optional
The unique categories for this categorical. If not given, the categories are assumed to be the unique values of values (sorted, if possible, otherwise in the order in which they appear).

orderedbool, default False
Whether or not this categorical is treated as a ordered categorical. If True, the resulting categorical will be ordered. An ordered categorical respects, when sorted, the order of its categories attribute (which in turn is the categories argument, if provided).

dtypeCategoricalDtype
An instance of CategoricalDtype to use for this categorical.

Raises
ValueError
If the categories do not validate.

TypeError
If an explicit ordered=True is given but no categories and the values are not sortable.

if you have an ordered category, you want it
represented in your data as both a
number and a string. 

You want to know what the order of the categories is, but you also want to know what each of the categories mean.

In alex's example if it says, in a survey rate your agreement: disagree strongly, disagree neutral, agree strongly, agree.

If youre running an analysis you migh want print a plot of those levels of agreement in order because you want to see how they differentiate by some factor or other attribute.

In [30]:
heart.slope.replace({1.0: 'upsloping', 2.0: 'flat', 3.0: 'downsloping'}, inplace = True)

In [31]:
heart.slope = pd.Categorical(heart.slope, ['ipsloping', 'flat', 'downsloping'], ordered = True)

In [32]:
heart.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
0,63.0,male,typical angina,145.0,233.0,1.0,2.0,150.0,0.0,2.3,downsloping,0.0,6.0,0
1,67.0,male,asymptomatic,160.0,286.0,0.0,2.0,108.0,1.0,1.5,flat,3.0,3.0,2
2,67.0,male,asymptomatic,120.0,229.0,0.0,2.0,129.0,1.0,2.6,flat,2.0,7.0,1
3,37.0,male,non-anginal pain,130.0,250.0,0.0,0.0,187.0,0.0,3.5,downsloping,0.0,3.0,0
4,41.0,female,atypical angina,130.0,204.0,0.0,2.0,172.0,0.0,1.4,,0.0,3.0,0


In [33]:
heart.slope.cat.codes

0      2
1      1
2      1
3      2
4     -1
      ..
298    1
299    1
300    1
301    1
302   -1
Length: 303, dtype: int8

In [34]:
heart.describe(include='all')

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heart_disease
count,303.0,303,303,303.0,303.0,303.0,303.0,303.0,303.0,303.0,161,299.0,301.0,303.0
unique,,2,4,,,,,,,,2,,3.0,
top,,male,asymptomatic,,,,,,,,flat,,3.0,
freq,,206,144,,,,,,,,140,,166.0,
mean,54.438944,,,131.689769,246.693069,0.148515,0.990099,149.607261,0.326733,1.039604,,0.672241,,0.937294
std,9.038662,,,17.599748,51.776918,0.356198,0.994971,22.875003,0.469794,1.161075,,0.937438,,1.228536
min,29.0,,,94.0,126.0,0.0,0.0,71.0,0.0,0.0,,0.0,,0.0
25%,48.0,,,120.0,211.0,0.0,0.0,133.5,0.0,0.0,,0.0,,0.0
50%,56.0,,,130.0,241.0,0.0,1.0,153.0,0.0,0.8,,0.0,,0.0
75%,61.0,,,140.0,275.0,0.0,2.0,166.0,1.0,1.6,,1.0,,2.0
