The data and consists of four databases: Cleveland, Hungary, Switzerland, and Long Beach V. It contains 76 attributes, including the predicted attribute, but all published experiments refer to using a subset of 14 of them. The "target" field refers to the presence of heart disease in the patient. It is integer valued 0 = no disease and 1 = disease.
Heart disease is the number one cause of death globally. Heart disease is concertedly contributed by hypertension, diabetes, overweight and unhealthy lifestyles. This project covers manual exploratory data analysis and using pandas in Jupyter Notebook. Questions:
- Import The Libraries And Dataset
- Display Top 5 Rows of The Dataset
- Check The Last 5 Rows of The Dataset
- Find Shape of Our Dataset (Number of Rows And Number of Columns)
- Get Information About Our Dataset Like Total Number Rows, Total Number of Columns, Datatypes of Each Column And Memory Requirement
- Check Null Values In The Dataset
- Check For Duplicate Data and Drop Them
- Get Overall Statistics About The Dataset
- Draw Correlation Matrix
- How Many People Have Heart Disease, And How Many Don't Have Heart Disease In This Dataset?
- Find Count of Male & Female in this Dataset
- Find Gender Distribution According to The Target Variable
- Check Age Distribution In The Dataset
- Check Chest Pain Type
- Show The Chest Pain Distribution As Per Target Variable
- Show Fasting Blood Sugar Distribution According To Target Variable
- Check Resting Blood Pressure Distribution
- Compare Resting Blood Pressure As Per Sex Column
- Show Distribution of Serum cholesterol
- Plot Continuous Variables