Skip to content


Repository files navigation



Project Description:

This project is presented a prediction of the Heart Disease for young patients using Data Minings techniques.

Files Description:

  • Main.ipynb - main Jupyter notebook, where is presented main prediction work.
  • outliers - in this folder is Outliers Jupyter notebook, Stat_outliers notebook and dataset file of outliers.
  • stats - in this folder are presented three different Jupyter notebooks where visulize dependecies target values with other attributes.
  • data.csv - collected data in one dataset from different country
  • heart.csv - processed data

Data description:

The data reviewed in this study was taken from the UCI Archive. The main data have about 900 rows, and processed data - about 300 rows. Its dataset was collected from different country, such as USA, Hungarian, Switzerland and Vatican

Atrribute description:

  • AGE - age
  • SEX - gender
  • CP - chest pain type
  • TRESTBPS - resting blood pressure (in mm HG on admission to the hospital)
  • CHOL - serum chorestoral in mg/dl
  • FBS - fasting blood sugar >120
  • RESTECG - resting electrographic result
  • THALACH - maximum heart rate achieved
  • EXANG - exercise induced angina
  • OLDPEAR - ST depression induced by exercise relative to rest
  • CA - number of major vessels colored by flouroscopy
  • THAL - “thalium”, the blood disorder

Getting Started:

Clone project to computer:

To clone project to your computer, from console run:

$ git clone

Install packages:

All the necessary libraries to run the project are in the file: requirements.txt. To install them, you should go through the terminal to the folder where the downloaded project is located and write the following:

$pip3 install -r requirements.txt

Run Jupyter Notebooks:

Jupyter Notebooks are used throughout the project. To open them, you can use the Jupyter Lab website or launch notepad from the console using the following instructions:

$jupyter notebook

Built With:

  • pandas - pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
  • sklearn - simple and efficient tools for data mining and data analysis
  • marplotlib - Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms

Authors & License:

  • Traverse Anastasia - UCU student, Faculty of Applied Science, Computer Science


No releases published


No packages published