<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Overview" data-toc-modified-id="Overview-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Overview</a></span><ul class="toc-item"><li><span><a href="#Purpose" data-toc-modified-id="Purpose-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Purpose</a></span></li><li><span><a href="#What-is-Heart-Disease?" data-toc-modified-id="What-is-Heart-Disease?-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>What is Heart Disease?</a></span></li><li><span><a href="#Data-Set-Description" data-toc-modified-id="Data-Set-Description-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Data Set Description</a></span></li><li><span><a href="#Why-Probabilistic-Machine-Learning-for-Heart-Disease-Diagnosis?" data-toc-modified-id="Why-Probabilistic-Machine-Learning-for-Heart-Disease-Diagnosis?-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Why Probabilistic Machine Learning for Heart Disease Diagnosis?</a></span></li><li><span><a href="#Objectives" data-toc-modified-id="Objectives-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Objectives</a></span></li></ul></li></ul></div>

# Overview 
## Purpose
The main purpose of this project is develop a web-based machine learning solution to help health practitioners solve the problem of heart disease diagnosis. The project aims to diagnosis heart disease patients, the patients can either be sick or healthy.

## What is Heart Disease?

> Heart disease is a collection of diseases and conditions that cause cardiovascular problems. Normally, in this disease, the heart fails to supply a sufficient amount of blood to other parts of the body in order to accomplish their normal functionalities. This is simply heart conditions that include diseased vessels, structural problems and blood clots. Early and on-time diagnosing of this problem is very essential for preventing patients from more damage and saving their lives. 
There are many different types of heart disease. Some types can be grouped together according to how they affect the structure or function of your heart. `The solution provided will only focus on diagnosing if a patient has heart disease or not using probabilistic machine learning`
>
> ![](https://images.onhealth.com/images/slideshow/heart_disease_s1_heart.jpg)
> 
> *Cardiovascular system (heart) anatomy*

Source: [OnHealth](https://www.onhealth.com/content/1/heart_disease_coronary_artery)

More information about heart disease on the following sources: [healthline](https://www.healthline.com/health/heart-disease) & [Heart&Stroke](https://www.heartandstroke.ca/heart-disease/what-is-heart-disease/types-of-heart-disease)

## Data Set Description 
The data used for this problem is a [Multivariate Heart Disease Data Set from UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Heart+Disease/). The data set database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. `The "goal" field refers to the presence of heart disease in the patient`. 

**Attribute Information** 


| Attribute | Description | Type |
| --- | --- | --- |
| `age` | Patient's age in years | numerical |
| `sex` | sex (1 = male; 0 = female) | nominal/binomial |
| `cp` | chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic) | nominal |
| `trestps` | resting blood pressure (in mm Hg on admission to the hospital) | numerical |
| `chol` | serum cholestoral in mg/dl | numerical |
| `fbs` | (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false) | binomial/nominal |
| `restecg` | resting electrocardiographic results (normal; abnormal; ventricular hypertrophy | nominal |
| `thalach` | maximum heart rate achieved | numerical |
| `exang` | exercise induced angina (1 = yes; 0 = no) | binomial/nominal |
| `oldpeak` | ST depression induced by exercise relative to rest | numerical |
| `slope` | the slope of the peak exercise ST segment (1 = upsloping; 2 = flat; 3 = downsloping) | nominal |
| `ca`  | number of major vessels colored by flourosopy (0 = mild; 1 = moderate; 3 = severe | nominal |
| `thal`  | Status of the heart (1 = normal; 2 = fixed defect; 3 = reversible defect) | nominal |
| `target` | (1 = heart disease; 0 = healthy) | binomial/nominal |

## Why Probabilistic Machine Learning for Heart Disease Diagnosis?

Medical uncertainty is considered as a natural feature for medicine and medical practice. Understanding medical uncertainty and acquiring proper coping strategies is regarded as a core clinical competency for medical specialists. Medical practice is properly known to be full of uncertainty. Even medical specialists face uncertainty, especially regarding complex co-morbid that may hinder applications of existing medical evidences. 

Diagnostic uncertainty is a subjective perception of inability to provide an accurate explanation of the patient's health problem. Dealing with uncertainty in diagnostic focuses on shared decision making, establishing a relationship of trust with patients and meticulous evaluation. The problem of employing machine learning to help practitioners solve the problem of heart disease, has a decision making element. This is deciding if a patient is sick (heart disease) or healthy.  Uncertainty is a key ingredient in this problem, where decision making depends on the amount of uncertainty. 

Probabilistic machine learning will provide opportunities for modeling uncertainty, performing probabilistic inference, combining prior knowledge and empirical evidence and making predictions or decisions in uncertain environments (heart disease diagnose). In this problem uncertainty can be incorporated by using probabilities to express the degree of uncertainty. This will enable machine learning to advise when it is uncertain or when it does not know, this can be reassuring for patients, as they would be concerned with trusting artificial intelligent systems. The solution to this problem will adopt a probabilistic framework which will aid in representing and manipulating uncertainty about the models and predictions.

More about uncertainty in medical diagnosis: [PubMed: National Library of Medicine](https://pubmed.ncbi.nlm.nih.gov/28936618/) and [National Center for Biotechnology Information](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127608/)

**A generative model will be developed for this problem such that:** <br>
- `P( θ | data ) ∝ P( data | θ )P(θ)` *whereby:* <br>
- `P( data | θ )` will be the likelihood of the response features given the model and predictor features. `P( data | θ )` is the conditional distribution of the response and predictor features given the model. It is data-driven and as the number of sample data increases the likelihood overwhelms the prior distribution <br>
- `P(θ)` will be the prior probability of the model parameters. `P(θ)` is the guess of the model parameters over the predictor features based on domain knowledge <br>
- `P( θ | data )` will be the posterior of model parameters. `P( θ | data )` is the conditional distribution of the model parameters given the response and predictor features <br>
The probabilistic machine learning model will be formulated considering probability distributions and prior before seeing the data instead of just the training data. <br> <br>
`Features:` age, sex, cp, trestbps, chol, fbs, restecg, thalach, exang, oldpeak, slope, ca, thal <br>
`Response Feature:` target (heart disease)

## Objectives 
**In line with the project's problem the following objectives are derived:**
- Build an intelligent web-based application system to diagnose heart disease patients (Main objective)
- Incorporate a probabilistic framework into the system by building a probabilistic machine learning model
- Use PyMC3 as a probabilistic programming framework to build a probabilistic machine learning model
- Implement the intelligent web-based application with the functionality to quantify uncertainty when diagnosing heart disease
- Implement the intelligent web-based application with the functionality to refer to a medical specialist when uncertain about  heart diagnosis <br>
- Build the system with tools that enable decision making under uncertainty and web application hosting. <br><br>
**Project Tools**
- Use HTML, CSS and Bootstrap as front-end tech stacks to the build the intelligent system
- Use Python, PyMC, ArviZ, Github, Django as Back-end tech stacks to the build the intelligent
