### Problem: Heart Disease
<img src ='https://img.freepik.com/free-vector/human-heart-disease-symbol_1308-107392.jpg?w=2000'/>
     
### Problem  Introduction and Motivation
>Heart Diease are the leading cause of death globally. An estimated 17.0 million people died from heart disease
>in 2019, representing 32% of all global death.
### Q & A
>**What is the prediction we are trying to make?**<br>
>We want to predict whether a person is having a heart diease or not by inputing several variables.<br><br>
>**Why is it important?  Who cares?**<br>
>Since heart disease is the major death worldwide, it may save a lot of lives by using this prediction to inform
>people that they may have heart disease. People who care about their health wants to know the condition of their hearts. Hospital can also use these predictions as a reference to help doctor diagnose the patient's heart situation.<br><br>
>**What are the possible actions that could be taken as a result of this work?**<br>
>If a person is predicted to have a heart disease, he or she can go to see a doctor for further examinations.<br><br>
>**How do we define success?**<br>
>We hope to have an accurancy above 80%. To achieve this, we plan to have collaboration with hospitals and use their patients' data to make our prediction more accurate. 

### Data Preparation

<img src ='https://ourworldindata.org/grapher/exports/causes-of-death-in-15-49-year-olds.svg'/>

**Besides cancer, people should know that heart disease is also a disease that could cause severe consequences.**
<img src ='https://www.templehealth.org/sites/default/files/inline-images/heart-attack-symptoms-men-vs-women.png'/>

**Men have a higher propensity of developing heart disease than Women, and the heart attack symptoms between men and women are also different.**

### Modeling

In [1]:
import pandas as pd
# pandas is a Python package prooviding fast, flexible, and expressive data structures designed to make working with
# "relational" or "labeled" data both easy and intuituve.

from sklearn.linear_model import LogisticRegression
# sklearn.linear_model is a machine learning package. It offers a set of fast tools for machine learning and 
# statistical modeling, such as classification, regression, clustering, and dimenstionality reduction, and we want
# to import LogisticRegression tool from sklearn.linear_model.

from sklearn.metrics import accuracy_score
# sklearn.metrics is a module that implements several loss, score, and utility functions to measure classification
# performance, and we are want to import the accuracy_score from the sklearn.metrics.

import pickle
# pickle is a module that is used for serializing and de-reserializing a Python object structure.

df = pd.read_csv('heart_disease.csv')
# df means data frame, the read_csv() function is used to retrieve data from from csv file, and this function is 
# in the pandas package so there is a pd. in front of the function.

X = df.iloc[:,1:len(df.columns)]
# iloc() function helps us to select a specfic row or column from the data set. The iloc indexer syntax is 
# iloc[<row selection>, <column selection>], so df.iloc[:, 1:len(df.columns)] means selecting all the rows and 
# selecting columns from the index 1 to the end of data frame.

y = df.iloc[:,0]
# df.iloc[:,0] means selecting all the rows and the first column of data frame.

model = LogisticRegression(max_iter=800)
# max_iter is the maximum number of iterations for the solver to converge, it is an integer and the default value
# is 100, but we change the maximum number of iterations to 800 in the model.

model.fit(X,y)
# model.fit(X,y) means fit the model according to the given training data.

predictions = model.predict(X)
# model.predict(X) means predict class labels for samples in X.

print(accuracy_score(y,predictions))
# accuracy_score function of the sklearn.metrics package calculates the accuracy score for a set of predicted 
# labels against the true labels, y is the true label and predictions is the predicted label.

pickle_out = open('classifier', mode='wb')
# the open() function returns a file object which can used to read, write, and modify the file.
# open('classifier', mode = 'wb') means open the file name 'classifier' and opened for writing in binary mode.
# 'wb' stands for write-binary.

pickle.dump(model, pickle_out)
# pickle.dump(model, pickle_out) means to dump information from pickle_out to model.

pickle_out.close()
# close the pickle_out object.

0.7542087542087542


### Deployment

In [1]:
%%writefile app.py

import pickle
import streamlit as st

pickle_in = open('classifier', 'rb')
classifier = pickle.load(pickle_in)

@st.cache()

# Define the function which will make the prediction using data
# inputs from users
def prediction(age, sex, non_anginal_pain, max_heart_rate, exercise_included_angina):
    
    # Make predictions
    prediction = classifier.predict(
        [[age, sex, non_anginal_pain, max_heart_rate, exercise_included_angina]])
    
    if prediction == 0:
        pred = 'Your heart is very healthy!'
    else:
        pred = 'PLEASE SEE A DOCTOR!  You are diagnosed with heart disease!'
    return pred

# This is the main function in which we define our webpage
def main():
    
    # Create input fields
    age = st.number_input("Age(Pateint Age in Years)",
                                  min_value=1,
                                  # min_value is the minimum permitted value. If None, there will be no minimum.
                                  max_value=120,
                                  # max_value is the maximum permitted value. If None, there will be no maximum.
                                  value=60,
                                  # It is the value that is displayed on this widget on its first render.
                                  step=1,
                                  # step is the stepping interval. Defaults to 1 if the values is an int.
                                 )
    sex = st.number_input("Sex(0 for Female and 1 for Male)",
                              min_value=0,
                              max_value=1,
                              value=0,
                              step=1
                             )

    non_anginal_pain = st.number_input("Non anginal pain(1 if this type of pain is diagnosed; 0 otherwise)",
                              min_value=0,
                              max_value=1,
                              value=0,
                              step=1
                             )
    max_heart_rate = st.number_input("Max heart rate(beats per minute)",
                          min_value=0,
                          max_value=200,
                          value=150,
                          step=1
                         )
    exercise_included_angina = st.number_input("Exercise included angina(1 if this type of pain arises under the stress of exercise; 0 otherwise)",
                          min_value=0,
                          max_value=1,
                          value=0,
                          step=1
                    )

    result = ""
    
    # When 'Predict' is clicked, make the prediction and store it
    if st.button("Predict"):
        result = prediction(age, sex, non_anginal_pain, max_heart_rate, exercise_included_angina)
        st.success(result)
        
if __name__=='__main__':
    main()
    

Overwriting app.py


In [None]:
!streamlit run app.py