# Proposal

## Introduction

Heart disease is a general term for a variety of conditions that affect your heart and blood vessels. These conditions can damage your heart, restrict blood flow, and lead to serious complications, including heart attack, stroke, and heart failure. With 20.5 million U.S. adults have coronary artery disease, heart disease is the leading cause of death in the United States (U.S. Department of Health and Human Services, 2023). 

Age is a major risk factor for heart disease due to several factors. Arteries stiffen and weaken, reducing blood flow and putting strain on the heart. Plaque buildup in arteries worsens with age, further restricting blood flow and increasing the risk of heart attack and stroke (Rodgers, J. L. et al., 2019).

Chronically high blood pressure significantly increases the risk of heart disease in several ways. It forces the heart to work harder, potentially leading to heart failure (World Health Organization, 2020). It can also damage arteries, making them more prone to plaque buildup and narrowing (Centers for Disease Control and Prevention, 2021).

High levels of LDL cholesterol, often called "bad" cholesterol, can contribute to heart disease by accumulating in arteries and forming plaque. This plaque narrows the arteries, reducing blood flow and increasing the risk of complications like chest pain, blood clots, heart attack, and stroke(Centers for Disease Control and Prevention, 2017).

Based on the research our team found, we will be using the UC Irvine Heart Disease dataset in order to answer the following research quesion.

**Research Question:** Can we classify if a patient has heart disease based on age, blood pressure and cholestrol? 

## Exploratory Data Analysis

In [35]:
library(tidyverse)
library(repr)
library(tidymodels)
library(stringr)

In [71]:

heart <- read_csv("data/Heart_Disease_Prediction.csv") 
names(heart)<-str_replace_all(names(heart), c(" " = "_" , "," = "" ))
head(heart)



heart_filtered <- heart |>
select(Age, BP, Cholesterol, Heart_Disease)
head(heart_filtered)

heart_mean <- heart_filtered|>
select(-Heart_Disease) |>
map_df(mean)

heart_amount <- heart_filtered|>
group_by(Heart_Disease) |>
summarize(amount=n())


heart_mean
heart_amount

heart_summary <- bind_cols(heart_mean, heart_amount) |>
pivot_wider(
            names_from=Heart_Disease,
            values_from=amount) |>
rename(age_avg=Age, bp_avg=BP, chol_avg=Cholesterol, hd_absence=Absence, hd_presense=Presence)
heart_summary




[1mRows: [22m[34m270[39m [1mColumns: [22m[34m14[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m  (1): Heart Disease
[32mdbl[39m (13): Age, Sex, Chest pain type, BP, Cholesterol, FBS over 120, EKG resu...

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


Age,Sex,Chest_pain_type,BP,Cholesterol,FBS_over_120,EKG_results,Max_HR,Exercise_angina,ST_depression,Slope_of_ST,Number_of_vessels_fluro,Thallium,Heart_Disease
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>
70,1,4,130,322,0,2,109,0,2.4,2,3,3,Presence
67,0,3,115,564,0,2,160,0,1.6,2,0,7,Absence
57,1,2,124,261,0,0,141,0,0.3,1,0,7,Presence
64,1,4,128,263,0,0,105,1,0.2,2,1,7,Absence
74,0,2,120,269,0,2,121,1,0.2,1,1,3,Absence
65,1,4,120,177,0,0,140,0,0.4,1,0,7,Absence


Age,BP,Cholesterol,Heart_Disease
<dbl>,<dbl>,<dbl>,<chr>
70,130,322,Presence
67,115,564,Absence
57,124,261,Presence
64,128,263,Absence
74,120,269,Absence
65,120,177,Absence


Age,BP,Cholesterol
<dbl>,<dbl>,<dbl>
54.43333,131.3444,249.6593


Heart_Disease,amount
<chr>,<int>
Absence,150
Presence,120


age_avg,bp_avg,chol_avg,hd_absence,hd_presense
<dbl>,<dbl>,<dbl>,<int>,<int>
54.43333,131.3444,249.6593,150,120


## Biblography

Centers for Disease Control and Prevention. (2017). LDL & HDL: Good & Bad Cholesterol. Centers for Disease Control and Prevention. https://www.cdc.gov/cholesterol/ldl_hdl.htm 

Centers for Disease Control and Prevention. (2021, May 18). About high blood pressure (hypertension). Centers for Disease Control and Prevention. https://www.cdc.gov/bloodpressure/about.htm

Rodgers, J. L., Jones, J., Bolleddu, S. I., Vanthenapalli, S., Rodgers, L. E., Shah, K., Karia, K., & Panguluri, S. K. (2019). Cardiovascular Risks Associated with Gender and Aging. Journal of Cardiovascular Development and Disease, 6(2). https://doi.org/10.3390/jcdd6020019

U.S. Department of Health and Human Services. (2023, December 20). What is coronary heart disease?. National Heart Lung and Blood Institute. https://www.nhlbi.nih.gov/health/coronary-heart-disease

World Health Organization. (2020). Hypertension. World Health Organization. https://www.who.int/health-topics/hypertension#tab=tab_1