# Project 3: Heart Attack Predictor

## Overview

In the United States, someone experiences a heart attack every 40 seconds, contributing to more than 800K heart attacks annually. This underscores the critical need for awareness and prevention. Regularly assessing your risk of suffering a heart attack is crucial, as it allows for timely interventions and lifestyle changes that can significantly reduce the likelihood of life-threatening cardiovascular events.

This project implements a web app that enables doctors to assess the risk of a patient having a heart attack. The web app is powered by a machine learning model trained on a real-world dataset containing patient information.

## Dataset

In [None]:
import boto3
import pandas as pd
from io import StringIO

bucket_name = 'prj-03-bucket-jd'
heart_attack_csv = 'heart_attack.csv'

s3 = boto3.client('s3')

response = s3.get_object(Bucket=bucket_name, Key=heart_attack_csv)

data = pd.read_csv(response['Body'])

data.head()

## Training

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

data.columns = data.columns.str.strip()

X = data.drop(columns=["heart attack"])
y = data["heart attack"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

model = LogisticRegression(max_iter=1000)
model.fit(X_train_scaled, y_train)

## Model Evaluation

In [None]:
from sklearn.metrics import accuracy_score, classification_report

y_pred = model.predict(X_test_scaled)

print("Classification Report:")
print(classification_report(y_test, y_pred))

accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy Score: {accuracy:.4f}")

## Model Serialization

In [None]:
import joblib

joblib.dump(model, "logistic_model.pkl")

joblib.dump(scaler, "scaler.pkl")

print("Saved successfully.")