# Practical Exercise: Predicting Heart Disease with Logistic Regression
In this exercise, we will build a Logistic Regression model to predict whether a person is at risk of heart disease based on their medical information.

We will use a dataset that includes age, blood pressure, cholesterol levels, and heart rate. The target is a binary variable: **1 = high risk**, **0 = low risk**.

This exercise will help you understand how logistic regression can be applied to real-world health data.

## Step 1: Import Required Libraries
We begin by importing the necessary Python libraries. These include:
- `pandas` for data handling
- `numpy` for numerical operations
- `train_test_split` for splitting data into training and testing sets
- `LogisticRegression` to train our model
- `accuracy_score` and `classification_report` to evaluate our model's performance.

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
print("✅ Libraries imported correctly.")

## Step 2: Load and Explore the Dataset
Next, we load the dataset from a CSV file. This dataset includes various medical parameters for several patients.

We will first preview the data using `.head()` and then get a statistical summary using `.describe()`.

In [None]:
df = pd.read_csv('heart_disease_data.csv')
print("📄 First few rows of the dataset:")
print(df.head())

print("\n📊 Statistical summary:")
print(df.describe())

## Step 3: Prepare the Data
Now we prepare the dataset for modeling. We will separate the independent variables (`X`) from the target variable (`y`), and then split the dataset into training and testing sets.

This allows us to evaluate how well our model generalizes to new data.

In [None]:
X = df[['age', 'resting_blood_pressure', 'cholesterol', 'max_heart_rate']]
y = df['heart_disease']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
print("✅ Data prepared and split into training and testing sets.")

## Step 4: Train the Logistic Regression Model
We now create and train a Logistic Regression model using the training data. 

The `.fit()` method adjusts the model parameters based on the training data so it can learn the relationship between the inputs and the target.

In [None]:
model = LogisticRegression()
model.fit(X_train, y_train)
print("✅ Model trained successfully.")

## Step 5: Evaluate the Model
We now test the model on the testing dataset using `.predict()`, and evaluate its performance using `accuracy_score` and `classification_report`.

These metrics will tell us how many predictions were correct and give us more detailed insights into model performance.

In [None]:
y_pred = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

## Step 6: Try Your Own Example
Finally, you can test the model with your own data. Enter values for a new patient and the model will predict if they are at high risk of heart disease.

In [None]:
print("🔍 Try entering your own patient's values:")
age = int(input("Enter age: "))
bp = int(input("Enter resting blood pressure: "))
chol = int(input("Enter cholesterol level: "))
hr = int(input("Enter max heart rate: "))

new_patient = pd.DataFrame([[age, bp, chol, hr]], columns=X.columns)
prediction = model.predict(new_patient)[0]
probability = model.predict_proba(new_patient)[0][1]

result = '🔴 High risk of heart disease' if prediction == 1 else '🟢 Low risk'
print(f"Prediction: {result} (probability: {probability:.2f})")