# Final Project: Binary Classification using ML & Neural Networks

**Dataset:** Heart Disease https://www.kaggle.com/datasets/aayushjha0311/uci-heart-disease-dataset

Hints are provided only to help you recall concepts already taught.

## 1. Imports and Setup

Import libraries for:
- Data handling (`numpy`, `pandas`)
- Visualization (`matplotlib`, optionally `seaborn`)
- Machine Learning (`scikit-learn`)
- Deep Learning (`torch`)

**Hint:** Look at imports you used in previous ML assignments.

## 2. Load the Dataset

1. Load `heart.csv` using pandas
2. Display the first 5 rows
3. Check shape and column names

**Hint:** `read_csv`, `head`, `shape`, `columns`

## 3. Exploratory Data Analysis (EDA)

Perform minimal EDA:
- Check missing values
- Check class balance of `target`
- Understand feature ranges

**Hint:**
- Missing values → `.isnull().sum()`
- Class balance → `value_counts()`
- Feature ranges → `.describe()`

## 4. Data Preprocessing

Steps to follow:
1. Separate features (`X`) and target (`y`)
2. Apply feature scaling
3. Split into training and validation sets

**Hint:**
- Neural networks are sensitive to feature scale
- Use the same split for ML and NN models
- `StandardScaler` + `train_test_split`

## 5. Traditional Machine Learning Models

You will train three models using scikit-learn.

### 5.1 Logistic Regression

Tasks:
- Train the model
- Predict on validation data
- Calculate accuracy and confusion matrix

**Hint:** This is your baseline model.

### 5.2 Decision Tree

Tasks:
- Train a Decision Tree classifier
- Experiment with `max_depth`
- Evaluate performance

**Hint:** Very deep trees usually overfit.

### 5.3 Random Forest

Tasks:
- Train a Random Forest classifier
- Tune at least one hyperparameter
- Evaluate performance

**Hint:** `n_estimators` and `max_depth` are good starting points.

## 6. Traditional ML Model Comparison

Compare all traditional ML models.

Answer:
1. Which model performed best?
2. Why do you think this happened?

**Hint:** Look at bias vs variance.

## 7. Neural Network using PyTorch

Now you will implement a **simple feed-forward neural network**.

### 7.1 Data Preparation for PyTorch

Tasks:
- Convert NumPy arrays to PyTorch tensors
- Ensure correct data types

**Hint:**
- Inputs → `float32`
- Targets → reshape if required

### 7.2 Neural Network Architecture

Define a neural network with:
- Input layer
- 1 or 2 hidden layers (ReLU)
- Output layer (Sigmoid)

**Hint:** Keep the model small and simple.

### 7.3 Loss Function and Optimizers

Tasks:
- Use Binary Cross Entropy loss
- Train once using SGD and once using Adam

**Hint:** Learning rate matters more than depth here.

### 7.4 Training Loop

Tasks:
- Write training loop
- Track training and validation loss

**Hint:** Order matters: zero_grad → forward → loss → backward → step

## 8. Neural Network Evaluation

Tasks:
- Compute accuracy on validation set
- Plot training vs validation loss

**Hint:** Loss curves help identify overfitting.

## 9. Optimizer Comparison

Compare SGD vs Adam based on:
- Convergence speed
- Final validation loss

**Hint:** Adam usually converges faster.

## 10. Final Conclusions

Answer:
1. When did the Neural Network perform better?
2. When did traditional ML perform better?
3. What is your key takeaway from this project?