# Building a Dense Neural Network for Heart Disease Prediction

Time estimate: **30** minutes

## Objectives

After completing this lab, you will be able to:

- Preprocess medical data by handling missing values, normalizing features, and encoding categorical variables for neural network training
- Build a fully connected (dense) neural network capable of learning from patient data to predict heart disease presence
- Train and evaluate the model using appropriate metrics, and visualize learning curves to assess model performance and improvement

## What you will do in this lab

In this hands-on lab, you will build your first neural network to predict heart disease using patient medical records. This practical exercise will introduce you to the complete machine learning workflow, from data loading through model evaluation.

You will:

- Explore and analyze the Heart Disease UCI dataset containing clinical parameters
- Preprocess medical data using standardization techniques
- Design and implement a multi-layer neural network architecture
- Train the model using backpropagation and evaluate its performance
- Interpret results using classification reports, confusion matrices, and learning curves

## Overview

Heart disease remains one of the leading causes of death worldwide, making early detection critical for effective treatment. Machine learning, particularly neural networks, has shown remarkable success in medical diagnosis by identifying complex patterns in patient data that might not be immediately apparent to human observers.

In this lab, you will build a dense (fully connected) neural network to predict the presence of heart disease based on 13 clinical features such as age, blood pressure, cholesterol levels, and electrocardiogram results. Dense neural networks are particularly well-suited for this task because they can learn non-linear relationships between multiple features simultaneously.

By the end of this lab, you will understand how to apply neural networks to real-world medical data and interpret the results in a clinically meaningful way. This foundational knowledge will prepare you for more advanced deep learning applications in healthcare and other domains.

## About the dataset

This dataset contains medical records for heart disease diagnosis and prediction. It is widely used in cardiovascular research and machine learning applications to predict the presence of heart disease in patients based on various clinical parameters.

### Dataset overview

The Cleveland Heart Disease Database is one of the most widely-used datasets for heart disease prediction research. Originally collected at the Cleveland Clinic Foundation, it contains comprehensive medical information from patients undergoing cardiovascular evaluations. The dataset has been extensively validated and is considered a benchmark for developing and testing machine learning models in medical diagnostics.

The dataset includes 303 patient records with 14 attributes (13 features and 1 target variable). These features represent a combination of demographic information, symptoms, clinical test results, and diagnostic measurements that physicians commonly use to assess cardiovascular health. The target variable indicates whether heart disease is present, making this a binary classification problem.

### Column descriptions

1. **age** - Age of the patient in years (range: 29-77 years)
2. **sex** - Gender of the patient (0 = Female, 1 = Male)
3. **cp** - Chest pain type: 0 = typical angina, 1 = atypical angina, 2 = non-anginal pain, 3 = asymptomatic
4. **trestbps** - Resting blood pressure measured in mm Hg on admission to the hospital (normal range: 90-140 mm Hg)
5. **chol** - Serum cholesterol level in mg/dl (desirable: < 200 mg/dl)
6. **fbs** - Fasting blood sugar > 120 mg/dl (0 = No, 1 = Yes); indicates potential diabetes
7. **restecg** - Resting electrocardiographic results: 0 = normal, 1 = ST-T wave abnormality, 2 = left ventricular hypertrophy
8. **thalach** - Maximum heart rate achieved during exercise stress test (normal range: 60-100 bpm at rest)
9. **exang** - Exercise-induced angina (chest pain triggered by physical activity): 0 = No, 1 = Yes
10. **oldpeak** - ST depression induced by exercise relative to rest (measures heart stress during exercise)
11. **slope** - Slope of the peak exercise ST segment: 0 = upsloping (better), 1 = flat, 2 = downsloping (worse)
12. **ca** - Number of major blood vessels (0-3) colored by fluoroscopy (fewer vessels visible indicates potential blockage)
13. **thal** - Thalassemia/Thallium stress test result: 1 = normal blood flow, 2 = fixed defect, 3 = reversible defect
14. **target** - Heart disease diagnosis: 0 = No disease (< 50% diameter narrowing), 1 = Disease present (> 50% diameter narrowing)

## Setup

### Installing required libraries

The following libraries are required to run this lab. If you are running this notebook in a local environment, you may need to install these libraries using pip.

In [None]:
!pip install pandas matplotlib scikit-learn tensorflow

### Importing required libraries

In [None]:
# Import numerical computing and data manipulation libraries
import numpy as np
import pandas as pd

# Import visualization library
import matplotlib.pyplot as plt

# Import scikit-learn utilities for data preparation and evaluation
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, confusion_matrix

# Import TensorFlow and Keras for building neural networks
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

print("All libraries imported successfully!")

## Step 1: Load the patient dataset

In [None]:
# Load the Heart Disease Dataset from CSV file
data = pd.read_csv("https://advanced-machine-learning-for-medical-data-8e1579.gitlab.io/labs/lab7/heart.csv")

print("Dataset loaded successfully!")
print(f"Dataset shape: {data.shape}")
print(f"This means {data.shape[0]} patients and {data.shape[1]} columns (features + target)")

In [None]:
# Display the first few rows to understand the data structure
print("First 5 rows of the dataset:")
data.head()

## Step 2: Explore the data

In [None]:
# Check the distribution of the target variable (heart disease presence)
print("Target variable distribution:")
print(data["target"].value_counts())

# Display class proportions as percentages
print("\nClass proportions:")
print(data["target"].value_counts(normalize=True))

##  Step 3: Identify input features (X) and target variable (y)

In [None]:
# Separate features (X) and target (y)
# X contains all columns except 'target' - these are the input variables
# y contains only the 'target' column - this is what you want to predict
X = data.drop(columns=["target"])
y = data["target"]

print(f"Features shape: {X.shape}")
print(f"Target shape: {y.shape}")

## Step 4: Split the data into Train and Test sets

In [None]:
# Split the dataset into training and testing sets
# test_size=0.2 means 20% of data goes to testing, 80% to training
# random_state=42 ensures reproducibility (same split every time you run this code)
# stratify=y maintains the same class proportion in both train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2,        # 20% for testing
    random_state=42,      # for reproducibility
    stratify=y            # maintain class balance in splits
)

print("Data split completed!")
print(f"Training set: {X_train.shape[0]} samples")
print(f"Testing set: {X_test.shape[0]} samples")
print(f"\nTraining proportion: {X_train.shape[0] / len(data) * 100:.1f}%")
print(f"Testing proportion: {X_test.shape[0] / len(data) * 100:.1f}%")

## Step 5: Standardize the features

**Feature standardization** is a crucial preprocessing step for neural networks. Different features in the dataset have different scales (for example, age ranges from 29-77, while the sex variable is only 0 or 1). Neural networks perform better when all features are on a similar scale.

**StandardScaler** transforms each feature to have:
- A mean of 0 (centered around zero)
- A standard deviation of 1 (unit variance)

This process is called **Z-score normalization** and is calculated as: `z = (x - mean) / standard_deviation`

By standardizing features, you ensure that no single feature dominates the learning process simply because it has larger values. This is especially important for algorithms like neural networks that use gradient descent optimization.

**Important:** You fit the scaler only on the training data, then apply the same transformation to both training and testing data. This prevents data leakage from the test set into the training process.

In [None]:
# Create a StandardScaler object to standardize features
scaler = StandardScaler()

# Fit the scaler on training data and transform it
# fit_transform() calculates mean and std from training data, then standardizes it
X_train = scaler.fit_transform(X_train)

# Transform test data using the same scaler parameters (mean and std from training data)
# This ensures consistent scaling between training and testing
X_test = scaler.transform(X_test)

print("Feature standardization completed!")
print(f"Training data shape: {X_train.shape}")
print(f"Testing data shape: {X_test.shape}")

In [None]:
# Display the first row of standardized training data to see the transformation
# Notice how all values are now centered around 0 with similar scales
print("First patient's standardized features:")
X_train[0]

## Step 6: Define the neural network

In [None]:
# Build the dense neural network using Keras Sequential API
model = keras.Sequential([
    # Input layer: accepts 13 features (patient's clinical measurements)
    layers.Input(shape=(13,)),   
    
    # First hidden layer: 16 neurons with ReLU activation
    # ReLU helps the model learn complex patterns by focusing on positive signals
    layers.Dense(16, activation="relu"),       
    
    # Second hidden layer: 8 neurons with ReLU activation
    # Further refines the learned patterns
    layers.Dense(8, activation="relu"),        
    
    # Output layer: 1 neuron with sigmoid activation
    # Sigmoid produces a probability between 0 and 1 for binary classification
    layers.Dense(1, activation="sigmoid")      
])

print("Neural network architecture created!")
print("Architecture: Input (13 features) → 16 neurons → 8 neurons → 1 output (probability)")

## Step 7: Compile the model

In [None]:
# Compile the model with loss function, optimizer, and metrics
model.compile(
    # Loss function: measures prediction error for binary classification
    loss="binary_crossentropy",
    
    # Optimizer: algorithm that updates model weights to minimize loss
    optimizer="adam",
    
    # Metrics: tracks accuracy (percentage of correct predictions)
    metrics=["accuracy"]
)

print("Model compiled successfully!")
print("\nConfiguration:")
print("- Loss function: binary_crossentropy (measures prediction errors)")
print("- Optimizer: adam (adapts learning rate for efficient training)")
print("- Metrics: accuracy (tracks percentage of correct predictions)")

In [None]:
# Display detailed model architecture summary
print("Model Architecture Summary:")
print("="*60)
model.summary()

print("\nThe summary shows:")
print("- Layer types and output shapes")
print("- Number of trainable parameters (weights) in each layer")
print("- Total parameters that will be learned during training")

## Step 8: Train the model

In [None]:
# Train the neural network on the training data
print("Starting model training...")
print("This may take a minute or two...")

# Train the model and store training history
history = model.fit(
    # Training features and labels
    X_train, 
    y_train,
    
    # Validation data to monitor performance on unseen data
    validation_data=(X_test, y_test),
    
    # Number of complete passes through the training data
    epochs=50,
    
    # Number of samples processed before updating weights
    batch_size=32,
    
    # Show progress bar during training
    verbose=1
)

print("\nTraining completed!")
print(f"Final training accuracy: {history.history['accuracy'][-1]:.4f}")
print(f"Final validation accuracy: {history.history['val_accuracy'][-1]:.4f}")

## Step 9: Make predictions on test data

After training, you use the model to make predictions on the test set. The neural network outputs probabilities between 0 and 1 for each patient. To convert these probabilities into binary predictions (0 = no disease, 1 = disease present), you use a threshold of 0.5:
- If probability > 0.5, predict 1 (disease present)
- If probability ≤ 0.5, predict 0 (no disease)

This allows us to compare the model's predictions with the actual diagnoses and evaluate its performance.

In [None]:
# Generate predictions on the test set
# The model outputs probabilities (values between 0 and 1) for each patient
y_pred_prob = model.predict(X_test)

# Convert probabilities to binary predictions using 0.5 threshold
# If probability > 0.5, predict 1 (disease present); otherwise predict 0 (no disease)
y_pred = (y_pred_prob > 0.5).astype(int)

print("Predictions generated!")
print(f"Number of test samples: {y_pred.shape[0]}")
print(f"\nFirst 10 predictions: {y_pred.flatten()[:10]}")
print(f"First 10 actual values: {y_test.values[:10]}")
print(f"\nCompare to see how well the model predicted!")

## Step 10: Print classification report

In [None]:
# Generate and display the classification report
print("Classification Report:")
print("="*60)
print(classification_report(y_test, y_pred))

print("\nMetric Definitions:")
print("- Precision: Percentage of positive predictions that were actually correct")
print("- Recall: Percentage of actual positive cases that were correctly identified")
print("- F1-score: Balanced measure combining precision and recall")
print("- Support: Number of actual samples in each class")

## Step 11: Print confusion matrix

In [None]:
# Generate and display the confusion matrix
cm = confusion_matrix(y_test, y_pred)

print("Confusion Matrix:")
print("="*60)
print(cm)
print("\nMatrix Layout:")
print("[[TN  FP]")
print(" [FN  TP]]")

print("\nDetailed Breakdown:")
print(f"True Negatives (TN): {cm[0, 0]} - Correctly predicted no disease")
print(f"False Positives (FP): {cm[0, 1]} - Incorrectly predicted disease (false alarm)")
print(f"False Negatives (FN): {cm[1, 0]} - Incorrectly predicted no disease (missed diagnosis)")
print(f"True Positives (TP): {cm[1, 1]} - Correctly predicted disease")

# Calculate overall accuracy from confusion matrix
accuracy = (cm[0, 0] + cm[1, 1]) / cm.sum()
print(f"\nOverall Accuracy: {accuracy:.4f} ({accuracy*100:.2f}%)")

## Step 12: Plot accuracy over epochs

In [None]:
# Plot training and validation accuracy over epochs
plt.figure(figsize=(10, 5))

# Plot training accuracy
plt.plot(history.history["accuracy"], label="Training Accuracy", linewidth=2)

# Plot validation accuracy
plt.plot(history.history["val_accuracy"], label="Validation Accuracy", linewidth=2)

plt.title("Model Accuracy Over Epochs", fontsize=14, fontweight='bold')
plt.xlabel("Epoch", fontsize=12)
plt.ylabel("Accuracy", fontsize=12)
plt.legend(loc='lower right', fontsize=10)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("The accuracy plot shows how the model's performance improved during training.")
print("Both training and validation accuracy should increase over time.")

## Step 13: Plot training and validation loss over epochs

In [None]:
# Plot training and validation loss over epochs
plt.figure(figsize=(10, 5))

# Plot training loss
plt.plot(history.history["loss"], label="Training Loss", linewidth=2)

# Plot validation loss
plt.plot(history.history["val_loss"], label="Validation Loss", linewidth=2)

plt.title("Model Loss Over Epochs", fontsize=14, fontweight='bold')
plt.xlabel("Epoch", fontsize=12)
plt.ylabel("Loss", fontsize=12)
plt.legend(loc='upper right', fontsize=10)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("The loss plot shows how prediction error decreased during training.")
print("Both training and validation loss should decrease and stabilize.")

In [None]:
# Create side-by-side plots for accuracy and loss
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Left plot: Accuracy
axes[0].plot(history.history["accuracy"], label="Training Accuracy", linewidth=2)
axes[0].plot(history.history["val_accuracy"], label="Validation Accuracy", linewidth=2)
axes[0].set_title("Model Accuracy", fontsize=14, fontweight='bold')
axes[0].set_xlabel("Epoch", fontsize=12)
axes[0].set_ylabel("Accuracy", fontsize=12)
axes[0].legend(loc='lower right')
axes[0].grid(True, alpha=0.3)

# Right plot: Loss
axes[1].plot(history.history["loss"], label="Training Loss", linewidth=2)
axes[1].plot(history.history["val_loss"], label="Validation Loss", linewidth=2)
axes[1].set_title("Model Loss", fontsize=14, fontweight='bold')
axes[1].set_xlabel("Epoch", fontsize=12)
axes[1].set_ylabel("Loss", fontsize=12)
axes[1].legend(loc='upper right')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("Combined visualization shows both accuracy improvement and loss reduction over training.")

## Step 14: Evaluate the model on test data

In [None]:
# Evaluate the final model performance on the test set
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)

print("Final Model Performance on Test Set:")
print("="*60)
print(f"Test Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")
print("\nThis accuracy represents the model's performance on completely unseen data.")
print("It's the best indicator of how the model would perform in real-world scenarios.")

# Exercise
In Step 6, you built a neural network. Given below is one more architecture for the neural network. This is a deeper network with more layers. Replace the existing neural network with this network, and observe the change in accuracy.

## Use the deeper network architecture variation

In [None]:
# Build a deeper neural network with dropout regularization
model = keras.Sequential([
    # Input layer: accepts 13 features
    layers.Input(shape=(13,)),
    
    # First hidden layer: 32 neurons with ReLU activation
    layers.Dense(32, activation="relu"),
    layers.Dropout(0.3),  # Drop 30% of neurons during training to prevent overfitting
    
    # Second hidden layer: 16 neurons with ReLU activation
    layers.Dense(16, activation="relu"),
    layers.Dropout(0.3),  # Additional dropout for regularization
    
    # Third hidden layer: 8 neurons with ReLU activation
    layers.Dense(8, activation="relu"),
    
    # Output layer: 1 neuron with sigmoid activation
    layers.Dense(1, activation="sigmoid")
])


# Congratulations!

You have successfully completed this lab on building dense neural networks for heart disease prediction! Throughout this hands-on exercise, you gained practical experience with the complete machine learning workflow, from data preprocessing through model evaluation. You now understand how to design neural network architectures, train them using backpropagation, and interpret their performance using various evaluation metrics.

The skills you developed in this lab form the foundation for more advanced deep learning applications in healthcare, finance, and many other domains. You've learned not just how to build neural networks, but also how to think critically about model performance, interpret results, and make data-driven decisions.


## Authors

Ramesh Sannareddy

Copyright © 2025 SkillUp. All rights reserved.