# 💓 Heart Disease Insights via Association Rule Mining

This notebook showcases how we can discover potential correlations in a heart disease dataset using association rule mining techniques, specifically the Apriori algorithm.

In [None]:
# 📦 Step 1: Load Required Libraries and Dataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from mlxtend.frequent_patterns import apriori, association_rules

# Mount Google Drive to access dataset
from google.colab import drive
drive.mount('/content/drive')

# Load the CSV file containing heart disease records
data_path = '/content/drive/MyDrive/Data Mining/project/heart_disease.csv'
heart_data = pd.read_csv(data_path)
heart_data.head()

In [None]:
# 🧾 Step 2: Transform Categorical Data into Binary Features
# This process converts all features to a format compatible with rule mining
encoded_heart_data = pd.get_dummies(heart_data)
encoded_heart_data.head()

In [None]:
# 🔍 Step 3: Identify Frequent Attribute Sets
# Using a minimum support threshold of 0.3
frequent_sets = apriori(encoded_heart_data, min_support=0.3, use_colnames=True)
frequent_sets.head()

In [None]:
# 🔗 Step 4: Extract Strong Association Rules
# Rules are filtered using a confidence threshold of at least 70%
discovered_rules = association_rules(frequent_sets, metric="confidence", min_threshold=0.7)
discovered_rules.head()

In [None]:
# 🏆 Step 5: Present Top 10 Rules by Confidence and Lift Ranking
top_10 = discovered_rules.sort_values(by=["confidence", "lift"], ascending=False).head(10)
top_10[['antecedents', 'consequents', 'support', 'confidence', 'lift']]

In [None]:
# 📌 Step 6: Interpret a Key Rule from the Result
highlighted_rule = top_10.iloc[0]
print("Rule:", highlighted_rule['antecedents'], "→", highlighted_rule['consequents'])
print("Support:", round(highlighted_rule['support'], 2))
print("Confidence:", round(highlighted_rule['confidence'], 2))
print("Lift:", round(highlighted_rule['lift'], 2))

### 🧠 Summary:
By utilizing association rule mining, we were able to detect important combinations of health attributes related to heart disease. The high-confidence rules serve as meaningful indicators that can support medical decisions and proactive diagnosis planning.