To check whether the **Logistic Regression model** is **overfitting or underfitting**, we can compare its **performance on training and test data** using key metrics like:

1. **Accuracy Score** – If training accuracy is much higher than test accuracy, the model is overfitting.
2. **Precision, Recall, F1-score** – Helps check performance on different aspects.
3. **ROC-AUC Score** – Measures overall model discrimination ability.
4. **Learning Curves** – A graphical way to check overfitting.

---

### **🔹 Python Code to Detect Overfitting/Underfitting**
```python
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score
import matplotlib.pyplot as plt
from sklearn.model_selection import learning_curve

# **1. Evaluate Training and Testing Accuracy**
train_acc = accuracy_score(y_train, model.predict(X_train))
test_acc = accuracy_score(y_test, model.predict(X_test))

print(f"Training Accuracy: {train_acc:.2f}")
print(f"Testing Accuracy: {test_acc:.2f}")

# **2. Classification Report (Precision, Recall, F1-score)**
print("Classification Report (Test Set):")
print(classification_report(y_test, y_pred))

# **3. ROC-AUC Score**
roc_auc = roc_auc_score(y_test, model.predict_proba(X_test)[:, 1])
print(f"ROC-AUC Score: {roc_auc:.2f}")

# **4. Learning Curve (To visualize Overfitting/Underfitting)**
train_sizes, train_scores, test_scores = learning_curve(model, X_selected, y, cv=5, scoring="accuracy")

# Calculate mean and std deviation of training and test scores
train_mean = np.mean(train_scores, axis=1)
train_std = np.std(train_scores, axis=1)
test_mean = np.mean(test_scores, axis=1)
test_std = np.std(test_scores, axis=1)

# Plot Learning Curve
plt.figure(figsize=(8, 5))
plt.plot(train_sizes, train_mean, 'o-', label="Training Score", color="blue")
plt.fill_between(train_sizes, train_mean - train_std, train_mean + train_std, alpha=0.1, color="blue")
plt.plot(train_sizes, test_mean, 'o-', label="Cross-validation Score", color="red")
plt.fill_between(train_sizes, test_mean - test_std, test_mean + test_std, alpha=0.1, color="red")
plt.xlabel("Training Examples")
plt.ylabel("Score")
plt.title("Learning Curve")
plt.legend()
plt.show()
```

---

### **🔹 How to Interpret the Results?**
✅ **If Training Accuracy >> Test Accuracy (Low Test Accuracy)** → **Overfitting**  
✅ **If Training and Test Accuracy are both Low** → **Underfitting**  
✅ **If Training and Test Accuracy are close & high (~80%+)** → **Good Fit**  

- **Learning Curve:**
  - If **training score is high, but validation score is low** → **Overfitting**.
  - If **both training and validation scores are low** → **Underfitting**.
  - If **both curves converge at a high value** → **Good Fit**.

---

### **🔹 How to Fix Overfitting or Underfitting?**
**🔸 If Overfitting:**  
- **Reduce model complexity** (e.g., Regularization, Feature Selection)  
- **Get more training data**  
- **Use dropout (for deep learning models)**  

**🔸 If Underfitting:**  
- **Use a more complex model**  
- **Feature Engineering** (e.g., polynomial features)  
- **Increase training time or reduce regularization**  

---

### **✅ Final Thoughts**
This method helps determine if your **Heart Disease Prediction Model** is overfitting or underfitting. Let me know if you need improvements!

In [None]:
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score
import matplotlib.pyplot as plt
from sklearn.model_selection import learning_curve

# **1. Evaluate Training and Testing Accuracy**
train_acc = accuracy_score(y_train, model.predict(X_train))
test_acc = accuracy_score(y_test, model.predict(X_test))

print(f"Training Accuracy: {train_acc:.2f}")
print(f"Testing Accuracy: {test_acc:.2f}")

# **2. Classification Report (Precision, Recall, F1-score)**
print("Classification Report (Test Set):")
print(classification_report(y_test, y_pred))

# **3. ROC-AUC Score**
roc_auc = roc_auc_score(y_test, model.predict_proba(X_test)[:, 1])
print(f"ROC-AUC Score: {roc_auc:.2f}")

# **4. Learning Curve (To visualize Overfitting/Underfitting)**
train_sizes, train_scores, test_scores = learning_curve(model, X_selected, y, cv=5, scoring="accuracy")

# Calculate mean and std deviation of training and test scores
train_mean = np.mean(train_scores, axis=1)
train_std = np.std(train_scores, axis=1)
test_mean = np.mean(test_scores, axis=1)
test_std = np.std(test_scores, axis=1)

# Plot Learning Curve
plt.figure(figsize=(8, 5))
plt.plot(train_sizes, train_mean, 'o-', label="Training Score", color="blue")
plt.fill_between(train_sizes, train_mean - train_std, train_mean + train_std, alpha=0.1, color="blue")
plt.plot(train_sizes, test_mean, 'o-', label="Cross-validation Score", color="red")
plt.fill_between(train_sizes, test_mean - test_std, test_mean + test_std, alpha=0.1, color="red")
plt.xlabel("Training Examples")
plt.ylabel("Score")
plt.title("Learning Curve")
plt.legend()
plt.show()