### Key Design Choices:
1. **Output Layer Activation**: Used sigmoid because:
   - It's a binary classification problem (has_disease: 0 or 1)
   - Sigmoid outputs probabilities between 0 and 1

2. **Handling Class Imbalance**:
   - Computed class weights to give more importance to the minority class
   - Used AUC (Area Under ROC Curve) as a metric, which is better for imbalanced data
   - Considered using oversampling/undersampling techniques (though not shown in code)

3. **Model Architecture**:
   - Two hidden layers with ReLU activation
   - Dropout layer to prevent overfitting
   - Batch normalization could also be added for better performance



In [32]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.utils import class_weight
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping
import pandas as pd



In [33]:
# Load your dataset
df = pd.read_csv("heart_disease.csv")

# Split features and target
X = df.drop("has_disease", axis=1)
y = df["has_disease"]



In [34]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)



In [35]:
# Feature scaling
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)



In [36]:
# Handle class imbalance
weights = class_weight.compute_class_weight(class_weight='balanced', classes=np.unique(y_train), y=y_train)
class_weights = {0: weights[0], 1: weights[1]}

# Build improved ANN model
model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    BatchNormalization(),
    Dropout(0.3),
    Dense(32, activation='relu'),
    BatchNormalization(),
    Dropout(0.3),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [37]:

# Train with early stopping
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

history = model.fit(
    X_train, y_train,
    validation_split=0.1,
    epochs=100,
    batch_size=16,
    class_weight=class_weights,
    callbacks=[early_stop],
    verbose=1
)

# Evaluate model
y_pred = (model.predict(X_test) > 0.5).astype(int)
print("Classification Report:\n", classification_report(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("ROC AUC Score:", roc_auc_score(y_test, y_pred))



Epoch 1/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 8ms/step - accuracy: 0.7230 - loss: 0.5469 - val_accuracy: 0.9375 - val_loss: 0.3369
Epoch 2/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9174 - loss: 0.2204 - val_accuracy: 0.9750 - val_loss: 0.2095
Epoch 3/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9492 - loss: 0.1606 - val_accuracy: 0.9875 - val_loss: 0.1426
Epoch 4/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9693 - loss: 0.1074 - val_accuracy: 0.9875 - val_loss: 0.1057
Epoch 5/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9572 - loss: 0.1190 - val_accuracy: 0.9875 - val_loss: 0.0867
Epoch 6/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9734 - loss: 0.0852 - val_accuracy: 0.9875 - val_loss: 0.0750
Epoch 7/100
[1m45/45[0m [32m━━━