#### Pridicting Heart Disease by using Neural Networks.

**About Dataset**

The dataset contains the following features:

- age(in years)
- sex: (1 = male; 0 = female)
- cp: chest pain type
- trestbps: resting blood pressure (in mm Hg on admission to the hospital)
- chol: serum cholestoral in mg/dl
- fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
- restecg: resting electrocardiographic results
- thalach: maximum heart rate achieved
- exang: exercise induced angina (1 = yes; 0 = no)
- oldpeak: ST depression induced by exercise relative to rest
- slope: the slope of the peak exercise ST segment
- ca: number of major vessels (0-3) colored by flourosopy
- thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
- target: 1 or 0


Expected Outcome from the project

1. Statistical analysis of the data 
2. Create Training and Testing Datasets
3. Building and Training the Neural Network
4. Improving Results - A Binary Classification Problem
5. Results and Metrics


#### To download the dataset<a href="https://drive.google.com/file/d/1R5SjStkUsgTgyoAjC_14v13siYh8AAF3/view?usp=sharing" title="Google Drive"> Click here </a>

In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Step 1: Load and preprocess the data
data_path = "C:\\Users\\manoj\\Downloads\\heart.csv"
heart_data = pd.read_csv(data_path)

# Statistical analysis
print(heart_data.head())
print(heart_data.info())
print(heart_data.describe())

# Check for missing values
print(heart_data.isnull().sum())


   age  sex  cp  trestbps  chol  fbs  restecg  thalach  exang  oldpeak  slope  \
0   63    1   3       145   233    1        0      150      0      2.3      0   
1   37    1   2       130   250    0        1      187      0      3.5      0   
2   41    0   1       130   204    0        0      172      0      1.4      2   
3   56    1   1       120   236    0        1      178      0      0.8      2   
4   57    0   0       120   354    0        1      163      1      0.6      2   

   ca  thal  target  
0   0     1       1  
1   0     2       1  
2   0     2       1  
3   0     2       1  
4   0     2       1  
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   age       303 non-null    int64  
 1   sex       303 non-null    int64  
 2   cp        303 non-null    int64  
 3   trestbps  303 non-null    int64  
 4   chol      303 non-null    int64  
 5 

In [5]:
# Step 2: Create training and testing datasets
X = heart_data.drop('target', axis=1)
y = heart_data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)



In [6]:
# Step 3: Building and training the neural network
model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2, verbose=2)



Epoch 1/50


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


7/7 - 1s - 152ms/step - accuracy: 0.4456 - loss: 3.6646 - val_accuracy: 0.5714 - val_loss: 2.3045
Epoch 2/50
7/7 - 0s - 8ms/step - accuracy: 0.5440 - loss: 2.4313 - val_accuracy: 0.4082 - val_loss: 0.8576
Epoch 3/50
7/7 - 0s - 8ms/step - accuracy: 0.4404 - loss: 1.5188 - val_accuracy: 0.4286 - val_loss: 1.2545
Epoch 4/50
7/7 - 0s - 8ms/step - accuracy: 0.5440 - loss: 1.2695 - val_accuracy: 0.6122 - val_loss: 0.7761
Epoch 5/50
7/7 - 0s - 8ms/step - accuracy: 0.5907 - loss: 0.9281 - val_accuracy: 0.5918 - val_loss: 0.7093
Epoch 6/50
7/7 - 0s - 9ms/step - accuracy: 0.6010 - loss: 0.6394 - val_accuracy: 0.6122 - val_loss: 0.7427
Epoch 7/50
7/7 - 0s - 8ms/step - accuracy: 0.6166 - loss: 0.9014 - val_accuracy: 0.7143 - val_loss: 0.5758
Epoch 8/50
7/7 - 0s - 8ms/step - accuracy: 0.6114 - loss: 0.8204 - val_accuracy: 0.7143 - val_loss: 0.5687
Epoch 9/50
7/7 - 0s - 7ms/step - accuracy: 0.5648 - loss: 0.7205 - val_accuracy: 0.7347 - val_loss: 0.5807
Epoch 10/50
7/7 - 0s - 8ms/step - accuracy: 0.

In [7]:
# Step 4: Evaluating the model
train_loss, train_acc = model.evaluate(X_train, y_train, verbose=0)
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
print(f'Training Accuracy: {train_acc:.4f}, Training Loss: {train_loss:.4f}')
print(f'Test Accuracy: {test_acc:.4f}, Test Loss: {test_loss:.4f}')



Training Accuracy: 0.7562, Training Loss: 0.4952
Test Accuracy: 0.8033, Test Loss: 0.4689


In [9]:
# Step 5: Results and Metrics
y_pred_prob = model.predict(X_test)
# Convert predicted probabilities to class labels
y_pred = (y_pred_prob > 0.5).astype(int)
print("Classification Report:")
print(classification_report(y_test, y_pred))



[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 28ms/step
Classification Report:
              precision    recall  f1-score   support

           0       0.81      0.76      0.79        29
           1       0.79      0.84      0.82        32

    accuracy                           0.80        61
   macro avg       0.80      0.80      0.80        61
weighted avg       0.80      0.80      0.80        61

