<a href="https://colab.research.google.com/github/PSivaMallikarjun/Predicting-Diabetes-using-Artificial-Neural-Networks/blob/main/Predicting_Diabetes_using_Artificial_Neural_Networks_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Predicting Diabetes using Artificial Neural Networks
1. Dataset:
Source: Pima Indians Diabetes Database (Kaggle)
Size: 768 patient records
Features: Medical test results (e.g., glucose level, blood pressure) and patient statistics (e.g., age, BMI)
Target: 0 (No Diabetes) / 1 (Diabetes)



2. Model 1: Simple ANN with 1 Neuron
Architecture:
Input layer: 8 features
Single Neuron Output Layer with Sigmoid activation
Loss Function: Binary Cross-Entropy
Optimizer: Adam or SGD
Performance Metric: Accuracy, Precision, Recall, F1-score

3. Model 2: ANN with 2 Hidden Layers and 25 Neurons
Architecture:
Input layer: 8 features
Hidden Layer 1: 12 neurons, ReLU activation
Hidden Layer 2: 13 neurons, ReLU activation
Output Layer: 1 neuron, Sigmoid activation
Loss Function: Binary Cross-Entropy
Optimizer: Adam
Performance Metric: Accuracy, Precision, Recall, F1-score

4. Implementation Steps
Data Preprocessing

Handle missing values (if any)
Normalize input features (MinMaxScaler or StandardScaler)
Split dataset into training and test sets (80-20 or 70-30)Model Training

Train both ANN models separately
Compare performance on the test set
Evaluation & Comparison

Evaluate accuracy, precision, recall, and F1-score for both models
Compare model efficiency and complexity

5. Tools & Libraries
Python
TensorFlow/Keras
NumPy, Pandas, Matplotlib, Seaborn (for data visualization & preprocessing)
Scikit-learn (for splitting dataset & evaluation metrics)

Here's the full Python implementation for predicting diabetes using Artificial Neural Networks (ANNs) with TensorFlow/Keras. The script includes data preprocessing, model training, evaluation, and comparison of the two ANN architectures.

This script loads the dataset, preprocesses it, builds two ANN models, trains them, and evaluates their performance.

In [3]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Load the dataset
data = pd.read_csv("https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv",
                   names=["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"])

# Splitting features and target variable
X = data.drop(columns=["Outcome"])
y = data["Outcome"]

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Model 1: Simple ANN with 1 Neuron
model1 = Sequential([
    keras.layers.Input(shape=(X_train.shape[1],)),
    Dense(1, activation='sigmoid')
])
model1.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model1.fit(X_train, y_train, epochs=100, verbose=0, batch_size=16)

# Model 2: ANN with 2 Hidden Layers & 25 Neurons
def build_complex_model():
    model = Sequential([
        keras.layers.Input(shape=(X_train.shape[1],)),
        Dense(12, activation='relu'),
        Dense(13, activation='relu'),
        Dense(1, activation='sigmoid')
    ])
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model

model2 = build_complex_model()
model2.fit(X_train, y_train, epochs=100, verbose=0, batch_size=16)

# Evaluate both models
def evaluate_model(model, X_test, y_test):
    y_pred = (model.predict(X_test) > 0.5).astype(int)
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    return accuracy, precision, recall, f1

accuracy1, precision1, recall1, f1_1 = evaluate_model(model1, X_test, y_test)
accuracy2, precision2, recall2, f1_2 = evaluate_model(model2, X_test, y_test)

# Print results
print("Model 1 - Simple ANN with 1 Neuron:")
print(f"Accuracy: {accuracy1:.4f}, Precision: {precision1:.4f}, Recall: {recall1:.4f}, F1 Score: {f1_1:.4f}\n")

print("Model 2 - ANN with 2 Hidden Layers (25 Neurons Total):")
print(f"Accuracy: {accuracy2:.4f}, Precision: {precision2:.4f}, Recall: {recall2:.4f}, F1 Score: {f1_2:.4f}")

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step
Model 1 - Simple ANN with 1 Neuron:
Accuracy: 0.7597, Precision: 0.6607, Recall: 0.6727, F1 Score: 0.6667

Model 2 - ANN with 2 Hidden Layers (25 Neurons Total):
Accuracy: 0.7532, Precision: 0.6545, Recall: 0.6545, F1 Score: 0.6545
