<a href="https://colab.research.google.com/github/YashNigam65/gitfolder/blob/master/notebook/Other/pima_indians_nural_network_multi_layer_perceptron_one.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook demonstrates how to build, train, and evaluate a simple Multilayer Perceptron (MLP) for binary classification using Keras. Specifically, it uses the Pima Indians Diabetes Dataset to predict whether a patient has diabetes based on several health indicators. The process involves:

**Introduction to Keras Concepts**: Explaining fundamental Keras components like Layers, Sequential models, and Dense layers.

**Data Loading and Preparation**: Loading the dataset and splitting it into input features (X) and target variable (Y).

**Model Definition**: Creating a Sequential Keras model with two hidden Dense layers and an output Dense layer.

**Model Compilation and Training**: Configuring the model with a loss function, optimizer, and metrics, then training it on the dataset.

**Model Evaluation**: Assessing the trained model's accuracy.

**Prediction**: Using the trained model to make predictions on new, unseen data samples.

**Keras**: Think of Keras as a user-friendly toolkit for building and training neural networks. It sits on top of more complex deep learning frameworks (like TensorFlow) and makes it much easier to design, build, and experiment with neural networks without getting bogged down in the complicated details.


**What is it?** Keras is a high-level API (Application Programming Interface) that allows you to quickly create and prototype neural networks. It's designed for ease of use, modularity, and extensibility.

**Layer**: In a neural network, a layer is like a step in processing the data. Data goes into one layer, the layer performs some calculations on it, and then the output of that layer becomes the input for the next layer. Think of it like a series of filters or transformations applied to the data.

**Sequential**: A Sequential model in Keras is the simplest type of model. It's a linear stack of layers, where you just add layers one after another. Data flows straight through from the first layer to the last layer without any branches or complex connections. It's like building a chain of processing steps.

**Dense Layer**: A Dense layer is a type of layer where every neuron in the layer is connected to every neuron in the previous layer. It's also known as a fully connected layer. This is a very common type of layer used in many neural network architectures. It essentially learns a weighted sum of the inputs from the previous layer and applies an activation function to the result.

Keras (for Neural Networks - MLP in this case):

**What it's good for**: Neural networks, especially deep learning models built with frameworks like Keras, are powerful for complex tasks where the relationships between the input features and the output are not simple or linear. They can automatically learn intricate patterns and hierarchies in data. This is particularly useful for things like image recognition, natural language processing, and in this case, potentially capturing non-linear relationships in the diabetes dataset.

**How it works**: Neural networks consist of interconnected layers of "neurons" that process data in a hierarchical way. They learn by adjusting the "weights" of these connections during training to minimize errors.

**When to use it**: When you have a large amount of data and suspect there are complex, non-linear relationships that simpler models might miss.
KNN (K-Nearest Neighbors):

**What it's good for**: KNN is a simple, non-parametric algorithm used for both classification and regression. It works by finding the 'k' nearest data points in the training set to a new data point and making a prediction based on the majority class (for classification) or the average value (for regression) of those neighbors.

**How it works**: It's based on the principle that similar data points are likely to have similar outcomes.

**When to use it**: KNN is easy to understand and implement. It can be effective for smaller datasets and when the decision boundary is not linear. However, it can be computationally expensive with large datasets.

**Linear Regression:**

What it's good for: Linear Regression is a supervised learning algorithm used for predicting a continuous output variable based on a linear relationship with the input features.

How it works: It finds the best-fitting straight line (or hyperplane in higher dimensions) that minimizes the difference between the predicted and actual output values.

When to use it: When you believe there is a linear relationship between your input features and the output variable. It's simple, interpretable, and fast to train.

**Logistic Regression:**

**What it's good for**: Logistic Regression is a supervised learning algorithm used for binary classification tasks (predicting one of two outcomes). It's similar to linear regression but uses a logistic function to output a probability between 0 and 1.
How it works: It models the probability of a data point belonging to a particular class based on a linear combination of the input features.
When to use it: When you need to predict a binary outcome and assume a linear relationship between the features and the log-odds of the outcome.

This code cell imports the necessary libraries for building the neural network:
*   `Sequential` and `Dense` from `keras.models` and `keras.layers` respectively, which are fundamental components for defining the neural network architecture.
*   `numpy` for numerical operations, particularly for handling the dataset.

In [1]:
# Create your first MLP in Keras
from keras.models import Sequential
from keras.layers import Dense
import numpy


This code cell performs the following steps:
*   `numpy.random.seed(7)`: Sets the random seed for reproducibility, ensuring that the model initialization is consistent each time the code is run.
*   `dataset = numpy.loadtxt(...)`: Loads the Pima Indians Diabetes dataset from a CSV file. The dataset contains various medical predictors and a binary outcome (diabetes or no diabetes).
*   `X = dataset[:,0:8]`: Splits the dataset into input features (`X`), taking the first 8 columns.
*   `Y = dataset[:,8]`: Splits the dataset into the output variable (`Y`), which is the 9th column (the target variable indicating diabetes).

In [2]:
# fix random seed for reproducibility
numpy.random.seed(7)
# load pima indians dataset
dataset = numpy.loadtxt("https://raw.githubusercontent.com/YashNigam65/gitfolder/refs/heads/master/dataset/pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

This code cell defines the neural network architecture:
*   `model = Sequential()`: Initializes a Sequential model, which is a linear stack of layers.
*   `model.add(Dense(12, input_dim=8, activation='relu'))`: Adds the first hidden Dense layer with 12 neurons, 8 input dimensions, and a ReLU activation function.
*   `model.add(Dense(8, activation='relu'))`: Adds a second hidden Dense layer with 8 neurons and a ReLU activation function.
*   `model.add(Dense(1, activation='sigmoid'))`: Adds the output Dense layer with 1 neuron (for binary classification) and a sigmoid activation function, which outputs a probability between 0 and 1.
*   `model.summary()`: Prints a summary of the model architecture, including the number of layers, output shapes, and the total number of trainable parameters.

There are some general guidelines, like choosing a number of neurons between the input and output layer size, or perhaps a number related to the input dimension (like half or twice the input features). In this case, the input dimension is 8, and 12 is a value in that general range.

In [3]:
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


**ReLU** **(Rectified Linear Unit) Function**:

What it is: A very common activation function in neural networks. It returns 0 if it receives any negative input, but for any positive value, it returns that value back.

Formula: f(x) = max(0, x)

Where it's used: Primarily in the hidden layers of neural networks. It helps introduce non-linearity into the model, allowing it to learn complex patterns.

Why it's popular: It's computationally efficient (simple calculation) and helps mitigate the vanishing gradient problem, which can occur with other activation functions like sigmoid in deep networks.

**Sigmoid** **Function**:

What it is: An S-shaped activation function that maps any real-valued number into a value between 0 and 1.

Formula: f(x) = 1 / (1 + e^(-x))

Where it's used: Most commonly in the output layer of binary classification models (like the one in this notebook). The output, being a probability between 0 and 1, can directly represent the likelihood of a positive class.

Why it's used for binary classification: Its output naturally lends itself to probability interpretation, where values close to 1 indicate a high probability of the positive class and values close to 0 indicate a high probability of the negative class.


This cell handles the training and evaluation of the defined Keras model:
*   `model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])`: Compiles the model, specifying:
    *   `loss='binary_crossentropy'`: The loss function for binary classification.
    *   `optimizer='adam'`: The Adam optimizer, a popular choice for neural networks.
    *   `metrics=['accuracy']`: The metric to monitor during training and evaluation.
*   `model.fit(X, Y, epochs=150, batch_size=10)`: Trains the model using the input data `X` and target `Y` for 150 epochs with a batch size of 10.
*   `scores = model.evaluate(X, Y)`: Evaluates the trained model on the same training data to get the final loss and accuracy.
*   `print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))`: Prints the final accuracy achieved by the model, which is approximately 76.30% in this run.

**loss='binary_crossentropy'**: This specifies the loss function (also known as the objective function) that the model will try to minimize during training. For binary classification problems (where the output is either 0 or 1, like predicting diabetes or not), binary_crossentropy is the standard and most appropriate choice. It measures how well the model's predicted probabilities match the true labels. A lower loss value means a better-performing model.

**optimizer='adam'**: This defines the optimizer, which is the algorithm used to update the weights of your neural network during training. The optimizer's job is to adjust the model's internal parameters (weights) in a way that reduces the loss function. 'Adam' (Adaptive Moment Estimation) is a very popular and often highly effective optimizer because it combines the best properties of other optimization algorithms, adapting the learning rate for each weight individually. This typically leads to faster convergence and better performance.

**epochs=150**: An epoch represents one complete pass through the entire training dataset. In each epoch, the model sees every training example once, updates its weights, and calculates the loss and metrics. Setting epochs=150 means the model will iterate over the entire X and Y dataset 150 times.

**batch_size=10**: The batch size determines the number of samples that will be propagated through the network at once before the model's weights are updated. Instead of updating weights after every single sample (which can be noisy and slow), or after seeing all samples (which can be memory-intensive for large datasets), training is usually done in mini-batches. With batch_size=10, the training data is divided into smaller chunks of 10 samples, and the model updates its weights after processing each batch. This provides a good balance between training stability and computational efficiency. For example, if you have 768 samples, an epoch would consist of 768 / 10 = 77 (rounded up) batches.

In [4]:
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=150, batch_size=10)
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

Epoch 1/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.6384 - loss: 12.7461
Epoch 2/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5753 - loss: 3.0101
Epoch 3/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6007 - loss: 1.3240
Epoch 4/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6244 - loss: 1.2019
Epoch 5/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6195 - loss: 1.0687
Epoch 6/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6501 - loss: 1.0419
Epoch 7/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6556 - loss: 0.8762
Epoch 8/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6935 - loss: 0.7907
Epoch 9/150
[1m77/77[0m [32m━━━━━━━━━━━━━━━━

In [5]:
import numpy as np
# Predict diabetes for new samples (3 new patients)
samples = np.array([[6, 148, 72, 35, 0, 33.6, 0.627, 50],   # Sample 1
                    [1, 85, 66, 29, 0, 26.6, 0.351, 31],    # Sample 2
                    [8, 183, 64, 0, 0, 23.3, 0.672, 32]])   # Sample 3

# Standardize the new data using the same scaler
#samples_scaled = scaler.transform(samples)

This code cell uses the trained model to predict outcomes for the new samples:
*   `predictions = model.predict(samples)`: Uses the `model` to predict the probability of diabetes for each of the three `samples`. The output will be probabilities between 0 and 1.
*   `predicted_classes = (predictions > 0.5).astype(int)`: Converts these probabilities into binary class labels (0 or 1). If the probability is greater than 0.5, it's classified as 1 (diabetes); otherwise, it's 0 (no diabetes).
*   `print(predicted_classes.flatten())`: Prints the final predicted classes for the samples in a flattened array. Based on the output, the predictions are `[1 0 1]`, meaning the first and third samples are predicted to have diabetes, and the second sample is predicted not to.

In [None]:
# Predict diabetes (returns probabilities)
predictions = model.predict(samples)

# Convert probabilities to class labels (0 or 1)
predicted_classes = (predictions > 0.5).astype(int)

# Output predictions
print("Predictions for the samples (0 = No Diabetes, 1 = Diabetes):")
print(predicted_classes.flatten())

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 79ms/step
Predictions for the samples (0 = No Diabetes, 1 = Diabetes):
[1 0 1]
