<a href="https://colab.research.google.com/github/Rhevs-NeverGiveUp/Tensorflow-in-Healthcare/blob/main/Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Project: Predicting Diabetes Risk

Project Goal: To build a deep learning model that can predict whether a patient has diabetes based on a set of diagnostic measurements. This is a classic **binary classification problem.**

Dataset: We'll use the **Pima Indians Diabetes **Dataset, a widely-used public dataset for machine learning research. You can download this dataset from Kaggle or the UCI Machine Learning Repository. It contains data for 768 female patients, 21 years or older, of Pima Indian heritage.

The dataset has 8 clinical features (like BMI, glucose level, and blood pressure) and a single output variable (Outcome) where 1 indicates a positive diabetes diagnosis and 0 indicates a negative diagnosis.



Step 1: Import TensorFlow and Keras

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

In [None]:
df=pd.read_csv("/content/drive/MyDrive/SUBJECTS/AY 25-26/Tensorflow and Other tools in Healthcare/DATASETS/diabetes.csv")

In [None]:
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [None]:
X=df.drop('Outcome',axis=1)
y=df['Outcome']

Step 2: Preparing for Learning - Splitting and Scaling

In [None]:
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=42)
#random_state=42 ensures that every time we run the code, the split is exactly the same, which is crucial for reproducibility in research.
scaler=StandardScaler()
#StandardScaler transforms the data so that each feature has a mean of 0 and a standard deviation of 1.
X_train_scaled=scaler.fit_transform(X_train)
#We fit the scaler to the training data. This step calculates the mean and standard deviation for each feature. We then immediately transform the training data using these values.
X_test_scaled=scaler.transform(X_test)
#We only transform the testing data. It's a common mistake to fit the scaler on the test data. Doing so would "leak" information from the test set into the training process, leading to an artificially high and misleading accuracy score.

Check :

In [None]:
X_train_scaled.shape

(614, 8)

Step 3: Building the Brain - The Neural Network Architecture

In [None]:
model=Sequential()
#This initializes a Sequential model, which is the simplest type of neural network in Keras. It lets you stack layers on top of each other.
model.add(Dense(64,activation='relu',input_shape=(X_train_scaled.shape[1],)))
#The first layer has 64 neurons. input_shape is only required for the first layer and tells the model to expect 8 input features.
#This also creates a tuple with one element, but the trailing comma explicitly tells Python it's a tuple.
model.add(Dense(32,activation='relu'))
#The second layer has 32 neurons. The number of neurons in hidden layers is a hyperparameter you can tune for a project.
model.add(Dense(1,activation='sigmoid'))
#This is the output layer. We use 1 neuron because we're predicting one of two outcomes (diabetes or not). The sigmoid activation function is perfect for binary classification, as it squashes the output into a probability between 0 and 1. A value of 0.75 would mean a 75% chance of diabetes.

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Step 4: The Game Plan - Compiling the Model

In [None]:
#model.compile(...): This is a crucial step that configures the model for training.
model.compile(optimizer='adam',
              #optimizer='adam': The optimizer is the algorithm that adjusts the weights and biases of the network during training to minimize the error. Adam is a highly efficient and widely used optimizer.
              loss='binary_crossentropy',
              #loss='binary_crossentropy': The loss function measures how wrong the model's predictions are. For binary classification, binary crossentropy is the standard choice. The goal of training is to reduce this value.
              metrics=['accuracy'])
              #metrics=['accuracy']: This tells the model to track and report accuracy (the percentage of correct predictions) during training, which gives us a clear understanding of its performance.

model.summary()
#This command prints a table summarizing the network's layers, output shapes, and the total number of parameters. This is useful for debugging and understanding your model's complexity.


Model: "sequential_2":

This is the name of your model. Since you used tf.keras.models.Sequential(), Keras gives it a default name like "sequential_2".

Layer (type):

This column lists the layers in your neural network in the order they are added.

dense_2 (Dense): This is the first layer you added. It's a Dense layer, which means it's a fully connected layer where every neuron in this layer is connected to every neuron in the previous layer (which are your input features in this case).

dense_3 (Dense): This is the second Dense layer, a hidden layer in your network.

dense_4 (Dense): This is the third Dense layer, which is your output layer.

Output Shape:

 This column shows the shape of the tensor that each layer outputs.

(None, 64): The None represents the batch size. When you train or predict with your model, you'll typically feed data in batches. The None means the model can handle any batch size. The 64 indicates that this layer has 64 neurons, and each neuron will output a value.

(None, 32): This layer outputs a tensor with an arbitrary batch size and 32 features (from the 32 neurons in this layer).

(None, 1): The output layer has 1 neuron, so it outputs a tensor with an arbitrary batch size and 1 value. In your case, this value represents the probability of the patient having diabetes

Param #:

This column shows the number of parameters (weights and biases) in each layer. These are the values that the model learns during the training process to make predictions.

576: For the first Dense layer, the number of parameters is calculated as

 (number of inputs * number of neurons) + number of neurons (for biases). You have 8 input features (X_train_scaled.shape[1]) and 64 neurons,

 so the calculation is (8 * 64) + 64 = 512 + 64 = 576.

2,080: For the second Dense layer, the number of parameters is calculated as

(number of inputs from the previous layer * number of neurons) + number of neurons. The previous layer (dense_2) has 64 outputs, and this layer has 32 neurons,

so the calculation is (64 * 32) + 32 = 2048 + 32 = 2080.


33: For the output layer, the number of parameters is calculated as

 (number of inputs from the previous layer * number of neurons) + number of neurons. The previous layer (dense_3) has 32 outputs, and this layer has 1 neuron,

  so the calculation is (32 * 1) + 1 = 32 + 1 = 33.

Total params: This is the sum of the parameters in all layers (576 + 2080 + 33 = 2689).

Trainable params: These are the parameters that the optimizer will adjust during training to minimize the loss function. In this case, all parameters are trainable.

Non-trainable params: These are parameters that are not updated during training. This is common when using pre-trained models or freezing certain layers. In your case, there are none.

Understanding the model summary helps you visualize the structure of your network and see how the number of parameters grows with each layer. This can be useful for debugging and optimizing your model.

Step 5: The Learning Process - Training the Model

In [None]:
print("Training the model...")
history=model.fit(X_train_scaled,y_train,#This command starts the training process.
                  epochs=2,#An epoch is one complete pass through the entire training dataset. We're telling the model to do this 2 times, as it takes multiple passes to learn the patterns in the data.
                  batch_size=32,#The training data is processed in smaller chunks (or batches) of 32 samples. This makes the training process more memory-efficient and helps the model converge faster.
                  validation_split=0.1,#During training, 10% of the training data is set aside to serve as a validation set. This helps us monitor for overfitting—when a model performs perfectly on the training data but poorly on new, unseen data.
                  verbose=1)#Progress bar. A progress bar is displayed for each epoch, showing the progress of training and validation (if applicable). This is what you saw in the output you provided.

Training the model...
Epoch 1/2
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 24ms/step - accuracy: 0.6580 - loss: 0.6519 - val_accuracy: 0.7258 - val_loss: 0.6141
Epoch 2/2
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.7512 - loss: 0.5502 - val_accuracy: 0.7419 - val_loss: 0.5673


Epoch 1/2 and Epoch 2/2: This indicates which training epoch is currently running out of the total number of epochs you specified (in this case, 2).

18/18: This means that each epoch consists of 18 batches, and all 18 batches have been processed.

accuracy: 0.8215 (Epoch 1) and 0.8092 (Epoch 2): This is the accuracy of your model on the training data after each epoch. It represents the proportion of correctly classified instances in the training set.

loss: 0.4420 (Epoch 1) and 0.4465 (Epoch 2): This is the loss of your model on the training data after each epoch. The loss function (binary crossentropy in your case) measures how well the model is performing; lower values indicate better performance.

val_accuracy: 0.6774 (Epoch 1) and 0.7097 (Epoch 2): This is the accuracy of your model on the validation data after each epoch. The validation data is a subset of the training data that the model has not seen during the actual training process. This metric helps you monitor for overfitting.

val_loss: 0.5307 (Epoch 1) and 0.5175 (Epoch 2): This is the loss of your model on the validation data after each epoch. Similar to the training loss, lower values indicate better performance on the validation set.

In your output, verbose=1 is why you see the progress bar (━━━━━━━━━━━━━━━━━━━━) and the details about batches (18/18) and the metrics for each epoch.

In this output, you can see that the training accuracy is relatively high, while the validation accuracy is lower. This might suggest some degree of overfitting, although with only two epochs, it's hard to say definitively.

Increasing the number of epochs would allow the model to learn more, and monitoring the validation accuracy and loss over more epochs will give you a better indication of how well your model generalizes to unseen data



**verbose**

controls how much information is displayed during the training process:

verbose=0: Silent mode. No output is printed to the console during training.
verbose=1: Progress bar. A progress bar is displayed for each epoch, showing the progress of training and validation (if applicable). This is what you saw in the output you provided.
verbose=2: One line per epoch. Only the training and validation metrics for each epoch are printed on a new line.

In [None]:
print("Training the model...")
history=model.fit(X_train_scaled,y_train,#This command starts the training process.
                  epochs=2,#An epoch is one complete pass through the entire training dataset. We're telling the model to do this 2 times, as it takes multiple passes to learn the patterns in the data.
                  batch_size=32,#The training data is processed in smaller chunks (or batches) of 32 samples. This makes the training process more memory-efficient and helps the model converge faster.
                  validation_split=0.1,#During training, 10% of the training data is set aside to serve as a validation set. This helps us monitor for overfitting—when a model performs perfectly on the training data but poorly on new, unseen data.
                  verbose=2)#Progress bar. A progress bar is displayed for each epoch, showing the progress of training and validation (if applicable). This is what you saw in the output you provided.

Training the model...
Epoch 1/2
18/18 - 0s - 8ms/step - accuracy: 0.7681 - loss: 0.5051 - val_accuracy: 0.7742 - val_loss: 0.5300
Epoch 2/2
18/18 - 0s - 6ms/step - accuracy: 0.7754 - loss: 0.4769 - val_accuracy: 0.7742 - val_loss: 0.5151


Step 6: The Final Grade - Evaluating Performance

This is the most critical step for any research paper. We use this to test the trained model on the unseen test data. The resulting accuracy score tells us how well the model is likely to perform in a real-world scenario.

In [None]:
loss,accuracy=model.evaluate(X_test_scaled,y_test,verbose=0)
print(f"Test Loss: {loss}, \nTest Accuracy: {accuracy}")

Test Loss: 0.5125934481620789, 
Test Accuracy: 0.7727272510528564


Step 7: Real-World Application - Making a Prediction

______________________________


PROGRAM ELEMENTS IN TENSORFLOW

Constants

Parameters whose value foes not change.

In [None]:
import tensorflow as tf

In [None]:
a=tf.constant(5.0,tf.float32)
a

<tf.Tensor: shape=(), dtype=float32, numpy=5.0>

In [None]:
b=tf.constant(3.0)
b

<tf.Tensor: shape=(), dtype=float32, numpy=3.0>

In [None]:
print(a,b)

tf.Tensor(5.0, shape=(), dtype=float32) tf.Tensor(3.0, shape=(), dtype=float32)


PLACEHOLDER

To feed data to a tensorflow model from outside a model.It permits a value to be assigned later.

In [None]:
# In newer versions of TensorFlow, tf.placeholder is not used.
# Input data is typically handled by passing data directly to model layers or functions.
# If you need to define an input shape for a Keras model, use tf.keras.Input.
# Example (not directly applicable here as this was just demonstrating placeholder):
# input_tensor = tf.keras.Input(shape=(input_dimension,))
# c = input_tensor # This would be used within a Keras model definition

# Since the original code was just demonstrating placeholder,
# we can remove this line as it's no longer valid TensorFlow syntax.
# If you intended to define a variable that can be assigned a value later,
# consider using tf.Variable or other TensorFlow constructs depending on your use case.

# Removing the erroneous line:
# c = tf.placeholder(tf.float32)

# If you want to define a variable whose value can change, you might use tf.Variable:
# c = tf.Variable(0.0, dtype=tf.float32)
# You can then assign a new value to it later:
# c.assign(some_new_value)

# Since the original intent was likely just to show a concept no longer valid,
# I'll leave this cell without the erroneous line.

VARIABLES

Allow us to add new trinable parameters to graph.

Initialize it before running the graph in a session

In [None]:
c=tf.Variable([.5],dtype=tf.float32)
c

<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([0.5], dtype=float32)>

In [None]:
d=tf.Variable([-1.0],dtype=tf.float32)
d

<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([-1.], dtype=float32)>

INTRODUCTION

Hello World - Very simple example to learn how to print "hello world" using TensorFlow.

In [None]:
import tensorflow as tf

In [None]:
# Simple hello world using TensorFlow

# Create a Constant op
# The op is added as a node to the default graph.
#
# The value returned by the constructor represents the output
# of the Constant op.

In [None]:
op=tf.constant("Hello,Tensorflow")
print("Tensor_constant:",op)
print("Value_constant:",op.numpy())

Tensor_constant: tf.Tensor(b'Hello,Tensorflow', shape=(), dtype=string)
Value_constant: b'Hello,Tensorflow'


Variable -  Learn to use variable in tensorflow.

In [None]:
# Variables are manipulated via the tf.Variable class.
# A tf.Variable represents a tensor whose value can be changed by running ops on it.
# Specific ops allow you to read and modify the values of this tensor.

## Creaging a Variable

In [None]:
variable=tf.Variable(1)
print("Tensor_var:",variable)
print("Value_var:",variable.numpy())

Tensor_var: <tf.Variable 'Variable:0' shape=() dtype=int32, numpy=1>
Value_var: 1


In [None]:
## Using Variables

# To use the value of a tf.Variable in a TensorFlow graph, simply treat it like a normal

In [None]:
variable_add=variable+1
print("variable_add:",variable_add.numpy())

variable_add: 2


In [None]:
variable = tf.Variable(2)
variable.assign_add(1)
print("value:", variable.numpy())

value: 3


Basical operation (notebook) (code). A simple example that covers TensorFlow basic operations.

In [None]:
#assign values
a=tf.ones([2,3])
print(a)

tf.Tensor(
[[1. 1. 1.]
 [1. 1. 1.]], shape=(2, 3), dtype=float32)


In [None]:
a=tf.Variable(a)
a

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

In [None]:
a[0,0]

<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

In [None]:
a[0,0].assign(10)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[10.,  1.,  1.],
       [ 1.,  1.,  1.]], dtype=float32)>

In [None]:
b=a.read_value()
print(b)

tf.Tensor(
[[10.  1.  1.]
 [ 1.  1.  1.]], shape=(2, 3), dtype=float32)


In [None]:
print(a)
print(b)

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
array([[10.,  1.,  1.],
       [ 1.,  1.,  1.]], dtype=float32)>
tf.Tensor(
[[10.  1.  1.]
 [ 1.  1.  1.]], shape=(2, 3), dtype=float32)


In [None]:
#Add,Multiply,div etc.,
a=tf.constant(2)
b=tf.constant(3)

#add
print("Addition with constants:",a+b)
print("Addition with numpy:",a.numpy()+b.numpy())
print("Addition with tf function:",tf.add(a,b))

Addition with constants: tf.Tensor(5, shape=(), dtype=int32)
Addition with numpy: 5
Addition with tf function: tf.Tensor(5, shape=(), dtype=int32)


In [None]:
#Multiplication
print("Multiplication with constants:",a*b)
print("Multiplication with numpy:",a.numpy()*b.numpy())
print("Multiplication with tf function:",tf.multiply(a,b))

Multiplication with constants: tf.Tensor(6, shape=(), dtype=int32)
Multiplication with numpy: 6
Multiplication with tf function: tf.Tensor(6, shape=(), dtype=int32)


In [None]:
#Matrix Multiplication
matrix1=tf.constant([[3,3]])
print(matrix1)

tf.Tensor([[3 3]], shape=(1, 2), dtype=int32)


In [None]:
matrix2=tf.constant([[2],[2]])
print(matrix2)

tf.Tensor(
[[2]
 [2]], shape=(2, 1), dtype=int32)


In [None]:
matrix_mul=tf.matmul(matrix1,matrix2)
print("Multiplication with matrixes:", matrix_mul)

Multiplication with matrixes: tf.Tensor([[12]], shape=(1, 1), dtype=int32)


In [None]:
matrix_mult=matrix1*matrix2
print(matrix_mult)

tf.Tensor(
[[6 6]
 [6 6]], shape=(2, 2), dtype=int32)


In [None]:
#Cast operations - float to int
print(a)
a1=tf.convert_to_tensor(3.0)
print(a1)

tf.Tensor(2, shape=(), dtype=int32)
tf.Tensor(3.0, shape=(), dtype=float32)


In [None]:
b1=tf.cast(a1,tf.int32)
print(a1,b1)

tf.Tensor(3.0, shape=(), dtype=float32) tf.Tensor(3, shape=(), dtype=int32)


In [None]:
#shape operations

x=tf.ones([3,2])
print(x)

tf.Tensor(
[[1. 1.]
 [1. 1.]
 [1. 1.]], shape=(3, 2), dtype=float32)


In [None]:
print(x.shape)

(3, 2)


In [None]:
print(x.shape[0],x.shape[1])

3 2


In [None]:
shape_x=tf.shape(x)
print(shape_x)
print(shape_x[0],shape_x[1])

tf.Tensor([3 2], shape=(2,), dtype=int32)
tf.Tensor(3, shape=(), dtype=int32) tf.Tensor(2, shape=(), dtype=int32)


**Session**

To evaluate the nodes,called as the Tensorflow runtime

In [None]:
x1=tf.constant(5.0)
y1=tf.constant(3.0)
z=x1*y1
print(z)

tf.Tensor(15.0, shape=(), dtype=float32)


In [None]:
# In TensorFlow 2.x and later, eager execution is enabled by default,
# so you don't need to explicitly create and run a session to evaluate tensors.

# The value of a tensor can be accessed directly using .numpy()

print(z.numpy())

# There is no need to close a session as it's not being used.
# session.close()

15.0


Activation (notebook) (code). Start to know some activation functions in tensorflow.

GradientTape (notebook) (code). Introduce a key technique for automatic differentiation

In [None]:
print(tf.__version__)

2.19.0
