<a href="https://colab.research.google.com/github/drpetros11111/The-Complete-Neural-Networks-Bootcamp-Theory-Applications/blob/master/FFNN_Diabetes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np
import torch
import torch.nn as nn
import pandas as pd
from sklearn.preprocessing import StandardScaler
from torch.utils.data import Dataset

In [None]:
# Load the dataset using Pandas
data = pd.read_csv('/content/diabetes.csv')

In [None]:
# For x: Extract out the dataset from all the rows (all samples) and all columns except last column (all features).
# For y: Extract out the last column (which is the label)
# Convert both to numpy using the .values method
x = data.iloc[:,0:-1].values
y_string= list(data.iloc[:,-1])

In [None]:
# Lets have a look some samples from our data
print(x[:3])
print(y_string[:3])

[[  6.  148.   72.   35.    0.   33.6  50. ]
 [  1.   85.   66.   29.    0.   26.6  31. ]
 [  8.  183.   64.    0.    0.   23.3  32. ]]
['positive', 'negative', 'positive']


In [None]:
# Our neural network only understand numbers! So convert the string to labels
y_int = []
for string in y_string:
    if string == 'positive':
        y_int.append(1)
    else:
        y_int.append(0)

In [None]:
data.head()

Unnamed: 0,Number of times pregnant,Plasma glucose concentration,Diastolic blood pressure,Triceps skin fold thickness,2-Hour serum insulin,Body mass index,Age,Class
0,6,148,72,35,0,33.6,50,positive
1,1,85,66,29,0,26.6,31,negative
2,8,183,64,0,0,23.3,32,positive
3,1,89,66,23,94,28.1,21,negative
4,0,137,40,35,168,43.1,33,positive


In [None]:
# Now convert to an array
y = np.array(y_int, dtype = 'float64')

### $x^{\prime}=\frac{x-\mu}{\sigma}$

In [None]:
# Feature Normalization. All features should have the same range of values (-1,1)
sc = StandardScaler()
x = sc.fit_transform(x)

In [None]:
# Now we convert the arrays to PyTorch tensors
x = torch.tensor(x)
# We add an extra dimension to convert this array to 2D
y = torch.tensor(y).unsqueeze(1)

# Create Pytorch Tensors
First, torch.tensor(y) converts y to a PyTorch tensor.

The unsqueeze(1) method adds an extra dimension to the tensor at the specified position (in this case, position 1).


---


# Why Add an Extra Dimension?
Adding an extra dimension is often necessary for data compatibility with various neural network layers or operations.


---


# Original Shape of y:
If y is a 1D tensor (e.g., [1, 2, 3]),
its shape would be (3,).

---
# Shape After unsqueeze(1):
Adding an extra dimension at position 1 converts y to a 2D tensor. For example:

Original tensor: [1, 2, 3] with shape (3,)

After unsqueeze(1): [[1], [2], [3]] with shape (3, 1)

This reshaping is often required for:

##Batch Processing

Neural network operations expect inputs to have a certain number of dimensions. For instance, many operations expect a batch of inputs, where each input is a row in a 2D tensor.

##Compatibility with Layers

Certain layers (like fully connected layers) expect inputs in a 2D format where one dimension is the batch size and the other is the feature size.

In [None]:
print(x.shape)
print(y.shape)

torch.Size([768, 7])
torch.Size([768, 1])


In [None]:
class Dataset(Dataset):

    def __init__(self,x,y):
        self.x = x
        self.y = y

    def __getitem__(self,index):
        # Get one item from the dataset
        return self.x[index], self.y[index]

    def __len__(self):
        return len(self.x)

# Creating a custom dataset class in PyTorch

is useful for several reasons, particularly when working with datasets that don’t fit neatly into the pre-defined formats provided by PyTorch’s built-in datasets. Here’s why you might create such a class:



---


# 1. Custom Data Handling
##Different Data Formats

If your data isn't already in a format that PyTorch can easily consume (like a CSV file, JSON, or a non-standard data structure), you can use a custom dataset class to handle loading, parsing, and transforming the data as needed.

##Complex Data Loading

For more complex data sources, like images stored in directories or data split across multiple files, a custom dataset class allows you to define how data is accessed and processed.

-----

# 2. Integration with PyTorch Utilities
##DataLoader Compatibility

By subclassing torch.utils.data.Dataset, your dataset can be used with PyTorch’s DataLoader, which handles batching, shuffling, and parallel data loading.

This integration simplifies the process of preparing data for training and evaluation.

-----
##Transformation and Augmentation

You can easily add data transformations and augmentations within your custom dataset class.

For example, you might include data normalization, resizing, or other preprocessing steps directly in your dataset class.

-----

#3. Flexibility and Reusability
##Custom Operations

If your dataset requires specific operations or preprocessing steps, such as scaling values, splitting sequences, or applying domain-specific transformations, these can be incorporated directly into your dataset class.

##Reuse

Once created, a custom dataset class can be reused across different projects or experiments, ensuring consistency and reducing the need for repetitive code.

-------

#4. Simplified Code Management
##Encapsulation

The dataset class encapsulates all the data-related functionality in one place, making your code cleaner and easier to maintain.

##Clarity

Having a well-defined dataset class helps in clearly defining the data pipeline, which is beneficial for both debugging and collaborative work.

-----
----

#Custom CSV Loading

##1. Class Definition

    class Dataset(Dataset):

The Dataset class inherits from torch.utils.data.Dataset, which is an abstract class that PyTorch provides for creating custom datasets.

The custom class Dataset here overrides the necessary methods to define how to access and manage the dataset.

-----
##2. Initialization Method

    def __init__(self, x, y):
       self.x = x
       self.y = y

The __init__ method is the constructor for the Dataset class.

It initializes the dataset object with x and y, which are typically the features and labels of the dataset, respectively.
self.x and self.y store the data and labels for the dataset.

----
##3. Getting an Item

    def __getitem__(self, index):
       # Get one item from the dataset
       return self.x[index], self.y[index]

The __getitem__ method retrieves a data point and its corresponding label from the dataset given an index.

This method allows you to use indexing on an instance of this dataset class, e.g., dataset[index], to get the item at that position in x and its corresponding label in y.

----
##4. Dataset Length

    def __len__(self):
       return len(self.x)

The __len__ method returns the total number of items in the dataset.

It uses the length of self.x because x and y are expected to have the same length. This ensures that the dataset provides a correct size when queried.

----
##How to Use This Dataset Class
Here’s how you can use this custom Dataset class with PyTorch’s DataLoader:

    import torch
    from torch.utils.data import DataLoader

-----
# Example data
    x = torch.tensor([[1.0], [2.0], [3.0], [4.0]])  # Example features
    y = torch.tensor([0, 1, 0, 1])  # Example labels

# Instantiate the custom dataset
dataset = Dataset(x, y)

# Create a DataLoader for batching, shuffling, etc.
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)

# Iterate over the DataLoader
for batch_x, batch_y in dataloader:
    print('Batch X:', batch_x)
    print('Batch Y:', batch_y)

----------
#Summary
__init__: Initializes the dataset with features x and labels y.

__getitem__: Returns the feature and label at a specified index.

__len__: Returns the number of items in the dataset.

This custom dataset class allows you to integrate your data with PyTorch’s data loading utilities seamlessly, enabling you to leverage batching, shuffling, and other data management features provided by DataLoader.








In [None]:
len(dataset)

768

In [None]:
# Load the data to your dataloader for batch processing and shuffling
train_loader = torch.utils.data.DataLoader(dataset=dataset,
                                           batch_size=32,
                                           shuffle=True)

# Load the data to your dataloader for batch processing and shuffling


---


##torch.utils.data.DataLoader

This is a utility provided by PyTorch to create an iterable over your dataset. It allows you to load data in batches, shuffle the data, and use multiprocessing for faster data loading.


---


##dataset

This is the dataset object you have created (usually an instance of a class derived from torch.utils.data.Dataset).

It contains the data samples and their corresponding labels.

----
##batch_size=32

This parameter specifies the number of samples to load in each batch.

In this case, the DataLoader will load 32 samples at a time.

-----

##shuffle=True

This parameter ensures that the data is shuffled at the beginning of each epoch.

 Shuffling the data is important to prevent the model from learning the order of the samples and to improve generalization.

 ----
Summary

 The DataLoader is an essential tool in PyTorch for managing datasets, providing an efficient way to load data in batches, shuffle it, and perform preprocessing on the fly.

 It abstracts much of the complexity involved in handling data, allowing you to focus on building and training your models.

In [None]:
# Let's have a look at the data loader
print("There is {} batches in the dataset".format(len(train_loader)))
for (x,y) in train_loader:
    print("For one iteration (batch), there is:")
    print("Data:    {}".format(x.shape))
    print("Labels:  {}".format(y.shape))
    break

There is 24 batches in the dataset
For one iteration (batch), there is:
Data:    torch.Size([32, 7])
Labels:  torch.Size([32, 1])


![demo](https://user-images.githubusercontent.com/30661597/60379583-246e5e80-9a68-11e9-8b7f-a4294234c201.png)

In [None]:
# Now let's build the above network
class Model(nn.Module):
    def __init__(self, input_features):
        super(Model, self).__init__()
        self.fc1 = nn.Linear(input_features, 5)
        self.fc2 = nn.Linear(5, 4)
        self.fc3 = nn.Linear(4, 3)
        self.fc4 = nn.Linear(3, 1)
        self.sigmoid = nn.Sigmoid()
        self.tanh = nn.Tanh()

    def forward(self, x):
        out = self.fc1(x)
        out = self.tanh(out)
        out = self.fc2(out)
        out = self.tanh(out)
        out = self.fc3(out)
        out = self.tanh(out)
        out = self.fc4(out)
        out = self.sigmoid(out)
        return out

# Importing Necessary Libraries

    import torch
    import torch.nn as nn

These imports bring in PyTorch's core tensor library (torch) and neural network module (torch.nn), which provides a variety of classes and functions to create neural networks.


---


# Defining the Model Class

    class Model(nn.Module):

This line defines a new class named Model that inherits from nn.Module, which is the base class for all neural network modules in PyTorch.

Inheriting from nn.Module provides a lot of built-in functionality for handling model parameters, gradients, and more.

----
#Initializing the Model

    def __init__(self, input_features):
       super(Model, self).__init__()

The __init__ method initializes the model. super(Model, self).__init__() ensures that the base class (nn.Module) is properly initialized.

self.fc1 to self.fc4 are fully connected (linear) layers. Each layer transforms the input features to the output features.

These lines define the layers and activation functions of the neural network:

#Defining the Layers

    self.fc1 = nn.Linear(input_features, 5)

self.fc1 to self.fc4 are fully connected (linear) layers. Each layer transforms the input features to the output features.

self.fc1 takes input_features and maps them to 5 features.

    self.fc2 = nn.Linear(5, 4)

self.fc1 to self.fc4 are fully connected (linear) layers. Each layer transforms the input features to the output features.

    self.fc3 = nn.Linear(4, 3)

self.fc3 takes the 4 features from self.fc2 and maps them to 3 features.

    self.fc4 = nn.Linear(3, 1)

self.fc4 takes the 3 features from self.fc3 and maps them to 1 feature.

    self.sigmoid = nn.Sigmoid()

self.sigmoid is a sigmoid activation function.

    self.tanh = nn.Tanh()

self.tanh is a hyperbolic tangent activation function.

-----
##Defining the Forward Pass

    def forward(self, x):
       out = self.fc1(x)
       out = self.tanh(out)
       out = self.fc2(out)
       out = self.tanh(out)
       out = self.fc3(out)
       out = self.tanh(out)
       out = self.fc4(out)
       out = self.sigmoid(out)
       return out

The forward method defines how the input tensor x passes through the network layers and activation functions.

This method is automatically called when you perform a forward pass on the network, e.g., model(x).

-----
##First Layer

    out = self.fc1(x)
    
The input x is passed through the first linear layer.

    out = self.tanh(out)
    
The output of the first linear layer is passed through the tanh activation function.

---
##Second Layer

    out = self.fc2(out)

The result from the previous step is passed through the second linear layer.

    out = self.tanh(out)

The output of the second linear layer is passed through the tanh activation function.

-----
##Third Layer

    out = self.fc3(out)

The result from the previous step is passed through the third linear layer.

    out = self.tanh(out)

The output of the third linear layer is passed through the tanh activation function.

-----
##Fourth Layer

    out = self.fc4(out)

The result from the previous step is passed through the fourth linear layer.

    out = self.sigmoid(out)

The output of the fourth linear layer is passed through the sigmoid activation function.

----
##Return Output

The final output out is returned.

----
#Summary
This neural network model performs a sequence of linear transformations followed by non-linear activation functions (tanh and sigmoid).

It takes an input tensor, passes it through four linear layers, applies tanh activations after the first three layers, and a sigmoid activation after the last layer.

This kind of architecture is commonly used for binary classification tasks, where the final sigmoid activation ensures the output is between 0 and 1, representing a probability.

$H_{p}(q)=-\frac{1}{N} \sum_{i=1}^{N} y_{i} \cdot \log \left(p\left(y_{i}\right)\right)+\left(1-y_{i}\right) \cdot \log \left(1-p\left(y_{i}\right)\right)$


cost = -(Y * torch.log(hypothesis) + (1 - Y) * torch.log(1 - hypothesis)).mean()

In [None]:
# Create the network (an object of the Net class)
net = Model(x.shape[1])

# In Binary Cross Entropy: the input and output should have the same shape
# size_average = True is deprecated, use reduction='mean' instead
criterion = torch.nn.BCELoss(reduction='mean')

# We will use SGD with momentum with a learning rate of 0.1
optimizer = torch.optim.SGD(net.parameters(), lr=0.1, momentum=0.9)


# Create the network (an object of the Net class)
    net = Model(x.shape[1])

##Model(x.shape[1])

This initializes an instance of the Model class. The Model class is assumed to be defined elsewhere and should inherit from torch.nn.Module.

##x.shape[1]

This gets the number of input features. If x is a 2D tensor representing your dataset, x.shape[1] gives the number of features (columns) in your data.

When we initialize our model, we need to tell it how many input features it should expect.

This is crucial because the input layer of the model must match the number of features in the dataset.

###x.shape[1]

This extracts the number of features (columns) from x. If x.shape is (100, 3), then x.shape[1] is 3.

###Model(x.shape[1])

This initializes the Model class with 3 as the number of input features. The Model class uses this information to create the first layer of the neural network with the correct number of input nodes.

##net

This variable holds the created model, which will be used for training and predictions.

-----
##Defining the Loss Function

###In Binary Cross Entropy: the input and output should have the same shape
### size_average = True is deprecated, use reduction='mean' instead

    criterion = torch.nn.BCELoss(reduction='mean')

##torch.nn.BCELoss

This is a built-in PyTorch loss function used for binary classification problems.

BCELoss stands for Binary Cross Entropy Loss.

    reduction='mean'

This specifies that the loss will be averaged over the batch.

The terms "reduction" and "deprecation" are used in different contexts in software development and mathematics. Here’s a detailed explanation of each:
----
----
#1. Reduction
In the context of loss functions in machine learning and deep learning, reduction refers to how the output of the loss function is aggregated or summarized.

Specifically, it determines how to combine the individual loss values from a batch of examples into a single scalar value.

###Common Reduction Methods:
'none': No reduction is applied. The loss values for each element in the batch are returned as-is.

This is useful when you need to perform custom operations on the individual losses.

####'mean':

The sum of the output is divided by the number of elements in the output. This method calculates the average loss per element, which is useful for getting a normalized measure of loss across the batch.

####'sum'

The output is summed. This aggregates all the loss values into a single value by summing them up.

This method provides the total loss across all elements in the batch.

###Example in PyTorch

    import torch
    import torch.nn as nn

# Create a loss function
    criterion = nn.MSELoss(reduction='mean')

# Example predictions and target labels
    outputs = torch.tensor([2.0, 3.0], dtype=torch.float32)
    targets = torch.tensor([2.5, 2.5], dtype=torch.float32)

# Compute the loss
    loss = criterion(outputs, targets)
    print("Mean Loss:", loss.item())

In this example, using reduction='mean' will average the loss values across all elements in the batch, while reduction='sum' would sum them.

----
#2. Deprecation
Deprecation is a term used in software development to indicate that a feature, function, or method is considered obsolete and may be removed in future versions.

Deprecated features are still available but are discouraged from being used because there are better or more efficient alternatives.

##Why Features Get Deprecated:
Improved Alternatives: Newer methods or functions are introduced that provide better performance or usability.

Maintenance: Maintaining older features can be burdensome, and removing them can simplify the codebase and reduce potential bugs.

Standardization: Aligning with best practices or industry standards.

When a feature is deprecated, developers are typically encouraged to use the recommended alternatives. This helps in migrating to newer versions of a library or framework without relying on outdated features.

##Example of Deprecation in PyTorch:

In PyTorch, size_average=True was deprecated in favor of reduction='mean':

###Deprecated: size_average=True was used to average the loss across all elements.

###Updated:

reduction='mean' is the new standard to achieve the same effect and provides more flexibility.

Using deprecated features can lead to warnings or errors, and the feature might be removed entirely in future releases, so it is essential to update code to use the recommended practices.

----
##Summary
Reduction specifies how to aggregate loss values (e.g., mean, sum, or none) and is crucial for understanding how to interpret the loss values produced by a loss function.

Deprecation indicates that a feature is outdated and should be replaced with newer alternatives to ensure compatibility with future versions and better practices.

-----
-----
The argument

    size_average=True

 used to achieve this is now deprecated, and reduction='mean' is the recommended usage.

-----------
##Setting Up the Optimizer

# We will use SGD with momentum with a learning rate of 0.1
    optimizer = torch.optim.SGD(net.parameters(), lr=0.1, momentum=0.9)

##torch.optim.SGD

This is a stochastic gradient descent optimizer provided by PyTorch.

##net.parameters()

This passes the parameters of the net model to the optimizer so that they can be updated during training.

##lr=0.1

This sets the learning rate for the optimizer to 0.1. The learning rate determines the step size at each iteration while moving toward a minimum of the loss function.

##momentum=0.9

This adds momentum to the gradient updates to accelerate convergence, especially in the relevant direction and dampen oscillations.

-----
#Summary
Here’s a step-by-step summary of what each part does:

##Model Initialization:

An instance of the Model class is created with the number of input features passed as an argument.

##Loss Function Definition:

The loss function used for training is defined. In this case, it’s Binary Cross Entropy Loss, appropriate for binary classification tasks.
Optimizer Setup:

##The optimizer is set up to adjust the parameters of the model during training using Stochastic Gradient Descent with momentum.

----
----
#Example Usage
This setup is typically part of a larger workflow that includes data loading, forward and backward passes, and parameter updates. Here’s how it might fit into a training loop:

    import torch
    import torch.nn as nn
    import torch.optim as optim

## Assuming the Model class and data (x, y) are already defined

## Create the network
    net = Model(x.shape[1])

## Define the loss criterion
    criterion = torch.nn.BCELoss(reduction='mean')

## Set up the optimizer
    optimizer = torch.optim.SGD(net.parameters(), lr=0.1, momentum=0.9)

-----
# Example training loop
    num_epochs = 10
    for epoch in range(num_epochs):
        for inputs, labels in train_loader:  # Assuming train_loader is defined
           # Forward pass
            outputs = net(inputs.float())
            loss = criterion(outputs, labels.float())

            # Backward pass and optimization
              optimizer.zero_grad()  # Clear gradients
              loss.backward()        # Compute gradients
              optimizer.step()       # Update parameters

          print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

   print('Finished Training')


In this example, train_loader would be a DataLoader object that provides batches of inputs and labels. The model's parameters are updated in each iteration of the loop to minimize the loss, improving the model's performance on the task.

In [None]:
# Train the network
num_epochs = 200
for epoch in range(num_epochs):
    for inputs,labels in train_loader:
        inputs = inputs.float()
        labels = labels.float()
        # Feed Forward
        output = net(inputs)
        # Loss Calculation
        loss = criterion(output, labels)
        # Clear the gradient buffer (we don't want to accumulate gradients)
        optimizer.zero_grad()
        # Backpropagation
        loss.backward()
        # Weight Update: w <-- w - lr * gradient
        optimizer.step()

    #Accuracy
    # Since we are using a sigmoid, we will need to perform some thresholding
    output = (output>0.5).float()

    # Accuracy: (output == labels).float().sum() / output.shape[0]
    accuracy = (output == labels).float().mean()

    # Print statistics
    print("Epoch {}/{}, Loss: {:.3f}, Accuracy: {:.3f}".format(epoch+1,num_epochs, loss, accuracy))

# Calculate and print the loss and accuracy
## Initialization:


---


### num_epochs = 200

The number of times the entire training dataset will pass through the network.

-----
### train_loader

A DataLoader object that provides batches of inputs and labels from the training dataset.

-----
###Training Loop

###for epoch in range(num_epochs):

Loop over the number of epochs.

###for inputs, labels in train_loader:

Loop over each batch of inputs and labels provided by the DataLoader.
Data Preparation:

###inputs = inputs.float()

Convert inputs to floating point numbers (if they are not already).

###labels = labels.float()

Convert labels to floating point numbers (if they are not already).

-----
##Feed Forward

###output = net(inputs)

Pass the inputs through the network to get the predicted outputs.
Loss Calculation:

###loss = criterion(output, labels)

Calculate the loss using the criterion (loss function), which compares the predicted outputs with the actual labels.

-----
##Clear Gradient Buffer

###optimizer.zero_grad()

Clear the gradients of all optimized tensors.

This is important because gradients by default add up; we don't want to accumulate gradients from multiple forward passes.

-----
##Backpropagation

###loss.backward()

Compute the gradient of the loss with respect to the network's parameters (i.e., perform backpropagation).

##Weight Update

##optimizer.step()

Update the network's parameters based on the gradients computed during backpropagation.
-----
##Accuracy Calculation (at the end of each epoch):

###output = (output > 0.5).float()

Apply thresholding to the output to convert probabilities to binary predictions (since we're likely dealing with a binary classification problem).

###accuracy = (output == labels).float().mean()

Calculate the accuracy by comparing the thresholded output with the actual labels.

------
##Print Statistics

    print("Epoch {}/{}, Loss: {:.3f}, Accuracy: {:.3f}".format(epoch+1, num_epochs, loss, accuracy))

Print the current epoch number, loss, and accuracy.

------
#Summary
The code snippet runs a training loop for a specified number of epochs.
In each epoch, it processes batches of data, performs a forward pass, computes the loss, performs backpropagation, and updates the model parameters.

It calculates and prints the loss and accuracy at the end of each epoch.
This process helps to iteratively train the model and improve its performance on the training data.

If you have any specific questions about any part of the code or its functionality, feel free to ask!