# #Buildiing the First Neural Network ......

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import torch: This imports the main PyTorch library, which provides a wide range of functionalities for building and training neural networks.

import torch.nn as nn: This imports the submodule nn from PyTorch, which includes classes and functions for defining and working with neural networks.

import torch.optim as optim: This imports the submodule optim from PyTorch, which contains various optimization algorithms for updating the parameters of neural networks during training.

from torch.utils.data import DataLoader, Dataset: This imports the classes DataLoader and Dataset from the torch.utils.data module. These classes are used for efficiently loading and handling datasets during the training process.

from sklearn.datasets import load_iris: This imports the function load_iris from scikit-learn, a popular machine learning library. load_iris is used to load the Iris dataset, a commonly used dataset for classification tasks.

from sklearn.model_selection import train_test_split: This imports the function train_test_split from scikit-learn. It is used to split the dataset into training and testing subsets.

from sklearn.preprocessing import StandardScaler: This imports the class StandardScaler from scikit-learn. StandardScaler is used for standardizing the features of the dataset, ensuring that they have zero mean and unit variance.

In [2]:
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


load_iris() from scikit-learn is used to load the Iris dataset, which is a classic machine learning dataset containing

measurements of flower samples from three different species of Iris flowers.

iris.data represents the input features of the dataset, which includes measurements such as sepal length, sepal width, petal length, and petal width.

iris.target represents the corresponding target labels, which indicate the species of each sample.

train_test_split() from scikit-learn is used to split the data into training and test sets. It takes the input features (X)

and the corresponding target labels (y) as input.

The test_size parameter is set to 0.2, indicating that 20% of the data will be allocated for testing, while the remaining

80% will be used for training.

The random_state parameter is set to 42, which ensures reproducibility by fixing the random seed. This means that the same

split will be obtained every time the code is executed.

The function returns four sets of data: X_train (training features), X_test (test features), y_train (training labels), and 

y_test (test labels).


In [3]:
# Scale the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


StandardScaler() from scikit-learn is an object used for standardizing features by removing the mean and scaling to unit variance.

scaler.fit_transform(X_train) fits the scaler on the training set (X_train) and then transforms it.

This means the scaler calculates the mean and standard deviation from the training set and applies the transformation to 

standardize the features.

scaler.transform(X_test) applies the same transformation to the test set (X_test) using the mean and standard deviation

calculated from the training set. 

This ensures that the test set is scaled consistently with the training set.

In [4]:
# Convert numpy arrays to PyTorch tensors
X_train = torch.from_numpy(X_train).float()
X_test = torch.from_numpy(X_test).float()
y_train = torch.from_numpy(y_train).long()
y_test = torch.from_numpy(y_test).long()

torch.from_numpy() is a function in PyTorch that creates a tensor from a NumPy array.

.float() is used to convert the tensor to the float data type. This is typically done for input features.

.long() is used to convert the tensor to the long data type. This is typically done for target labels.

By converting the NumPy arrays to PyTorch tensors, we enable seamless integration with PyTorch operations and take advantage of PyTorch's automatic differentiation capabilities during the training process. It allows us to perform computations on the GPU if available and utilize the vast ecosystem of PyTorch libraries and functionalities for deep learning tasks.

In [5]:
# Create a custom dataset
class IrisDataset(Dataset):
    def __init__(self, features, labels):
        self.features = features
        self.labels = labels

    def __getitem__(self, index):
        x = self.features[index]
        y = self.labels[index]
        return x, y

    def __len__(self):
        return len(self.features)

The __init__ method is the initializer of the class. It takes two arguments: features and labels. These arguments represent the input features and target labels for the Iris dataset.
In the __init__ method, the input features and labels are assigned to instance variables self.features and self.labels, respectively.
The __getitem__ method is implemented to retrieve an item from the dataset given an index. It takes an index as input and returns the corresponding features (x) and labels (y) at that index.
The __len__ method returns the total number of samples in the dataset, which is determined by the length of the features array.
By creating this custom dataset class, we can leverage the functionality provided by PyTorch's DataLoader class to efficiently load and process the Iris dataset during training and testing. The custom dataset encapsulates the features and labels, allowing us to easily retrieve individual samples when iterating over the dataset.

In [6]:
# Define the neural network
class NeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

The NeuralNetwork class is defined as a subclass of nn.Module, which is the base class for all neural network modules in PyTorch.

The __init__ method is the initialization method of the class. It is called when an instance of the class is created. Here, it takes three arguments: input_size, hidden_size, and num_classes.

Inside the __init__ method, the structure of the neural network is defined by instantiating different layers.

self.fc1 = nn.Linear(input_size, hidden_size) defines the first fully connected layer with input_size input features and hidden_size output features. This layer performs a linear transformation on the input data.

self.relu = nn.ReLU() defines the activation function to be applied element-wise after the first fully connected layer. ReLU (Rectified Linear Unit) is a popular choice for introducing non-linearity in neural networks.

self.fc2 = nn.Linear(hidden_size, num_classes) defines the second fully connected layer with hidden_size input features and num_classes output features. This layer transforms the data into the desired output size.

The forward method defines the forward pass computation of the neural network. It takes an input tensor x as an argument and defines the sequence of operations to be applied to the input.

Inside the forward method, the input tensor x is passed through the layers of the neural network in the defined order.
The output tensor out is returned at the end of the forward method.

By defining the neural network model in this way, we can create an instance of the NeuralNetwork class, which represents our customized neural network architecture. This allows us to easily access and manipulate the model's parameters, perform forward pass computations, and train the model on data.

In [7]:
# Set hyperparameters
input_size = X_train.shape[1]
hidden_size = 64
num_classes = len(iris.target_names)
learning_rate = 0.001
batch_size = 16
num_epochs = 100

In the provided code, hyperparameters for training the neural network on the Iris dataset are set. These hyperparameters determine the configuration and behavior of the neural network during the training process. Let's break down each hyperparameter:

input_size represents the number of input features in the dataset. In this case, it is set to the number of columns in the training set (X_train.shape[1]), indicating the size of the input layer of the neural network.

hidden_size represents the number of units/neurons in the hidden layer of the neural network. It is set to 64 in this example, but it can be adjusted based on the complexity of the problem.

num_classes corresponds to the number of classes in the classification problem. It is determined by the length ofiris.target_names, which represents the different target labels in the Iris dataset.

learning_rate determines the step size at each iteration during gradient descent optimization. It controls how much the weights are updated based on the calculated gradients

batch_size defines the number of samples in each mini-batch during training. It indicates how many samples are processed before updating the weights and biases of the neural network

num_epochs represents the number of times the entire training dataset is passed through the neural network during training. It defines the number of iterations over the entire dataset.

By setting these hyperparameters, we can configure the network's architecture, control the learning process, and define the number of iterations required for training the neural network on the Iris dataset. These hyperparameters can be adjusted based on the specific requirements of the problem and the characteristics of the dataset.






`




In [8]:
# Create the dataloaders
train_dataset = IrisDataset(X_train, y_train)
test_dataset = IrisDataset(X_test, y_test)
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

train_dataset is an instance of the IrisDataset class, created using the training features (X_train) and labels (y_train).

test_dataset is an instance of the IrisDataset class, created using the test features (X_test) and labels (y_test).

train_dataloader is a DataLoader object that wraps the train_dataset and provides an efficient way to iterate over the

training samples in mini-batches. It takes the train_dataset as input and specifies the batch_size for each mini-batch.

Additionally, setting shuffle=True ensures that the samples are randomly shuffled at the beginning of each epoch during training, improving the learning process.

test_dataloader is a DataLoader object that wraps the test_dataset and is used for evaluating the performance of the

trained model on the test set. Similar to train_dataloader, it specifies the batch_size for processing the test samples.

Setting shuffle=False ensures that the samples are processed in the order they appear in the test dataset, maintaining consistency for evaluation purposes.

In [9]:
# Initialize the neural network
model = NeuralNetwork(input_size, hidden_size, num_classes)

NeuralNetwork is a custom class that represents the architecture of the neural network. It takes three arguments:

input_size, hidden_size, and num_classes.

input_size is the number of input features in the dataset. It determines the size of the input layer of the neural network.

hidden_size is the number of units/neurons in the hidden layer of the neural network. It specifies the size and complexity 

of the hidden layer.

num_classes is the number of classes in the classification problem. It defines the size of the output layer of the neural

network, which is typically equal to the number of classes.

By initializing the neural network model using the NeuralNetwork class, we define the architecture and parameters of the

model. The model can then be trained on the training dataset and used for making predictions on new data. The specific

implementation details of the NeuralNetwork class are not provided in the code snippet but would include the definition of the layers, activation functions, and forward pass logic of the neural network.

In [10]:
# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

criterion is an instance of the nn.CrossEntropyLoss() class from PyTorch. This loss function is commonly used for multi-

class classification problems. It combines the softmax activation function and the negative log likelihood loss,

making it suitable for training neural networks to classify multiple classes.

optimizer is an instance of the optim.Adam() class from PyTorch. It is used to optimize the model's parameters during the

training process. The Adam optimizer is a popular optimization algorithm that utilizes adaptive learning rates and 

momentum. It updates the model's parameters based on the gradients computed during backpropagation.

The loss function (criterion) measures the discrepancy between the predicted class probabilities and the true class labels.

The optimizer (optimizer) adjusts the model's parameters in the direction that minimizes the loss function. By using these

components together, the neural network can be trained to minimize the loss and improve its predictive performance.

It's important to note that the specific choice of the loss function and optimizer can vary depending on the problem and the characteristics of the dataset. The nn.CrossEntropyLoss() and optim.Adam() functions are commonly used defaults, but other options may be suitable in different scenarios.








In [11]:
# Training loop
for epoch in range(num_epochs):
    for i, (inputs, labels) in enumerate(train_dataloader):
        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i + 1) % 5 == 0:
            print(f"Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(train_dataloader)}], Loss: {loss.item():.4f}")

Epoch [1/100], Step [5/8], Loss: 1.1945
Epoch [2/100], Step [5/8], Loss: 1.1145
Epoch [3/100], Step [5/8], Loss: 0.9424
Epoch [4/100], Step [5/8], Loss: 0.8364
Epoch [5/100], Step [5/8], Loss: 0.7992
Epoch [6/100], Step [5/8], Loss: 0.7936
Epoch [7/100], Step [5/8], Loss: 0.6044
Epoch [8/100], Step [5/8], Loss: 0.6036
Epoch [9/100], Step [5/8], Loss: 0.5390
Epoch [10/100], Step [5/8], Loss: 0.5171
Epoch [11/100], Step [5/8], Loss: 0.4455
Epoch [12/100], Step [5/8], Loss: 0.5367
Epoch [13/100], Step [5/8], Loss: 0.4614
Epoch [14/100], Step [5/8], Loss: 0.5400
Epoch [15/100], Step [5/8], Loss: 0.3863
Epoch [16/100], Step [5/8], Loss: 0.4385
Epoch [17/100], Step [5/8], Loss: 0.3612
Epoch [18/100], Step [5/8], Loss: 0.3908
Epoch [19/100], Step [5/8], Loss: 0.3229
Epoch [20/100], Step [5/8], Loss: 0.3117
Epoch [21/100], Step [5/8], Loss: 0.4300
Epoch [22/100], Step [5/8], Loss: 0.1681
Epoch [23/100], Step [5/8], Loss: 0.2430
Epoch [24/100], Step [5/8], Loss: 0.2964
Epoch [25/100], Step [5/8

The outer loop iterates over the specified number of epochs (num_epochs), representing the number of times the entire 

training dataset is passed through the neural network.

The inner loop iterates over the mini-batches of the training dataset provided by the train_dataloader.

inputs and labels represent a mini-batch of input features and corresponding target labels.

The forward pass is performed by passing the inputs through the model, which computes the predicted outputs.

The loss is calculated using the specified loss function (criterion) by comparing the predicted outputs with the true labels.

The optimizer's gradient buffers are cleared (optimizer.zero_grad()) to ensure that gradients from the previous iteration do not accumulate.

The gradients are computed by calling loss.backward(), which calculates the gradients of the loss with respect to the model's parameters using backpropagation.

The optimizer's step() function is called to update the model's parameters based on the computed gradients.

The if condition checks if the current iteration is a multiple of 5. If true, it prints the current epoch, iteration, and loss value.

This printing statement is optional and is commonly used to monitor the training progress and observe the decreasing loss value.

By executing this training loop, the neural network iteratively learns from the training dataset, adjusting its parameters to minimize the loss. The process involves forwarding the inputs through the model, calculating the loss, backpropagating the gradients, and updating the model's parameters using the optimizer. This loop is repeated for the specified number of epochs, allowing the network to converge and improve its performance on the task.

In [12]:
# Evaluation
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for inputs, labels in test_dataloader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print(f"Test Accuracy: {accuracy:.2f}%")

Test Accuracy: 100.00%


model.eval() sets the model to evaluation mode. This is necessary to turn off certain operations like dropout or batch normalization, which behave differently during training and evaluation.

with torch.no_grad(): is a context manager that disables gradient calculation. It is used during evaluation to save memory and speed up computations since gradients are not needed for evaluation.

correct and total variables are initialized to keep track of the number of correctly classified samples and the total number of samples in the test dataset, respectively.

The code then iterates over the test dataset using the test_dataloader.

For each mini-batch of inputs and labels, the model performs a forward pass (outputs = model(inputs)).

The predicted class labels are obtained by taking the maximum value along the second dimension (1) of the outputs tensor using torch.max(outputs.data, 1). The _ variable is used to store the maximum values, while predicted stores the corresponding predicted class labels.

labels.size(0) gives the number of labels in the current mini-batch, which is added to the total count.
(predicted == labels).sum().item() calculates the number of correctly classified samples in the mini-batch and adds it to the correct count.

After iterating over all mini-batches in the test dataset, the accuracy is calculated by dividing the correct count by the total count and multiplying by 100.
Finally, the test accuracy is printed.

By evaluating the model on the test dataset, we can assess how well it generalizes to unseen data and obtain a measure of its performance in terms of accuracy