# Neural Network

## 3.1 Dataloader
### What is Dataloader
Dataloader is a class that helps with shuffling and organizing the data in minibatches. We can import this class from `torch.utils.data`.

The job of a data loader is to sample minibatches from a dataset, giving us the flexibility to choose the size of our minibatch to be use for training in each iteration. The constructor takes a `Dataset` object as input, along with `batch_size` and a `shuffle` boolean variable that indicates whether the data needs to be shuffled at the beginning of each epoch.

In [1]:
# importing the required library
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import Dataset, DataLoader

In [2]:
# Loading/Downloading the FashionMNIST dataset, download might takes some time 
train_set = torchvision.datasets.FashionMNIST(
    root = '../data',
    train = True,
    download = True,
    transform = transforms.ToTensor()
    )
test_set = torchvision.datasets.FashionMNIST(
    root = '../data',
    train = False,
    download = True,
    transform = transforms.ToTensor()
    )

Loading the dataset into the DataLoader and input your desired batch size for training

In [3]:
train_loader = DataLoader(train_set, batch_size = 32, shuffle = True)
test_loader = DataLoader(test_set, batch_size = 32, shuffle = False)

In [4]:
# A view of the DataLoader

batch = next(iter(train_loader))
images, labels = batch

# Output the size of each batch
print(images.shape, labels.shape)

torch.Size([32, 1, 28, 28]) torch.Size([32])


## 3.2 Build your first Neural Network (Subclassing nn.Module)

### 3.2.1 Model Training
We had loaded our dataset into training and testing set, now let us build a simple Feedfoward Neural Network to perform classification on this dataset.

PyTorch has a whole submodule dedicated to neural networks, called `torch.nn`. It contains the building blocks needed to create all sorts of neural network architectures.

To build a Neural Network, it could be done in two ways :
- Subclassing `nn.Module` to have more flexibility on designing the network, eg: writing the your own `foward()` method
- Calling the `nn.Sequential()` for fast implementation of the network

Now let us start building the Neural Network

In [5]:
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

We would like to build a 4 layers neural network with ReLU activation function. Apply dropout with 20% probability to reduce the effect of overfitting

In [6]:
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc_1 = nn.Linear(784, 256)
        self.act_1 = nn.ReLU()
        self.fc_2 = nn.Linear(256, 128)
        self.act_2 = nn.ReLU()
        self.fc_3 = nn.Linear(128, 64)
        self.act_3 = nn.ReLU()
        self.fc_4 = nn.Linear(64, 10)
        self.dropout = nn.Dropout(0.2)
        
    def forward(self, x):
        out = self.dropout(self.act_1(self.fc_1(x)))
        out = self.dropout(self.act_2(self.fc_2(out)))
        out = self.dropout(self.act_3(self.fc_3(out)))
        out = self.fc_4(out)
        return out

# Or you can use the Pytorch provided functional API when defining the forward method. Both of these are the same.

class Classifier_F(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc_1 = nn.Linear(784, 256)
        self.fc_2 = nn.Linear(256, 128)
        self.fc_3 = nn.Linear(128, 64)
        self.fc_4 = nn.Linear(64, 10)
        
    def forward(self, x):
        out = F.dropout(F.relu(self.fc_1(x)), p = 0.2)
        out = F.dropout(F.relu(self.fc_2(out)), p = 0.2)
        out = F.dropout(F.relu(self.fc_3(out)), p = 0.2)
        out = self.fc_4(out)
        return out

We will build a wrapper function for our training called `training`. This wrapper function will take on parameters:
- n_epochs
- optimizer
- model
- loss_fn
- train_loader
- writer (Instance of Summary Writer to use TensorBoard for visualization)

Pytorch does support TensorBoard which provides the visualization and tooling needed for machine learning experimentation. It is a useful tool that we can use during our training. Now let's define our training loop and implement some of the TensorBoard methods. 

If you wish to know more on TensorBoard, you can access it at [here](https://pytorch.org/docs/stable/tensorboard.html)

In [7]:
from torch.utils.tensorboard import SummaryWriter

def training(n_epochs, optimizer, model, loss_fn, train_loader, writer):
    for epoch in range(1, n_epochs + 1):
        loss_train = 0.0
        total = 0
        correct = 0
        for imgs, labels in train_loader:
            # Clearing gradient from previous mini-batch gradient computation  
            optimizer.zero_grad()
            
            # Reshape the tensor so that it fits the dimension of our input layer
            # Get predictions output from the model
            outputs = model(imgs.view(-1, 784))
            
            # Calculate the loss for curernt batch
            loss = loss_fn(outputs, labels)
            
            # Calculating the gradient
            loss.backward()
            
            # Updating the weights and biases using optimizer.step
            optimizer.step()
            
            # Summing up the loss over each epoch
            loss_train += loss.item()
            
            # Calculating the accuracy
            predictions = torch.max(outputs, 1)[1]
            correct += (predictions == labels).sum().item()
            total += len(labels)

        accuracy = correct * 100 / total
        writer.add_scalar('Loss ', loss_train / len(train_loader), epoch)
        writer.add_scalar('Accuracy ', accuracy, epoch)
        print('Epoch {}, Training loss {} , Accuracy {:.2f} %'.format(epoch, loss_train / len(train_loader), accuracy))
    writer.close()

We can open our TensorBoard in the terminal with the command of "tensorboard --logdir=runs". Do remember change to the same directory as this notebook.

Now we are ready for training. Let's use SGD as our optimizer and CrossEntropy as loss function. 

In [8]:
torch.manual_seed(0)
model_SGD = Classifier() 
optimizer = optim.SGD(model_SGD.parameters(), lr = 1e-3) 
loss_fn = nn.CrossEntropyLoss()
writer = SummaryWriter(comment = 'SGD')
training(
    n_epochs = 10,
    optimizer = optimizer,
    model = model_SGD,
    loss_fn = loss_fn,
    train_loader = train_loader,
    writer = writer
)

Epoch 1, Training loss 2.2894890218098958 , Accuracy 18.39 %
Epoch 2, Training loss 2.2399076170603434 , Accuracy 28.73 %
Epoch 3, Training loss 2.068951116498311 , Accuracy 29.45 %
Epoch 4, Training loss 1.695164651552836 , Accuracy 36.16 %
Epoch 5, Training loss 1.4096814838409424 , Accuracy 46.69 %
Epoch 6, Training loss 1.2168791191418966 , Accuracy 52.60 %
Epoch 7, Training loss 1.1041425074577331 , Accuracy 56.25 %
Epoch 8, Training loss 1.0339518047332763 , Accuracy 59.37 %
Epoch 9, Training loss 0.9758181870142619 , Accuracy 62.02 %
Epoch 10, Training loss 0.9312916868527731 , Accuracy 63.92 %


Let us build another model which we set log softmax as the activation function at the output layer and uses Negative log-likelihood loss function. Compare the results for both of these setting

In [9]:
torch.manual_seed(0)
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc_1 = nn.Linear(784, 256)
        self.act_1 = nn.ReLU()
        self.fc_2 = nn.Linear(256, 128)
        self.act_2 = nn.ReLU()
        self.fc_3 = nn.Linear(128, 64)
        self.act_3 = nn.ReLU()
        self.fc_4 = nn.Linear(64, 10)
        self.dropout = nn.Dropout(0.2)
        
    def forward(self, x):
        out = self.dropout(self.act_1(self.fc_1(x)))
        out = self.dropout(self.act_2(self.fc_2(out)))
        out = self.dropout(self.act_3(self.fc_3(out)))
        # adding in softmax
        out = F.log_softmax(self.fc_4(out),dim =1 )
        return out

model_SGD = Classifier() 
optimizer = optim.SGD(model_SGD.parameters(), lr = 1e-3) 
loss_fn = nn.NLLLoss()
writer = SummaryWriter(comment = 'SGD')
training(
    n_epochs = 10,
    optimizer = optimizer,
    model = model_SGD,
    loss_fn = loss_fn,
    train_loader = train_loader,
    writer = writer
)

Epoch 1, Training loss 2.2894890218098958 , Accuracy 18.39 %
Epoch 2, Training loss 2.2399076170603434 , Accuracy 28.73 %
Epoch 3, Training loss 2.068951116498311 , Accuracy 29.45 %
Epoch 4, Training loss 1.695164651552836 , Accuracy 36.16 %
Epoch 5, Training loss 1.4096814838409424 , Accuracy 46.69 %
Epoch 6, Training loss 1.2168791191418966 , Accuracy 52.60 %
Epoch 7, Training loss 1.1041425074577331 , Accuracy 56.25 %
Epoch 8, Training loss 1.0339518047332763 , Accuracy 59.37 %
Epoch 9, Training loss 0.9758181870142619 , Accuracy 62.02 %
Epoch 10, Training loss 0.9312916868527731 , Accuracy 63.92 %


CrossEntropy Loss is actually performing log softmax and negative log likelihood at the same time. Therefore during the construction of our model we could neglect the declaration of activation function at the output layer and save some memory during the backpropagation.

Let us try using other optimizer (Adam) to do our training. Optimizer is one of the hyperparameters that we can tune on.

In [10]:
model_Adam = Classifier() 
optimizer = optim.Adam(model_Adam.parameters(), lr = 1e-3) 
loss_fn = nn.CrossEntropyLoss()
writer = SummaryWriter(comment = 'Adam')
training(
    n_epochs = 10,
    optimizer = optimizer,
    model = model_Adam,
    loss_fn = loss_fn,
    train_loader = train_loader,
    writer = writer
)

Epoch 1, Training loss 0.5945944479823112 , Accuracy 78.32 %
Epoch 2, Training loss 0.423241344755888 , Accuracy 84.78 %
Epoch 3, Training loss 0.38519719421068827 , Accuracy 86.15 %
Epoch 4, Training loss 0.36408053546349206 , Accuracy 86.94 %
Epoch 5, Training loss 0.35000673046310743 , Accuracy 87.39 %
Epoch 6, Training loss 0.3385574172397455 , Accuracy 87.74 %
Epoch 7, Training loss 0.32801985016465185 , Accuracy 88.09 %
Epoch 8, Training loss 0.3184917394856612 , Accuracy 88.41 %
Epoch 9, Training loss 0.31102090905706087 , Accuracy 88.59 %
Epoch 10, Training loss 0.3041634604026874 , Accuracy 88.89 %


In this case, we can see that Adam is performing better than the SGD with the same setting. Hyperparameter tuning is very important in order to obtain desired result

### 3.2.2 Model Saving
After training the model, we would like to save it for future usages. There are some pretty useful functions you might need to familar with:

- `torch.save`: It serialize the object to save to your machine. Models, tensors, and dictionaries of all kinds of objects can be saved using this function.
- `torch.load`: This function uses pickle’s unpickling facilities to deserialize pickled object files to memory.
- `torch.nn.Module.load_state_dict`: Loads a model’s parameter dictionary using a deserialized state_dict.

If you wish to know more on model saving, you can access it at [here](https://pytorch.org/tutorials/beginner/saving_loading_models.html)

#### Saving only the weights

In [11]:
import os
if not os.path.exists('../generated_model'):
    os.mkdir('../generated_model')

In [12]:
# Saving the weights only of the model
torch.save(model_Adam.state_dict(),  '../generated_model/mnist_state_dict.pt')

In [13]:
# To load the state_dict, you must have an instance of the model
modelLoad = Classifier()
modelLoad.load_state_dict(torch.load('../generated_model/mnist_state_dict.pt'))

<All keys matched successfully>

#### Saving the entire model

In [14]:
# Saving the entire model
torch.save(model_Adam, '../generated_model/mnist_model.pt')

In [15]:
# Loading model
modelLoad = torch.load('../generated_model/mnist_model.pt')

### Add-ons: Saving Model in ONNX format
Pytorch also support saving model as ONNX (Open Neural Network Exchange) file type, which is a open format built to represent machine learning models. Let's see how to do it.

In [16]:
import torch.onnx 
dummy_input = torch.randn(32, 784, requires_grad = True)
torch.onnx.export(model_Adam, dummy_input, '../generated_model/model.onnx', verbose = True, input_names = ['input'], output_names = ['output'])

graph(%input : Float(32:784, 784:1),
      %fc_1.weight : Float(256:784, 784:1),
      %fc_1.bias : Float(256:1),
      %fc_2.weight : Float(128:256, 256:1),
      %fc_2.bias : Float(128:1),
      %fc_3.weight : Float(64:128, 128:1),
      %fc_3.bias : Float(64:1),
      %fc_4.weight : Float(10:64, 64:1),
      %fc_4.bias : Float(10:1)):
  %9 : Float(32:256, 256:1) = onnx::Gemm[alpha=1., beta=1., transB=1](%input, %fc_1.weight, %fc_1.bias) # C:\Users\GuanSheng.Wong\anaconda3\envs\Intro_to_Pytorch\lib\site-packages\torch\nn\functional.py:1674:0
  %10 : Float(32:256, 256:1) = onnx::Relu(%9) # C:\Users\GuanSheng.Wong\anaconda3\envs\Intro_to_Pytorch\lib\site-packages\torch\nn\functional.py:973:0
  %11 : Float(32:128, 128:1) = onnx::Gemm[alpha=1., beta=1., transB=1](%10, %fc_2.weight, %fc_2.bias) # C:\Users\GuanSheng.Wong\anaconda3\envs\Intro_to_Pytorch\lib\site-packages\torch\nn\functional.py:1674:0
  %12 : Float(32:128, 128:1) = onnx::Relu(%11) # C:\Users\GuanSheng.Wong\anaconda3\envs\Int

In [17]:
import onnx
#loading the onnx format model
model = onnx.load('../generated_model/model.onnx')

### 3.2.3 Inference
Sometimes, we would like to inference on the trained model to evaluate the performance. `model.eval()` will set the model to evaluation(inference) mode to set dropout, batch normalization layers, etc.. to evaluation mode

In [18]:
# Using previous loaded model
modelLoad.eval()           

Classifier(
  (fc_1): Linear(in_features=784, out_features=256, bias=True)
  (act_1): ReLU()
  (fc_2): Linear(in_features=256, out_features=128, bias=True)
  (act_2): ReLU()
  (fc_3): Linear(in_features=128, out_features=64, bias=True)
  (act_3): ReLU()
  (fc_4): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

After setting it to inference mode, we could pass in test data with the setting of 
```python 
with torch.no_grad():
``` 
as we do not have to calculate the gradient during the inference, this can help us save some memory.

In [19]:
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        outputs = modelLoad(images.view(-1, 784))
        predictions = torch.max(outputs, 1)[1]
        correct += (predictions == labels).sum()
        total += len(labels)
    accuracy_test = correct.item() * 100 / total
print("Test Accuracy : {:.2f} %".format(accuracy_test))

Test Accuracy : 88.09 %


## 3.3 Build your first Neural Network (Sequential Model)
### 3.3.1 Model Training

Altough there are many other machine learning techniques to tackle multi-variate linear regression, it would be interesting for us to tackle it using deep learning for learning purposes.
<br>In this sub-section, we will try to perform said regression using PyTorch `SequentialModel` 

We will use the Real Estate dataset from the `realEstate.csv` for our linear regression example. 

Description of data:
- House Age
- Distance from the unit to MRT station
- The number of Convenience Stores around the unit
- House Unit Price per 1000 USD

In [20]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

First we use pandas to load in the csv.<br>
Note that in this dataset there are a total of $3$ features and $1$ label.<br>
Thus from the data we will use `.iloc[]` to distinguish the features and labels.

In [21]:
data = pd.read_csv("../data/Regression/realEstate.csv", header = 0)
n_features = 3
X = data.iloc[:, 0:3].values
y = data.iloc[:, 3].values

Following that, we split our dataset into 70/30 train/test ratio.

In [22]:
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 0.7, shuffle = True, random_state = 1022)

Next, we perform feature scaling onto `X_train` and `X_test` using `StandardScaler` from `scikit-learn`.<br>
*Note: only fit the train_set but transform both train and test sets*

In [23]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In section 3.1, we've touch on how Dataloaders are initialized and used in model training. It was simple, which is to pass in whatever `Dataset` we need into the Dataloader initializer. <br>

Here, we are using a custom dataset from a csv file as compared to the previous one which was prepared readily from torchvision. Thus in this case, we will have to build our own by subclassing from `torch.utils.data.Dataset`.

Whilst subclassing `Dataset`, PyTorch [documentation](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) notes that we have to override the `__getitem__()` method and optionally the `__len__()` method.<br>
We will mainly have three methods in this `Dataset` class:
- `__init__(self, data, label)`: helps us pass in the feature and labels into the dataset
- `__len__(self)`:allows the dataset to know how many instances of data there is 
- `__getitem__(self, idx)`:allows the dataset to get items from the data and labels by indexing

In [24]:
class Custom_Dataset(Dataset):
    def __init__(self, features, labels):
        self.features = torch.tensor(features, dtype = torch.float32)
        self.labels = torch.tensor(labels, dtype  = torch.float32)

    def __len__(self):
        return self.features.shape[0]
    
    def __getitem__(self, idx):
        return self.features[idx], self.labels[idx]

After feature scaling, we initialize our custom datasets and put them into `Dataloader` constructor and our data is prepared. The next step will be modeling.

In [25]:
train_dataset = Custom_Dataset(X_train, y_train)
test_dataset = Custom_Dataset(X_test, y_test)
train_loader = DataLoader(train_dataset, batch_size = 32)
test_loader = DataLoader(test_dataset, batch_size = 128 )

Like we previously stated, there are two approaches of modeling.
- Subclassing `nn.Module` 
- Calling the `nn.Sequential()` 

`torch.nn.Sequential` is a simple function that accepts a list of `nn.Modules` and returns a model with all the sequential layers. We will be implementing these few layers:
1. nn.Linear(3,50)
2. nn.ReLU()
3. nn.Linear(50,25)
4. nn.ReLU()
5. nn.Linear(25,10)
6. nn.ReLU()
7. nn.Linear(10,1)

In [26]:
torch.manual_seed(123)
model_sequential = nn.Sequential(nn.Linear(n_features, 50),
                                 nn.ReLU(),
                                 nn.Linear(50, 25),
                                 nn.ReLU(),
                                 nn.Linear(25, 10),
                                 nn.ReLU(),
                                 nn.Linear(10, 1)
                                 )

For this regression probelm, the loss/criterion we will use is Mean-Squared-Error loss, which in PyTorch is `nn.MSELoss()`<br>
We will also choose to use `Adam` as our optimizer.<br> Remember, `torch.optim.*any_optimizer*` accepts `model.parameters()` to keep track of the model's parameters, hence we should always initialize our model first before our optimizer.

In [27]:
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model_sequential.parameters(), lr = 0.01)

Now that our modeling is done, let's commence our training with using the training loop that defined previously

We will build a wrapper function for our training called `train_model`. This wrapper function will take on parameters:
- model
- loader
- loss_function/criterion
- optimizer
- number_of_epochs (optional)a
- iteration_check (optional): *if False is passed in, losses of each iteration per epoch will not be printed>*

Below will be an overall workings an explaination of our train_model function:
1. In each epoch, each minibatch starts with `optimizer.zero_grad()`. This is to clear previously computed gradients from previous minibatches.
2. We get the features and labels by indexing our minibatch.
3. Compute forward propagation by calling `model(features)` and assigning it to a variable `prediction`
4. Compute the loss by calling `criterion(prediction, torch.unsqueeze(labels, dim=1))`
    - the reason we unsqueeze is to make sure the shape of the labels are the same as the predictions, which is (batch_size,1) 
5. Compute backward_propagation by calling `loss.backward()`
6. Update the parameters(learning rate etc.) of the model by calling `optimizer.step()`
7. Increment our running_loss with the loss of our current batch
8. At the end of each epoch, compute the accuracy by dividing the accumulated loss and the amount of data samples, and finally zero the running_loss for the next epoch.


In [28]:
def train_model(model, loader, criterion, optimizer,epochs=5000):
#   this running_loss will keep track of the losses of every epoch from each respective iteration
    running_loss = 0.0
    for epoch in range(1, epochs + 1):
        for i, data in enumerate(loader):
#           zero the parameter gradients
            optimizer.zero_grad()
            features, labels = data[0],data[1]
            prediction = model(features)
            loss = criterion(prediction, torch.unsqueeze(labels,dim=1))
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
        if (epoch % 100 == 0 or epoch == 1):
            print(f"Epoch {epoch} Loss: {running_loss / len(loader)}")     
        running_loss = 0.0

In [29]:
torch.manual_seed(0)
train_model(model_sequential, train_loader, criterion, optimizer)

Epoch 1 Loss: 1559.314471435547
Epoch 100 Loss: 61.67083594799042
Epoch 200 Loss: 57.53920102566481
Epoch 300 Loss: 54.614624582976106
Epoch 400 Loss: 51.69376365095377
Epoch 500 Loss: 49.110941734910014
Epoch 600 Loss: 44.46782956123352
Epoch 700 Loss: 45.49254035949707
Epoch 800 Loss: 45.39475156664848
Epoch 900 Loss: 43.348855590820314
Epoch 1000 Loss: 42.04828781485558
Epoch 1100 Loss: 39.37081394195557
Epoch 1200 Loss: 42.60350239276886
Epoch 1300 Loss: 38.945985350012776
Epoch 1400 Loss: 39.63016664907336
Epoch 1500 Loss: 36.81087758541107
Epoch 1600 Loss: 34.936926842236424
Epoch 1700 Loss: 35.42953658103943
Epoch 1800 Loss: 32.789571383502334
Epoch 1900 Loss: 34.93219475212682
Epoch 2000 Loss: 33.54853103160858
Epoch 2100 Loss: 28.336665666103364
Epoch 2200 Loss: 25.664763996377587
Epoch 2300 Loss: 24.103572607040405
Epoch 2400 Loss: 17.353846311569214
Epoch 2500 Loss: 15.863344663381577
Epoch 2600 Loss: 13.111431193351745
Epoch 2700 Loss: 12.318226540088654
Epoch 2800 Loss: 19

### 3.3.2 Inference

Now let's evaluate our model. Use `model.eval()` to set the model to inference mode

In [30]:
model_sequential.eval()

Sequential(
  (0): Linear(in_features=3, out_features=50, bias=True)
  (1): ReLU()
  (2): Linear(in_features=50, out_features=25, bias=True)
  (3): ReLU()
  (4): Linear(in_features=25, out_features=10, bias=True)
  (5): ReLU()
  (6): Linear(in_features=10, out_features=1, bias=True)
)

Let's say your house age is 10, distance to MRT is 100 meters, and there are 6 convenience stores around the unit, could you predict your house price? Let's use our trained model to find out

In [31]:
with torch.no_grad():
    inference = torch.tensor([[10, 100, 6]])
    inference = torch.from_numpy(scaler.transform(inference))
    predict = model_sequential.forward(inference.float())
        
print("The prediction for your house price is :", predict.item() * 1000)

The prediction for your house price is : 54032.859802246094


# Exercise

In this exercise we will try to build a classifier for our MNIST Handwriting dataset.

Construct transform with the following transforms:
- coverting to tensor
- normalize the tensor with mean=0.15 and std=0.3081

In [32]:
transform = transforms.Compose(
    [transforms.ToTensor(),
    transforms.Normalize((0.15,), (0.3081,))]
)

Obtain the MNIST dataset from `torchvision.datasets`. Load them into respective `Dataloaders`

In [33]:
from torchvision.datasets import MNIST

train = MNIST("../data", download = True, transform = transform, train = True)
test = MNIST("../data", download = True, transform = transform, train = False)

In [34]:
train_loader = DataLoader(train, 100, shuffle = True, num_workers = 0)
test_loader = DataLoader(test, 100, shuffle = False, num_workers = 0)

Declare `SummaryWriter` for TensorBoard

In [35]:
writer = SummaryWriter()

Create a Model with the following layers:
- 4 linear/dense layers
- First 3 with ReLU activation functions

*Note: Remember to resize the incoming tensor first*

In [36]:
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.fc1 = nn.Linear(in_features = 28 * 28, out_features = 1000)
        self.fc2 = nn.Linear(in_features = 1000, out_features = 500)
        self.fc3 = nn.Linear(in_features = 500, out_features = 100)
        self.fc4 = nn.Linear(in_features = 100, out_features = 10)
        
    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return x

Initialize the model and load it to our **GPU**.

In [37]:
model = Model()
if torch.cuda.is_available():
    device = torch.device("cuda:0")
    model.to(device)

Initialize criterion: CrossEntropyLoss and optimizer Adam.

In [38]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr = 0.001)

Build a wrapper function `train_model` to train the model using `CUDA`. Add_scalar which shows a loss against epoch graph on TensorBoard.<br>
Here is a checklist for you to keep check what to do:
1. For each iteration in each epoch, zero the gradients of the parameters
2. Forward propagate
3. Calculate loss
4. Write the loss and train to TensorBoard
5. Back propagate
6. Update the parameters
7. For each epoch, calculate the accuracy on our test set

In [39]:
def train_model(model, train_loader, test_loader, criterion, optimizer, epochs = 5):
    accuraccy_list = []
    for epoch in range(epochs):
        total = 0
        correct = 0
        for i, data in enumerate(train_loader):
            inputs, labels = data[0].to(device), data[1].to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            writer.add_scalar("Loss/train", loss, epoch)
            loss.backward()
            optimizer.step()
            print(f'Epoch:{epoch + 1} \nIteration:{i + 1} \nLoss:{loss}')
            with torch.no_grad():
                total += labels.size(0)
                _,prediction = torch.max(outputs, 1)
                correct += (prediction == labels).sum().item()
        print(f'\nAccuracy of network in epoch {epoch + 1}: {100 * correct / total}')
    writer.flush()

train_model(model, train_loader, test_loader, criterion, optimizer)
writer.close()

Epoch:1 
Iteration:1 
Loss:2.3120129108428955
Epoch:1 
Iteration:2 
Loss:2.243009090423584
Epoch:1 
Iteration:3 
Loss:2.103304862976074
Epoch:1 
Iteration:4 
Loss:1.938184380531311
Epoch:1 
Iteration:5 
Loss:1.7360073328018188
Epoch:1 
Iteration:6 
Loss:1.4185400009155273
Epoch:1 
Iteration:7 
Loss:1.3077017068862915
Epoch:1 
Iteration:8 
Loss:0.979144811630249
Epoch:1 
Iteration:9 
Loss:0.8673616051673889
Epoch:1 
Iteration:10 
Loss:0.7848854660987854
Epoch:1 
Iteration:11 
Loss:0.9053862690925598
Epoch:1 
Iteration:12 
Loss:0.7195755243301392
Epoch:1 
Iteration:13 
Loss:0.8481418490409851
Epoch:1 
Iteration:14 
Loss:0.7561290860176086
Epoch:1 
Iteration:15 
Loss:0.6766089797019958
Epoch:1 
Iteration:16 
Loss:0.5629891157150269
Epoch:1 
Iteration:17 
Loss:0.6399906277656555
Epoch:1 
Iteration:18 
Loss:0.6289631128311157
Epoch:1 
Iteration:19 
Loss:0.7176056504249573
Epoch:1 
Iteration:20 
Loss:0.6771294474601746
Epoch:1 
Iteration:21 
Loss:0.43189147114753723
Epoch:1 
Iteration:22 
Lo

Epoch:1 
Iteration:174 
Loss:0.23364071547985077
Epoch:1 
Iteration:175 
Loss:0.1508840024471283
Epoch:1 
Iteration:176 
Loss:0.23261654376983643
Epoch:1 
Iteration:177 
Loss:0.23869295418262482
Epoch:1 
Iteration:178 
Loss:0.18907514214515686
Epoch:1 
Iteration:179 
Loss:0.16858088970184326
Epoch:1 
Iteration:180 
Loss:0.3811591863632202
Epoch:1 
Iteration:181 
Loss:0.124379463493824
Epoch:1 
Iteration:182 
Loss:0.31516680121421814
Epoch:1 
Iteration:183 
Loss:0.31521129608154297
Epoch:1 
Iteration:184 
Loss:0.1377878487110138
Epoch:1 
Iteration:185 
Loss:0.18656201660633087
Epoch:1 
Iteration:186 
Loss:0.1332310140132904
Epoch:1 
Iteration:187 
Loss:0.19603176414966583
Epoch:1 
Iteration:188 
Loss:0.19425024092197418
Epoch:1 
Iteration:189 
Loss:0.13017690181732178
Epoch:1 
Iteration:190 
Loss:0.13372930884361267
Epoch:1 
Iteration:191 
Loss:0.1804359406232834
Epoch:1 
Iteration:192 
Loss:0.2930707633495331
Epoch:1 
Iteration:193 
Loss:0.14517799019813538
Epoch:1 
Iteration:194 
Loss

Epoch:1 
Iteration:347 
Loss:0.31348198652267456
Epoch:1 
Iteration:348 
Loss:0.12617792189121246
Epoch:1 
Iteration:349 
Loss:0.12596957385540009
Epoch:1 
Iteration:350 
Loss:0.23239940404891968
Epoch:1 
Iteration:351 
Loss:0.22754289209842682
Epoch:1 
Iteration:352 
Loss:0.10897237807512283
Epoch:1 
Iteration:353 
Loss:0.12211742252111435
Epoch:1 
Iteration:354 
Loss:0.17609861493110657
Epoch:1 
Iteration:355 
Loss:0.16741515696048737
Epoch:1 
Iteration:356 
Loss:0.1579798460006714
Epoch:1 
Iteration:357 
Loss:0.1322966068983078
Epoch:1 
Iteration:358 
Loss:0.17488032579421997
Epoch:1 
Iteration:359 
Loss:0.16777494549751282
Epoch:1 
Iteration:360 
Loss:0.14855535328388214
Epoch:1 
Iteration:361 
Loss:0.22715777158737183
Epoch:1 
Iteration:362 
Loss:0.15137185156345367
Epoch:1 
Iteration:363 
Loss:0.3162267804145813
Epoch:1 
Iteration:364 
Loss:0.1435621976852417
Epoch:1 
Iteration:365 
Loss:0.11869576573371887
Epoch:1 
Iteration:366 
Loss:0.06551741808652878
Epoch:1 
Iteration:367 


Epoch:1 
Iteration:521 
Loss:0.05353476479649544
Epoch:1 
Iteration:522 
Loss:0.23325391113758087
Epoch:1 
Iteration:523 
Loss:0.18582992255687714
Epoch:1 
Iteration:524 
Loss:0.09449417889118195
Epoch:1 
Iteration:525 
Loss:0.09688922017812729
Epoch:1 
Iteration:526 
Loss:0.027983758598566055
Epoch:1 
Iteration:527 
Loss:0.11870747804641724
Epoch:1 
Iteration:528 
Loss:0.1011987179517746
Epoch:1 
Iteration:529 
Loss:0.1776442676782608
Epoch:1 
Iteration:530 
Loss:0.14917439222335815
Epoch:1 
Iteration:531 
Loss:0.14289936423301697
Epoch:1 
Iteration:532 
Loss:0.05317782983183861
Epoch:1 
Iteration:533 
Loss:0.14169421792030334
Epoch:1 
Iteration:534 
Loss:0.19296100735664368
Epoch:1 
Iteration:535 
Loss:0.0548328198492527
Epoch:1 
Iteration:536 
Loss:0.11793999373912811
Epoch:1 
Iteration:537 
Loss:0.1974993497133255
Epoch:1 
Iteration:538 
Loss:0.14726309478282928
Epoch:1 
Iteration:539 
Loss:0.05452985689043999
Epoch:1 
Iteration:540 
Loss:0.08887743204832077
Epoch:1 
Iteration:541 

Epoch:2 
Iteration:96 
Loss:0.050671275705099106
Epoch:2 
Iteration:97 
Loss:0.11057719588279724
Epoch:2 
Iteration:98 
Loss:0.06792283803224564
Epoch:2 
Iteration:99 
Loss:0.1934918463230133
Epoch:2 
Iteration:100 
Loss:0.09845835715532303
Epoch:2 
Iteration:101 
Loss:0.1792984902858734
Epoch:2 
Iteration:102 
Loss:0.13294540345668793
Epoch:2 
Iteration:103 
Loss:0.10784327238798141
Epoch:2 
Iteration:104 
Loss:0.07323049753904343
Epoch:2 
Iteration:105 
Loss:0.04443055018782616
Epoch:2 
Iteration:106 
Loss:0.07564422488212585
Epoch:2 
Iteration:107 
Loss:0.16559630632400513
Epoch:2 
Iteration:108 
Loss:0.11632976680994034
Epoch:2 
Iteration:109 
Loss:0.0820557102560997
Epoch:2 
Iteration:110 
Loss:0.058863043785095215
Epoch:2 
Iteration:111 
Loss:0.0629238709807396
Epoch:2 
Iteration:112 
Loss:0.036725856363773346
Epoch:2 
Iteration:113 
Loss:0.08275307714939117
Epoch:2 
Iteration:114 
Loss:0.12446576356887817
Epoch:2 
Iteration:115 
Loss:0.0786079689860344
Epoch:2 
Iteration:116 
Lo

Epoch:2 
Iteration:268 
Loss:0.052086878567934036
Epoch:2 
Iteration:269 
Loss:0.09080106765031815
Epoch:2 
Iteration:270 
Loss:0.0629081279039383
Epoch:2 
Iteration:271 
Loss:0.0771857425570488
Epoch:2 
Iteration:272 
Loss:0.018934419378638268
Epoch:2 
Iteration:273 
Loss:0.11793094873428345
Epoch:2 
Iteration:274 
Loss:0.04348869249224663
Epoch:2 
Iteration:275 
Loss:0.09788016974925995
Epoch:2 
Iteration:276 
Loss:0.01922612264752388
Epoch:2 
Iteration:277 
Loss:0.13758008182048798
Epoch:2 
Iteration:278 
Loss:0.022951893508434296
Epoch:2 
Iteration:279 
Loss:0.04002572223544121
Epoch:2 
Iteration:280 
Loss:0.058139171451330185
Epoch:2 
Iteration:281 
Loss:0.05044596642255783
Epoch:2 
Iteration:282 
Loss:0.1323990821838379
Epoch:2 
Iteration:283 
Loss:0.12433477491140366
Epoch:2 
Iteration:284 
Loss:0.07417905330657959
Epoch:2 
Iteration:285 
Loss:0.0989016517996788
Epoch:2 
Iteration:286 
Loss:0.029146023094654083
Epoch:2 
Iteration:287 
Loss:0.07612933963537216
Epoch:2 
Iteration:

Epoch:2 
Iteration:439 
Loss:0.10966259986162186
Epoch:2 
Iteration:440 
Loss:0.14692892134189606
Epoch:2 
Iteration:441 
Loss:0.07716568559408188
Epoch:2 
Iteration:442 
Loss:0.10208268463611603
Epoch:2 
Iteration:443 
Loss:0.1128358393907547
Epoch:2 
Iteration:444 
Loss:0.062188971787691116
Epoch:2 
Iteration:445 
Loss:0.08713003993034363
Epoch:2 
Iteration:446 
Loss:0.09090324491262436
Epoch:2 
Iteration:447 
Loss:0.11777570843696594
Epoch:2 
Iteration:448 
Loss:0.061372045427560806
Epoch:2 
Iteration:449 
Loss:0.07631354033946991
Epoch:2 
Iteration:450 
Loss:0.05208823084831238
Epoch:2 
Iteration:451 
Loss:0.08311394602060318
Epoch:2 
Iteration:452 
Loss:0.0697915107011795
Epoch:2 
Iteration:453 
Loss:0.04550196975469589
Epoch:2 
Iteration:454 
Loss:0.01881393976509571
Epoch:2 
Iteration:455 
Loss:0.09792184829711914
Epoch:2 
Iteration:456 
Loss:0.0425495021045208
Epoch:2 
Iteration:457 
Loss:0.048559416085481644
Epoch:2 
Iteration:458 
Loss:0.20353494584560394
Epoch:2 
Iteration:4

Epoch:3 
Iteration:7 
Loss:0.009984851814806461
Epoch:3 
Iteration:8 
Loss:0.07752212882041931
Epoch:3 
Iteration:9 
Loss:0.055770035833120346
Epoch:3 
Iteration:10 
Loss:0.011395781300961971
Epoch:3 
Iteration:11 
Loss:0.12790292501449585
Epoch:3 
Iteration:12 
Loss:0.026682408526539803
Epoch:3 
Iteration:13 
Loss:0.061499666422605515
Epoch:3 
Iteration:14 
Loss:0.061734963208436966
Epoch:3 
Iteration:15 
Loss:0.11611274629831314
Epoch:3 
Iteration:16 
Loss:0.14121262729167938
Epoch:3 
Iteration:17 
Loss:0.027529515326023102
Epoch:3 
Iteration:18 
Loss:0.0688527449965477
Epoch:3 
Iteration:19 
Loss:0.05044511705636978
Epoch:3 
Iteration:20 
Loss:0.05832422897219658
Epoch:3 
Iteration:21 
Loss:0.06288693100214005
Epoch:3 
Iteration:22 
Loss:0.1361725628376007
Epoch:3 
Iteration:23 
Loss:0.08357694000005722
Epoch:3 
Iteration:24 
Loss:0.008071056567132473
Epoch:3 
Iteration:25 
Loss:0.019149335101246834
Epoch:3 
Iteration:26 
Loss:0.004059730097651482
Epoch:3 
Iteration:27 
Loss:0.04642

Loss:0.04570407047867775
Epoch:3 
Iteration:181 
Loss:0.11269282549619675
Epoch:3 
Iteration:182 
Loss:0.046963516622781754
Epoch:3 
Iteration:183 
Loss:0.053563155233860016
Epoch:3 
Iteration:184 
Loss:0.14870327711105347
Epoch:3 
Iteration:185 
Loss:0.022914163768291473
Epoch:3 
Iteration:186 
Loss:0.049962759017944336
Epoch:3 
Iteration:187 
Loss:0.0712253674864769
Epoch:3 
Iteration:188 
Loss:0.1353616863489151
Epoch:3 
Iteration:189 
Loss:0.043921440839767456
Epoch:3 
Iteration:190 
Loss:0.09061047434806824
Epoch:3 
Iteration:191 
Loss:0.22199466824531555
Epoch:3 
Iteration:192 
Loss:0.07586213201284409
Epoch:3 
Iteration:193 
Loss:0.0725313276052475
Epoch:3 
Iteration:194 
Loss:0.047340258955955505
Epoch:3 
Iteration:195 
Loss:0.10653276741504669
Epoch:3 
Iteration:196 
Loss:0.09254342317581177
Epoch:3 
Iteration:197 
Loss:0.0783831924200058
Epoch:3 
Iteration:198 
Loss:0.0283550713211298
Epoch:3 
Iteration:199 
Loss:0.029116563498973846
Epoch:3 
Iteration:200 
Loss:0.08985125273

Epoch:3 
Iteration:351 
Loss:0.021516937762498856
Epoch:3 
Iteration:352 
Loss:0.07417622953653336
Epoch:3 
Iteration:353 
Loss:0.06155497953295708
Epoch:3 
Iteration:354 
Loss:0.023866722360253334
Epoch:3 
Iteration:355 
Loss:0.1153397485613823
Epoch:3 
Iteration:356 
Loss:0.06120302155613899
Epoch:3 
Iteration:357 
Loss:0.02457522414624691
Epoch:3 
Iteration:358 
Loss:0.05458023026585579
Epoch:3 
Iteration:359 
Loss:0.08095678687095642
Epoch:3 
Iteration:360 
Loss:0.045584581792354584
Epoch:3 
Iteration:361 
Loss:0.026411592960357666
Epoch:3 
Iteration:362 
Loss:0.026539266109466553
Epoch:3 
Iteration:363 
Loss:0.050297483801841736
Epoch:3 
Iteration:364 
Loss:0.016372786834836006
Epoch:3 
Iteration:365 
Loss:0.027314424514770508
Epoch:3 
Iteration:366 
Loss:0.037081822752952576
Epoch:3 
Iteration:367 
Loss:0.15916821360588074
Epoch:3 
Iteration:368 
Loss:0.060564830899238586
Epoch:3 
Iteration:369 
Loss:0.1259833574295044
Epoch:3 
Iteration:370 
Loss:0.10481490939855576
Epoch:3 
Ite

Epoch:3 
Iteration:521 
Loss:0.09876592457294464
Epoch:3 
Iteration:522 
Loss:0.071944959461689
Epoch:3 
Iteration:523 
Loss:0.06504212319850922
Epoch:3 
Iteration:524 
Loss:0.08795938640832901
Epoch:3 
Iteration:525 
Loss:0.03814290463924408
Epoch:3 
Iteration:526 
Loss:0.01877966709434986
Epoch:3 
Iteration:527 
Loss:0.05336418002843857
Epoch:3 
Iteration:528 
Loss:0.040779076516628265
Epoch:3 
Iteration:529 
Loss:0.029231682419776917
Epoch:3 
Iteration:530 
Loss:0.022919287905097008
Epoch:3 
Iteration:531 
Loss:0.023215139284729958
Epoch:3 
Iteration:532 
Loss:0.02789212204515934
Epoch:3 
Iteration:533 
Loss:0.08830823749303818
Epoch:3 
Iteration:534 
Loss:0.06086815893650055
Epoch:3 
Iteration:535 
Loss:0.06807231158018112
Epoch:3 
Iteration:536 
Loss:0.045101579278707504
Epoch:3 
Iteration:537 
Loss:0.01850729063153267
Epoch:3 
Iteration:538 
Loss:0.015373787842690945
Epoch:3 
Iteration:539 
Loss:0.07212677597999573
Epoch:3 
Iteration:540 
Loss:0.11486479640007019
Epoch:3 
Iterati

Epoch:4 
Iteration:91 
Loss:0.08991432934999466
Epoch:4 
Iteration:92 
Loss:0.08320696651935577
Epoch:4 
Iteration:93 
Loss:0.010825909674167633
Epoch:4 
Iteration:94 
Loss:0.05765775218605995
Epoch:4 
Iteration:95 
Loss:0.08058343082666397
Epoch:4 
Iteration:96 
Loss:0.006746064405888319
Epoch:4 
Iteration:97 
Loss:0.022668354213237762
Epoch:4 
Iteration:98 
Loss:0.003434357000514865
Epoch:4 
Iteration:99 
Loss:0.044832728803157806
Epoch:4 
Iteration:100 
Loss:0.009273026138544083
Epoch:4 
Iteration:101 
Loss:0.011010735295712948
Epoch:4 
Iteration:102 
Loss:0.030676711350679398
Epoch:4 
Iteration:103 
Loss:0.013816446997225285
Epoch:4 
Iteration:104 
Loss:0.08443519473075867
Epoch:4 
Iteration:105 
Loss:0.027454061433672905
Epoch:4 
Iteration:106 
Loss:0.03269563615322113
Epoch:4 
Iteration:107 
Loss:0.038587652146816254
Epoch:4 
Iteration:108 
Loss:0.036297816783189774
Epoch:4 
Iteration:109 
Loss:0.031072886660695076
Epoch:4 
Iteration:110 
Loss:0.03862094134092331
Epoch:4 
Iterati

Epoch:4 
Iteration:262 
Loss:0.05857823044061661
Epoch:4 
Iteration:263 
Loss:0.14568962156772614
Epoch:4 
Iteration:264 
Loss:0.010872261598706245
Epoch:4 
Iteration:265 
Loss:0.030020492151379585
Epoch:4 
Iteration:266 
Loss:0.03108099475502968
Epoch:4 
Iteration:267 
Loss:0.060746122151613235
Epoch:4 
Iteration:268 
Loss:0.010861274786293507
Epoch:4 
Iteration:269 
Loss:0.041624534875154495
Epoch:4 
Iteration:270 
Loss:0.04964950680732727
Epoch:4 
Iteration:271 
Loss:0.02079024724662304
Epoch:4 
Iteration:272 
Loss:0.039441172033548355
Epoch:4 
Iteration:273 
Loss:0.03411968797445297
Epoch:4 
Iteration:274 
Loss:0.026641465723514557
Epoch:4 
Iteration:275 
Loss:0.02539953961968422
Epoch:4 
Iteration:276 
Loss:0.047852907329797745
Epoch:4 
Iteration:277 
Loss:0.04497009143233299
Epoch:4 
Iteration:278 
Loss:0.01229097880423069
Epoch:4 
Iteration:279 
Loss:0.0556337833404541
Epoch:4 
Iteration:280 
Loss:0.058529749512672424
Epoch:4 
Iteration:281 
Loss:0.09918298572301865
Epoch:4 
Ite

Epoch:4 
Iteration:431 
Loss:0.14199618995189667
Epoch:4 
Iteration:432 
Loss:0.06995493173599243
Epoch:4 
Iteration:433 
Loss:0.011810476891696453
Epoch:4 
Iteration:434 
Loss:0.009954454377293587
Epoch:4 
Iteration:435 
Loss:0.08230546861886978
Epoch:4 
Iteration:436 
Loss:0.07516516745090485
Epoch:4 
Iteration:437 
Loss:0.04371054098010063
Epoch:4 
Iteration:438 
Loss:0.053136616945266724
Epoch:4 
Iteration:439 
Loss:0.0999719649553299
Epoch:4 
Iteration:440 
Loss:0.039290495216846466
Epoch:4 
Iteration:441 
Loss:0.08188962936401367
Epoch:4 
Iteration:442 
Loss:0.02714819833636284
Epoch:4 
Iteration:443 
Loss:0.07733628898859024
Epoch:4 
Iteration:444 
Loss:0.1293228417634964
Epoch:4 
Iteration:445 
Loss:0.03363217040896416
Epoch:4 
Iteration:446 
Loss:0.0377047173678875
Epoch:4 
Iteration:447 
Loss:0.039709605276584625
Epoch:4 
Iteration:448 
Loss:0.01902863010764122
Epoch:4 
Iteration:449 
Loss:0.09580697864294052
Epoch:4 
Iteration:450 
Loss:0.15124060213565826
Epoch:4 
Iteration

Epoch:4 
Iteration:599 
Loss:0.028972206637263298
Epoch:4 
Iteration:600 
Loss:0.17726197838783264

Accuracy of network in epoch 4: 98.48833333333333
Epoch:5 
Iteration:1 
Loss:0.008354908786714077
Epoch:5 
Iteration:2 
Loss:0.0021204808726906776
Epoch:5 
Iteration:3 
Loss:0.009838227182626724
Epoch:5 
Iteration:4 
Loss:0.04806382209062576
Epoch:5 
Iteration:5 
Loss:0.016117582097649574
Epoch:5 
Iteration:6 
Loss:0.014580371789634228
Epoch:5 
Iteration:7 
Loss:0.014269322156906128
Epoch:5 
Iteration:8 
Loss:0.019681314006447792
Epoch:5 
Iteration:9 
Loss:0.027688927948474884
Epoch:5 
Iteration:10 
Loss:0.060438722372055054
Epoch:5 
Iteration:11 
Loss:0.0070455027744174
Epoch:5 
Iteration:12 
Loss:0.003211061004549265
Epoch:5 
Iteration:13 
Loss:0.009114285930991173
Epoch:5 
Iteration:14 
Loss:0.17354173958301544
Epoch:5 
Iteration:15 
Loss:0.016876043751835823
Epoch:5 
Iteration:16 
Loss:0.013226073235273361
Epoch:5 
Iteration:17 
Loss:0.0037711127661168575
Epoch:5 
Iteration:18 
Loss:

Epoch:5 
Iteration:168 
Loss:0.003371194237843156
Epoch:5 
Iteration:169 
Loss:0.1367059350013733
Epoch:5 
Iteration:170 
Loss:0.0333084836602211
Epoch:5 
Iteration:171 
Loss:0.0029286888893693686
Epoch:5 
Iteration:172 
Loss:0.043764401227235794
Epoch:5 
Iteration:173 
Loss:0.022074062377214432
Epoch:5 
Iteration:174 
Loss:0.03626113757491112
Epoch:5 
Iteration:175 
Loss:0.027319753542542458
Epoch:5 
Iteration:176 
Loss:0.10259804874658585
Epoch:5 
Iteration:177 
Loss:0.07937386631965637
Epoch:5 
Iteration:178 
Loss:0.0366901233792305
Epoch:5 
Iteration:179 
Loss:0.01942426711320877
Epoch:5 
Iteration:180 
Loss:0.021640989929437637
Epoch:5 
Iteration:181 
Loss:0.021282339468598366
Epoch:5 
Iteration:182 
Loss:0.12226741760969162
Epoch:5 
Iteration:183 
Loss:0.0194413885474205
Epoch:5 
Iteration:184 
Loss:0.022255633026361465
Epoch:5 
Iteration:185 
Loss:0.04461973160505295
Epoch:5 
Iteration:186 
Loss:0.05878574401140213
Epoch:5 
Iteration:187 
Loss:0.10168582946062088
Epoch:5 
Iterat

Epoch:5 
Iteration:334 
Loss:0.010469253174960613
Epoch:5 
Iteration:335 
Loss:0.00333763868547976
Epoch:5 
Iteration:336 
Loss:0.05751198157668114
Epoch:5 
Iteration:337 
Loss:0.0816745012998581
Epoch:5 
Iteration:338 
Loss:0.0029237940907478333
Epoch:5 
Iteration:339 
Loss:0.05111284554004669
Epoch:5 
Iteration:340 
Loss:0.09387700259685516
Epoch:5 
Iteration:341 
Loss:0.03929232060909271
Epoch:5 
Iteration:342 
Loss:0.00708002271130681
Epoch:5 
Iteration:343 
Loss:0.062146686017513275
Epoch:5 
Iteration:344 
Loss:0.013858905993402004
Epoch:5 
Iteration:345 
Loss:0.09932363778352737
Epoch:5 
Iteration:346 
Loss:0.03540554642677307
Epoch:5 
Iteration:347 
Loss:0.013138684444129467
Epoch:5 
Iteration:348 
Loss:0.12957780063152313
Epoch:5 
Iteration:349 
Loss:0.015373778529465199
Epoch:5 
Iteration:350 
Loss:0.05264896899461746
Epoch:5 
Iteration:351 
Loss:0.08456496149301529
Epoch:5 
Iteration:352 
Loss:0.006332049146294594
Epoch:5 
Iteration:353 
Loss:0.046433161944150925
Epoch:5 
Ite

Epoch:5 
Iteration:501 
Loss:0.07025323808193207
Epoch:5 
Iteration:502 
Loss:0.0074350880458951
Epoch:5 
Iteration:503 
Loss:0.031213246285915375
Epoch:5 
Iteration:504 
Loss:0.009925741702318192
Epoch:5 
Iteration:505 
Loss:0.1001056581735611
Epoch:5 
Iteration:506 
Loss:0.16858603060245514
Epoch:5 
Iteration:507 
Loss:0.011763614602386951
Epoch:5 
Iteration:508 
Loss:0.07400573045015335
Epoch:5 
Iteration:509 
Loss:0.0965205579996109
Epoch:5 
Iteration:510 
Loss:0.05325038731098175
Epoch:5 
Iteration:511 
Loss:0.10820859670639038
Epoch:5 
Iteration:512 
Loss:0.013834051787853241
Epoch:5 
Iteration:513 
Loss:0.00935860350728035
Epoch:5 
Iteration:514 
Loss:0.03320690989494324
Epoch:5 
Iteration:515 
Loss:0.05758867785334587
Epoch:5 
Iteration:516 
Loss:0.0857243537902832
Epoch:5 
Iteration:517 
Loss:0.007965043187141418
Epoch:5 
Iteration:518 
Loss:0.09572593867778778
Epoch:5 
Iteration:519 
Loss:0.04329109564423561
Epoch:5 
Iteration:520 
Loss:0.012259460985660553
Epoch:5 
Iteration

In [40]:
total = 0
correct = 0
for data, labels in test_loader:
    data = data.to(torch.device("cuda:0"))
    with torch.no_grad():
        validation = model(data)
        _,prediction = torch.max(validation, 1)
        total += labels.size(0)
        correct += (prediction.cpu() == labels).sum().item()
    
print(f'Accuracy of the network:{100 * correct / total}')

Accuracy of the network:98.02
