## **Pytorch Installation on Local Machine**

To install pytorch, we need to go to pytorch website. Scroll down until we found installation method for pytorch. If we are windows and linux user, we can use cuda support for our pytorch and download cuda to our local machine.

## **Tensor define value**

In [None]:
import torch
# Create tensor matrix with various dimensions
a = torch.empty(2) # matrix tensor 1,2 
b = torch.empty(2,1) # matrix tensor 2,1

# Create tensor matrix with only 1 value with various dimensions
c = torch.ones(2,2, dtype=torch.float16) # this matrix contain only number 1 with 2,2 size and float16 data type

# Create tensor matrix with our custom value
d = torch.tensor([2.5, 0.1]) # will return tensor matrix with 2.5 and 0.1 value with 1,2 dimensions

# Create tensor with random value
e = torch.rand(5,3) # will return tensor with random value in matrix dimension 5,3

# Reshape tensor size
c = torch.rand(2,2)
y = c.view(4) # tensor will return matrix with 1 dimension. The value 4 appear is because tensor c has 4 value (with matrix 2,2)

# Check tensor size
c.size()

## **Tensor calculation**

In [None]:
# Simple sum of 2 value in torch
z = a + b # will return new tensor matrix with sum value of matrix a + b and same dimension

# add new value in other tensor
a.add_(b) # this mean value in tensor a add with value in tensor b, sign _ mean that this calculation inplace=True

# tensor subtraction
z = torch.sub(x,y) # has same meaning with z = x-y

# tensor multiplication
z = torch.mul(x,y) # has same meaning with z = x*y

# tensor divide
z = torch.div(x,y) # has same meaning with z = x/y

## **IMPORTANT TENSOR OPERATION WITH CUDA**

If we use tensor operation without cuda, there is no problem. But, if we use cuda for tensor operation, there is problem in update value on converting numpy array format into tensor format.

One problem is, when we define numpy array like a = np.array([1,2,3,4]) and we turn this numpy array into tensor with b = torch.tensor(a) and if we change this tensor b value with mathematical operation like b.add_(b), tensor b will return value tensor(2,4,6,8), but array a will also update it value to array(2,4,6,8). We don't want this condition.


In [None]:
# Here code that we can use to tackle problem aforementioned above
if torch.cuda.is_available(): # check cuda available
  device = torch.device('cuda') # define cuda variable
  a = torch.ones(5, device=device) # define tensor a
  b = torch.ones(5) # define tensor b
  b = b.to(device) # turn tensor b from CPU to GPU

  # operation in GPU
  z = a + b

  # If we want to do operation with numpy data type, so we must turn from GPU to CPU. It because numpy only process data with CPU
  z = z.to('cpu') 

## **Autograd operation**

In [None]:
a = torch.tensor([1,2], dtype=torch.float64, requires_grad=True) # requires_grad is mandatory to perform autograd operation
b = torch.tensor([6,7], dtype=torch.float64, requires_grad=True)
z = a*b

# backpropagation
z.backward(torch.tensor([1.,1.])) # tensor([1.,1.]) needed based on jacobian matrix formula

# gradient result in specific variable that we want to operate
a.grad # this mean gradient operation happened with formula dz/da

tensor([6., 7.], dtype=torch.float64)

For further explanation about jacobian matrix, see this link:
https://drive.google.com/file/d/16Np-DIMIukqAMvzxqRzldVhWtyPW4XWW/view?usp=share_link


In [None]:
# Way to remove grade function in tensor
a = torch.tensor([1.,2.,3.], requires_grad=True)
print(a)

tensor([1., 2., 3.], requires_grad=True)


In [None]:
# Way 1
a.requires_grad_(False)

tensor([1., 2., 3.])

In [None]:
# Way 2
a.detach()

tensor([1., 2., 3.])

In [None]:
# Way 3
with torch.no_grad():
  print(a)

tensor([1., 2., 3.])


## **Training example**

In [None]:
import torch

In [None]:
a = torch.tensor([1.,2.,3.,4.], requires_grad=True)

for i in range(3):
  output = a**2
  output.backward(torch.ones(4))
  print(a.grad)

  a.grad.zero_() # need to reset backward to zero for next operation

tensor([2., 4., 6., 8.])
tensor([2., 4., 6., 8.])
tensor([2., 4., 6., 8.])


## **Create example of update weight with gradient descent**

with this example, we want to show how weight can be updated on function. Final weight we compare with actual function

In [None]:
# Define actual function
import numpy as np
x = np.array([1,2,3,4], dtype=np.float32)
y = np.array([2,4,6,8], dtype=np.float32)

from data above, we can see that our baseline (actual) function is y = 2*x

In [None]:
# Define raw function with weight
def forward(x):
  y_prediction = w*x
  return y_prediction

def loss_function(y_prediction,y):
  loss = ((y_prediction-y)**2).mean()
  return loss

def gradient_descent(w,x,y):
  gradient = 2*w*(x**2) - 2*x*y
  return gradient

w = 0.0
learning_rate = 0.01

for i in range(400):
  y_prediction = forward(x)

  loss = loss_function(y_prediction, y)

  gradient = gradient_descent(w, x, y)

  print(f'w = {w} and loss = {loss}')

  w = w - (learning_rate*gradient)

w = 0.0 and loss = 30.0
w = [0.04       0.16       0.35999998 0.64      ] and loss = 17.796001434326172
w = [0.0792 0.3072 0.6552 1.0752] and loss = 11.278056144714355
w = [0.117616   0.44262403 0.897264   1.371136  ] and loss = 7.629202365875244
w = [0.15526368 0.56721413 1.0957565  1.5723724 ] and loss = 5.474826335906982
w = [0.1921584 0.681837  1.2585204 1.7092133] and loss = 4.129886150360107
w = [0.22831523 0.78729004 1.3919867  1.802265  ] and loss = 3.243558883666992
w = [0.26374894 0.88430685 1.5014291  1.8655403 ] and loss = 2.6300199031829834
w = [0.29847395 0.9735623  1.5911719  1.9085674 ] and loss = 2.186877727508545
w = [0.33250448 1.0556773  1.664761   1.9378258 ] and loss = 1.8552099466323853
w = [0.36585438 1.1312231  1.725104   1.9577216 ] and loss = 1.599558711051941
w = [0.39853728 1.2007252  1.7745852  1.9712507 ] and loss = 1.3976436853408813
w = [0.43056652 1.2646672  1.8151599  1.9804504 ] and loss = 1.2348966598510742
w = [0.4619552 1.3234937 1.8484311 1.98670

from 400 iteration above, we can see that weight for every value on y = w*x is 1.99 (very close to 2). This prove that our backpropagation algorithm work well.

## **Update weight with torch gradient descent (autograd)**

In [None]:
import torch
import numpy as np

In [None]:
x = torch.tensor([1,2,3,4], dtype=torch.float64)
y = torch.tensor([2,4,6,8], dtype=torch.float64)

In [None]:
# Define raw function with weight
def forward(x):
  y_prediction = w*x
  return y_prediction

def loss_function(y_prediction,y):
  loss = ((y_prediction-y)**2).mean()
  return loss

w = torch.tensor(0.0, dtype=torch.float64, requires_grad=True)
learning_rate = 0.01

for i in range(40):
  y_prediction = forward(x)

  loss = loss_function(y_prediction, y)

  loss.backward()

  print(f'w = {w} and loss = {loss}')

  with torch.no_grad():  # why should use this?
    w -= learning_rate*(w.grad) # why w = w - (learning_rate*(w.grad)) not working?

  w.grad.zero_()

w = 0.0 and loss = 30.0
w = 0.3 and loss = 21.674999999999997
w = 0.5549999999999999 and loss = 15.6601875
w = 0.7717499999999999 and loss = 11.31448546875
w = 0.9559875 and loss = 8.174715751171876
w = 1.112589375 and loss = 5.90623213022168
w = 1.24570096875 and loss = 4.2672527140851635
w = 1.3588458234375 and loss = 3.08309008592653
w = 1.455018949921875 and loss = 2.2275325870819183
w = 1.5367661074335939 and loss = 1.609392294166685
w = 1.6062511913185549 and loss = 1.1627859325354297
w = 1.6653135126207717 and loss = 0.840112836256848
w = 1.715516485727656 and loss = 0.6069815241955727
w = 1.7581890128685076 and loss = 0.43854415123130097
w = 1.7944606609382314 and loss = 0.3168481492646149
w = 1.8252915617974967 and loss = 0.2289227878436842
w = 1.8514978275278722 and loss = 0.16539671421706198
w = 1.8737731533986914 and loss = 0.11949912602182704
w = 1.8927071803888877 and loss = 0.08633811855077002
w = 1.9088011033305545 and loss = 0.06237929065293141
w = 1.9224809378309713 a

## **Training neural network with pytorch tools**

In [None]:
import torch.nn as nn

In [None]:
# Define input and output variable with both values
x = torch.tensor([[1], [2], [3], [4]], dtype=torch.float32)
y = torch.tensor([[2], [4], [6], [8]], dtype=torch.float32)

x_test = torch.tensor([5], dtype=torch.float32)

In [None]:
n_samples, n_features = x.shape

input_size = n_features
output_size = n_features

model = nn.Linear(input_size, output_size)

learning_rate = 0.01
iterations = 100

loss = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

for i in range(iterations):
  y_prediction = model(x) # for prediction

  l = loss(y, y_prediction) # for calculating loss

  l.backward() # for backpropagation

  optimizer.step() # for update weights

  optimizer.zero_grad() # reset gradient

  [w,b] = model.parameters()

  print(f'w = {w[0][0]} and loss = {l}')

w = 0.5010972619056702 and loss = 22.154808044433594
w = 0.7159448862075806 and loss = 15.39315128326416
w = 0.8950178623199463 and loss = 10.701257705688477
w = 1.0442904233932495 and loss = 7.445530414581299
w = 1.1687391996383667 and loss = 5.186330795288086
w = 1.2725095748901367 and loss = 3.6185998916625977
w = 1.3590545654296875 and loss = 2.5306644439697266
w = 1.4312506914138794 and loss = 1.775651216506958
w = 1.4914939403533936 and loss = 1.2516456842422485
w = 1.5417802333831787 and loss = 0.8879327774047852
w = 1.58377206325531 and loss = 0.6354436278343201
w = 1.6188544034957886 and loss = 0.4601314067840576
w = 1.6481808423995972 and loss = 0.33837103843688965
w = 1.6727123260498047 and loss = 0.2537701725959778
w = 1.6932493448257446 and loss = 0.19495369493961334
w = 1.710458755493164 and loss = 0.15402951836585999
w = 1.72489595413208 and loss = 0.1255209892988205
w = 1.7370235919952393 and loss = 0.10562790930271149
w = 1.7472270727157593 and loss = 0.091713771224021

## **Dataset and dataloader**

In [None]:
import torch
import torchvision
from torch.utils.data import Dataset, Dataloader
import numpy as np
import math

class WindDataset(Dataset):
  def __init__(self):
    xy = np.loadtxt('./data/wind/wind.csv', delimiter=',', dtype=np.float32, skiprows=1)
    self.x = torch.from_numpyr(xy[:, 1:])
    self.y = torch.from_numpy(xy[:, [0]]) # n_samples, 1
    self.n_samples = xy.shape[0]

  def __getitem__(self, index):
    return self.x[index], self.y[index]

  def __len__(self):
    return self.n_samples

dataset = WindDataset()
first_data = dataset[0]

**Convert value type to tensor**

In [None]:
import torch
class tes:
  def __init__(self, angka_awal):
    self.angka_awal = angka_awal

  def perkalian_1(self, koefisien_1):
    hasil_perkalian = self.angka_awal * koefisien_1

    return hasil_perkalian

  def to_tensor(self, fungsi):
    hasil_tensor = torch.tensor(fungsi)

    return hasil_tensor

In [None]:
test = tes(20)
test.perkalian_1(4)

80

In [None]:
test.to_tensor(test.perkalian_1(4))

tensor(80)

## **Softmax activation function and cross entropy**

In [None]:
import torch
import numpy as np

In [None]:
# define softmax function without tensor
def softmax(x):
  return np.exp(x) / np.sum(np.exp(x), axis=0)

x = np.array([2.0, 1.0, 0.1])
output = softmax(x)

print(f'x values : {str(x)}')
print(f'softmax values : {str(output)}')

x values : [2.  1.  0.1]
softmax values : [0.65900114 0.24243297 0.09856589]


In [None]:
# define cross entropy function
def cross_entropy(actual, predicted):
  loss = -np.sum(actual * np.log(predicted))
  return loss

y_actual = np.array([1,0,0])

y_prediction_good = np.array([0.7, 0.2, 0.1])
y_prediction_bad = np.array([0.1, 0.3, 0.6])

loss_1 = cross_entropy(y_actual, y_prediction_good)
loss_2 = cross_entropy (y_actual, y_prediction_bad)

print(f'loss good : {loss_1}')
print(f'loss bad : {loss_2}')

loss good : 0.35667494393873245
loss bad : 2.3025850929940455


**Neural network with sigmoid**

In [None]:
# Binary classification
class NeuralNet1(nn.Module):
  def __init__(self, input_size, hidden_size):
    super(NeuralNet1, self).__init__()
    self.linear1 = nn.Linear(input_size, hidden_size)
    self.relu = nn.ReLU()
    self.Linear2 = nn.Linear(hidden_size, 1)

  def forward(self, x):
    out = self.Linear1(x)
    out = self.relu(out)
    out = self.Linear2(out)
    y_prediction = torch.sigmoid(out)
    return y_prediction

model = NeuralNet1(input_size=28*28, hidden_size=5)
criterion = nn.BCELoss()  # binary cross entropy loss

## **Full example of feed forward neural network**

In [None]:
# define libraries
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import torch.nn.functional as F

# device config
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# hyperparameters
input_size = 784
hidden_size = 100
num_classes = 10
num_epochs = 2
batch_size = 100
learning_rate = 0.001

# loading MNIST data
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transforms.ToTensor())

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuflle=False)

examples = iter(train_loader)
samples, labels = examples.next()
print(samples.shape, labels.shape)

# Show image data
for i in range(6):
  plt.subplot(2, 3, i+1)
  plt.imshow(samples[i][0], cmap='gray')

# Neural network
class NeuralNet(nn.Module):
  def __init__(self, input_size, hidden_size, num_classes):
    super(NeuralNet, self).__init__()
    self.l1 = nn.Linear(input_size, hidden_size)
    self.relu = nn.ReLU()
    self.l2 = nn.Linear(hidden_size, num_classes)

  def forward(self, x):
    out = self.l1(x)
    out = self.relu(out)
    out = self.l2(out)
    return out

# modeling
model = NeuralNet(input_size, hidden_size, num_classes)

# loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# training loop
n_total_step = len(train_loader)
for epoch in range(num_epochs):
  for i, (images, labels) in enumerate(train_loader):
    # 100, 1, 28, 28
    # 100, 784
    images = images.reshape(-1, 28*28).to(device)
    labels = labels.to(device)

    # forward
    output = model(images)
    loss = criterion(output, labels)

    # backward
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (i+1 % 100 == 0):
      print(f'epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_step}, loss = {loss.item():.4f}')


# test
with torch.no_grad(): # because we don't want to calculate gradient again
  n_correct = 0
  n_samples = 0
  for images, labels in test_loader:
    images = images.reshape(-1, 28*28).to(device)
    labels = labels.to(device)
    outputs = model(images)

    # value, index
    _, predictions = torch.max(outputs, 1)
    n_samples += labels.shape[0]
    n_correct = (predictions == labels).sum().item()

  acc = 100.0 * n_correct / n_samples

  print(f'accuracy = {acc}')

## **Convolution neural network**

In [None]:
class ConvNet(nn.Module):
  def __init__(self):
    super(ConvNet, self).__init__()
    self.conv1 = nn.Conv2d(3, 6, 5)  #3 is color channel, 6 is output channel, 5 is kernel size (5x5)
    self.pool = nn.MaxPool2d(2,2) #2 is kernel size, 2 is stride
    self.conv2 = nn.Conv2d(6, 16, 5)
    self.fc1 = nn.Linear(16*5*5, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, 10)  #10 is number of total class

  def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = x.view(-1, 16*5*5)  #for tensor flatten
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

## **Transfer learning**

In [None]:
from torchvision import models
model = models.resnet18(pretrained=True)
number_features = model.fc.in_features

model.fc = nn.Linear(number_features, 2)
model.to(torch.device('cuda'))

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)

# scheduler
step_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

model = train_model(model, criterion, optimizer, scheduler, num_epochs=20)

**Modify last layer of pre trained model**

This is example of resnet model for computer vision task. I will modify last layer of this whole network. Last layer of resnet50 contain input=2048 and output=1000.

In [None]:
from torchvision import models
resnet_model = models.resnet50(pretrained=True)
resnet_model

Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:01<00:00, 72.6MB/s]


ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

From (fc) resnet above, we can see there are in_features=2048 and out_features=1000. Out_features = 1000 because resnet is pre trained on imagenet dataset which contain 1000 classes.

In [None]:
# Modify last layer and change out_features from 1000 to 4 classes
import torch.nn as nn
resnet_model.fc = nn.Linear(2048, 4, bias=True)

In [None]:
resnet_model

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

From network report above, in (fc) we can see out_features has been changed to 4.

**Pre trained model with torch hub**

In [1]:
import torch
import torch.nn as nn

In [2]:
model_vgg19 = torch.hub.load('pytorch/vision:v0.10.0', 'vgg19', pretrained=True)
model_vgg19

Downloading: "https://github.com/pytorch/vision/zipball/v0.10.0" to /root/.cache/torch/hub/v0.10.0.zip
Downloading: "https://download.pytorch.org/models/vgg19-dcbb9e9d.pth" to /root/.cache/torch/hub/checkpoints/vgg19-dcbb9e9d.pth
100%|██████████| 548M/548M [00:06<00:00, 82.6MB/s]


VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padd

In [None]:
# select spesific layer in vgg features
model_vgg19.features[0] # select layer 0

Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

In [None]:
# Change last layer hyperparameter value
model_vgg19.classifier[6] = nn.Linear(in_features=4096, out_features=6, bias=True)

In [None]:
model_vgg19

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padd

Last layer output features successfully changed to 6.

**Freeze layer parameters**

In [None]:
for name,module in model_vgg19.named_children():
  if name != 'classifier':
    for a in module.parameters():
      a.requires_grad = False

  else:
    for b in range(len(model_vgg19.classifier)):
      if b != 6:
        for c in model_vgg19.classifier[b].parameters():
          c.requires_grad = False

In [None]:
from torchsummary import summary

model_vgg19 = model_vgg19.to(device='cpu')

summary(model_vgg19, (3,224,224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256,

from model summary above, we can see that trainable parameters only 24,582 (for last layer only).

## **Save and load model**

In [None]:
#save model
torch.save(model.state_dict(), 'model.pth')

#load model
loaded_model = Model(n_input_features=6)
loaded_model.load_state_dict(torch.load('model.pth'))