# Lab 8: Pretrained Networks and Autoencoders
## 1. Model Downloading

Download one of the [availiable in pytorch pretrained neural nets](https://pytorch.org/vision/stable/models.html). And print its structure. Help:

```python
import torch, torchvision, os
import numpy as np
net = torchvision.models.NAME_OF_THE_MODEL(pretrained=True)
net
```

In [26]:
# place for code
import torch, torchvision, os
import numpy as np
net = torchvision.models.resnet18(pretrained=True)
net

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

## 2. Change Net Structure

Replace the last fully connected layer by the linear layer same number of inputs (net.fc.in_features) and with 2 outputs

In [27]:
# place for code
net.fc = torch.nn.Linear(in_features= net.fc.in_features, out_features = 2)

## 3. Define normalization and augmentation transformations for data

Help:
```python
data_transforms = torchvision.transforms.Compose([
        torchvision.transforms.RandomResizedCrop(224),
        torchvision.transforms.RandomHorizontalFlip(),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
image_dataset = torchvision.datasets.ImageFolder('data/hymenoptera/train', data_transforms)
dataloader = torch.utils.data.DataLoader(image_dataset, batch_size=4, shuffle=True, num_workers=4)
class_names = image_dataset.classes
```

In [29]:
# place for code
data_transforms = torchvision.transforms.Compose([
        torchvision.transforms.RandomResizedCrop(224),
        torchvision.transforms.RandomHorizontalFlip(),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
image_dataset = torchvision.datasets.ImageFolder('data/hymenoptera/train', data_transforms)
dataloader = torch.utils.data.DataLoader(image_dataset, batch_size=4, shuffle=True, num_workers=4)
class_names = image_dataset.classes

## 4. Define optimization options: criterion and optimizer

In [30]:
# place for code
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum = 0.9)

## 5. Training

Copy training cycle from previous laboratory work. Along with crossentropy loss calculate and print every **10** cycles mean accuracy. Help:

```python
predictions = np.argmax(outputs.data, axis=1)
batch_accuracy = torch.sum(predictions == labels.data)/len(predictions)
mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
```

In [32]:
# place for code
mean_accuracy = 0.0
for epoch in range(1000):
    for cycle, data_batch in enumerate(dataloader):
        x_batch, label_batch = data_batch
        # forward step
        output = net(x_batch)
        
        # calculate loss
        loss = criterion(output, label_batch)
        
        # calculate gradients
        optimizer.zero_grad()
        loss.backward()
        
        # change weights
        optimizer.step()

        predictions = np.argmax(output.data, axis=1)
        batch_accuracy = torch.sum(predictions == label_batch.data)/len(predictions)
        mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
        if cycle % 10 ==0: 
            print(f'epoch = {epoch},cycle={cycle}, mean_accuracy={mean_accuracy}')
    if mean_accuracy >= 1 - 0.098:
        break
print('finishid')

epoch = 0,cycle=0, mean_accuracy=0.009999999776482582
epoch = 0,cycle=10, mean_accuracy=0.08553960919380188
epoch = 0,cycle=20, mean_accuracy=0.1706009805202484
epoch = 0,cycle=30, mean_accuracy=0.23549115657806396
epoch = 0,cycle=40, mean_accuracy=0.3038100004196167
epoch = 0,cycle=50, mean_accuracy=0.36326950788497925
epoch = 0,cycle=60, mean_accuracy=0.41017061471939087
epoch = 1,cycle=0, mean_accuracy=0.4160689115524292
epoch = 1,cycle=10, mean_accuracy=0.4576133191585541
epoch = 1,cycle=20, mean_accuracy=0.4976305663585663
epoch = 1,cycle=30, mean_accuracy=0.5290653109550476
epoch = 1,cycle=40, mean_accuracy=0.5508134365081787
epoch = 1,cycle=50, mean_accuracy=0.5796404480934143
epoch = 1,cycle=60, mean_accuracy=0.6077276468276978
epoch = 2,cycle=0, mean_accuracy=0.6116503477096558
epoch = 2,cycle=10, mean_accuracy=0.6463085412979126
epoch = 2,cycle=20, mean_accuracy=0.6658352613449097
epoch = 2,cycle=30, mean_accuracy=0.6763833165168762
epoch = 2,cycle=40, mean_accuracy=0.7049731

## 6. Finetuning

Try several different configurations and write here training statistics after 10 epochs for

* Pretrained network (pretrained=True)
* Not pretrained network (pretrained=False)
* Another learning rate value
* Using scheduler (call scheduler.step() each epoch)
* Freeze weights of several layers of pretrained network

Help: 
```python
# Construct a scheduler to decay LR by a factor of 0.1 every 5 epochs
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.1)

# Freeze first 7 layers:
for i, child in enumerate(net.children()):
    if i < 7:
        for param in child.parameters():
            param.requires_grad = False
```

In [33]:
# Pretrained network (pretrained=True)
pretrained_net = torchvision.models.resnet18(pretrained=True)
pretrained_net.fc = torch.nn.Linear(in_features= pretrained_net.fc.in_features, out_features = 2)
optimizer1 = torch.optim.SGD(net.parameters(), lr=0.001, momentum = 0.9)

mean_accuracy = 0.0
for epoch in range(10):
    for cycle, data_batch in enumerate(dataloader):
        x_batch, label_batch = data_batch
        output = pretrained_net(x_batch)
        loss = criterion(output, label_batch)
        optimizer1.zero_grad()
        loss.backward()
        optimizer1.step()
        predictions = np.argmax(output.data, axis=1)
        batch_accuracy = torch.sum(predictions == label_batch.data)/len(predictions)
        mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
        if cycle % 10 ==0: 
            print(f'epoch = {epoch},cycle={cycle}, mean_accuracy={mean_accuracy}')

epoch = 0,cycle=0, mean_accuracy=0.004999999888241291
epoch = 0,cycle=10, mean_accuracy=0.09552611410617828
epoch = 0,cycle=20, mean_accuracy=0.172207772731781
epoch = 0,cycle=30, mean_accuracy=0.24162867665290833
epoch = 0,cycle=40, mean_accuracy=0.3118587136268616
epoch = 0,cycle=50, mean_accuracy=0.3728286325931549
epoch = 0,cycle=60, mean_accuracy=0.4234229624271393
epoch = 1,cycle=0, mean_accuracy=0.42918872833251953
epoch = 1,cycle=10, mean_accuracy=0.4692353308200836
epoch = 1,cycle=20, mean_accuracy=0.5008914470672607
epoch = 1,cycle=30, mean_accuracy=0.5390811562538147
epoch = 1,cycle=40, mean_accuracy=0.5784927010536194
epoch = 1,cycle=50, mean_accuracy=0.6116379499435425
epoch = 1,cycle=60, mean_accuracy=0.6440406441688538
epoch = 2,cycle=0, mean_accuracy=0.6451002359390259
epoch = 2,cycle=10, mean_accuracy=0.6577988862991333
epoch = 2,cycle=20, mean_accuracy=0.6810281872749329
epoch = 2,cycle=30, mean_accuracy=0.6994459629058838
epoch = 2,cycle=40, mean_accuracy=0.721075117

In [35]:
# Not pretrained network (pretrained=False)
not_pretrained_net = torchvision.models.resnet18(pretrained=False)
not_pretrained_net.fc = torch.nn.Linear(in_features= not_pretrained_net.fc.in_features, out_features = 2)
optimizer2 = torch.optim.SGD(net.parameters(), lr=0.001, momentum = 0.9)

mean_accuracy = 0.0
for epoch in range(10):
    for cycle, data_batch in enumerate(dataloader):
        x_batch, label_batch = data_batch
        output = not_pretrained_net(x_batch)
        loss = criterion(output, label_batch)
        optimizer2.zero_grad()
        loss.backward()
        optimizer2.step()
        predictions = np.argmax(output.data, axis=1)
        batch_accuracy = torch.sum(predictions == label_batch.data)/len(predictions)
        mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
        if cycle % 10 ==0: 
            print(f'epoch = {epoch},cycle={cycle}, mean_accuracy={mean_accuracy}')

epoch = 0,cycle=0, mean_accuracy=0.0024999999441206455
epoch = 0,cycle=10, mean_accuracy=0.04552849382162094
epoch = 0,cycle=20, mean_accuracy=0.08908043056726456
epoch = 0,cycle=30, mean_accuracy=0.13038983941078186
epoch = 0,cycle=40, mean_accuracy=0.17063170671463013
epoch = 0,cycle=50, mean_accuracy=0.19744037091732025
epoch = 0,cycle=60, mean_accuracy=0.23346015810966492
epoch = 1,cycle=0, mean_accuracy=0.23112556338310242
epoch = 1,cycle=10, mean_accuracy=0.26663628220558167
epoch = 1,cycle=20, mean_accuracy=0.28897398710250854
epoch = 1,cycle=30, mean_accuracy=0.302473247051239
epoch = 1,cycle=40, mean_accuracy=0.3284672796726227
epoch = 1,cycle=50, mean_accuracy=0.3399876058101654
epoch = 1,cycle=60, mean_accuracy=0.3576405644416809
epoch = 2,cycle=0, mean_accuracy=0.35906416177749634
epoch = 2,cycle=10, mean_accuracy=0.36540719866752625
epoch = 2,cycle=20, mean_accuracy=0.3805555999279022
epoch = 2,cycle=30, mean_accuracy=0.40653741359710693
epoch = 2,cycle=40, mean_accuracy=0

In [36]:
# Another learning rate value
pretrained_net = torchvision.models.resnet18(pretrained=True)
pretrained_net.fc = torch.nn.Linear(in_features= pretrained_net.fc.in_features, out_features = 2)
optimizer3 = torch.optim.SGD(net.parameters(), lr=0.009, momentum = 0.9)

mean_accuracy = 0.0
for epoch in range(10):
    for cycle, data_batch in enumerate(dataloader):
        x_batch, label_batch = data_batch
        output = pretrained_net(x_batch)
        loss = criterion(output, label_batch)
        optimizer3.zero_grad()
        loss.backward()
        optimizer3.step()
        predictions = np.argmax(output.data, axis=1)
        batch_accuracy = torch.sum(predictions == label_batch.data)/len(predictions)
        mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
        if cycle % 10 ==0: 
            print(f'epoch = {epoch},cycle={cycle}, mean_accuracy={mean_accuracy}')

epoch = 0,cycle=0, mean_accuracy=0.0024999999441206455
epoch = 0,cycle=10, mean_accuracy=0.042575251311063766
epoch = 0,cycle=20, mean_accuracy=0.07209239155054092
epoch = 0,cycle=30, mean_accuracy=0.11070146411657333
epoch = 0,cycle=40, mean_accuracy=0.11916866898536682
epoch = 0,cycle=50, mean_accuracy=0.14087967574596405
epoch = 0,cycle=60, mean_accuracy=0.17315351963043213
epoch = 1,cycle=0, mean_accuracy=0.17392198741436005
epoch = 1,cycle=10, mean_accuracy=0.19585300981998444
epoch = 1,cycle=20, mean_accuracy=0.22238659858703613
epoch = 1,cycle=30, mean_accuracy=0.22257913649082184
epoch = 1,cycle=40, mean_accuracy=0.22755424678325653
epoch = 1,cycle=50, mean_accuracy=0.23952741920948029
epoch = 1,cycle=60, mean_accuracy=0.25042712688446045
epoch = 2,cycle=0, mean_accuracy=0.250422865152359
epoch = 2,cycle=10, mean_accuracy=0.25011733174324036
epoch = 2,cycle=20, mean_accuracy=0.26207059621810913
epoch = 2,cycle=30, mean_accuracy=0.26555222272872925
epoch = 2,cycle=40, mean_accur

In [38]:
# Using scheduler (call scheduler.step() each epoch)
pretrained_net = torchvision.models.resnet18(pretrained=True)
pretrained_net.fc = torch.nn.Linear(in_features= pretrained_net.fc.in_features, out_features = 2)
optimizer4 = torch.optim.SGD(net.parameters(), lr=0.001, momentum = 0.9)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer4, step_size=5, gamma=0.1)

mean_accuracy = 0.0
for epoch in range(10):
    for cycle, data_batch in enumerate(dataloader):
        x_batch, label_batch = data_batch
        output = pretrained_net(x_batch)
        loss = criterion(output, label_batch)
        loss.backward()
        optimizer4.step()
        scheduler.step()
        predictions = np.argmax(output.data, axis=1)
        batch_accuracy = torch.sum(predictions == label_batch.data)/len(predictions)
        mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
        if cycle % 10 ==0: 
            print(f'epoch = {epoch},cycle={cycle}, mean_accuracy={mean_accuracy}')

epoch = 0,cycle=0, mean_accuracy=0.007499999832361937
epoch = 0,cycle=10, mean_accuracy=0.04947558417916298
epoch = 0,cycle=20, mean_accuracy=0.10917787998914719
epoch = 0,cycle=30, mean_accuracy=0.14166711270809174
epoch = 0,cycle=40, mean_accuracy=0.18080973625183105
epoch = 0,cycle=50, mean_accuracy=0.21352018415927887
epoch = 0,cycle=60, mean_accuracy=0.23606225848197937
epoch = 1,cycle=0, mean_accuracy=0.23870162665843964
epoch = 1,cycle=10, mean_accuracy=0.27353763580322266
epoch = 1,cycle=20, mean_accuracy=0.3023015558719635
epoch = 1,cycle=30, mean_accuracy=0.30684152245521545
epoch = 1,cycle=40, mean_accuracy=0.33009493350982666
epoch = 1,cycle=50, mean_accuracy=0.34175071120262146
epoch = 1,cycle=60, mean_accuracy=0.35488685965538025
epoch = 2,cycle=0, mean_accuracy=0.3563379943370819
epoch = 2,cycle=10, mean_accuracy=0.3797531723976135
epoch = 2,cycle=20, mean_accuracy=0.39125433564186096
epoch = 2,cycle=30, mean_accuracy=0.3944697678089142
epoch = 2,cycle=40, mean_accuracy=

In [39]:
# Freeze first 7 layers:
pretrained_net = torchvision.models.resnet18(pretrained=True)
pretrained_net.fc = torch.nn.Linear(in_features= pretrained_net.fc.in_features, out_features = 2)
optimizer5 = torch.optim.SGD(net.parameters(), lr=0.001, momentum = 0.9)

for i, child in enumerate(pretrained_net.children()):
    if i < 7:
        for param in child.parameters():
            param.requires_grad = False

mean_accuracy = 0.0
for epoch in range(10):
    for cycle, data_batch in enumerate(dataloader):
        x_batch, label_batch = data_batch
        output = pretrained_net(x_batch)
        loss = criterion(output, label_batch)
        optimizer5.zero_grad()
        loss.backward()
        optimizer5.step()
        predictions = np.argmax(output.data, axis=1)
        batch_accuracy = torch.sum(predictions == label_batch.data)/len(predictions)
        mean_accuracy = 0.99*mean_accuracy + 0.01*batch_accuracy
        if cycle % 10 ==0: 
            print(f'epoch = {epoch},cycle={cycle}, mean_accuracy={mean_accuracy}')

epoch = 0,cycle=0, mean_accuracy=0.0
epoch = 0,cycle=10, mean_accuracy=0.054895684123039246
epoch = 0,cycle=20, mean_accuracy=0.11412703990936279
epoch = 0,cycle=30, mean_accuracy=0.16793307662010193
epoch = 0,cycle=40, mean_accuracy=0.21393339335918427
epoch = 0,cycle=50, mean_accuracy=0.2629522681236267
epoch = 0,cycle=60, mean_accuracy=0.2907351851463318
epoch = 1,cycle=0, mean_accuracy=0.2978278398513794
epoch = 1,cycle=10, mean_accuracy=0.32667624950408936
epoch = 1,cycle=20, mean_accuracy=0.3550409972667694
epoch = 1,cycle=30, mean_accuracy=0.37586838006973267
epoch = 1,cycle=40, mean_accuracy=0.3951149880886078
epoch = 1,cycle=50, mean_accuracy=0.40951424837112427
epoch = 1,cycle=60, mean_accuracy=0.4249648153781891
epoch = 2,cycle=0, mean_accuracy=0.4257151782512665
epoch = 2,cycle=10, mean_accuracy=0.45189782977104187
epoch = 2,cycle=20, mean_accuracy=0.4705292582511902
epoch = 2,cycle=30, mean_accuracy=0.48309776186943054
epoch = 2,cycle=40, mean_accuracy=0.4917750060558319
e

In [None]:
# place for comments

# Pretrained network (pretrained=True) ------------- Mean Accuracy after 10 epoch: 0.8882108926773071
# Not Pretrained network (pretrained=False)  ------- Mean Accuracy after 10 epoch: 0.5055774450302124
# Another learning rate value (0.009)  ------------- Mean Accuracy after 10 epoch: 0.3534396290779114
# Using scheduler  --------------------------------- Mean Accuracy after 10 epoch: 0.48538312315940857
# Freeze first 7 layers  --------------------------- Mean Accuracy after 10 epoch: 0.570824146270752