#Laborator 4

In cadrul acestui laborator o sa lucram cu blocurile de baza necesare construirii unor retele mai complexe. De interes sunt:


*   Residual Blocks
*   Inception Blocks

Pe langa acestea, o sa aplicam si augmentari generale de date care au rolul de a face modelul robust la variatii mici.



## Operatii Noi

Urmatoarele operatii vor fi folosite in cadrul laboratorului

Tensor shape: (batch,channels,dim1,dim2)


*  **torch.cat(tensors, dim=0).** Tensorii trebuie sa aiba aceleasi dim1,dim2, dar channels poate sa difere.
*  **torch.add(input, other)**. Tensorii trebuie sa aiba aceleasi dimensiune pe toate axele.





In [12]:
import numpy as np
import torch.nn as nn
import torch

dummy_input_tensor1 = torch.rand((1,3,100,100))  # Input random de marime 100x100 cu 3 canale
dummy_input_tensor2 = torch.rand((1,5,100,100))  # Input random de marime 100x100 cu 5 canale

# Normal se concateneaza pe dimensiunea canalelor.
x = torch.cat([dummy_input_tensor1,dummy_input_tensor2],dim=1)
print(x.shape) # Numarul de canele_output = canale_input2 + canale_input1

dummy_input_tensor1 = torch.rand((1,3,100,100))  # Input random de marime 100x100 cu 3 canale
dummy_input_tensor2 = torch.rand((1,3,100,100))  # Input random de marime 100x100 cu 3 canale

x = torch.add(dummy_input_tensor1,dummy_input_tensor2)
print(x.shape)

torch.Size([1, 8, 100, 100])
torch.Size([1, 3, 100, 100])


##Residual Block

In cadrul Resnet se utilizeaza residual connections / skip connections, care impreuna cu un path normal, ca cele implementate pana acum, formeaza un residual block.

<img src="https://user-images.githubusercontent.com/6086781/28494249-97e81166-6ef6-11e7-88b8-fa4aa184bc0b.png" alt="Drawing" style="height: 400px;"/>


###Cerinta 1 - **(3p)**

Implementati ResidualBlock. Acesta duce input tensor din ($c_{input}$,width,height) in  ($c_{out}$,width,height) sau  ($c_{out}$,width/2,height/2) in functie de stride. (Puteti implemnta oricare dintre variantele din imagine)




In [13]:
import numpy as np
import torch.nn as nn
import torch
import torch.nn.functional as F

class ResidualBlock(nn.Module):
  def __init__(self,input_channels=32,hidden_channels=64,output_channels=64,kernel_size=3,stride=1,activation=nn.ReLU()):
    super(ResidualBlock,self).__init__()
    layers = [
            nn.Conv2d(input_channels, hidden_channels, kernel_size, stride=stride, padding=1),
            nn.BatchNorm2d(hidden_channels),
            activation,
            nn.Conv2d(hidden_channels, output_channels, kernel_size, stride=1, padding=1),
            nn.BatchNorm2d(output_channels)
    ]


    self.net_normal = nn.Sequential(*layers)

    self.net_residual = nn.Conv2d(input_channels,output_channels,1,stride)

  def forward(self,x):
    x = torch.add(self.net_normal(x),self.net_residual(x))
    x = F.relu(x)
    return x

# block = ResidualBlock(3,64,128,3,1,nn.ReLU())
# x = torch.rand(size=(1,3,100,100))

# stride=2 halves size output size
block = ResidualBlock(3,64,128,3,2,nn.ReLU())
x = torch.rand(size=(1,3,100,100))

# Should output torch.Size([1, 128, 50, 50])
print(block(x).shape)

torch.Size([1, 128, 50, 50])


##Inception Block

In cadrul GoogleNet/InceptionNet este folosit Inception Block, care este alcatuit din mai multe mini-retele putin diferite, care se unesc la finalul Inception Block

### Cerinta 2 - **(3p)**

Implementati Inception Block. Acesta trebuie sa duca un Tensor ($ch_{input}$,w,h) in ($ch_{out}$,w/2,h/2)

Punctaj:
- cate **0.5p** pentru fiecare path implementat corect **(4 * 0.5p = 2p)**
- codul ruleaza si rezultatul final este corect (**1p**)


<!-- ![InceptionBlock](https://drive.google.com/uc?export=view&id=11eiInGoQoytm_N0989v5oJnKpFrspUzk) -->

<img src="https://drive.google.com/uc?export=view&id=11eiInGoQoytm_N0989v5oJnKpFrspUzk" alt="Drawing" style="height: 200px; width:200px;"/>

In [14]:
import numpy as np
import torch.nn as nn
import torch
import torch.nn.functional as F


class InceptionBlock(nn.Module):
  def __init__(self,input_channels=32,kernel_size=3,stride=1,activation=nn.ReLU()):
    super(InceptionBlock,self).__init__()

    output_size = 64

    layers = [
        nn.Conv2d(input_channels, 64, kernel_size=1, stride=stride),
        nn.BatchNorm2d(64),
        activation,
        nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
        nn.BatchNorm2d(128),
        activation,
        nn.Conv2d(128, output_size, kernel_size=3, stride=1, padding=1),
        nn.BatchNorm2d(output_size),
        activation
    ]

    self.path1 = nn.Sequential(*layers)

    layers = [
        nn.Conv2d(input_channels, 128, kernel_size=1, stride=stride),
        nn.BatchNorm2d(128),
        activation,
        nn.Conv2d(128, output_size, kernel_size=3, stride=1, padding=1),
        nn.BatchNorm2d(output_size),
        activation
    ]

    self.path2 = nn.Sequential(*layers)

    layers = [
        nn.MaxPool2d(kernel_size, stride=stride, padding=1),
        nn.Conv2d(input_channels, output_size, kernel_size=1),
        nn.BatchNorm2d(output_size),
        activation
    ]

    self.path3 = nn.Sequential(*layers)

    layers = [
        nn.Conv2d(input_channels, output_size, kernel_size=1, stride=stride),
        nn.BatchNorm2d(output_size),
        activation
    ]

    self.path4 = nn.Sequential(*layers)

  def forward(self,x):
    x1 = self.path1(x)
    # print(x1.shape)
    x2 =self.path2(x)
    # print(x2.shape)
    x3 =self.path3(x)
    # print(x3.shape)
    x4 =self.path4(x)
    # print(x4.shape)
    x = torch.cat([x1,x2,x3,x4],1)
    return x

block = InceptionBlock(64,3,2,nn.ReLU())
x = torch.rand(size=(1,64,100,100))

# Should output torch.Size([1, 256, 50, 50])
print(block(x).shape)

torch.Size([1, 256, 50, 50])


## Instantierea seturilor de date

In acest laborator lucram cu un nou set de date. Este vorba de un dataset folosit in [aceasta competitie Kaggle](https://www.kaggle.com/c/dogs-vs-cats/overview), Pisici vs Caini.

***In this competition, you'll write an algorithm to classify whether images contain either a dog or a cat.  This is easy for humans, dogs, and cats. Your computer will find it a bit more difficult.***


# Descarcarea setului de date

### Authenticating with Kaggle using kaggle.json

Navigate to https://www.kaggle.com. Then go to the [Account tab of your user profile](https://www.kaggle.com/me/account) and select Create API Token. This will trigger the download of kaggle.json, a file containing your API credentials.

Then run the cell below to upload kaggle.json to your Colab runtime.


In [None]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

# Then move kaggle.json into the folder where the API expects to find it.
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json

In [None]:
!pip install kaggle
!kaggle competitions download -c dogs-vs-cats
!for z in *.zip; do unzip "$z"; done
!ls

In [None]:
!unzip train.zip
!unzip test1.zip

## Crearea Dataloader-ului

In continuare, pentru a incarca date, sa o folosim un obiect mai complex, un Torch.utils.data.Dataset. Acesta are 3 functii importante:


*   __init__()
*   ____len____()
*  ____get_item____()



In [None]:
import torch as t
from PIL import Image
import numpy as np
from torch.utils.data import Dataset, DataLoader
from torchvision.transforms.functional import to_tensor, normalize
import random, os
random.seed(42)

import matplotlib.pyplot as plt
import matplotlib.patches as patches

train_dir = '../data/train'
test_dir = '../data/test'

class CatsDogsDataset(Dataset):
    def __init__(self, file_list, width=128, height=128, transform=None):
        self.file_list = file_list
        self.transform = transform
        self.img_size = (width, height)

    def __len__(self):
        return len(self.file_list)

    def __getitem__(self,idx):
        img_path = self.file_list[idx]
        img = Image.open(img_path)

        original_width, original_height = img.size

        img = img.resize(self.img_size)
        img = np.array(img)

        label = img_path.split('/')[-1].split('.')[0]
        if label == 'dog':
            label = 1
        elif label == 'cat':
            label = 0

        return to_tensor(img), label

Construire Dataset si vizualizare date.

In [None]:
from IPython.display import clear_output
import time

train_test_proportion = .85

import glob

samples = glob.glob(os.path.join('./train','*.jpg'))
random.shuffle(samples)

train_samples = samples[:int(train_test_proportion*len(samples))]
test_samples = samples[int(train_test_proportion*len(samples)):]

cats_dogs_train = CatsDogsDataset(train_samples,128,128)
cats_dogs_test = CatsDogsDataset(test_samples,128,128)

train_loader = DataLoader(cats_dogs_train, batch_size=16, shuffle=True, num_workers=4)
test_loader = DataLoader(cats_dogs_test, batch_size=16, shuffle=False, num_workers=4)

see_examples = 10
for i, (imgs, label) in enumerate(train_loader):
    clear_output(wait=True)
    image = imgs[0].numpy().transpose(1, 2, 0)
    plt.imshow(image)
    plt.text(5, 30, "DOG" if label[0] else "CAT", fontsize ='xx-large', color='red', fontweight='bold')
    plt.show()

    if i >= see_examples - 1:
      break
    time.sleep(1)


### Cerinta 3 - **(4p)**

  1. Antrenati o retea convolutionala (o arhitectura la alegere din laboratorul 3) folosind acest dataset, pe GPU (https://pytorch.org/docs/stable/notes/cuda.html) **(1p)**
  2. Antrenati o retea de tip Resnet (folosind blocuri de tip Residual) **(1p)**
  3. Antrenati o retea de tip Inception (folosind blocuri de tip Inception)  **(1p)**
  4. Experimentati cu diferiti hyperparameters (numarul de layers, numarul de filtre/neuroni, etc.) **(1p)**


In [None]:
import torch as t
from PIL import Image
import numpy as np
from torch.utils.data import Dataset, DataLoader
from torchvision.transforms.functional import to_tensor, normalize
import random, os
random.seed(37)

import matplotlib.pyplot as plt
import matplotlib.patches as patches

train_dir = '../data/train'
test_dir = '../data/test'

class CatsDogsDataset(Dataset):
    def __init__(self, file_list, width=128, height=128, transform=None):
        self.file_list = file_list
        self.transform = transform
        self.img_size = (width, height)

    def __len__(self):
        return len(self.file_list)

    def __getitem__(self,idx):
        img_path = self.file_list[idx]
        img = Image.open(img_path)

        original_width, original_height = img.size

        img = img.resize(self.img_size)
        img = np.array(img)

        label = img_path.split('/')[-1].split('.')[0]
        if label == 'dog':
            label = 1
        elif label == 'cat':
            label = 0

        return to_tensor(img), label


from IPython.display import clear_output
import time

train_test_proportion = .85

import glob

samples = glob.glob(os.path.join('./train','*.jpg'))
random.shuffle(samples)

train_samples = samples[:int(train_test_proportion*len(samples))]
test_samples = samples[int(train_test_proportion*len(samples)):]

cats_dogs_train = CatsDogsDataset(train_samples,32,32)
cats_dogs_test = CatsDogsDataset(test_samples,32,32)

train_loader = DataLoader(cats_dogs_train, batch_size=16, shuffle=True, num_workers=4)
test_loader = DataLoader(cats_dogs_test, batch_size=16, shuffle=False, num_workers=4)


In [None]:
import torch.nn as nn
import torch
import torch.optim as optim
import numpy as np
from torch.utils.data import DataLoader
from torchvision.transforms.functional import to_tensor, normalize
import torchvision.transforms as transforms

device = torch.device("cuda")

class LeNet(nn.Module):
  def __init__(self):
    super().__init__()

    # Convolutii
    self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=(5,5), stride=(1,1), padding=0)
    self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=(5,5), stride=(1,1), padding=0)
    self.conv3 = nn.Conv2d(in_channels=16, out_channels=16, kernel_size=(5,5), stride=(2,2), padding=0)

    # Activare
    self.tanh = nn.Tanh()
    self.softmax = nn.Softmax()

    # Pooling
    self.pooling = nn.AvgPool2d(kernel_size=(2,2), stride=(2,2))

    # Full connection
    self.linear1 = nn.Linear(16, 84)
    self.linear2 = nn.Linear(84, 2)

  def forward(self,x):

    x = self.tanh(self.conv1(x))
    x = self.tanh(self.pooling(x))
    x = self.tanh(self.conv2(x))
    x = self.tanh(self.pooling(x))
    x = self.tanh(self.conv3(x))

    x = x.view(x.size(0), -1)
    x = self.tanh(self.linear1(x))
    x = self.softmax(self.linear2(x))
    return x








In [None]:
def test_acc(net: nn.Module, test_loader: DataLoader):
  net.eval()

  total = 0
  correct = 0

  for test_images, test_labels in test_loader:
    test_images, test_labels = test_images.to(device), test_labels.to(device)
    total += len(test_images)
    out_class = torch.argmax(net(test_images))
    correct += torch.sum(out_class == test_labels)

  return correct / total * 100


def train_fn(epochs: int, train_loader: DataLoader, test_loader: DataLoader,
             net: nn.Module, loss_fn: nn.Module, optimizer: optim.Optimizer):
  # Iteram prin numarul de epoci
  for e in range(epochs):
    net.train()

    # Iteram prin fiecare batch din dataloader
    for images, labels in train_loader:
      # Trecerea pe cuda
      images, labels = images.to(device), labels.to(device)
      # Aplicam reteaua neurala pe imaginile din batch-ul curent
      out = net(images)
      # Aplicam functia cost pe iesirea retelei neurale si pe etichetele imaginilor
      loss = loss_fn(out, labels)
      # Aplicam algoritmul de back-propagation
      loss.backward()
      # Facem pasul de optimizare, pentru a actualiza parametrii retelei
      optimizer.step()
      # Apelam functia zero_grad() pentru a uita gradientii de la iteratie curenta
      optimizer.zero_grad()

    print("Loss-ul la finalul epocii {} are valoarea {}".format(e, loss.item()))

    # Calculam acuratetea
    acc = test_acc(net, test_loader)
    print("Acuratetea la finalul epocii {} este {:.2f}%".format(e + 1, acc))

In [None]:
epochs = 10

network = LeNet().to(device)

optimizer = optim.SGD(network.parameters(), lr=1e-2)
optimizer.zero_grad()

loss_fn = nn.CrossEntropyLoss()


train_fn(epochs, train_loader, test_loader, network, loss_fn, optimizer)

In [None]:
import torch.nn as nn
import torch
import torch.optim as optim
import numpy as np
from torch.utils.data import DataLoader
from torchvision.transforms.functional import to_tensor, normalize
import torchvision.transforms as transforms

class LeNet2(nn.Module):
  def __init__(self):
    super().__init__()

    # Convolutii
    self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=(5,5), stride=(1,1), padding=0)
    self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=(5,5), stride=(1,1), padding=0)
    self.conv3 = nn.Conv2d(in_channels=16, out_channels=120, kernel_size=(5,5), stride=(2,2), padding=0)
    self.conv4 = nn.Conv2d(in_channels=120, out_channels=64, kernel_size=(5,5), stride=(1,1), padding=0)

    # Activare
    self.relu = nn.ReLU()
    self.softmax = nn.Softmax()

    # Pooling
    self.pooling = nn.AvgPool2d(kernel_size=(2,2), stride=(2,2))

    # Full connection
    self.linear1 = nn.Linear(64, 32)
    self.linear2 = nn.Linear(32, 2)

  def forward(self,x):

    x = self.relu(self.conv1(x))
    x = self.relu(self.pooling(x))
    x = self.relu(self.conv2(x))
    x = self.relu(self.pooling(x))
    x = self.relu(self.conv3(x))
    x = self.relu(self.pooling(x))
    x = self.relu(self.conv4(x))

    x = x.view(x.size(0), -1)
    x = self.softmax(self.linear1(x))
    x = self.softmax(self.linear2(x))
    return x


In [None]:

epochs = 15

network = LeNet().to(device)

optimizer = optim.SGD(network.parameters(), lr=(1e-3)*2)
optimizer.zero_grad()

loss_fn = nn.CrossEntropyLoss()


train_fn(epochs, train_loader, test_loader, network, loss_fn, optimizer)

In [None]:
import torch.nn as nn
import torch
import torch.optim as optim
import numpy as np
from torch.utils.data import DataLoader
from torchvision.transforms.functional import to_tensor, normalize
import torchvision.transforms as transforms



def test_acc(net: nn.Module, test_loader: DataLoader):
  net.eval()

  total = 0
  correct = 0

  for test_images, test_labels in test_loader:
    test_images, test_labels = test_images, test_labels
    total += len(test_images)
    out_class = torch.argmax(net(test_images))
    correct += torch.sum(out_class == test_labels)

  return correct / total * 100


def train_fn(epochs: int, train_loader: DataLoader, test_loader: DataLoader,
             net: nn.Module, loss_fn: nn.Module, optimizer: optim.Optimizer):
  # Iteram prin numarul de epoci
  for e in range(epochs):
    net.train()

    # Iteram prin fiecare batch din dataloader
    for images, labels in train_loader:
      # Trecerea pe cuda
      images, labels = images, labels
      # Aplicam reteaua neurala pe imaginile din batch-ul curent
      out = net(images)
      # Aplicam functia cost pe iesirea retelei neurale si pe etichetele imaginilor
      loss = loss_fn(out, labels)
      # Aplicam algoritmul de back-propagation
      loss.backward()
      # Facem pasul de optimizare, pentru a actualiza parametrii retelei
      optimizer.step()
      # Apelam functia zero_grad() pentru a uita gradientii de la iteratie curenta
      optimizer.zero_grad()

    print("Loss-ul la finalul epocii {} are valoarea {}".format(e, loss.item()))

    # Calculam acuratetea
    acc = test_acc(net, test_loader)
    print("Acuratetea la finalul epocii {} este {:.2f}%".format(e + 1, acc))

class LeNetWithInception(nn.Module):
    def __init__(self):
        super(LeNetWithInception, self).__init__()

        # First Inception Block
        self.inception1 = InceptionBlock(input_channels=3, kernel_size=3, stride=2, activation=nn.ReLU())

        # Convolutional layers

        self.conv1 = nn.Conv2d(in_channels=256, out_channels=120, kernel_size=(5,5), stride=(1,1), padding=0)
        self.conv2 = nn.Conv2d(in_channels=120, out_channels=16, kernel_size=(5,5), stride=(1,1), padding=0)


        # Activation function
        self.relu = nn.ReLU()



        # Full connection
        self.linear1 = nn.Linear(1024, 16)
        self.linear2 = nn.Linear(16, 2)

    def forward(self, x):
        x = self.inception1(x)
        x = self.relu(self.conv1(x))
        x = self.relu(self.conv2(x))

        x = x.view(x.size(0), -1)
        x = self.relu(self.linear1(x))
        # x = self.relu(self.linear2(x)) -> fara
        # Daca folosim nn.CrossEntropyLoss(), nu mai este nevoie de functie de activare pe ultimul nivel
        x = self.linear2(x)
        return x

# Initialize the model

# device = torch.device("cuda")
# network = LeNetWithInception().to(device)
network = LeNetWithInception()

# Define the optimizer and loss function
optimizer = optim.SGD(network.parameters(), lr=0.001)
loss_fn = nn.CrossEntropyLoss()

# Train the model
epochs = 5
train_fn(epochs, train_loader, test_loader, network, loss_fn, optimizer)

In [None]:
def test_acc(net: nn.Module, test_loader: DataLoader):
  net.eval()

  total = 0
  correct = 0

  for test_images, test_labels in test_loader:
    test_images, test_labels = test_images, test_labels
    total += len(test_images)
    out_class = torch.argmax(net(test_images))
    correct += torch.sum(out_class == test_labels)

  return correct / total * 100


def train_fn(epochs: int, train_loader: DataLoader, test_loader: DataLoader,
             net: nn.Module, loss_fn: nn.Module, optimizer: optim.Optimizer):
  # Iteram prin numarul de epoci
  for e in range(epochs):
    net.train()

    # Iteram prin fiecare batch din dataloader
    for images, labels in train_loader:
      # Trecerea pe cuda
      images, labels = images, labels
      # Aplicam reteaua neurala pe imaginile din batch-ul curent
      out = net(images)
      # Aplicam functia cost pe iesirea retelei neurale si pe etichetele imaginilor
      loss = loss_fn(out, labels)
      # Aplicam algoritmul de back-propagation
      loss.backward()
      # Facem pasul de optimizare, pentru a actualiza parametrii retelei
      optimizer.step()
      # Apelam functia zero_grad() pentru a uita gradientii de la iteratie curenta
      optimizer.zero_grad()

    print("Loss-ul la finalul epocii {} are valoarea {}".format(e, loss.item()))

    # Calculam acuratetea
    acc = test_acc(net, test_loader)
    print("Acuratetea la finalul epocii {} este {:.2f}%".format(e + 1, acc))

class LeNetWithResidual(nn.Module):
    def __init__(self):
        super(LeNetWithResidual, self).__init__()

        self.residual = ResidualBlock(input_channels=3, kernel_size=3, stride=2, activation=nn.ReLU())

        # Convolutional layers
        self.conv1 = nn.Conv2d(in_channels=64, out_channels=120, kernel_size=(5,5), stride=(1,1), padding=0)
        self.conv2 = nn.Conv2d(in_channels=120, out_channels=8, kernel_size=(5,5), stride=(1,1), padding=0)

        # Activation function
        self.relu = nn.ReLU()

        # Full connection
        self.linear1 = nn.Linear(512, 1024)
        self.linear2 = nn.Linear(1024, 2)

    def forward(self, x):
        x = self.residual(x)
        x = self.conv1(x)
        x = self.relu(x)
        x = self.relu(self.conv2(x))

        x = x.view(x.size(0), -1)
        x = self.relu(self.linear1(x))
        x = self.linear2(x)

# Initialize the model

# device = torch.device("cuda")
# network = LeNetWithInception().to(device)
network = LeNetWithInception()

# Define the optimizer and loss function
optimizer = optim.SGD(network.parameters(), lr=0.001)
loss_fn = nn.CrossEntropyLoss()

# Train the model
epochs = 5
train_fn(epochs, train_loader, test_loader, network, loss_fn, optimizer)