# Pràctica 2

L'objectiu d'aquesta segona pràctica es demostrar que heu assolit els conceptes que s'han explicar a l'assignatura i s'han practicat a les sessions presencials, relacionats amb el disseny i l'ús de xarxes neurals. 

Aquesta pràctica consta de 3 enunciats dels quals només heu de realitzar el que us hagi tocat per sorteig.

**Condicions**
1. El model que solucioni el problema estarà basat en xarxes neurals, aquestes s'han d'entrenar i avaluar emprant la llibreria _Pytorch_.
2. Es demana que com a mínim s'avaluin 2 models diferents: un que ha d'estar creat per vosaltres i un altre que es basi en una xarxa ja existent. Evidentment es permeten modificacions de la ja existen per adaptar-ho al problema que es vol resoldre.
3. El resultat del treball serà un informe on s'expliqui el procés que s'ha dut a terme per arribar a la que considereu que és millor solució. El document serà en format `pdf`. Podreu adjuntar una carpeta amb el codi i recursos que trobeu necessaris per comprovar la veracitat del que explicau al document.
4. Aquest document ha de tenir un llenguatge formal i tècnic i ha d'estar correctament estructurat:
    - Introducció al problema
    - Solucions considerades (dades, característiques, models, mètriques)
    - Experiments realitzats
    - Resultats dels experiments
    - Conclusions
5. A més del document explicatiu s'ha d'adjuntar un fitxer amb els pesos del millor entrenament de cada una de les xarxes que heu emprat (la que heu dissenyat vosaltres i la que ja existia), de tal manera que el professor pugui validar els resultats sense haver de repetir l'entrenament. Sense l'adjunció d'aquests fitxers la pràctica no es podrà aprovar.
6. Les dades depenen de cada un dels tres enunciats i les trobareu en el seu apartat.


**Avaluació**

- El treball es durà a terme en parelles.
- El professor es reserva la possibilitat de convocar als grups a una revisió de la pràctica de forma presencial.
- Només està permés emprar tècniques de disseny i entrenament vistes a classe.
- Tot el que no està fet pels alumnes ha d'estar referenciat, en cas contrari es considerarà com una còpia.


**Data d'entrega**

- Aquest treball s'entrega dia 15 de gener.
- Es realitzarà una tutoria dilluns 9 de gener a les 15:30.
---

## Enunciat 1: Classificació

El problema que heu de resoldre en aquest cas és un problema de classificació amb el conjunt de dades _Horses or Human_ dataset [enllaç](https://laurencemoroney.com/datasets.html). És un conjunt de dades generat per ordinador en el que trobareu dues classes diferents: persones i cavalls (500 imatges de cavalls i 547 imatges de persones). També dos subconjunts de dades ja definits: entrenament  i validació. Les imatges tenen una mida de 300x300 pixels i es troben en RGB.

A més de la feina de classificació i presentació dels resultats amb el conjunt de dades que es proporciona, també es demana que construiu un petit conjunt d'imatges (entre 10 i 20) de persones i cavalls reals com a conjunt de test i obtingueu les mesures rendiment adients per aquestes dades.


**Exemples del dataset**
<div style="display:flex">
     <div style="flex:1;padding-right:10px;">
          <img src="img/human01-16.png" width="200"/>
     </div>
     <div style="flex:1;padding-left:10px;">
          <img src="img/horse03-3.png" width="200"/>
     </div>
</div>

In [15]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import Dataset, DataLoader
from torch.utils.tensorboard import SummaryWriter  # to print to tensorboard


import matplotlib.pyplot as plt
import numpy as np
import os
from PIL import Image

In [16]:
class CustomDataset(Dataset):

    def __init__(self, path, transform=None):
        self.path = path
        self.transform = transform
        self.all_files = [f for f in os.listdir(self.path) if f.endswith('.png')]
        self.labels = [f.split('-')[0] for f in self.all_files]
        self.labels = [0 if x == 'horse' else 1 for x in self.labels]
       
    def __len__(self):
        return len(self.all_files)
    
    def __getitem__(self, index):
        image_path = os.path.join(self.path, self.all_files[index])
        image = Image.open(image_path).convert('RGB')
        label = self.labels[index]

        if self.transform:
            image = self.transform(image)
        # image = transforms.ToTensor()(image)
        # image = image.permute(1, 2, 0)

        return image, label

In [17]:

def get_normalization_values(path):
    """
    Compute the mean and standard deviation of the pixel values for each channel
    in the images stored in the specified folder.
    """
    red_values = []
    green_values = []
    blue_values = []

    for file in os.listdir(path):
        if file.endswith('.png'):
            image = Image.open(os.path.join(path, file))
            image_np = np.array(image)

            red, green, blue = image_np[:,:,0], image_np[:,:,1], image_np[:,:,2]

            red_values.append(red)
            green_values.append(green)
            blue_values.append(blue)

    red_mean = np.mean(red_values)/255
    green_mean = np.mean(green_values)/255
    blue_mean = np.mean(blue_values)/255

    red_std = np.std(red_values)/255
    green_std = np.std(green_values)/255
    blue_std = np.std(blue_values)/255

    return (red_mean, green_mean, blue_mean), (red_std, green_std, blue_std)


In [18]:
# mean, std = get_normalization_values("data/train/")

# mean = torch.tensor(mean)
# std = torch.tensor(std)


In [19]:
transform = transforms.Compose([
    transforms.ToTensor(),
    # transforms.Normalize(mean, std)
])

train_dataset = CustomDataset(path = "data/train/",  transform=transform)
test_dataset = CustomDataset(path = "data/validation/", transform=transform)

train_batch_size = 64
test_batch_size = 64

train_loader = DataLoader(train_dataset, batch_size=train_batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=test_batch_size, shuffle=True)

In [20]:
# # Get the first batch of data from the data loader
# batch = next(iter(train_loader))
# data, labels = batch

# # Make a figure with subplots
# fig, axs = plt.subplots(8, 8, figsize=(8, 8))

# # Iterate over the data and labels and plot the images
# for i, (data, label) in enumerate(zip(data, labels)):
#     # Calculate the row and column indices for the subplot
#     data = np.squeeze(data)

#     row = i // 8
#     col = i % 8
#     # Plot the image on the corresponding subplot
#     ax = axs[row, col]
#     ax.imshow(data)
#     ax.axis('off')

# # Show the plot
# plt.show()

In [21]:
# custom weights initialization called on netG and netD
def weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        nn.init.normal_(m.weight.data, 0.0, 0.02)
    elif classname.find('BatchNorm') != -1:
        nn.init.normal_(m.weight.data, 1.0, 0.02)
        nn.init.constant_(m.bias.data, 0)

In [22]:
class Net_CNN(nn.Module):
    def __init__(self, channels, feature_size, num_classes):
        super().__init__()
        self.main = nn.Sequential(
            self.block(channels, feature_size, 3, 2, 1),
            self.block(feature_size, feature_size*2, 3, 2, 1),
            self.block(feature_size*2, feature_size*4, 3, 2, 1),
            self.block(feature_size*4, feature_size*8, 3, 2, 1),
            nn.Flatten(),
            nn.Linear(16, 8),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(8, num_classes),
            nn.Softmax(dim=1)
        )
    
    def forward(self, x):
        return self.main(x)

    def block(self, in_channels, out_channels, kernel_size, stride, padding):
        return nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
            nn.MaxPool2d(2, 2)
        )

In [23]:
class Net_FCN(nn.Module):
    def __init__(self, channels, feature_size):
        super().__init__()
        self.main = nn.Sequential(
            self.block(channels, feature_size, 3, 2, 1), # 300x300 --> 150x150
            self.block(feature_size, feature_size*2, 3, 2, 1), # 150x150 --> 75x75
            self.block(feature_size*2, feature_size*4, 3, 2, 1), # 75x75 --> 38x38
            self.block(feature_size*4, feature_size*8, 3, 2, 1), # 38x38 --> 19x19
            self.block(feature_size*8, feature_size*4, 3, 2, 1), # 19x19 --> 10x10
            self.block(feature_size*4, feature_size*2, 3, 2, 1), # 10x10 --> 5x5
            self.block(feature_size*2, feature_size, 3, 2, 1), # 5x5 --> 3x3
            self.block(feature_size, 1, 3, 2, 0), # 3x3 --> 2x2
            nn.Flatten(),
            nn.Softmax(dim=1)
            
        )
    
    def forward(self, x):
        return self.main(x)

    def block(self, in_channels, out_channels, kernel_size, stride, padding):
        return nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )

In [24]:
device = "cuda" if torch.cuda.is_available() else "cpu"

lr_CNN = 0.001
lr_FCN = 0.01

channels = 3

feature_size_CNN = 2
feature_size_FCN = 2

num_classes = 2

num_epochs_CNN = 50
num_epochs_FCN = 50

net_CNN = Net_CNN(channels, feature_size_CNN, num_classes).to(device)
net_FCN = Net_FCN(channels, feature_size_FCN).to(device)

net_CNN.apply(weights_init)
net_FCN.apply(weights_init)

criterion_CNN = nn.CrossEntropyLoss()
criterion_FCN = nn.CrossEntropyLoss()

optimizer_CNN = torch.optim.SGD(net_CNN.parameters(), lr=lr_CNN, momentum=0.9)
optimizer_FCN = torch.optim.SGD(net_FCN.parameters(), lr=lr_FCN, momentum=0.9)

In [25]:
print(net_FCN)
pytorch_total_params_CNN = sum(p.numel() for p in net_CNN.parameters())
pytorch_total_params_FCN = sum(p.numel() for p in net_FCN.parameters())
print("Total number of parameters CNN: ", pytorch_total_params_CNN)
print("Total number of parameters FCN: ", pytorch_total_params_FCN)

Net_FCN(
  (main): Sequential(
    (0): Sequential(
      (0): Conv2d(3, 2, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): BatchNorm2d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU()
    )
    (1): Sequential(
      (0): Conv2d(2, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU()
    )
    (2): Sequential(
      (0): Conv2d(4, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU()
    )
    (3): Sequential(
      (0): Conv2d(8, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU()
    )
    (4): Sequential(
      (0): Conv2d(16, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): BatchNorm2d(8, eps=

In [None]:
writer_train = SummaryWriter(f"logs/train")
writer_test = SummaryWriter(f"logs/test")

step_train = 0
step_test = 0

tr_loss = np.zeros((num_epochs_CNN))
te_loss = np.zeros((num_epochs_CNN))

tr_acc = np.zeros((num_epochs_CNN))
te_acc = np.zeros((num_epochs_CNN))

for epoch in range(num_epochs_CNN):

    train_loss = 0
    test_loss = 0
    train_acc = 0
    test_acc = 0

    net_CNN.train()
    for i, (data, labels) in enumerate(train_loader):
        data = data.to(device)
        labels = torch.tensor(labels)
        labels = labels.to(device)

        optimizer_CNN.zero_grad()
        outputs = net_CNN(data)
        loss = criterion_CNN(outputs, labels)
        loss.backward()
        optimizer_CNN.step()

        train_loss += loss.item()

        # compute the accuracy for this batch
        _, predicted = torch.max(outputs, 1)
        correct = (predicted == labels).sum().item()
        accuracy = correct / len(labels)
        train_acc += accuracy


        writer_train.add_scalar("Training loss", loss, global_step=step_train)
        step_train += 1
    
    net_CNN.eval()
    with torch.no_grad():
        for j, (data, labels) in enumerate(test_loader):
            data = data.to(device)
            labels = torch.tensor(labels)
            labels = labels.to(device)

            outputs = net_CNN(data)
            loss = criterion_CNN(outputs, labels)

            test_loss += loss.item()
            
            # compute the accuracy for this batch
            _, predicted = torch.max(outputs, 1)
            correct = (predicted == labels).sum().item()
            accuracy = correct / len(labels)
            test_acc += accuracy

            writer_test.add_scalar("Test loss", loss, global_step=step_test)
            step_test += 1

    # compute the average loss and accuracy for each epoch
    train_loss /= len(train_loader)
    test_loss /= len(test_loader)
    train_acc /= len(train_loader)
    test_acc /= len(test_loader)

    tr_loss[epoch] = train_loss
    te_loss[epoch] = test_loss
    tr_acc[epoch] = train_acc
    te_acc[epoch] = test_acc

    print(f"Epoch {epoch+1}/{num_epochs_CNN}, Train Loss: {train_loss:.4f}, Test Loss: {test_loss:.4f}, Train Acc: {train_acc:.4f}, Test Acc: {test_acc:.4f}")

In [None]:
# plot the loss
plt.plot(tr_acc, label='Train accuracy')
plt.plot(te_acc, label='Test accuracy')
plt.legend()
plt.show()

In [27]:
step_test = 0

tr_loss = np.zeros((num_epochs_FCN))
te_loss = np.zeros((num_epochs_FCN))

tr_acc = np.zeros((num_epochs_FCN))
te_acc = np.zeros((num_epochs_FCN))

for epoch in range(num_epochs_FCN):

    train_loss = 0
    test_loss = 0
    train_acc = 0
    test_acc = 0

    net_FCN.train()
    for i, (data, labels) in enumerate(train_loader):
        data, labels = data.to(device), torch.tensor(labels).to(device)

        optimizer_FCN.zero_grad()
        outputs = net_FCN(data)
    
        labels = labels.unsqueeze(1)
        print(outputs, labels)

        loss = F.binary_cross_entropy(outputs,  labels.type(torch.float32))
        loss.backward()
        optimizer_FCN.step()

        train_loss += loss.item()

        # compute the accuracy for this batch
        _, predicted = torch.max(outputs, 1)
        correct = (predicted == labels).sum().item()
        accuracy = correct / len(labels)
        train_acc += accuracy

    net_FCN.eval()
    with torch.no_grad():
        for j, (data, labels) in enumerate(test_loader):
            data, labels = data.to(device), torch.tensor(labels).to(device)

            optimizer_FCN.zero_grad()
            outputs = net_FCN(data)
        
            labels = labels.unsqueeze(1)
            loss = F.binary_cross_entropy(outputs,  labels.type(torch.float32))

            test_loss += loss.item()

            # compute the accuracy for this batch
            _, predicted = torch.max(outputs, 1)
            correct = (predicted == labels).sum().item()
            accuracy = correct / len(labels)
            test_acc += accuracy

    # compute the average loss and accuracy for each epoch
    train_loss /= len(train_loader)
    test_loss /= len(test_loader)
    train_acc /= len(train_loader)
    test_acc /= len(test_loader)

    tr_loss[epoch] = train_loss
    te_loss[epoch] = test_loss
    tr_acc[epoch] = train_acc
    te_acc[epoch] = test_acc

    print(f"Epoch {epoch+1}/{num_epochs_FCN}, Train Loss: {train_loss:.4f}, Test Loss: {test_loss:.4f}, Train Acc: {train_acc:.4f}, Test Acc: {test_acc:.4f}")

  data, labels = data.to(device), torch.tensor(labels).to(device)


tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

  data, labels = data.to(device), torch.tensor(labels).to(device)


Epoch 1/50, Train Loss: 51.5625, Test Loss: 50.0000, Train Acc: 29.4118, Test Acc: 32.0000


KeyboardInterrupt: 

In [None]:
# plot the loss
plt.plot(tr_acc, label='Train accuracy')
plt.plot(te_acc, label='Test accuracy')
plt.legend()
plt.show()