<a href="https://colab.research.google.com/github/IrinaSirbu2002/Astrology.Web.App/blob/main/laborator/CV%203%20-%20Lab%20%234.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Clasificarea imaginilor
Acest proiect implementeaza arhitectura AlexNet pentru clasificarea imaginilor reale. Baza de date aleasa pentru demonstratie este CIFAR10.

In [20]:
import numpy as np
import torch
import torch.nn as nn
from torchvision import datasets
from torchvision import transforms
from torch.utils.data.sampler import SubsetRandomSampler
from tqdm import tqdm


# Alegem configuratia sistemului (cpu/gpu)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Setam media si deviatia standard pentru normalizarea bazei de date - acestea sunt calculate la nivel de canal si doar pe baza de date de antrenare!
normalize = transforms.Normalize(mean=[0.4914, 0.4822, 0.4465], std=[0.2023, 0.1994, 0.2010])

##1. Pre-procesarea datelor

In [21]:
def get_train_valid_loader(data_dir,
                           batch_size,
                           augment,
                           random_seed,
                           normalize,
                           valid_size=0.1,
                           shuffle=True):

    # Definim setul de transformari necesare bazei de date
    valid_transform = transforms.Compose([
            transforms.Resize((227,227)), # baza de date CIFAR10 contine imagini de dimensiunea 32x32, iar AlexNet are intrari de dimensiune 227x227
            transforms.ToTensor(),        # transformarea intrarilor in tensori
            normalize,                    # aplicarea normalizarii
    ])
    if augment:
        train_transform = transforms.Compose([
            transforms.RandomCrop(32, padding=4), # decuparea unor regiuni aleatoare de dimensiune 32x32 din imaginea originala la care s-a adaugat padding=4
            transforms.RandomHorizontalFlip(0.4), # oglindirea imaginilor cu probabilitate de 40%
            transforms.Resize((227,227)),         # redimensionarea imaginilor augmentate la dimensiunea de 227x227 pixeli
            transforms.ToTensor(),                # transformarea intrarilor in tensori
            normalize,                            # aplicarea normalizarii
        ])
    else:
        train_transform = valid_transform

    # Fiind o baza de date foarte populara, CIFAR10 poate fi descarcata cu ajutorul modulului torchvision
    train_dataset = datasets.CIFAR10(root=data_dir,
                                     train=True,
                                     download=True,
                                     transform=train_transform,
                                     )

    valid_dataset = datasets.CIFAR10(root=data_dir,
                                     train=True,
                                     download=True,
                                     transform=valid_transform,
                                     )

    # Alegem numarul de esantioane pentru train/val
    num_train = len(train_dataset)
    indices = list(range(num_train))
    split = int(np.floor(valid_size * num_train))

    # Amestecam indecsii
    if shuffle:
        np.random.seed(random_seed)
        np.random.shuffle(indices)

    # Separam indecsii de train in train+val
    train_idx, valid_idx = indices[split:], indices[:split]
    train_sampler = SubsetRandomSampler(train_idx)
    valid_sampler = SubsetRandomSampler(valid_idx)

    # Cream dataloaders pentru train si val
    train_loader = torch.utils.data.DataLoader(
        train_dataset, batch_size=batch_size, sampler=train_sampler)

    valid_loader = torch.utils.data.DataLoader(
        valid_dataset, batch_size=batch_size, sampler=valid_sampler)

    return (train_loader, valid_loader)


def get_test_loader(data_dir,
                    batch_size,
                    normalize,
                    shuffle=True):

    # Transformari asemanatoare cu cele pentru train/val. Normalizarea se face cu aceleasi valori ca in cazul train!
    transform = transforms.Compose([
        transforms.Resize((227,227)),
        transforms.ToTensor(),
        normalize,
    ])

    # Descarcarea bazei de test
    dataset = datasets.CIFAR10(
        root=data_dir, train=False,
        download=True, transform=transform,
    )

    # Crearea dataloader pentru test
    data_loader = torch.utils.data.DataLoader(
        dataset, batch_size=batch_size, shuffle=shuffle
    )

    return data_loader


# Crearea efectiva a dataloaders
train_loader, valid_loader = get_train_valid_loader(
    data_dir = './data',
    batch_size = 64,
    augment = True,
    random_seed = 1,
    normalize = normalize
)

test_loader = get_test_loader(
    data_dir = './data',
    batch_size = 64,
    normalize = normalize
)

##2. Definirea modelului

Resurse utile (documentatii):
- strat convolutional: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html
- strat complet conectat: https://pytorch.org/docs/stable/generated/torch.nn.Linear.html?highlight=linear#torch.nn.Linear
- strat max pooling: https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d
- activare ReLU: https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html?highlight=relu#torch.nn.ReLU
- regularizare dropout: https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html?highlight=dropout#torch.nn.Dropout
- mod secvential de compunere a straturilor: https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html?highlight=sequential#torch.nn.Sequential

In [22]:
class AlexNet(nn.Module):
    def __init__(self, num_classes=10):
        super(AlexNet, self).__init__()
        # TODO: definiti componentele de baza
        self.features = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=11, stride=4, padding=2), # padding added to adjust output size
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(96, 256, kernel_size=5, padding=2), # padding added to adjust output size
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(256, 384, kernel_size=3, padding=1), # padding added to adjust output size
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 384, kernel_size=3, padding=1), # padding added to adjust output size
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1), # padding added to adjust output size
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(9216, 4096), # adjusted input size to match the actual output size after flattening
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x # removed softmax as CrossEntropyLoss already applies it

##3. Antrenarea retelei

In [23]:
# Alegerea hiperparametrilor
num_classes = 10
num_epochs = 20
batch_size = 64
learning_rate = 0.005

# Trecerea modelului pe gpu
model = AlexNet(num_classes).to(device)

# Alegerea functiei de pierdere. Clasificare de imagini => cross-entropy
criterion = nn.CrossEntropyLoss()
# Alegerea optimizatorului
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, weight_decay = 0.005, momentum = 0.9)

In [24]:
# Antrenarea modelului
total_step = len(train_loader)

for epoch in tqdm(range(num_epochs)):
    for i, (images, labels) in enumerate(train_loader):
        # Incarcam tensorii pe gpu/cpu
        images = images.to(device)
        labels = labels.to(device)

        # Forward propagation
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backprop si rularea unui pas de optimizare a ponderilor
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

    # Rularea algoritmului pe baza de validare
    with torch.no_grad():
        correct = 0
        total = 0
        for images, labels in valid_loader:
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
            del images, labels, outputs

        print('Accuracy of the network on the {} validation images: {} %'.format(5000, 100 * correct / total))


  0%|          | 0/20 [00:00<?, ?it/s]

Epoch [1/20], Step [704/704], Loss: 1.7385


  5%|▌         | 1/20 [01:41<32:02, 101.18s/it]

Accuracy of the network on the 5000 validation images: 28.2 %
Epoch [2/20], Step [704/704], Loss: 1.6713


 10%|█         | 2/20 [03:22<30:24, 101.36s/it]

Accuracy of the network on the 5000 validation images: 45.5 %
Epoch [3/20], Step [704/704], Loss: 1.2199


 15%|█▌        | 3/20 [05:03<28:42, 101.31s/it]

Accuracy of the network on the 5000 validation images: 47.82 %
Epoch [4/20], Step [704/704], Loss: 1.0892


 20%|██        | 4/20 [06:45<27:03, 101.46s/it]

Accuracy of the network on the 5000 validation images: 60.98 %
Epoch [5/20], Step [704/704], Loss: 0.7142


 25%|██▌       | 5/20 [08:27<25:22, 101.48s/it]

Accuracy of the network on the 5000 validation images: 65.74 %
Epoch [6/20], Step [704/704], Loss: 0.7961


 30%|███       | 6/20 [10:08<23:40, 101.48s/it]

Accuracy of the network on the 5000 validation images: 63.98 %
Epoch [7/20], Step [704/704], Loss: 0.9501


 35%|███▌      | 7/20 [11:50<21:59, 101.47s/it]

Accuracy of the network on the 5000 validation images: 70.42 %
Epoch [8/20], Step [704/704], Loss: 0.3632


 40%|████      | 8/20 [13:31<20:15, 101.30s/it]

Accuracy of the network on the 5000 validation images: 73.12 %
Epoch [9/20], Step [704/704], Loss: 0.8312


 45%|████▌     | 9/20 [15:12<18:33, 101.21s/it]

Accuracy of the network on the 5000 validation images: 68.36 %
Epoch [10/20], Step [704/704], Loss: 2.1797


 50%|█████     | 10/20 [16:53<16:54, 101.40s/it]

Accuracy of the network on the 5000 validation images: 72.94 %
Epoch [11/20], Step [704/704], Loss: 0.9893


 55%|█████▌    | 11/20 [18:34<15:11, 101.30s/it]

Accuracy of the network on the 5000 validation images: 75.8 %
Epoch [12/20], Step [704/704], Loss: 1.4245


 60%|██████    | 12/20 [20:16<13:31, 101.41s/it]

Accuracy of the network on the 5000 validation images: 74.18 %
Epoch [13/20], Step [704/704], Loss: 0.3878


 65%|██████▌   | 13/20 [21:57<11:48, 101.17s/it]

Accuracy of the network on the 5000 validation images: 76.48 %
Epoch [14/20], Step [704/704], Loss: 0.6329


 70%|███████   | 14/20 [23:37<10:06, 101.01s/it]

Accuracy of the network on the 5000 validation images: 76.92 %
Epoch [15/20], Step [704/704], Loss: 0.9037


 75%|███████▌  | 15/20 [25:19<08:25, 101.08s/it]

Accuracy of the network on the 5000 validation images: 76.66 %
Epoch [16/20], Step [704/704], Loss: 0.7478


 80%|████████  | 16/20 [27:00<06:44, 101.10s/it]

Accuracy of the network on the 5000 validation images: 77.54 %
Epoch [17/20], Step [704/704], Loss: 0.9269


 85%|████████▌ | 17/20 [28:41<05:03, 101.22s/it]

Accuracy of the network on the 5000 validation images: 79.3 %
Epoch [18/20], Step [704/704], Loss: 1.3858


 90%|█████████ | 18/20 [30:23<03:22, 101.49s/it]

Accuracy of the network on the 5000 validation images: 79.2 %
Epoch [19/20], Step [704/704], Loss: 1.4099


 95%|█████████▌| 19/20 [32:05<01:41, 101.65s/it]

Accuracy of the network on the 5000 validation images: 79.02 %
Epoch [20/20], Step [704/704], Loss: 0.2953


100%|██████████| 20/20 [33:48<00:00, 101.43s/it]

Accuracy of the network on the 5000 validation images: 79.8 %





##4. Testarea retelei

In [25]:
# Rularea algoritmului pe baza de test
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        del images, labels, outputs

    print('Accuracy of the network on the {} test images: {} %'.format(10000, 100 * correct / total))

Accuracy of the network on the 10000 test images: 80.1 %


#\#TODO:
1. Implementati structura retelei AlexNet. Clasa trebuie sa defineasca straturile in functia \_\_init\_\_() si sa le lege in functia foward().
2. Comparati rezultatele antrenarii cu, respectiv fara, augmentari.
3. Reduceti setul de date de antrenare la 10% din valoarea sa si analizati impactul asupra acuratetii finale.
4. Incercati diferiti optimizatori si hiperparametri.
5. Rulati inferenta (testarea) pe o imagine reala si afisati rezultatul.
6. Modificati modelul retelei prin adaugarea unor straturi intermediare.