# Machine Learning Assignment (First Part)
##### Name: Matvey Makhnov<br> 
Task 1: Detection of inconsistencies in flower descriptions in online floristry and delivery platforms is essential for success, customer retention, and satisfaction. Many companies providing online floristry services are increasingly utilizing deep learning solutions to ensure that a flower image displayed on their platform matches the given description or category. <br> <br>To implement a flower classification convolutional neural network (CNN) trained on the Flowers102 dataset

In [10]:
import os 
import torch 
import torch.nn as nn 
import torch.optim as optim 
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# set a size for our batches 
train_batch = 32
val_batch = 32 
test_batch = 32

train_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.RandomHorizontalFlip(), 
    transforms.RandomRotation(15),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                          std=[0.229, 0.224, 0.225])
])

validation_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                          std=[0.229, 0.224, 0.225])
]) 


data_path = os.path.join('.', 'Flowers_102_Dataset')

#----------------------------------------------------------------------------------------------------------

#dataset = datasets.Flowers102(root = data_path, transform=None, download=True)
# set a size for our sets (training, validation, test) 
#train_size = int(0.8 * len(dataset))
#val_size = int(0.1 * len(dataset))
#test_size = len(dataset) - (train_size + val_size)
#train_dataset, val_dataset, test_dataset =  random_split(dataset, [train_size,val_size,test_size])

#----------------------------------------------------------------------------------------------------------

# I want to apply 2 different transforms compositions 
# that is why I'll use next code 
train_dataset = datasets.Flowers102(root = data_path, split="train", transform=train_transform, download=True)
val_dataset = datasets.Flowers102(root = data_path, split="val", transform=validation_transform, download=True) 
test_dataset = datasets.Flowers102(root = data_path, split="test", transform=validation_transform, download=True)



train_loader = DataLoader(dataset=train_dataset, batch_size=train_batch,shuffle=True)
val_loader = DataLoader(dataset=val_dataset, batch_size=val_batch,shuffle=False)
test_loader = DataLoader(dataset=test_dataset, batch_size=test_batch, shuffle=False)

print(f"Using device: {device}")

Using device: cpu


The next step is to build our CNN architecture depends on Table 1 in  `F24.ML.Assignment.2.pdf` file <br> In this architecture I'll use only RELU activation function for all layes and for last one I'll apply softmax to get final results. In total we'll have 102 classes cause we have 102 types of flowers in our dataset

In [None]:
import torch.nn.functional as F 

class CNN_1(nn.Module):
    def __init__(self):
        super(CNN_1,self).__init__()

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1,padding=1)

        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

        self.fc1 = nn.Linear(in_features=128 * 28 * 28, out_features=512)
        self.fc2 = nn.Linear(in_features=512,out_features=102)



    def forward(self,x):

        # the 1st convol layer input 224x224x3 output 114x112x32
        x = self.pool(F.relu(self.conv1(x)))
        # # the 2nd convol layer input 112x112x32 output 56x56x64
        x = self.pool(F.relu(self.conv2(x)))
        # the 3rd convol layer input 56x56x64 output 28x28x128
        x = self.pool(F.relu(self.conv3(x)))

        # here we have 28*28*128 values of feature map 

        # flattening 
        x = x.view(-1, 128 * 28 * 28)   


        # using weight matrics to 
        x = F.relu(self.fc1(x))
        x=self.fc2(x)

        return F.log_softmax(x, dim = 1)
    
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model_1 = CNN_1().to(device)

In [3]:
def counter_params(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

In [16]:
print(f"Number of parameters in our 1st CNN model: {counter_params(model_1)}")

Number of parameters in our 1st CNN model: 51526310


In [17]:
print(len(train_dataset))
print(len(val_dataset))
print(len(test_dataset))

1020
1020
6149


Now I'll build training, validation and test function <br> We have to estimate our model on training, validation and test sets with using accuracy, loss and F1-score. I'll calculate average loss and accuracy for training set. And for validation and test sets I'll apply all of them (accuracy, average loss and definitely F1-score, cause it will give us ability to understand how to make our model better). <br> As a loss function I'll choose NLL Loss (Will explain latter why I choose it) 

In [6]:
from tqdm import tqdm
from sklearn.metrics import f1_score
from torch.utils.tensorboard import SummaryWriter

def train_1(model, device, loader, dataset, optimizer, epoch, writer):
    model.train()
    train_loss = 0 
    train_correct = 0 
    total = 0 
    
    for batch_idx, (images, labels) in enumerate(tqdm(loader, desc=f"Training epoch: {epoch}")):
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        output = model(images)
        loss = F.nll_loss(output,labels)
        loss.backward()
        optimizer.step()

        train_loss +=loss.item()
        prediction = output.argmax(dim = 1, keepdim = True)
        #train_correct += (prediction == labels).sum().item()
        train_correct += prediction.eq(labels.view_as(prediction)).sum().item()
        total += labels.size(0)

    average_loss = train_loss/len(loader)
    accuracy = train_correct/len(loader.dataset) * 100.0 

    writer.add_scalar('Loss/Train', average_loss, epoch)
    writer.add_scalar('Accuracy/Train', accuracy, epoch)
    writer.flush()
    

    print(f"==> Epoch {epoch} Completed: Average loss: {average_loss:.6f}\tAccuracy: {accuracy:.3f}% ")


def validation_1(model, device, loader, dataset, epoch, writer):
    model.eval()
    validation_loss = 0
    val_correct = 0 
    v_labels_list = []
    v_prediction_list = []
    total = 0 

    with torch.no_grad():
        for images, labels in tqdm(loader, desc="Valodation"):
            images, labels = images.to(device), labels.to(device)

            output = model(images)
            loss = F.nll_loss(output, labels)
            validation_loss +=loss.item()

            prediction = output.argmax(dim = 1, keepdim = True)
            #val_correct += (prediction == labels).sum().item()
            val_correct += prediction.eq(labels.view_as(prediction)).sum().item()

            v_labels_list.extend(labels.cpu().numpy())
            v_prediction_list.extend(prediction.cpu().numpy())
            total += labels.size(0)
            

    average_loss = validation_loss/len(loader)
    accuracy = val_correct/len(loader.dataset) * 100.0
    f1 = f1_score(v_labels_list, v_prediction_list, average="weighted")

    writer.add_scalar('Loss/Validation', average_loss, epoch)
    writer.add_scalar('Accuracy/Validation', accuracy, epoch)
    writer.add_scalar('F1-score/Validation', f1, epoch)
    writer.flush()

    print(f"==> Validation Completed: Average Loss: {average_loss:.6f}\tAccuracy: {accuracy:.2f}%\tF-1 Score: {f1:.4f}")

    # return accuracy and loss for tracking 
    return average_loss, accuracy



def test_1(model, device, loader,dataset):
    model.eval()
    test_loss = 0 
    test_correct = 0
    t_label_list = []
    t_prediction_list = []
    total = 0

    with torch.no_grad():
        for images, labels in tqdm(loader, desc="Test"):
            images, labels = images.to(device), labels.to(device)

            output = model(images)
            loss = F.nll_loss(output, labels)
            test_loss +=loss.item()

            prediction = output.argmax(dim = 1, keepdim = True)
            #test_correct += (prediction == labels).sum().item()
            test_correct += prediction.eq(labels.view_as(prediction)).sum().item()

            t_label_list.extend(labels.cpu().numpy())
            t_prediction_list.extend(prediction.cpu().numpy())
            total += labels.size(0)
            
    average_loss = test_loss/len(loader)
    accuracy = test_correct/len(loader.dataset) * 100
    f1 = f1_score(t_label_list,t_prediction_list,average="weighted")

    print(f"==>Test Completed: Avverage loss: {average_loss:.6f}\tAccuracy: {accuracy:.2f}%\tF-1 Score: {f1:.4f}")

    # return for tracking 
    return average_loss, accuracy

Now I'll train my model with applying SGD, learning rate = 0.001. Also I'll use TensorBoard <br> TensorBorad will help us to visualize our Average Loss, Accuracy and F1-score during whole training process<br> [Link for TensorBoard documentation](https://pytorch.org/tutorials/recipes/recipes/tensorboard_with_pytorch.html)

In [None]:
from torch.utils.tensorboard import SummaryWriter
 
# before starting our training we have to 
# Initialize tensorboard writer
writer = SummaryWriter(log_dir="First_CNN_Model")

# our hyperparameters 
epochs = 10 
learning_rate = 0.001
momentum = 0.5

# Model and optimizer 
model = model_1.to(device)
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum)

best_accuracy = 0 

for epoch in range(1, epochs+1):
    print(f"\nEpoch {epoch}/{epochs}")
    
    # start our training
    train_1(model, device, train_loader,train_dataset, optimizer, epoch, writer)

    val_loss, val_accuracy = validation_1(model, device, val_loader, val_dataset, epoch, writer)

    if val_accuracy > best_accuracy:
        best_accuracy = val_accuracy
        torch.save(model.state_dict(), "Best_in_the_1st_CNN.pt")
        print(f"The best model was saved with accuracy: {best_accuracy:.2f}%")

# test our model on test set
test_loss, test_accuracy = test_1(model, device, test_loader, test_dataset)
torch.save(model.state_dict(), "Test_result_for_1st_CNN.pt")
print(f"The best result was saved with Accuracy: {test_accuracy:.2f}% and Average Loss: {test_loss:.6f}")

# in the end we have to use close method 
writer.close()



Epoch 1/10


Training epoch: 1:   0%|          | 0/32 [00:00<?, ?it/s]

Training epoch: 1: 100%|██████████| 32/32 [01:33<00:00,  2.92s/it]


==> Epoch 1 Completed: Average loss: 4.616038	Accuracy: 1.569% 


Valodation: 100%|██████████| 32/32 [00:35<00:00,  1.12s/it]


==> Validation Completed: Average Loss: 4.582602	Accuracy: 4.12%	F-1 Score: 0.0156
The best model was saved with accuracy: 4.12%

Epoch 2/10


Training epoch: 2: 100%|██████████| 32/32 [01:13<00:00,  2.29s/it]


==> Epoch 2 Completed: Average loss: 4.554125	Accuracy: 5.196% 


Valodation: 100%|██████████| 32/32 [00:31<00:00,  1.03it/s]


==> Validation Completed: Average Loss: 4.530618	Accuracy: 5.78%	F-1 Score: 0.0257
The best model was saved with accuracy: 5.78%

Epoch 3/10


Training epoch: 3: 100%|██████████| 32/32 [01:09<00:00,  2.17s/it]


==> Epoch 3 Completed: Average loss: 4.481274	Accuracy: 6.765% 


Valodation: 100%|██████████| 32/32 [00:30<00:00,  1.06it/s]


==> Validation Completed: Average Loss: 4.462608	Accuracy: 5.98%	F-1 Score: 0.0267
The best model was saved with accuracy: 5.98%

Epoch 4/10


Training epoch: 4: 100%|██████████| 32/32 [01:09<00:00,  2.16s/it]


==> Epoch 4 Completed: Average loss: 4.383176	Accuracy: 7.157% 


Valodation: 100%|██████████| 32/32 [00:27<00:00,  1.15it/s]


==> Validation Completed: Average Loss: 4.372005	Accuracy: 6.37%	F-1 Score: 0.0315
The best model was saved with accuracy: 6.37%

Epoch 5/10


Training epoch: 5: 100%|██████████| 32/32 [01:08<00:00,  2.14s/it]


==> Epoch 5 Completed: Average loss: 4.250586	Accuracy: 8.725% 


Valodation: 100%|██████████| 32/32 [00:27<00:00,  1.14it/s]


==> Validation Completed: Average Loss: 4.259804	Accuracy: 7.35%	F-1 Score: 0.0403
The best model was saved with accuracy: 7.35%

Epoch 6/10


Training epoch: 6: 100%|██████████| 32/32 [01:11<00:00,  2.24s/it]


==> Epoch 6 Completed: Average loss: 4.090029	Accuracy: 11.373% 


Valodation: 100%|██████████| 32/32 [00:31<00:00,  1.02it/s]


==> Validation Completed: Average Loss: 4.127021	Accuracy: 9.31%	F-1 Score: 0.0532
The best model was saved with accuracy: 9.31%

Epoch 7/10


Training epoch: 7: 100%|██████████| 32/32 [01:10<00:00,  2.19s/it]


==> Epoch 7 Completed: Average loss: 3.885299	Accuracy: 13.627% 


Valodation: 100%|██████████| 32/32 [00:31<00:00,  1.01it/s]


==> Validation Completed: Average Loss: 3.997244	Accuracy: 10.98%	F-1 Score: 0.0706
The best model was saved with accuracy: 10.98%

Epoch 8/10


Training epoch: 8: 100%|██████████| 32/32 [01:11<00:00,  2.24s/it]


==> Epoch 8 Completed: Average loss: 3.668203	Accuracy: 18.627% 


Valodation: 100%|██████████| 32/32 [00:32<00:00,  1.00s/it]


==> Validation Completed: Average Loss: 3.883033	Accuracy: 10.20%	F-1 Score: 0.0682

Epoch 9/10


Training epoch: 9: 100%|██████████| 32/32 [01:12<00:00,  2.26s/it]


==> Epoch 9 Completed: Average loss: 3.435644	Accuracy: 20.098% 


Valodation: 100%|██████████| 32/32 [00:34<00:00,  1.08s/it]


==> Validation Completed: Average Loss: 3.767782	Accuracy: 13.63%	F-1 Score: 0.0919
The best model was saved with accuracy: 13.63%

Epoch 10/10


Training epoch: 10: 100%|██████████| 32/32 [01:10<00:00,  2.19s/it]


==> Epoch 10 Completed: Average loss: 3.189563	Accuracy: 25.588% 


Valodation: 100%|██████████| 32/32 [00:25<00:00,  1.24it/s]


==> Validation Completed: Average Loss: 3.690553	Accuracy: 14.51%	F-1 Score: 0.1132
The best model was saved with accuracy: 14.51%


Test: 100%|██████████| 193/193 [03:01<00:00,  1.06it/s]


==>Test Completed: Avverage loss: 3.807991	Accuracy: 13.79%	F-1 Score: 0.1087
The best result was saved with Accuracy: 13.79% and Average Loss: 3.807991


**Conclusion:**
We can see that value of final accuracy on test set is too low - ` 13.9% `. It says to us that we have to optimize our model 

**Now I will create the 2nd model (optimize my 1st model with using next technic):**<br> 1. Batch normalization (nn.BatchNorm2d for convolution Layers and nn.Batchnormalization1d for FC Layers ) <br> 2. Early stopping <br> 3. Dropout (p = 0.5 for FC layer) <br> 4. Scheduler for learning rate (Will emplement in the end)

In [2]:
import torch.nn.functional as F 

class CNN_2(nn.Module):
    def __init__(self):
        super(CNN_2, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=32,kernel_size=3, stride=1, padding=1)
        self.batch1 = nn.BatchNorm2d(32)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.batch2 = nn.BatchNorm2d(64)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1,padding=1)
        self.batch3 = nn.BatchNorm2d(128)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

        #self.conv_dropout = nn.Dropout2d()

        self.fc1 = nn.Linear(in_features=128 * 28 * 28, out_features=512)
        self.fc_batch1 = nn.BatchNorm1d(512)
        self.fc2 = nn.Linear(in_features=512,out_features=102)

        self.fc_dropout = nn.Dropout(p=0.5)



    def forward(self,x):

        x = self.pool(F.relu(self.batch1(self.conv1(x))))
        x = self.pool(F.relu(self.batch2(self.conv2(x))))
        x = self.pool(F.relu(self.batch3(self.conv3(x))))

        x = x.view(-1,128*28*28)

        x = self.fc_dropout(F.relu(self.fc_batch1(self.fc1(x))))
        x = self.fc2(x)

        return F.log_softmax(x, dim=1)
    
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model_2 = CNN_2().to(device)
        

In [4]:
print(f"Number of parameters in our 1st CNN model: {counter_params(model_2)}")

Number of parameters in our 1st CNN model: 51527782


In [None]:
# just to see our archetecture 
print(model_2)

CNN_2(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (batch1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (batch2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (batch3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=100352, out_features=512, bias=True)
  (fc_batch1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (fc2): Linear(in_features=512, out_features=102, bias=True)
  (fc_dropout): Dropout(p=0.5, inplace=False)
)


I will use previous funcrion: ` train_1 `, ` validation_1 `, ` test_1 `

In [None]:
from torch.utils.tensorboard import SummaryWriter
from torch.optim.lr_scheduler import StepLR

writer_2 = SummaryWriter(log_dir="Second_CNN_Model")

epochs = 10
learning_rate = 0.01
momentum = 0.5
# For early stopping
patience = 4 

model = model_2.to(device)
optimizer_2 = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum)
scheduler = StepLR(optimizer_2, step_size=5, gamma=0.1)

best_accuracy_2 = 0 
early_stop_counter = 0

for epoch in range(1, epochs+1):
    print(f"Epoch {epoch}/{epochs}")
    train_1(model,device, train_loader,train_dataset,optimizer_2, epoch, writer_2)

    val_loss, val_acc = validation_1(model, device, val_loader, val_dataset, epoch, writer_2)

    # control condition for early stopping 
    if val_acc>best_accuracy_2:
        best_accuracy_2 = val_acc
        torch.save(model.state_dict(), "Best_in_the_2nd_CNN.pt")
        print(f"The best model was saved with accuracy: {best_accuracy_2:.2f}%")
        early_stop_counter = 0
    else: 
        early_stop_counter +=1 

    # early stopping codition
    if early_stop_counter>=patience:
        print(f"Early stopping at epoch {epoch} due to no improvement in test accuracy.")
        break
    
    # step the schedular
    scheduler.step()

test_loss, test_acc = test_1(model, device, test_loader, test_dataset)

torch.save(model.state_dict(), "Test_result_for_2nd_CNN.pt")
print(f'Test model was saved with Accuracy: {test_acc:.2f}% and Avverage Loss: {test_loss:.6f}')

Epoch 1/10


Training epoch: 1: 100%|██████████| 32/32 [01:23<00:00,  2.60s/it]


==> Epoch 1 Completed: Average loss: 4.369323	Accuracy: 5.588% 


Valodation: 100%|██████████| 32/32 [00:29<00:00,  1.08it/s]


==> Validation Completed: Average Loss: 4.037020	Accuracy: 13.14%	F-1 Score: 0.0860
The best model was saved with accuracy: 13.14%
Epoch 2/10


Training epoch: 2: 100%|██████████| 32/32 [01:19<00:00,  2.48s/it]


==> Epoch 2 Completed: Average loss: 3.605604	Accuracy: 21.471% 


Valodation: 100%|██████████| 32/32 [00:30<00:00,  1.06it/s]


==> Validation Completed: Average Loss: 3.607573	Accuracy: 20.98%	F-1 Score: 0.1576
The best model was saved with accuracy: 20.98%
Epoch 3/10


Training epoch: 3: 100%|██████████| 32/32 [01:19<00:00,  2.48s/it]


==> Epoch 3 Completed: Average loss: 3.152220	Accuracy: 34.118% 


Valodation: 100%|██████████| 32/32 [00:28<00:00,  1.11it/s]


==> Validation Completed: Average Loss: 3.344013	Accuracy: 27.75%	F-1 Score: 0.2262
The best model was saved with accuracy: 27.75%
Epoch 4/10


Training epoch: 4: 100%|██████████| 32/32 [01:19<00:00,  2.47s/it]


==> Epoch 4 Completed: Average loss: 2.800133	Accuracy: 43.333% 


Valodation: 100%|██████████| 32/32 [00:29<00:00,  1.10it/s]


==> Validation Completed: Average Loss: 3.228951	Accuracy: 31.57%	F-1 Score: 0.2702
The best model was saved with accuracy: 31.57%
Epoch 5/10


Training epoch: 5: 100%|██████████| 32/32 [01:22<00:00,  2.57s/it]


==> Epoch 5 Completed: Average loss: 2.454779	Accuracy: 55.784% 


Valodation: 100%|██████████| 32/32 [00:32<00:00,  1.01s/it]


==> Validation Completed: Average Loss: 3.063599	Accuracy: 34.41%	F-1 Score: 0.2998
The best model was saved with accuracy: 34.41%
Epoch 6/10


Training epoch: 6: 100%|██████████| 32/32 [01:18<00:00,  2.44s/it]


==> Epoch 6 Completed: Average loss: 2.219969	Accuracy: 61.471% 


Valodation: 100%|██████████| 32/32 [00:28<00:00,  1.11it/s]


==> Validation Completed: Average Loss: 2.934191	Accuracy: 35.59%	F-1 Score: 0.3165
The best model was saved with accuracy: 35.59%
Epoch 7/10


Training epoch: 7: 100%|██████████| 32/32 [01:19<00:00,  2.48s/it]


==> Epoch 7 Completed: Average loss: 1.937748	Accuracy: 70.294% 


Valodation: 100%|██████████| 32/32 [00:29<00:00,  1.10it/s]


==> Validation Completed: Average Loss: 2.887398	Accuracy: 35.88%	F-1 Score: 0.3186
The best model was saved with accuracy: 35.88%
Epoch 8/10


Training epoch: 8: 100%|██████████| 32/32 [01:18<00:00,  2.46s/it]


==> Epoch 8 Completed: Average loss: 1.720086	Accuracy: 76.373% 


Valodation: 100%|██████████| 32/32 [00:28<00:00,  1.11it/s]


==> Validation Completed: Average Loss: 2.791682	Accuracy: 38.53%	F-1 Score: 0.3520
The best model was saved with accuracy: 38.53%
Epoch 9/10


Training epoch: 9: 100%|██████████| 32/32 [01:18<00:00,  2.45s/it]


==> Epoch 9 Completed: Average loss: 1.506288	Accuracy: 80.980% 


Valodation: 100%|██████████| 32/32 [00:28<00:00,  1.11it/s]


==> Validation Completed: Average Loss: 2.776209	Accuracy: 38.14%	F-1 Score: 0.3496
Epoch 10/10


Training epoch: 10: 100%|██████████| 32/32 [01:17<00:00,  2.42s/it]


==> Epoch 10 Completed: Average loss: 1.296220	Accuracy: 85.686% 


Valodation: 100%|██████████| 32/32 [00:29<00:00,  1.09it/s]


==> Validation Completed: Average Loss: 2.687117	Accuracy: 39.61%	F-1 Score: 0.3722
The best model was saved with accuracy: 39.61%


Test: 100%|██████████| 193/193 [02:58<00:00,  1.08it/s]


==>Test Completed: Avverage loss: 2.874494	Accuracy: 34.64%	F-1 Score: 0.3367
Test model was saved with Accuracy: 34.64% and Avverage Loss: 2.874494


In [9]:
writer_2.close()

**Conclusion**

We can see how our perfomance became batter thanks to optimization. Before optimization Accuracy on test data was ` 13.79% ` and after optimization we got ` 34.64% `. <br>It's influence of Batch normalization.

**Next Task:** Use transfer learning to achieve better performance than the improved baseline model above. I'll use pretrained model `resnet50` <br> I've learned how to implement transfer learning over [here](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html)

In [14]:
# Let's download our pretrained model 
import torch 
from torchvision import models

model_3 = models.resnet50(pretrained = True)

Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to C:\Users\user/.cache\torch\hub\checkpoints\resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:03<00:00, 30.6MB/s]


In [15]:
print(f"Number of parameters in our Pretrained resnet50 model: {counter_params(model_3)}")

Number of parameters in our Pretrained resnet50 model: 25557032


In [None]:
# just to see our architecture of FC Layers 
print(model_3)

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [None]:
# Here I'm freezing the model's pretrained weights
for params in model_3.parameters():
    params.requires_grad = False 

# reset final fully connected layer
#fc_input = model_3.fc.in_features

model_3.fc = nn.Sequential(
    nn.Linear(2048,102),
    nn.LogSoftmax(dim=1) # cause we are using NLL Loss function in our Loss and we need to apply LogSoftmax to get correct values
)

In [38]:
# just to make sure that we successfully changed
# number of output features 
print(model_3.fc)

Sequential(
  (0): Linear(in_features=2048, out_features=102, bias=True)
  (1): LogSoftmax(dim=1)
)


In [39]:
from torch.utils.tensorboard import SummaryWriter
from torch.optim.lr_scheduler import StepLR


writer_3 = SummaryWriter(log_dir="Transfer_learning_Model")

epochs = 10 
learning_rate = 0.01
momentum = 0.5
patience = 3

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = model_3.to(device)
optimizer_3 = optim.SGD(model.parameters(), lr = learning_rate, momentum=momentum)
scheduler_3 = StepLR(optimizer_3, step_size=5,gamma=0.1)

best_accuracy_3 = 0
early_stop_counter_3 = 0 

for epoch in range(1, epochs+1):
    print(f"Epoch {epoch}/{epochs}")
    train_1(model, device, train_loader, train_dataset, optimizer_3, epoch, writer_3)

    val_loss_3, val_acc_3 = validation_1(model,device, val_loader, val_dataset, epoch, writer_3)

    if val_acc_3 > best_accuracy_3:
        best_accuracy_3 = val_acc_3
        torch.save(model.state_dict(), "Best_in_transfer_learning_CNN.pt")
        print(f"The best model was saved with accuracy: {best_accuracy_3:.2f}%")
        early_stop_counter_3 = 0 
    else:
        early_stop_counter_3 +=1 

    if early_stop_counter_3 >= patience:
        print(f"Early stopping at epoch {epoch} due to no improvement in test accuracy.")
        break

    scheduler_3.step()

model.load_state_dict(torch.load("Best_in_transfer_learning_CNN.pt"))

test_loss_3, test_acc_3 = test_1(model,device, test_loader, test_dataset)
torch.save(model.state_dict(), "Test_result_for_transfer_learning.pt")

print(f"Test model was saved with Accuracy: {test_acc_3:.2f}% and Avverage Loss: {test_loss_3:.6f} ")

writer_3.close()


Epoch 1/10


Training epoch: 1: 100%|██████████| 32/32 [01:50<00:00,  3.45s/it]


==> Epoch 1 Completed: Average loss: 4.602927	Accuracy: 2.843% 


Valodation: 100%|██████████| 32/32 [01:40<00:00,  3.14s/it]


==> Validation Completed: Average Loss: 4.307562	Accuracy: 15.59%	F-1 Score: 0.1206
The best model was saved with accuracy: 15.59%
Epoch 2/10


Training epoch: 2: 100%|██████████| 32/32 [01:48<00:00,  3.40s/it]


==> Epoch 2 Completed: Average loss: 4.205982	Accuracy: 15.784% 


Valodation: 100%|██████████| 32/32 [01:43<00:00,  3.24s/it]


==> Validation Completed: Average Loss: 3.974060	Accuracy: 33.33%	F-1 Score: 0.2914
The best model was saved with accuracy: 33.33%
Epoch 3/10


Training epoch: 3: 100%|██████████| 32/32 [01:55<00:00,  3.60s/it]


==> Epoch 3 Completed: Average loss: 3.844655	Accuracy: 33.627% 


Valodation: 100%|██████████| 32/32 [01:39<00:00,  3.10s/it]


==> Validation Completed: Average Loss: 3.645155	Accuracy: 47.06%	F-1 Score: 0.4340
The best model was saved with accuracy: 47.06%
Epoch 4/10


Training epoch: 4: 100%|██████████| 32/32 [01:48<00:00,  3.40s/it]


==> Epoch 4 Completed: Average loss: 3.469561	Accuracy: 51.373% 


Valodation: 100%|██████████| 32/32 [01:38<00:00,  3.07s/it]


==> Validation Completed: Average Loss: 3.365293	Accuracy: 52.45%	F-1 Score: 0.4857
The best model was saved with accuracy: 52.45%
Epoch 5/10


Training epoch: 5: 100%|██████████| 32/32 [01:48<00:00,  3.41s/it]


==> Epoch 5 Completed: Average loss: 3.164243	Accuracy: 62.157% 


Valodation: 100%|██████████| 32/32 [01:37<00:00,  3.05s/it]


==> Validation Completed: Average Loss: 3.112697	Accuracy: 58.82%	F-1 Score: 0.5542
The best model was saved with accuracy: 58.82%
Epoch 6/10


Training epoch: 6: 100%|██████████| 32/32 [01:48<00:00,  3.40s/it]


==> Epoch 6 Completed: Average loss: 2.904836	Accuracy: 75.392% 


Valodation: 100%|██████████| 32/32 [01:54<00:00,  3.59s/it]


==> Validation Completed: Average Loss: 3.075557	Accuracy: 61.08%	F-1 Score: 0.5798
The best model was saved with accuracy: 61.08%
Epoch 7/10


Training epoch: 7: 100%|██████████| 32/32 [01:54<00:00,  3.58s/it]


==> Epoch 7 Completed: Average loss: 2.863517	Accuracy: 79.608% 


Valodation: 100%|██████████| 32/32 [01:36<00:00,  3.02s/it]


==> Validation Completed: Average Loss: 3.039001	Accuracy: 64.22%	F-1 Score: 0.6149
The best model was saved with accuracy: 64.22%
Epoch 8/10


Training epoch: 8: 100%|██████████| 32/32 [01:47<00:00,  3.35s/it]


==> Epoch 8 Completed: Average loss: 2.825759	Accuracy: 82.255% 


Valodation: 100%|██████████| 32/32 [01:36<00:00,  3.03s/it]


==> Validation Completed: Average Loss: 3.014183	Accuracy: 64.12%	F-1 Score: 0.6119
Epoch 9/10


Training epoch: 9: 100%|██████████| 32/32 [01:46<00:00,  3.34s/it]


==> Epoch 9 Completed: Average loss: 2.796082	Accuracy: 83.039% 


Valodation: 100%|██████████| 32/32 [01:36<00:00,  3.00s/it]


==> Validation Completed: Average Loss: 2.988901	Accuracy: 66.27%	F-1 Score: 0.6351
The best model was saved with accuracy: 66.27%
Epoch 10/10


Training epoch: 10: 100%|██████████| 32/32 [01:45<00:00,  3.29s/it]


==> Epoch 10 Completed: Average loss: 2.763920	Accuracy: 83.725% 


Valodation: 100%|██████████| 32/32 [01:34<00:00,  2.94s/it]
  model.load_state_dict(torch.load("Best_in_transfer_learning_CNN.pt"))


==> Validation Completed: Average Loss: 2.976717	Accuracy: 64.90%	F-1 Score: 0.6196


Test: 100%|██████████| 193/193 [30:44<00:00,  9.56s/it]   

==>Test Completed: Avverage loss: 3.104836	Accuracy: 62.86%	F-1 Score: 0.6136
Test model was saved with Accuracy: 62.86% and Avverage Loss: 3.104836 





**Conclusion:**
We can see that thanks to pretrained `resnet50` model we got `Accuracy = 62.86%` on test set. As u remember well in the begining on baseline model we got less then `20%` of accuracy. 
Why it happend u may ask. Cause we've used pretrained model. That was trained on a vary large dataset and used same architecture with same weights on our data. We just changed final Fully Connected Layer, cause in our task we have 102 classes