# Notebook 5.1: Recommendations - How dropout affect models

Dropout is a common regularization method in machine learning, especially for deep learning models. It prevents overfitting, which will result in poor performance on unseen data. Overfitting reduces a model's generalizability and effectiveness with new sample introduced.

During training, dropout randomly "deactivates" a portion of connections in a neural network. This occurs at every training iteration, with the probability of a neuron being dropped determined by the dropout rate. Consequently, the model learns multiple representations. It then will not depend too much on certain neurons or connections, preventing overfitting.

In this notebook, our goal is to remove the dropout layers from the model in Section 3.3 and compare the performance differences to better understand dropout's significance.

## How dropout affect models
* [Remove dropout layers, and retrain model](#rem)
    * [Remove dropout layers](#rem)
    * [Retrain the model](#train)
* [The result](#vis)
    * [Visualization](#vis)
    * [Conclusion](#conc)

In [1]:
from torch import nn, utils
from transformers import BertTokenizer, BertModel
from torch.optim import Adam
from tqdm import tqdm
import numpy as np
import torch
import pandas as pd
import seaborn as sns
from torch.utils.tensorboard import SummaryWriter
from sklearn.metrics import accuracy_score

  import pandas.util.testing as tm


### Load the Bert Tokenizer

In [2]:
tokenizer = BertTokenizer.from_pretrained('./distilbert-base-uncased', local_files_only=True)


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DistilBertTokenizer'. 
The class this function is called from is 'BertTokenizer'.


As the definition from huggingface's website https://huggingface.co/distilbert-base-uncased, DistilBERT is a small, fast, cheap and light Transformer model which was **pretrained** on the same corpus in a **self-supervised** fashion, using the BERT base model as a teacher. This means it was pretrained on the raw texts only, with no humans labelling them in any way.

This makes it suitable for our model development to focus on oue lyrics as we are trying to investigate its relationship with the popularity of the song.

## Load the dataset

In [3]:
df = pd.read_csv("./positive_and_negative_one_hot.csv")
df = df.dropna()
df


Unnamed: 0,artist,year,views,features,lyrics,id,url,acousticness,danceability,duration_ms,...,key_8,key_9,key_10,key_11,tag_country,tag_misc,tag_pop,tag_rap,tag_rb,tag_rock
0,AKING,2015,4.432273e-05,{},Glorious mistakes are anxiously waiting to be ...,985583,https://open.spotify.com/track/30sr35axWFPOvmi...,0.760040,0.806517,0.144170,...,0,0,0,0,0,0,1,0,0,0
1,Filip Winther,2020,1.251733e-06,{},[Intro]\nDe-de-deluxe\n\n[Refräng]\nJag fuckar...,5097257,https://open.spotify.com/track/4mznGf6tTvHp74y...,0.020681,0.894094,0.141797,...,0,0,0,1,0,0,0,1,0,0
2,Dan Reeder,2018,1.513459e-05,{},The guy who bathes in the pond at the park\nTh...,3407076,https://open.spotify.com/track/1UbSSyqIVEkooKe...,0.993976,0.554990,0.044422,...,0,0,0,0,0,0,1,0,0,0
3,Noa Azazel,2021,1.251733e-06,{},[Pre-Chorus]\nWhen the moon is taking over i'm...,7061926,https://open.spotify.com/track/51F8whLH1Qou7iV...,0.214858,0.419552,0.169140,...,0,0,0,0,0,0,1,0,0,0
4,070 Phi,2019,2.031221e-05,{},[Chorus]\nAin't no way that you ain't eatin' w...,4241387,https://open.spotify.com/track/0mvzUwvyLT1Dm1y...,0.367469,0.695519,0.146753,...,0,0,0,0,0,0,0,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9529,mounika yadav,2021,1.257423e-05,"{""Allu Arjun"",""Rashmika Mandanna""}",నువ్ అమ్మీ అమ్మీ అంటాంటే నీ పెళ్ళాన్నైపోయినట్ట...,7552375,https://open.spotify.com/track/4ZUxhQNRCzlh6al...,0.360441,0.821792,0.161581,...,0,0,0,0,0,0,1,0,0,0
9530,d-metal stars,2016,1.706909e-07,{},[Verse 1]\nThe seaweed is always greener\nIn s...,7558599,https://open.spotify.com/track/0F8nLktPi0SgOAm...,0.000092,0.542770,0.154411,...,1,0,0,0,0,0,0,0,0,1
9531,grupo firme,2021,2.048290e-06,{Maluma},"Dejen de meterse ya, en donde no les importa\n...",7728445,https://open.spotify.com/track/5BE9B2FiFWBbBdo...,0.137549,0.719959,0.142190,...,0,0,0,0,0,0,1,0,0,0
9532,hensonn,2021,7.567295e-06,{},[Instrumental],7814578,https://open.spotify.com/track/6nqdgUTiWt4JbAB...,0.146585,0.626273,0.122640,...,0,0,0,0,0,0,0,1,0,0


In [4]:
np.random.seed(6666)
df_train, df_val, df_test = np.split(df.sample(frac=1, random_state=88), 
                                     [int(.6*len(df)), int(.8*len(df))])

In [5]:
class Dataset(utils.data.Dataset):

    def __init__(self, df):

        self.ys = df['if_popular'].to_numpy()
        self.texts = [tokenizer(text, 
                               padding='max_length', max_length = 512, truncation=True,
                                return_tensors="pt") for text in df['lyrics']]
        self.features = df[['key_0','key_1','key_2','key_3','key_4','key_5','key_6','key_7','key_8','key_9','key_10','key_11','tag_country','tag_misc','tag_pop','tag_rap','tag_rb','tag_rock','year', 'views','acousticness','danceability','duration_ms','energy','instrumentalness','liveness','loudness','speechiness','tempo','valence','popularity']].to_numpy()
        self.df = df

    def linear(self):
        return self.ys

    def __len__(self):
        return len(self.ys)

    def get_batch_labels(self, idx):
        # Fetch a batch of labels
        #print("Total len:", len(self.ys), " getting:", idx)
        return self.ys[idx]

    def get_batch_texts(self, idx):
        # Fetch a batch of inputs
        return self.texts[idx]
    
    def get_batch_freatures(self, idx):
        # Fetch a batch of inputs
        return self.features[idx]

    def __getitem__(self, idx):

        batch_texts = self.get_batch_texts(idx)
        batch_y = self.get_batch_labels(idx)
        batch_feature = self.get_batch_freatures(idx)

        return batch_texts, batch_y, batch_feature

<br>

# Remove dropout layers <a name = "rem"> </a>

The model below is identical to the model in 3.3, however, all dropout layers are removed.

In [6]:
class BertClassifier(nn.Module):

    def __init__(self, dropout=0.5):

        super(BertClassifier, self).__init__()

        self.bert = BertModel.from_pretrained('./distilbert-base-uncased', local_files_only=True)
        self.dropout = nn.Dropout(dropout)
        self.linear1 = nn.Linear(768, 64)
        self.linear2 = nn.Linear(64, 32)
        self.linear3 = nn.Linear(63, 16)
        self.layer_out = nn.Linear(16, 1) 
        self.batchnorm1 = nn.BatchNorm1d(64)
        self.batchnorm2 = nn.BatchNorm1d(32)
        self.batchnorm3 = nn.BatchNorm1d(16)
        self.relu = nn.ReLU()

    def forward(self, input_id, mask, features):

        _, pooled_output = self.bert(input_ids= input_id, attention_mask=mask,return_dict=False)
        # dropout_output = self.dropout(pooled_output)
        linear_output1 = self.relu(self.linear1(pooled_output))
        linear_output1 = self.batchnorm1(linear_output1)
        linear_output2 = self.relu(self.linear2(linear_output1))
        linear_output2 = self.batchnorm2(linear_output2)
        # linear_output2 = self.dropout(linear_output2)
        linear_output3 = self.relu(self.linear3(torch.cat((linear_output2, features), dim=1)))
        linear_output3 = self.batchnorm3(linear_output3)
        # linear_output3 = self.dropout(linear_output3)
        final_layer = self.layer_out(linear_output3)

        return final_layer

In [7]:
def get_accuracy(y_true, y_prob):
    accuracy = accuracy_score(y_true, y_prob > 0.5)
    return accuracy

## Retrain the model <a name = "train"> </a>

Dropout is a regularization technique used in neural networks to reduce overfitting. In this model, we removed dropout and tested its effect.

The model below is mostly identical to the training process in 3.3 with no dropout layers.

In [8]:
def train(model, train_data, val_data, learning_rate, epochs):

    train, val = Dataset(train_data), Dataset(val_data)
    batch_size = 32

    train_dataloader = torch.utils.data.DataLoader(train, batch_size=batch_size, shuffle=True)
    val_dataloader = torch.utils.data.DataLoader(val, batch_size=batch_size, shuffle=True)

    use_cuda = torch.cuda.is_available()
    device = torch.device("cuda")
    use_cuda = True

    criterion = nn.BCEWithLogitsLoss()
    optimizer = Adam(model.parameters(), lr= learning_rate)

    if use_cuda:

            model = model.cuda()
            criterion = criterion.cuda()

    for epoch_num in range(epochs):

            total_cnt_train = 0
            total_loss_train = 0
            train_acc = 0
            train_acc_cnt = 0

            for train_input, train_label, train_features in tqdm(train_dataloader):

                train_label = train_label.to(device)
                mask = train_input['attention_mask'].to(device)
                input_id = train_input['input_ids'].squeeze(1).to(device)
                train_features = train_features.to(torch.float32).to(device)

                output = model(input_id, mask, train_features)
                
                # print("Output1: ", output, " Output2: ", train_label.float().unsqueeze(1), " loss: " , criterion(output, train_label.float()))
                
                batch_loss = criterion(output, train_label.float().unsqueeze(1))
                total_loss_train += batch_loss.item()
                # if total_cnt_train == 5:
                #     print("Train LOSS:", batch_loss.item())
                total_cnt_train += 1
                
                train_acc += get_accuracy(train_label.float().unsqueeze(1).cpu(), output.cpu())
                train_acc_cnt += 1
 
    
                model.zero_grad()
                batch_loss.backward()
                optimizer.step()
            
            total_cnt_val = 0
            total_loss_val = 0
            acc = 0
            acc_cnt = 0

            with torch.no_grad():

                for val_input, val_label, val_features in val_dataloader:

                    val_label = val_label.to(device)
                    mask = val_input['attention_mask'].to(device)
                    input_id = val_input['input_ids'].squeeze(1).to(device)
                    val_features = val_features.to(torch.float32).to(device)

                    output = model(input_id, mask, val_features)

                    batch_loss = criterion(output, val_label.float().unsqueeze(1))
                    total_loss_val += batch_loss.item()
                    # if total_cnt_val == 5:
                    #     print("Val LOSS:", batch_loss.item())
                    total_cnt_val += 1
                    # print("Loss: ", batch_loss, " Calc: ", sum((output - val_label.float().unsqueeze(1))**2) / batch_size)
                    acc += get_accuracy(val_label.float().unsqueeze(1).cpu(), output.cpu())
                    acc_cnt += 1

            
            print(
                f'Epochs: {epoch_num + 1} | \nTrain BCELoss: {total_loss_train / total_cnt_train: .3f} \
                | Val BCELoss: {total_loss_val / total_cnt_val: .3f}' +
                f'\nTrain Acc: {train_acc / train_acc_cnt: .3f} \
                | Val Acc: {acc / acc_cnt: .3f}'
            )
                  
            writer.add_scalar('Loss/train', total_loss_train / total_cnt_train, epoch_num)
            writer.add_scalar('Loss/val', total_loss_val / total_cnt_val, epoch_num)
            writer.add_scalar('Acc/train', train_acc / train_acc_cnt, epoch_num)
            writer.add_scalar('Acc/val', acc / acc_cnt, epoch_num)
            torch.save(model.state_dict(), "./Bert_classificationnoo_drop/BERT-CLASSIFICATION_it" + str(epoch_num) + ".pt")


EPOCHS = 50
model = BertClassifier()
LR = 3e-5
     
writer = SummaryWriter()
train(model, df_train, df_val, LR, EPOCHS)
writer.flush()


You are using a model of type distilbert to instantiate a model of type bert. This is not supported for all configurations of models and can yield errors.
Some weights of the model checkpoint at ./distilbert-base-uncased were not used when initializing BertModel: ['distilbert.transformer.layer.1.ffn.lin2.bias', 'distilbert.transformer.layer.0.attention.out_lin.bias', 'distilbert.transformer.layer.2.attention.k_lin.bias', 'distilbert.transformer.layer.5.output_layer_norm.bias', 'distilbert.transformer.layer.3.attention.q_lin.bias', 'distilbert.transformer.layer.0.attention.q_lin.bias', 'distilbert.transformer.layer.5.sa_layer_norm.bias', 'distilbert.transformer.layer.0.attention.k_lin.weight', 'distilbert.transformer.layer.5.attention.v_lin.bias', 'vocab_transform.bias', 'distilbert.transformer.layer.1.sa_layer_norm.bias', 'distilbert.transformer.layer.1.attention.out_lin.bias', 'distilbert.embeddings.LayerNorm.weight', 'distilbert.transformer.layer.4.sa_layer_norm.bias', 'distilbert.tr

Epochs: 1 | 
Train BCELoss:  0.560                 | Val BCELoss:  0.539
Train Acc:  0.686                 | Val Acc:  0.741


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 2 | 
Train BCELoss:  0.495                 | Val BCELoss:  0.522
Train Acc:  0.777                 | Val Acc:  0.753


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:26<00:00,  2.08it/s]


Epochs: 3 | 
Train BCELoss:  0.451                 | Val BCELoss:  0.547
Train Acc:  0.816                 | Val Acc:  0.730


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 4 | 
Train BCELoss:  0.406                 | Val BCELoss:  0.498
Train Acc:  0.863                 | Val Acc:  0.768


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 5 | 
Train BCELoss:  0.362                 | Val BCELoss:  0.487
Train Acc:  0.903                 | Val Acc:  0.783


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 6 | 
Train BCELoss:  0.329                 | Val BCELoss:  0.480
Train Acc:  0.939                 | Val Acc:  0.786


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 7 | 
Train BCELoss:  0.304                 | Val BCELoss:  0.491
Train Acc:  0.959                 | Val Acc:  0.779


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 8 | 
Train BCELoss:  0.274                 | Val BCELoss:  0.482
Train Acc:  0.980                 | Val Acc:  0.788


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 9 | 
Train BCELoss:  0.256                 | Val BCELoss:  0.478
Train Acc:  0.985                 | Val Acc:  0.798


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 10 | 
Train BCELoss:  0.242                 | Val BCELoss:  0.479
Train Acc:  0.989                 | Val Acc:  0.786


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 11 | 
Train BCELoss:  0.231                 | Val BCELoss:  0.476
Train Acc:  0.990                 | Val Acc:  0.797


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 12 | 
Train BCELoss:  0.219                 | Val BCELoss:  0.475
Train Acc:  0.993                 | Val Acc:  0.791


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 13 | 
Train BCELoss:  0.207                 | Val BCELoss:  0.491
Train Acc:  0.993                 | Val Acc:  0.778


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 14 | 
Train BCELoss:  0.200                 | Val BCELoss:  0.460
Train Acc:  0.992                 | Val Acc:  0.806


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 15 | 
Train BCELoss:  0.196                 | Val BCELoss:  0.494
Train Acc:  0.990                 | Val Acc:  0.783


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 16 | 
Train BCELoss:  0.185                 | Val BCELoss:  0.477
Train Acc:  0.992                 | Val Acc:  0.808


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 17 | 
Train BCELoss:  0.182                 | Val BCELoss:  0.471
Train Acc:  0.991                 | Val Acc:  0.787


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 18 | 
Train BCELoss:  0.192                 | Val BCELoss:  0.479
Train Acc:  0.982                 | Val Acc:  0.794


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 19 | 
Train BCELoss:  0.160                 | Val BCELoss:  0.500
Train Acc:  0.994                 | Val Acc:  0.773


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 20 | 
Train BCELoss:  0.154                 | Val BCELoss:  0.484
Train Acc:  0.995                 | Val Acc:  0.795


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 21 | 
Train BCELoss:  0.148                 | Val BCELoss:  0.512
Train Acc:  0.995                 | Val Acc:  0.786


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 22 | 
Train BCELoss:  0.141                 | Val BCELoss:  0.490
Train Acc:  0.995                 | Val Acc:  0.806


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 23 | 
Train BCELoss:  0.137                 | Val BCELoss:  0.499
Train Acc:  0.995                 | Val Acc:  0.801


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 24 | 
Train BCELoss:  0.136                 | Val BCELoss:  0.517
Train Acc:  0.994                 | Val Acc:  0.786


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 25 | 
Train BCELoss:  0.126                 | Val BCELoss:  0.519
Train Acc:  0.995                 | Val Acc:  0.790


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 26 | 
Train BCELoss:  0.124                 | Val BCELoss:  0.513
Train Acc:  0.995                 | Val Acc:  0.793


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 27 | 
Train BCELoss:  0.132                 | Val BCELoss:  0.508
Train Acc:  0.989                 | Val Acc:  0.786


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 28 | 
Train BCELoss:  0.125                 | Val BCELoss:  0.490
Train Acc:  0.992                 | Val Acc:  0.802


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 29 | 
Train BCELoss:  0.114                 | Val BCELoss:  0.509
Train Acc:  0.994                 | Val Acc:  0.804


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 30 | 
Train BCELoss:  0.106                 | Val BCELoss:  0.515
Train Acc:  0.995                 | Val Acc:  0.802


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 31 | 
Train BCELoss:  0.103                 | Val BCELoss:  0.497
Train Acc:  0.994                 | Val Acc:  0.805


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 32 | 
Train BCELoss:  0.131                 | Val BCELoss:  0.496
Train Acc:  0.985                 | Val Acc:  0.790


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 33 | 
Train BCELoss:  0.103                 | Val BCELoss:  0.488
Train Acc:  0.994                 | Val Acc:  0.803


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 34 | 
Train BCELoss:  0.093                 | Val BCELoss:  0.507
Train Acc:  0.995                 | Val Acc:  0.807


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 35 | 
Train BCELoss:  0.090                 | Val BCELoss:  0.525
Train Acc:  0.995                 | Val Acc:  0.802


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 36 | 
Train BCELoss:  0.101                 | Val BCELoss:  0.518
Train Acc:  0.989                 | Val Acc:  0.785


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 37 | 
Train BCELoss:  0.090                 | Val BCELoss:  0.547
Train Acc:  0.993                 | Val Acc:  0.785


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 38 | 
Train BCELoss:  0.079                 | Val BCELoss:  0.553
Train Acc:  0.995                 | Val Acc:  0.803


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 39 | 
Train BCELoss:  0.076                 | Val BCELoss:  0.568
Train Acc:  0.995                 | Val Acc:  0.801


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 40 | 
Train BCELoss:  0.076                 | Val BCELoss:  0.576
Train Acc:  0.995                 | Val Acc:  0.793


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 41 | 
Train BCELoss:  0.072                 | Val BCELoss:  0.587
Train Acc:  0.995                 | Val Acc:  0.794


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.09it/s]


Epochs: 42 | 
Train BCELoss:  0.069                 | Val BCELoss:  0.601
Train Acc:  0.995                 | Val Acc:  0.796


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 43 | 
Train BCELoss:  0.066                 | Val BCELoss:  0.577
Train Acc:  0.995                 | Val Acc:  0.802


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 44 | 
Train BCELoss:  0.066                 | Val BCELoss:  0.593
Train Acc:  0.995                 | Val Acc:  0.800


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 45 | 
Train BCELoss:  0.063                 | Val BCELoss:  0.598
Train Acc:  0.995                 | Val Acc:  0.804


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 46 | 
Train BCELoss:  0.059                 | Val BCELoss:  0.608
Train Acc:  0.995                 | Val Acc:  0.797


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 47 | 
Train BCELoss:  0.057                 | Val BCELoss:  0.606
Train Acc:  0.995                 | Val Acc:  0.801


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 48 | 
Train BCELoss:  0.065                 | Val BCELoss:  0.603
Train Acc:  0.992                 | Val Acc:  0.803


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 49 | 
Train BCELoss:  0.074                 | Val BCELoss:  0.637
Train Acc:  0.987                 | Val Acc:  0.798


100%|████████████████████████████████████████████████████████████████████████████████| 179/179 [01:25<00:00,  2.08it/s]


Epochs: 50 | 
Train BCELoss:  0.071                 | Val BCELoss:  0.609
Train Acc:  0.988                 | Val Acc:  0.806


<br>

# Visualization of Result <a name = "vis"> </a>

## Purple: With Dropout; Yellow: No Dropout


![Results](./5.1Result.png "Results")



## Conclusion <a name = "conc"> </a>

We trained the model without a Dropout Layer for 50 epochs.

When comparing the results to the model that includes dropout layers, we observed that although dropout may slightly slow down the training process, it leads to nearly a **5% increase** in **test set accuracy** and a substantially lower loss compared to models without dropout.

In conclusion, **integrating dropout layers greatly increased the performance of our model**. Dropout strengthens the **generalization abilities of deep learning models**, making it a crucial component in model development.
