# PetFinder PoolFormer PyTorch Trainer + PoolFormer + fp16 + Mixup + KFolds + W&B Tracking ✨

<!-- ![](https://d3tfjnq35srlo8.cloudfront.net/uploads/2020/11/0061269_PETFINDER-FOUNDATION-SUB-PAGE-02.jpg)
 -->
![](https://user-images.githubusercontent.com/15921929/142746124-1ab7635d-2536-4a0e-ad43-b4fe2c5a525d.png)

This notebook features a modular PyTorch Trainer with Support for CutMix augmentation along with Apex. The model I'm training here is PoolFormer, which was announced recently in [this](https://arxiv.org/abs/2111.11418) paper.

I've tried my best to make this PyTorch training script as modular and fault-tolerant as possible and it doesn't generally throw errors or breaks, should you forget to pass an argument or two.

Then again, if there's some bug or improvements that you notice, please do tell me in the comments and I'll have them fixed in the next commit.

### Please leave an upvote if you found this kernel helpful!

In [1]:
%%sh
pip install -q wandb
git clone --quiet https://github.com/sail-sg/poolformer.git
pip install -q git+https://github.com/rwightman/pytorch-image-models.git@9d6aad44f8fd32e89e5cca503efe3ada5071cc2a



In [2]:
import sys
sys.path.append('./poolformer')

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import os
import cv2
import timm
from timm.models import load_checkpoint

import models
from PIL import Image
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.transforms as T
from torch.cuda.amp import autocast, GradScaler
from torch.utils.data import Dataset, DataLoader

import gc
import wandb
import warnings
from tqdm.notebook import tqdm

from sklearn.metrics import mean_squared_error
from sklearn.model_selection import StratifiedKFold

warnings.simplefilter('ignore')

If for semantic segmentation, please install mmsegmentation first
If for detection, please install mmdetection first


In [4]:
Config = {
    'CSV_PATH': "../input/petfinder-pawpularity-score/train.csv",
    'IMG_PATH': "../input/petfinder-pawpularity-score/train",
    'N_ACCUM': 2,
    'N_SPLITS': 5,
    'TRAIN_BS': 64,
    'VALID_BS': 64,
    'N_EPOCHS': 5,
    'NUM_WORKERS': 4,
    'LR': 1e-5,
    'OPTIM': "AdamW",
    'LOSS': "BCELogits",
    'ARCH': "../input/poolformerweights/poolformer_m36.pth.tar",
    'IMG_SIZE': 224,
    'DEVICE': "cuda",
    "T_0": 20,
    "η_min": 1e-4,
    'infra': "Kaggle",
    'competition': 'petfinder',
    '_wandb_kernel': 'tanaym',
    "wandb": True,
}

## About W&B:
<center><img src="https://i.imgur.com/gb6B4ig.png" width="400" alt="Weights & Biases"/></center><br>
<p style="text-align:center">WandB is a developer tool for companies turn deep learning research projects into deployed software by helping teams track their models, visualize model performance and easily automate training and improving models.
We will use their tools to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.<br><br></p>

To login to W&B, you can use below snippet.

```python
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
wb_key = user_secrets.get_secret("WANDB_API_KEY")

wandb.login(key=wb_key)
```
Make sure you have your W&B key stored as `WANDB_API_KEY` under Add-ons -> Secrets

You can view [this](https://www.kaggle.com/ayuraj/experiment-tracking-with-weights-and-biases) notebook to learn more about W&B tracking.

If you don't want to login to W&B, the kernel will still work and log everything to W&B in anonymous mode.

In [5]:
# Start W&B logging
if Config['wandb']:
    run = wandb.init(
        project='pytorch',
        config=Config,
        group='vision',
        job_type='train',
        anonymous='must'
    )

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: wandb version 0.12.7 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade

CondaEnvException: Unable to determine environment

Please re-run this command with one of the following options:

* Provide an environment name via --name or -n
* Re-run this command inside an activated conda environment.



In [6]:
# Some utility functions
def wandb_log(**kwargs):
    """
    Logs a key-value pair to W&B
    """
    for k, v in kwargs.items():
        wandb.log({k: v})

def rmse(output, target):
    """
    Returns root mean squared error loss
    """
    return mean_squared_error(output, target, squared=False)

<h1 align='center' style='color: #8532a8; font-size: 1.5em; font-weight: 300; font-size: 32px'>1. Dataset Class</h1>

In [7]:
class PetfinderData(Dataset):
    def __init__(self, df, config=Config, augments=None, is_test=False):
        self.df = df
        self.augments = augments
        self.is_test = is_test
        self.config = config
        
        self.img_paths = self._get_img_paths(self.df, self.config)
        self.meta_feats = self._get_meta_feats(self.df, self.is_test)

    def __getitem__(self, idx):
        img = cv2.imread(self.img_paths[idx])
        img = cv2.resize(img, (Config['IMG_SIZE'], Config['IMG_SIZE']))
        meta_feats = torch.tensor(self.meta_feats.iloc[idx].values).float()

        if self.augments:
            img = Image.fromarray(img)
            img = self.augments(img)
        
        if self.is_test:
            return (img, meta_feats)
        else:
            target = torch.tensor(self.df['Pawpularity'].iloc[idx]).float()
            return (img, meta_feats, target)
    
    def __len__(self):
        return len(self.df)

    def _get_img_paths(self, df, config):
        """
        Returns the image paths in a list
        """
        imgs = df['Id'].apply(lambda x: os.path.join(config['IMG_PATH'], x + ".jpg")).tolist()
        return imgs
    
    def _get_meta_feats(self, df, is_test):
        """
        Returns the meta features in a df
        """
        if self.is_test:
            meta = self.df.drop(['Id'], axis=1)
            return meta
        else:
            meta = self.df.drop(['Id', 'Pawpularity'], axis=1)
            return meta

<h1 align='center' style='color: #8532a8; font-size: 1.5em; font-weight: 300; font-size: 32px'>2. Model Class</h1>

## About PoolFormer (MetaFormer)

Recent works have shown that the Attention-based module in transformers can be replaced by spatial MLPs and the resulted models still perform quite well. Based on this observation, it is hypothesized that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model’s performance. 

To verify this, the attention module in transformers is replaced with a simple spatial pooling operator to conduct only the most basic token mixing. The derived model, termed as **PoolFormer**, achieves competitive performance on multiple computer vision tasks.

Link to the [original paper](https://arxiv.org/pdf/2111.11418.pdf) and their GitHub [repo](https://github.com/sail-sg/poolformer).

In [8]:
class RegressionHeadModel(nn.Module):
    def __init__(self, backbone_arch, pretrained=True, in_chans=3):
        super(RegressionHeadModel, self).__init__()
        self.backbone = models.poolformer_m36(pretrained=pretrained)
        load_checkpoint(model=self.backbone, checkpoint_path=backbone_arch)
        self.backbone.head = nn.Linear(self.backbone.head.in_features, 128)
        self.drop = nn.Dropout(0.3)
        self.fc1 = nn.Linear(140, 64)
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 1)
    
    def forward(self, img, meta):
        emb = self.backbone(img)
        x = self.drop(emb)
        x = torch.cat([x, meta], dim=1)
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)

        return x

<h1 align='center' style='color: #8532a8; font-size: 1.5em; font-weight: 300; font-size: 32px'>3. Mixup and Augmentations</h1>

In [9]:
def mixup_augmentation(x:torch.Tensor, y:torch.Tensor, alpha:float = 1.0):
    """
    Function which performs Mixup augmentation
    """
    assert alpha > 0, "Alpha must be greater than 0"
    assert x.shape[0] > 1, "Need more than 1 sample to apply mixup"

    lam = np.random.beta(alpha, alpha)
    rand_idx = torch.randperm(x.shape[0])
    mixed_x = lam * x + (1 - lam) * x[rand_idx, :]

    target_a, target_b = y, y[rand_idx]

    return mixed_x, target_a, target_b, lam

In [10]:
class Augments:
    IMAGENET_MEAN = [0.485, 0.456, 0.406]  # RGB
    IMAGENET_STD = [0.229, 0.224, 0.225]  # RGB
    train_augments = T.Compose(
            [
                T.RandomHorizontalFlip(),
                T.RandomVerticalFlip(),
                T.RandomAffine(15, translate=(0.1, 0.1), scale=(0.9, 1.1)),
                T.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1),
                T.ToTensor(),
                T.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
            ]
        )
    valid_augments = T.Compose(
            [
                T.ToTensor(),
                T.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
            ]
        )

<h1 align='center' style='color: #8532a8; font-size: 1.5em; font-weight: 300; font-size: 32px'>4. Trainer Class</h1>

<center>I've decided to take my old signature trainer class and add more utilities and functions to it in order to make it much better, efficient and flexible.</center>

In [11]:
class Trainer:
    def __init__(self, config, dataloaders, optimizer, model, loss_fns, scheduler, device="cuda:0", apex=False):
        self.train_loader, self.valid_loader = dataloaders
        self.train_loss_fn, self.valid_loss_fn = loss_fns
        self.scheduler = scheduler
        self.optimizer = optimizer
        self.model = model
        self.device = torch.device(device)
        self.apex = apex
        self.config = config

    def train_one_epoch(self):
        """
        Trains the model for 1 epoch
        """
        if self.apex:
            scaler = GradScaler()

        self.model.train()
        train_pbar = tqdm(enumerate(self.train_loader), total=len(self.train_loader))
        train_preds, train_labels = [], []
        running_loss = 0

        for bnum, data_cache in train_pbar:
            img = self._convert_if_not_tensor(data_cache[0], dtype=torch.float32)
            meta = self._convert_if_not_tensor(data_cache[1], dtype=torch.float32)
            target = self._convert_if_not_tensor(data_cache[2], dtype=torch.float32)
            target = target / 100.0

            bs = img.shape[0]

            # Support of Apex with Mixup 🛠️
            if self.apex:
                # Mixup - allowed
                if torch.randn(1)[0] < 0.5:
                    mix_img, tar_a, tar_b, lam = mixup_augmentation(img, target, alpha=0.5)
                    
                    with autocast(enabled=True):
                        output = self.model(mix_img, meta).squeeze()
                        
                        # Mixup loss calculation
                        loss_a = self.train_loss_fn(output, tar_a)
                        loss_b = self.train_loss_fn(output, tar_b)
                        loss = loss_a * lam + (1 - lam) * loss_b

                        loss = loss / self.config['N_ACCUM']
                    scaler.scale(loss).backward()

                    if (bnum + 1) % self.config['N_ACCUM'] == 0:
                        scaler.step(self.optimizer)
                        scaler.update()
                        optimizer.zero_grad()

                        if self.scheduler:
                            self.scheduler.step()
                    running_loss += (loss.item() * bs)
                
                # Mixup - not allowed
                else:
                    with autocast(enabled=True):
                        output = self.model(img, meta).squeeze()
                        loss = self.train_loss_fn(output, target)
                        loss = loss / self.config['N_ACCUM']
                    scaler.scale(loss).backward()

                    if (bnum + 1) % self.config['N_ACCUM'] == 0:
                        scaler.step(self.optimizer)
                        scaler.update()
                        optimizer.zero_grad()

                        if self.scheduler:
                            self.scheduler.step()
                    running_loss += (loss.item() * bs)
            # No Apex
            else:
                # Mixup - allowed
                if torch.randn(1)[0] < 0.5:
                    mix_img, tar_a, tar_b, lam = mixup_augmentation(img, target, alpha=0.5)
                    output = self.model(mix_img, meta).squeeze()
                    
                    # Mixup loss calculation
                    loss_a = self.train_loss_fn(output, tar_a)
                    loss_b = self.train_loss_fn(output, tar_b)
                    loss = loss_a * lam + (1 - lam) * loss_b
                    
                    loss = loss / self.config['N_ACCUM']
                    loss.backward()

                    if (bnum + 1) % self.config['N_ACCUM'] == 0:
                        self.optimizer.step()
                        optimizer.zero_grad()

                        if self.scheduler:
                            self.scheduler.step()
                    running_loss += (loss.item() * bs)
                
                # Mixup - not allowed
                else:
                    output = self.model(img, meta).squeeze()
                    loss = self.train_loss_fn(output, target)
                    loss = loss / self.config['N_ACCUM']
                    loss.backward()

                    if (bnum + 1) % self.config['N_ACCUM'] == 0:
                        self.optimizer.step()
                        optimizer.zero_grad()

                        if self.scheduler:
                            self.scheduler.step()
                    running_loss += (loss.item() * bs)

            train_pbar.set_description(desc=f"loss: {loss.item():.4f}")
            running_loss /= len(self.train_loader)

            # Rescale the targets and output before chugging in a matrix
            output = output.sigmoid().detach() * 100.0
            target = target.detach() * 100.0
            train_preds += [output.cpu().numpy()]
            train_labels += [target.cpu().numpy()]
        
        all_train_preds = np.concatenate(train_preds)
        all_train_labels = np.concatenate(train_labels)
        
        # Tidy
        del output, target, train_preds, train_labels, loss, img, meta, all_train_preds, all_train_labels
        gc.collect()
        torch.cuda.empty_cache()
        
        return running_loss

    @torch.no_grad()
    def valid_one_epoch(self):
        """
        Validates the model for 1 epoch
        """
        self.model.eval()
        valid_pbar = tqdm(enumerate(self.valid_loader), total=len(self.valid_loader))
        valid_preds, valid_targets = [], []

        for idx, cache in valid_pbar:
            img = self._convert_if_not_tensor(cache[0], dtype=torch.float32)
            meta = self._convert_if_not_tensor(cache[1], dtype=torch.float32)
            target = self._convert_if_not_tensor(cache[2], dtype=torch.float32)
            target = target / 100.0

            output = self.model(img, meta).squeeze()
            valid_loss = torch.sqrt(self.valid_loss_fn(output, target))

            valid_pbar.set_description(desc=f"val_loss: {valid_loss.item():.4f}")

            output = output.sigmoid().detach() * 100.0
            target = target.detach() * 100.0

            valid_preds += [output.cpu().numpy()]
            valid_targets += [target.cpu().numpy()]

        all_valid_preds = np.concatenate(valid_preds)
        all_valid_targets = np.concatenate(valid_targets)

        total_valid_loss = rmse(all_valid_targets, all_valid_preds)
        
        # Tidy
        del img, meta, target, valid_preds, valid_targets, all_valid_targets, output, valid_loss
        gc.collect()
        torch.cuda.empty_cache()
        
        return total_valid_loss, all_valid_preds

    def fit(self, fold: str, epochs: int = 10, output_dir: str = "/kaggle/working/", custom_name: str = 'model.pth'):
        """
        Low-effort alternative for doing the complete training and validation process
        """
        best_loss = int(1e+7)
        best_preds = None
        for epx in range(epochs):
            print(f"{'='*20} Epoch: {epx+1} / {epochs} {'='*20}")

            train_running_loss = self.train_one_epoch()
            print(f"Training loss: {train_running_loss:.4f}")

            valid_loss, preds = self.valid_one_epoch()
            print(f"Validation loss: {valid_loss:.4f}")

            if valid_loss < best_loss:
                best_loss = valid_loss
                self.save_model(output_dir, custom_name)
                print(f"Saved model with val_loss: {best_loss:.4f}")
                best_preds = preds
            
            # Log
            if Config['wandb']:
                wandb_log(
                    train_loss=train_running_loss,
                    val_loss=valid_loss
                )
        return best_preds
            
    def save_model(self, path, name, verbose=False):
        """
        Saves the model at the provided destination
        """
        try:
            if not os.path.exists(path):
                os.makedirs(path)
        except:
            print("Errors encountered while making the output directory")

        torch.save(self.model.state_dict(), os.path.join(path, name))
        if verbose:
            print(f"Model Saved at: {os.path.join(path, name)}")

    def _convert_if_not_tensor(self, x, dtype):
        if self._tensor_check(x):
            return x.to(self.device, dtype=dtype)
        else:
            return torch.tensor(x, dtype=dtype, device=self.device)

    def _tensor_check(self, x):
        return isinstance(x, torch.Tensor)

<h1 align='center' style='color: #8532a8; font-size: 1.5em; font-weight: 300; font-size: 32px'>5. Training Cell</h1>
<center>
    The below cell is where the main training happens.
</center>

In [12]:
if __name__ == '__main__':
    kf = StratifiedKFold(n_splits=Config['N_SPLITS'])
    train_file = pd.read_csv(Config['CSV_PATH'])
    
    for fold_, (train_idx, valid_idx) in enumerate(kf.split(X=train_file, y=train_file['Pawpularity'])):
        print(f"{'='*40} Fold: {fold_+1} / {Config['N_SPLITS']} {'='*40}")
        
        train_ = train_file.loc[train_idx]
        valid_ = train_file.loc[valid_idx]
        
        train_set = PetfinderData(
            df = train_,
            config = Config,
            augments = Augments.train_augments
        )
        valid_set = PetfinderData(
            df = valid_,
            config = Config,
            augments = Augments.valid_augments
        )
        
        train_loader = DataLoader(
            train_set,
            batch_size = Config['TRAIN_BS'],
            shuffle = True,
            num_workers = Config['NUM_WORKERS'],
            pin_memory = True
        )
        
        valid_loader = DataLoader(
            valid_set,
            batch_size = Config['VALID_BS'],
            shuffle = False,
            num_workers = Config['NUM_WORKERS'],
        )
        
        model = RegressionHeadModel(backbone_arch=Config['ARCH'])
        model = model.to(torch.device(Config['DEVICE']))
        if Config['wandb']:
            wandb.watch(model)
            
        optimizer = torch.optim.AdamW(model.parameters(), lr=Config['LR'])
        scheduler = optim.lr_scheduler.CosineAnnealingWarmRestarts(
            optimizer, 
            T_0=Config['T_0'], 
            eta_min=Config['η_min']
        )
        train_lfn, valid_lfn = nn.BCEWithLogitsLoss(), nn.BCEWithLogitsLoss()
        
        trainer = Trainer(
            config = Config,
            dataloaders=(train_loader, valid_loader),
            loss_fns=(train_lfn, valid_lfn),
            optimizer=optimizer,
            model = model,
            scheduler=scheduler,
            apex=True
        )
        
        best_pred = trainer.fit(
            fold = fold_,
            epochs = Config['N_EPOCHS'],
            custom_name = f"poolformer_s36_fold_{fold_}_model.bin"
        )
        
        valid_['preds'] = best_pred
        valid_.to_csv(f"fold_{fold_}_oof_df.csv", index=None)



  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1499


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.2163
Saved model with val_loss: 19.2163


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1528


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.7811
Saved model with val_loss: 18.7811


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1531


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.5028
Saved model with val_loss: 18.5028


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1475


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.5157


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1513


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.1453
Saved model with val_loss: 18.1453


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1471


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.5356
Saved model with val_loss: 19.5356


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1485


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.8079
Saved model with val_loss: 18.8079


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1504


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.4868
Saved model with val_loss: 18.4868


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1509


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.2097
Saved model with val_loss: 18.2097


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1488


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.0794
Saved model with val_loss: 18.0794


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1558


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.6753
Saved model with val_loss: 19.6753


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1531


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.9793
Saved model with val_loss: 18.9793


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1551


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.4080
Saved model with val_loss: 18.4080


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1494


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.1810
Saved model with val_loss: 18.1810


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1476


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.3744


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1545


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.5940
Saved model with val_loss: 19.5940


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1561


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.1728
Saved model with val_loss: 19.1728


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1565


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.7390
Saved model with val_loss: 18.7390


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1528


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.6719
Saved model with val_loss: 18.6719


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1538


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.3868
Saved model with val_loss: 18.3868


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1599


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.5721
Saved model with val_loss: 19.5721


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1485


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 19.0555
Saved model with val_loss: 19.0555


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1563


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.5755
Saved model with val_loss: 18.5755


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1586


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.4000
Saved model with val_loss: 18.4000


  0%|          | 0/124 [00:00<?, ?it/s]

Training loss: 0.1509


  0%|          | 0/31 [00:00<?, ?it/s]

Validation loss: 18.3452
Saved model with val_loss: 18.3452


In [13]:
# Code taken from https://www.kaggle.com/ayuraj/interactive-eda-using-w-b-tables

# Finish the logging run
if Config['wandb']:
    run.finish()

VBox(children=(Label(value=' 0.00MB of 0.00MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
train_loss,▃▄▄▁▃▁▂▃▃▂▆▄▅▂▁▅▆▆▄▅█▂▆▇▃
val_loss,▆▄▃▃▁▇▄▃▂▁█▅▂▁▂█▆▄▄▂█▅▃▂▂

0,1
train_loss,0.15093
val_loss,18.34522


<center>
<img src="https://img.shields.io/badge/Upvote-If%20you%20like%20my%20work-07b3c8?style=for-the-badge&logo=kaggle">
</center>