# Efficient Net - Repurposing/Finetuning
## Introduction

This notebook is an attempt to repurpose and finetune an EfficientNet model to the task of American Sign Language detection for the DSPRO2 project at HSLU.

## Setup
In this section all the necessary libraries are imported.

In [1]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [2]:
import wandb
import torch
import torch.nn as nn
import torchvision.models as visionmodels
import torchvision.transforms as transforms
import lightning as L

from lightning.pytorch.loggers import WandbLogger

import nbformat

import os

# Our own modules
from datapipeline.asl_image_data_module import ASLImageDataModule
from models.asl_model import ASLModel
from models.training import sweep, train

In [3]:
os.environ["WANDB_NOTEBOOK_NAME"] = "./dspro2/efficientnet.ipynb"

## Preprocessing
No general data preprocessing is necessary, however there will be random transforms applied to the images during training. The images are resized to 224x224 pixels, which is the input size of the EfficientNet model. The images are also normalized using the mean and standard deviation of the ImageNet dataset, which is the dataset on which the EfficientNet model was pretrained.

The following cells will show the loading of the dataset and the preparation of the mentioned transforms.

In [4]:
PATH = "/exchange/dspro2/silent-speech/ASL_Pictures_Dataset"

In [5]:
img_size = 224

# See https://pytorch.org/vision/master/auto_examples/transforms/plot_transforms_illustrations.html#sphx-glr-auto-examples-transforms-plot-transforms-illustrations-py
# for more examples of transforms

# Open Idea: Grayscale for anti bias


data_transforms = transforms.Compose([
    transforms.Resize((img_size, img_size)),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1), # Idea: ColorJitter for anti bias
    transforms.RandomRotation(degrees=5),
    transforms.RandomPerspective(distortion_scale=0.5, p=0.5),
    transforms.RandomAffine(0, shear=10, scale=(0.8, 1.2)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # ImageNet stats
])

In [6]:
datamodule = ASLImageDataModule(path=PATH, transforms=data_transforms, val_split_folder="Validation", batch_size=32, num_workers=128)

## Models

In [7]:
NUM_CLASSES = 28

In [8]:
class ASLEfficientNetRepurpose(nn.Module):
    def __init__(self, efficientnet_model: visionmodels.efficientnet.EfficientNet, dropout: float = 0.2, num_classes: int = NUM_CLASSES):
        super().__init__()
        self.model = efficientnet_model
        self.model.requires_grad_(False)
        self.model.classifier = nn.Sequential(
            nn.Dropout(dropout),
            nn.Linear(self.model.classifier[1].in_features, num_classes)
        )

    def forward(self, x):
        return self.model(x)

In [9]:
class ASLEfficientNetFinetune(ASLEfficientNetRepurpose):
    def __init__(self, efficientnet_model: visionmodels.efficientnet.EfficientNet, dropout: float = 0.2, unfreeze_features: int = 1, num_classes: int = NUM_CLASSES):
        super().__init__(efficientnet_model, dropout, num_classes)

        assert unfreeze_features > 0, "unfreeze_features must be greater than 0"
        assert unfreeze_features <= len(self.model.features), "unfreeze_features must be less than or equal to the number of features in the model"

        self.model.features[-unfreeze_features:].requires_grad_(True)

## Training

In [10]:
TUNE_TYPE = "tune_type"
EFFICIENTNET_MODEL = "efficientnet_model"

DROPOUT = "dropout"

NAME = "name"

In [11]:
def get_pretrained_efficientnet_model(model_type: str):
    if model_type == "b0":
        efficientnet_model = visionmodels.efficientnet_b0(weights=visionmodels.EfficientNet_B0_Weights.DEFAULT)
    elif model_type == "b1":
        efficientnet_model = visionmodels.efficientnet_b1(weights=visionmodels.EfficientNet_B1_Weights.DEFAULT)
    elif model_type == "b2":
        efficientnet_model = visionmodels.efficientnet_b2(weights=visionmodels.EfficientNet_B2_Weights.DEFAULT)
    elif model_type == "b3":
        efficientnet_model = visionmodels.efficientnet_b3(weights=visionmodels.EfficientNet_B3_Weights.DEFAULT)
    elif model_type == "b4":
        efficientnet_model = visionmodels.efficientnet_b4(weights=visionmodels.EfficientNet_B4_Weights.DEFAULT)
    elif model_type == "b5":
        efficientnet_model = visionmodels.efficientnet_b5(weights=visionmodels.EfficientNet_B5_Weights.DEFAULT)
    elif model_type == "b6":
        efficientnet_model = visionmodels.efficientnet_b6(weights=visionmodels.EfficientNet_B6_Weights.DEFAULT)
    elif model_type == "b7":
        efficientnet_model = visionmodels.efficientnet_b7(weights=visionmodels.EfficientNet_B7_Weights.DEFAULT)

    return efficientnet_model

In [12]:
def get_asl_efficientnet_model(type: str, efficientnet_model: visionmodels.efficientnet.EfficientNet, dropout: float, unfreeze_features: int = 1) -> nn.Module:
    if type == "repurpose":
        model = ASLEfficientNetRepurpose(efficientnet_model, dropout=dropout)
    elif type == "finetune":
        model = ASLEfficientNetFinetune(efficientnet_model, dropout=dropout, unfreeze_features=unfreeze_features)
    else:
        raise ValueError(f"Invalid model type: {type}")

    return model

In [13]:
OPTIMIZER = "optimizer"
LEARNING_RATE = "learning_rate"
WEIGHT_DECAY = "weight_decay"
MOMENTUM = "momentum"


def get_optimizer(optimizer_params: dict, model: nn.Module):
    optimizer = optimizer_params[NAME]
    learning_rate = optimizer_params[LEARNING_RATE]
    weight_decay = optimizer_params[WEIGHT_DECAY]

    if optimizer == "adam":
        return torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
    elif optimizer == "adamw":
        return torch.optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
    elif optimizer == "rmsprop":
        momentum = optimizer_params[MOMENTUM]
        return torch.optim.RMSprop(model.parameters(), lr=learning_rate, weight_decay=weight_decay, momentum=momentum)

In [14]:
LEARNING_RATE_SCHEDULER = "learning_rate_scheduler"
GAMMA = "gamma"
STEP_SIZE = "step_size"
FACTOR = "factor"


def get_learning_rate_scheduler(learning_rate_scheduler_params: dict, optimizer: torch.optim.Optimizer):
    learning_rate_scheduler = learning_rate_scheduler_params[NAME]
    if learning_rate_scheduler == "None":
        return None

    if learning_rate_scheduler == "step":
        step_size = learning_rate_scheduler_params[STEP_SIZE]
        gamma = learning_rate_scheduler_params[GAMMA]
        return torch.optim.lr_scheduler.StepLR(optimizer, step_size=step_size, gamma=gamma)
    elif learning_rate_scheduler == "exponential":
        gamma = learning_rate_scheduler_params[GAMMA]
        return torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=gamma)
    elif learning_rate_scheduler == "constant":
        factor = learning_rate_scheduler_params[FACTOR]
        return torch.optim.lr_scheduler.ConstantLR(optimizer, factor=factor)

In [15]:
run_id = 0
SEED = 42


def train_efficient_net():
    global run_id
    run_id += 1

    L.seed_everything(SEED)

    wandb.init(name=f"efficientnet-{run_id}")

    wandb_logger = WandbLogger(log_model=True)

    # TODO: A lot of this could become a library

    config = wandb.config
    efficientnet_model = get_pretrained_efficientnet_model(config[EFFICIENTNET_MODEL])

    run_type = config[TUNE_TYPE]
    model = get_asl_efficientnet_model(run_type, efficientnet_model, dropout=config[DROPOUT], unfreeze_features=2)

    optimizer_params = config[OPTIMIZER]
    optimizer = get_optimizer(optimizer_params, model)

    learning_rate_scheduler_params = config[LEARNING_RATE_SCHEDULER]
    scheduler = get_learning_rate_scheduler(learning_rate_scheduler_params, optimizer)

    asl_model = ASLModel(model=model, criterion=nn.CrossEntropyLoss(), optimizer=optimizer, lr_scheduler=scheduler)

    train(
        model=asl_model,
        datamodule=datamodule,
        logger=wandb_logger,
        seed=SEED
    )

In [16]:
sweep_config = {
    "method": "random",
    "metric": {
        "name": f"{ASLModel.VALID_ACCURACY}",
        "goal": "maximize"
    },
    "early_terminate": {
        "type": "hyperband",
        "min_iter": 5
    },
    "parameters": {
        TUNE_TYPE: {
            "values": ["repurpose"] #, "finetune"]
        },
        EFFICIENTNET_MODEL: {
            "values": ["b0", "b1"] #, "b2", "b3"]
        },
        DROPOUT: {
            "min": 0.1,
            "max": 0.5
        },
        OPTIMIZER: {
            "parameters": {
                NAME: {
                    "values": ["adam", "adamw", "rmsprop"]
                },
                LEARNING_RATE: {
                    "min": 1e-5,
                    "max": 1e-2,
                    "distribution": "log_uniform_values"
                },
                WEIGHT_DECAY: {
                    "min": 0,
                    "max": 1e-3,
                },
                MOMENTUM: {
                    "min": 0.8,
                    "max": 0.99
                }
            }
        },
        LEARNING_RATE_SCHEDULER: {
            "parameters": {
                NAME: {
                    "values": ["None", "step", "exponential", "constant"]
                },
                STEP_SIZE: {
                    "min": 1,
                    "max": 10
                },
                GAMMA: {
                    "min": 0.1,
                    "max": 0.9
                },
                FACTOR: {
                    "min": 0.1,
                    "max": 0.5,
                }
            }
        }
    }
}

In [None]:
sweep(sweep_config, 10, train_efficient_net)

[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


Create sweep with ID: yualbs7m
Sweep URL: https://wandb.ai/dspro2-silent-speech/silent-speech/sweeps/yualbs7m


[34m[1mwandb[0m: Agent Starting Run: v0dxzle1 with config:
[34m[1mwandb[0m: 	dropout: 0.4080004129219553
[34m[1mwandb[0m: 	efficientnet_model: b0
[34m[1mwandb[0m: 	learning_rate_scheduler: {'factor': 0.3813333361356862, 'gamma': 0.7275501907434817, 'name': 'constant', 'step_size': 6}
[34m[1mwandb[0m: 	optimizer: {'learning_rate': 1.1579640831711996e-05, 'momentum': 0.9782010003214552, 'name': 'adamw', 'weight_decay': 3.534708758912542e-05}
[34m[1mwandb[0m: 	tune_type: repurpose
Seed set to 42
[34m[1mwandb[0m: Currently logged in as: [33mv8-luky[0m ([33mdspro2-silent-speech[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA A16') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
/opt/conda/lib/python3.12/site-packages/lightning/pytorch/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                     | Params | Mode 
--------------------------------------------------------------------
0 | model          | ASLEfficientNetRepurpose | 4.0 M  | train
1 | criteri

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved. New best score: 0.035
Metric train_accuracy improved. New best score: 0.014


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.038
Metric train_accuracy improved by 0.019 >= min_delta = 0.0. New best score: 0.033


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.041


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.004 >= min_delta = 0.0. New best score: 0.045


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.008 >= min_delta = 0.0. New best score: 0.053


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.058
Metric train_accuracy improved by 0.007 >= min_delta = 0.0. New best score: 0.040


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.012 >= min_delta = 0.0. New best score: 0.070


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.010 >= min_delta = 0.0. New best score: 0.080


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.015 >= min_delta = 0.0. New best score: 0.095


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.015 >= min_delta = 0.0. New best score: 0.110


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.013 >= min_delta = 0.0. New best score: 0.123
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.041


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.011 >= min_delta = 0.0. New best score: 0.135
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.044


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.014 >= min_delta = 0.0. New best score: 0.149
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.045


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.014 >= min_delta = 0.0. New best score: 0.163
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.048


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.017 >= min_delta = 0.0. New best score: 0.181
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.050


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.183
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.052


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.004 >= min_delta = 0.0. New best score: 0.187
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.055


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.017 >= min_delta = 0.0. New best score: 0.204
Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.055


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.007 >= min_delta = 0.0. New best score: 0.211
Metric train_accuracy improved by 0.004 >= min_delta = 0.0. New best score: 0.059


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.212
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.061


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.015 >= min_delta = 0.0. New best score: 0.227
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.064


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.064


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.012 >= min_delta = 0.0. New best score: 0.239
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.066


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.067


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.008 >= min_delta = 0.0. New best score: 0.247
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.069


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.248
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.071


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.007 >= min_delta = 0.0. New best score: 0.254
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.072


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.257
Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.073


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.075


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.014 >= min_delta = 0.0. New best score: 0.271
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.077


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.273
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.079


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.005 >= min_delta = 0.0. New best score: 0.278
Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.079


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.081


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.009 >= min_delta = 0.0. New best score: 0.287
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.082


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.082


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.084


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.011 >= min_delta = 0.0. New best score: 0.298
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.086


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.086


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.004 >= min_delta = 0.0. New best score: 0.302
Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.087


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.088


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.089


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.008 >= min_delta = 0.0. New best score: 0.310
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.091


Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.092


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.092


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.093


Validation: |          | 0/? [00:00<?, ?it/s]

Monitored metric valid_accuracy did not improve in the last 5 records. Best score: 0.310. Signaling Trainer to stop.
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.094


0,1
epoch,▁▁▁▁▂▂▂▃▃▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▆▆▆▆▇▇▇█
lr-AdamW,▁▁▁▁▁███████████████████████████████████
lr-AdamW-momentum,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-AdamW-weight_decay,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train_accuracy_epoch,▁▃▂▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇██████
train_accuracy_step,▂▅▁▂▁▁▇▁▂▁▃▂▁▆▂▁▁▃▁▅▁▂▁▁▁▅▃▁█▅▁▁▁▂▁▄▇▁▁▅
train_loss_epoch,█▄▄▄▄▃▄▄▄▄▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁
train_loss_step,█▅▅▄▄▄▆▄▆▄▄▁▆▆▇▆▄▇▄▅▆▅▃▄▃▆▇▃▅▅▄▂▆▄▃▂▃▄▆▆
trainer/global_step,▂▁▁▃▁▂▂▂▅▂▂▂▂▂▂▂▂▂▃▃▃▃▃█▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▅
valid_accuracy_epoch,▁▁▁▁▁▂▂▃▃▃▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇▇▇█████████

0,1
epoch,46.0
lr-AdamW,1e-05
lr-AdamW-momentum,0.9
lr-AdamW-weight_decay,4e-05
train_accuracy_epoch,0.09421
train_accuracy_step,0.09375
train_loss_epoch,3.24302
train_loss_step,3.05685
trainer/global_step,146733.0
valid_accuracy_epoch,0.30902


[34m[1mwandb[0m: Agent Starting Run: hmgpqr7e with config:
[34m[1mwandb[0m: 	dropout: 0.17138116088622649
[34m[1mwandb[0m: 	efficientnet_model: b0
[34m[1mwandb[0m: 	learning_rate_scheduler: {'factor': 0.21845290803683823, 'gamma': 0.40059128257938825, 'name': 'step', 'step_size': 1}
[34m[1mwandb[0m: 	optimizer: {'learning_rate': 0.0005751594692431429, 'momentum': 0.9215841094632776, 'name': 'adamw', 'weight_decay': 0.0002486823612655453}
[34m[1mwandb[0m: 	tune_type: repurpose
Seed set to 42


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/opt/conda/lib/python3.12/site-packages/lightning/pytorch/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                     | Params | Mode 
--------------------------------------------------------------------
0 | model          | ASLEfficientNetRepurpose | 4.0 M  | train
1 | criterion      | CrossEntropyLoss         | 0      | train
2 | train_accuracy | MulticlassAccuracy       | 0      | train
3 | valid_accuracy | MulticlassAccuracy       | 0      | train
4 | test_accuracy  | MulticlassAccuracy       | 0      | train
--------------------------------------------------------------------
35.9 K    Trainable params
4.0 M     Non-

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved. New best score: 0.044
Metric train_accuracy improved. New best score: 0.820


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.004 >= min_delta = 0.0. New best score: 0.049


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.009 >= min_delta = 0.0. New best score: 0.058


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.043 >= min_delta = 0.0. New best score: 0.101


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.083 >= min_delta = 0.0. New best score: 0.184


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.042 >= min_delta = 0.0. New best score: 0.226
Monitored metric train_accuracy did not improve in the last 5 records. Best score: 0.820. Signaling Trainer to stop.


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▂▂▂▂▄▄▄▄▄▄▅▅▅▅▇▇▇▇▇▇▇▇████████
lr-AdamW,███████▄▄▄▄▄▄▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-AdamW-momentum,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-AdamW-weight_decay,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train_accuracy_epoch,█▇▅▂▁▁
train_accuracy_step,█▅▁██████▁█▁█▁███▁▁▁▁▁█▁███▁▁▁▁▄▁▁▁▁▁▁▁▁
train_loss_epoch,▁▁▅▇██
train_loss_step,▁▁▁▁▁▅█▁▁▁▁█▁▁▅▁▅▂▁▃▁▅▃▃▃▄▃▃▂▄▃▃▃▃▃▄▃▃▃▃
trainer/global_step,▁▁▁▁▁▁▃▁▁▁▂▂▂▂▂▂▂▅▂▂▂▂▂▂▂▃▃▃▃▃▃█▃▃▃▃▃▃▃▃
valid_accuracy_epoch,▁▁▂▃▆█

0,1
epoch,5.0
lr-AdamW,1e-05
lr-AdamW-momentum,0.9
lr-AdamW-weight_decay,0.00025
train_accuracy_epoch,0.05475
train_accuracy_step,0.09375
train_loss_epoch,3.38633
train_loss_step,3.02582
trainer/global_step,18731.0
valid_accuracy_epoch,0.2259


[34m[1mwandb[0m: Agent Starting Run: by8rmfx6 with config:
[34m[1mwandb[0m: 	dropout: 0.3957734738050327
[34m[1mwandb[0m: 	efficientnet_model: b0
[34m[1mwandb[0m: 	learning_rate_scheduler: {'factor': 0.15105690330700564, 'gamma': 0.6146870780047069, 'name': 'step', 'step_size': 2}
[34m[1mwandb[0m: 	optimizer: {'learning_rate': 0.001288325392959786, 'momentum': 0.919629549540921, 'name': 'rmsprop', 'weight_decay': 9.04223471227994e-05}
[34m[1mwandb[0m: 	tune_type: repurpose
Seed set to 42


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/opt/conda/lib/python3.12/site-packages/lightning/pytorch/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                     | Params | Mode 
--------------------------------------------------------------------
0 | model          | ASLEfficientNetRepurpose | 4.0 M  | train
1 | criterion      | CrossEntropyLoss         | 0      | train
2 | train_accuracy | MulticlassAccuracy       | 0      | train
3 | valid_accuracy | MulticlassAccuracy       | 0      | train
4 | test_accuracy  | MulticlassAccuracy       | 0      | train
--------------------------------------------------------------------
35.9 K    Trainable params
4.0 M     Non-

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved. New best score: 0.037
Metric train_accuracy improved. New best score: 0.824


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.038


Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.038


Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Monitored metric train_accuracy did not improve in the last 5 records. Best score: 0.824. Signaling Trainer to stop.


0,1
epoch,▁▁▁▁▁▁▁▁▁▂▂▂▂▂▄▄▄▄▄▄▅▅▅▅▅▅▇▇▇▇▇▇▇███████
lr-RMSprop,███████████████▄▄▄▄▄▄▄▄▄▄▄▄▄▄▁▁▁▁▁▁▁▁▁▁▁
lr-RMSprop-momentum,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-RMSprop-weight_decay,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train_accuracy_epoch,█▃▁▃▁▃
train_accuracy_step,███▁███▁█▁▁███████▁█████▁██▁████████████
train_loss_epoch,▆█▄▄▁▁
train_loss_step,▁▁▁▁██▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▅▁▁▁▁▁▁▁▁▂▁▁▄▂▁▁▁▁
trainer/global_step,▁▁▁▁▁▁▁▁▃▁▂▂▂▄▂▂▂▂▂▂▆▂▂▂▂▂▃▃▃▆▃▃▃▃█▃▃▃▃▃
valid_accuracy_epoch,▃▇▃█▁▂

0,1
epoch,5.0
lr-RMSprop,0.00049
lr-RMSprop-momentum,0.91963
lr-RMSprop-weight_decay,9e-05
train_accuracy_epoch,0.80161
train_accuracy_step,1.0
train_loss_epoch,10.68452
train_loss_step,0.0
trainer/global_step,18731.0
valid_accuracy_epoch,0.0368


[34m[1mwandb[0m: Agent Starting Run: jgnxahgo with config:
[34m[1mwandb[0m: 	dropout: 0.40418130331998714
[34m[1mwandb[0m: 	efficientnet_model: b1
[34m[1mwandb[0m: 	learning_rate_scheduler: {'factor': 0.4205647724754341, 'gamma': 0.822455561906846, 'name': 'exponential', 'step_size': 2}
[34m[1mwandb[0m: 	optimizer: {'learning_rate': 4.190582618087689e-05, 'momentum': 0.9453285146767646, 'name': 'adam', 'weight_decay': 0.0006505026665179994}
[34m[1mwandb[0m: 	tune_type: repurpose
Seed set to 42


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/opt/conda/lib/python3.12/site-packages/lightning/pytorch/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                     | Params | Mode 
--------------------------------------------------------------------
0 | model          | ASLEfficientNetRepurpose | 6.5 M  | train
1 | criterion      | CrossEntropyLoss         | 0      | train
2 | train_accuracy | MulticlassAccuracy       | 0      | train
3 | valid_accuracy | MulticlassAccuracy       | 0      | train
4 | test_accuracy  | MulticlassAccuracy       | 0      | train
--------------------------------------------------------------------
35.9 K    Trainable params
6.5 M     Non-

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved. New best score: 0.077
Metric train_accuracy improved. New best score: 0.025


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.062 >= min_delta = 0.0. New best score: 0.139
Metric train_accuracy improved by 0.040 >= min_delta = 0.0. New best score: 0.065


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.064 >= min_delta = 0.0. New best score: 0.203


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.055 >= min_delta = 0.0. New best score: 0.258


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.043 >= min_delta = 0.0. New best score: 0.301


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.022 >= min_delta = 0.0. New best score: 0.324
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.066


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.029 >= min_delta = 0.0. New best score: 0.353
Metric train_accuracy improved by 0.005 >= min_delta = 0.0. New best score: 0.071


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.004 >= min_delta = 0.0. New best score: 0.074


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.021 >= min_delta = 0.0. New best score: 0.373
Metric train_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.080


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.009 >= min_delta = 0.0. New best score: 0.383
Metric train_accuracy improved by 0.005 >= min_delta = 0.0. New best score: 0.085


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.009 >= min_delta = 0.0. New best score: 0.391
Metric train_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.091


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.391
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.093


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.393
Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.096


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.398
Metric train_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.102


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.005 >= min_delta = 0.0. New best score: 0.107


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.401
Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.109


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.000 >= min_delta = 0.0. New best score: 0.401
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.111


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.408
Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.113


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.003 >= min_delta = 0.0. New best score: 0.116


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.116


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.006 >= min_delta = 0.0. New best score: 0.414


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.118


Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.002 >= min_delta = 0.0. New best score: 0.120


Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric train_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.121


Validation: |          | 0/? [00:00<?, ?it/s]

Monitored metric valid_accuracy did not improve in the last 5 records. Best score: 0.414. Signaling Trainer to stop.


0,1
epoch,▁▁▁▁▁▁▂▂▂▂▂▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇██
lr-Adam,█▇▇▇▇▆▆▆▅▄▃▃▃▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-Adam-momentum,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-Adam-weight_decay,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train_accuracy_epoch,▁▄▃▃▃▄▄▅▅▅▆▆▆▇▇▇▇▇████████
train_accuracy_step,▂▁▁▃▄▁▂▆▁▂▂▁▁▃▂▁▆▂▂▁▁▄▂▃▁▅▃▄▃▁▃█▆▄▂▆▃▃▃▃
train_loss_epoch,█▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train_loss_step,▁▃█▃▃▃▂▂▃▃▃▂▂▂▁▂▂▂▂▂▁▃▂▂▂▂▂▂▂▂▂▃▃▂▂▁▃▂▂▂
trainer/global_step,▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▅▃▃▃▃▃▃▃▃█▃▃▄▄
valid_accuracy_epoch,▁▂▄▅▆▆▇▇▇▇████████████████

0,1
epoch,25.0
lr-Adam,0.0
lr-Adam-momentum,0.9
lr-Adam-weight_decay,0.00065
train_accuracy_epoch,0.12063
train_accuracy_step,0.0625
train_loss_epoch,3.24564
train_loss_step,3.30182
trainer/global_step,81171.0
valid_accuracy_epoch,0.40165


[34m[1mwandb[0m: Agent Starting Run: 24ke0805 with config:
[34m[1mwandb[0m: 	dropout: 0.23816209954497733
[34m[1mwandb[0m: 	efficientnet_model: b0
[34m[1mwandb[0m: 	learning_rate_scheduler: {'factor': 0.24043020306948895, 'gamma': 0.4291725038510038, 'name': 'exponential', 'step_size': 8}
[34m[1mwandb[0m: 	optimizer: {'learning_rate': 0.00020590397173720729, 'momentum': 0.9753472422131564, 'name': 'rmsprop', 'weight_decay': 0.0005953443109458515}
[34m[1mwandb[0m: 	tune_type: repurpose
Seed set to 42


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/opt/conda/lib/python3.12/site-packages/lightning/pytorch/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                     | Params | Mode 
--------------------------------------------------------------------
0 | model          | ASLEfficientNetRepurpose | 4.0 M  | train
1 | criterion      | CrossEntropyLoss         | 0      | train
2 | train_accuracy | MulticlassAccuracy       | 0      | train
3 | valid_accuracy | MulticlassAccuracy       | 0      | train
4 | test_accuracy  | MulticlassAccuracy       | 0      | train
--------------------------------------------------------------------
35.9 K    Trainable params
4.0 M     Non-

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved. New best score: 0.044
Metric train_accuracy improved. New best score: 0.402


Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.001 >= min_delta = 0.0. New best score: 0.045
Monitored metric train_accuracy did not improve in the last 5 records. Best score: 0.402. Signaling Trainer to stop.


0,1
epoch,▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▄▄▄▄▅▅▅▅▅▅▇▇▇▇▇▇▇▇▇██████
lr-RMSprop,██████████▄▄▄▄▄▄▄▄▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-RMSprop-momentum,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr-RMSprop-weight_decay,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
train_accuracy_epoch,█▆▆▅▂▁
train_accuracy_step,██▁█▁▁▅█▁█▁▁▁▁▁▁█▁█▁▁▁███▁▁▁▁█▁▁▁▁▁▁▁▁▁▁
train_loss_epoch,█▅▃▁▁▁
train_loss_step,█▁▁█▄▅▁▆▇▄▄▂▂▂▂▃▁▂▂▁▂▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
trainer/global_step,▁▂▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▆▂▂▃▃▃▃▃▃▃▃▇█▃▃▃
valid_accuracy_epoch,▇▁▁▁▇█

0,1
epoch,5.0
lr-RMSprop,0.0
lr-RMSprop-momentum,0.97535
lr-RMSprop-weight_decay,0.0006
train_accuracy_epoch,6e-05
train_accuracy_step,0.0
train_loss_epoch,5.44769
train_loss_step,4.00414
trainer/global_step,18731.0
valid_accuracy_epoch,0.04479


[34m[1mwandb[0m: Agent Starting Run: 34qctx9g with config:
[34m[1mwandb[0m: 	dropout: 0.39192856465128056
[34m[1mwandb[0m: 	efficientnet_model: b1
[34m[1mwandb[0m: 	learning_rate_scheduler: {'factor': 0.39803760604193605, 'gamma': 0.5245179379083522, 'name': 'None', 'step_size': 10}
[34m[1mwandb[0m: 	optimizer: {'learning_rate': 0.00025727726907342184, 'momentum': 0.9098549546937296, 'name': 'rmsprop', 'weight_decay': 0.0009300368193308384}
[34m[1mwandb[0m: 	tune_type: repurpose
Seed set to 42


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/opt/conda/lib/python3.12/site-packages/lightning/pytorch/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                     | Params | Mode 
--------------------------------------------------------------------
0 | model          | ASLEfficientNetRepurpose | 6.5 M  | train
1 | criterion      | CrossEntropyLoss         | 0      | train
2 | train_accuracy | MulticlassAccuracy       | 0      | train
3 | valid_accuracy | MulticlassAccuracy       | 0      | train
4 | test_accuracy  | MulticlassAccuracy       | 0      | train
--------------------------------------------------------------------
35.9 K    Trainable params
6.5 M     Non-

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved. New best score: 0.049
Metric train_accuracy improved. New best score: 0.714


Validation: |          | 0/? [00:00<?, ?it/s]

Metric valid_accuracy improved by 0.037 >= min_delta = 0.0. New best score: 0.086


Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

## Evaluation