# Prohledávání hyperparametrů pro model TinyViT nad datasetem CIFAR100 v původní i augmentované podobě

Tento notebook slouží k nalezení optimálních hyperparametrů nad datasetem CIFAR100 pro model TinyViT. Hyperparametry jsou hledány pro původní i augmentovaný dataset pro normální trénink i destilaci.

K prohledávání je využito knihovny Optuna s algoritmem Hyperband. Nejlepší konfigurace je volena na základě F1-skóre, zkoušeno je 150 kombinací hyperparametrů pro každou z variant.

## Import knihoven a definice metod

In [None]:
from transformers import Trainer, AutoModelForImageClassification
from torch.utils.data import ConcatDataset
import optuna
import torch
import math
import base
import os

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package punkt to /home/jovyan/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /home/jovyan/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!


In [None]:
dataset_part = base.get_dataset_part()

Resetování náhodného seedu pro replikovatelnost výsledků.

In [None]:
base.reset_seed()

Ověření dostupnosti GPU.

In [4]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("GPU is available and will be used:", torch.cuda.get_device_name(0))
else:
    device = torch.device("cpu")
    print("GPU is not available, using CPU.")

GPU is available and will be used: NVIDIA H100 PCIe


Načtení datasetu a aplikace základních a augmentačních transformací.

In [5]:
DATASET = "cifar100"

In [None]:
transform = base.base_transforms()

test = base.CustomCIFAR100L(root=f"{os.path.expanduser('~')}/data/100-logits", dataset_part=dataset_part.TEST, transform=transform)
train = base.CustomCIFAR100L(root=f"{os.path.expanduser('~')}/data/100-logits", dataset_part=dataset_part.TRAIN, transform=transform)
eval = base.CustomCIFAR100L(root=f"{os.path.expanduser('~')}/data/100-logits", dataset_part=dataset_part.EVAL, transform=transform)

In [7]:
augment_transform = base.aug_transforms()
train_aug = base.CustomCIFAR100L(root=f"{os.path.expanduser('~')}/data/100-logits", dataset_part=dataset_part.TRAIN, transform=augment_transform)

Provedení filtrace augmentovaného datasetu dle popsaného mechanismu.

In [8]:
train_aug = base.remove_diff_pred_class(train, train_aug, pytorch_dataset=True)
train_combo = ConcatDataset([train, train_aug])

Removing entries from augmented dataset that are different from the base one - based on saved logits:   0%|   …

Základní konfigurace tréninku během prohledávání. Optuna nepracuje s epochami, ale s kroky. Níže je prováděn přepočet. 

Minimální délka tréninku jsou dvě epochy, maximální sedm epoch. Maximální počet kroků pro warm up je nastaven na 10 % první epochy.

In [10]:
num_epochs = 7
batch_size = 128

In [None]:
data_length = len(train)
min_r = math.ceil(data_length/batch_size)*2
max_r = math.ceil(data_length/batch_size)*num_epochs
warm_up = math.ceil(data_length/batch_size/10)

## Prohledávání s normálním tréninkem nad původním datasetem
Definice hledaných hyperparametrů a jejich rozmezí.

In [12]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up)
    }   
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [13]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



In [14]:
base.reset_seed()

Konfigurace jednotlivých tréninků.

In [15]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/_hp-search", logging_dir=f"~/logs/{DATASET}/_hp-search", epochs=num_epochs, batch_size=batch_size)

Definice získání studentského modelu.

In [16]:
def get_model():
    return AutoModelForImageClassification.from_pretrained("timm/tiny_vit_5m_224.in1k", num_labels=100, ignore_mismatched_sizes=True)

Konfigurace trenéra pro jednotlivé tréninky. 

In [17]:
trainer = Trainer(
    args=training_args,
    train_dataset=train,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)
  

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [18]:
best_base = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Base",
    n_trials=150
)

[I 2025-03-30 23:22:16,562] A new study created in memory with name: Base


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.779,0.893963,0.7406,0.762904,0.7406,0.740171
2,0.5847,0.686867,0.7948,0.806792,0.7948,0.794618
3,0.3363,0.631952,0.8154,0.82496,0.8154,0.81535
4,0.1848,0.601714,0.8327,0.840475,0.8327,0.832641


[I 2025-03-30 23:25:51,435] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.0007875660249889869, 'weight_decay': 0.001, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8377,1.323822,0.627,0.676967,0.627,0.624342
2,0.9142,1.028047,0.6979,0.728427,0.6979,0.693776


[I 2025-03-30 23:27:34,014] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 6.533369619026643e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8476,1.418506,0.6928,0.702749,0.6928,0.684777
2,0.9645,0.8046,0.7875,0.790334,0.7875,0.785589
3,0.5353,0.64245,0.8194,0.821925,0.8194,0.818424
4,0.3389,0.594126,0.8289,0.831575,0.8289,0.828155


[I 2025-03-30 23:31:43,543] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.0013035123791853842, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3015,1.986411,0.4694,0.565807,0.4694,0.464106
2,1.3352,1.384514,0.6093,0.653202,0.6093,0.607543


[I 2025-03-30 23:33:28,176] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.002311294500510415, 'weight_decay': 0.002, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.2631,4.468088,0.0302,0.019052,0.0302,0.01275
2,4.4665,4.476647,0.0237,0.011655,0.0237,0.008009
3,4.4678,4.472987,0.028,0.010977,0.028,0.01164
4,4.4227,4.46489,0.0279,0.015813,0.0279,0.012699


[I 2025-03-30 23:37:08,297] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1902,0.954118,0.7507,0.76473,0.7507,0.749
2,0.6362,0.659894,0.8071,0.814535,0.8071,0.806063
3,0.3518,0.573815,0.832,0.8381,0.832,0.831791
4,0.1951,0.558544,0.8373,0.842324,0.8373,0.837618


[I 2025-03-30 23:40:44,631] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0003654769917956456, 'weight_decay': 0.003, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7216,0.927901,0.7293,0.757559,0.7293,0.731286
2,0.6271,0.716611,0.788,0.801179,0.788,0.787577
3,0.3754,0.660361,0.8061,0.816713,0.8061,0.806066
4,0.2094,0.619377,0.8282,0.837144,0.8282,0.828721
5,0.1032,0.608632,0.8384,0.843153,0.8384,0.838414
6,0.045,0.614578,0.8453,0.84854,0.8453,0.845463
7,0.0144,0.601354,0.8528,0.854332,0.8528,0.852923


[I 2025-03-30 23:47:00,065] Trial 6 finished with value: 0.8529231852358233 and parameters: {'learning_rate': 0.0003654769917956456, 'weight_decay': 0.003, 'warmup_steps': 20}. Best is trial 6 with value: 0.8529231852358233.


Trial 7 with params: {'learning_rate': 9.505122659935192e-05, 'weight_decay': 0.003, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3759,1.094031,0.7306,0.741568,0.7306,0.726646
2,0.7344,0.69384,0.8054,0.809823,0.8054,0.803854
3,0.4098,0.612366,0.8236,0.830343,0.8236,0.823176
4,0.2406,0.578185,0.8347,0.838824,0.8347,0.834828


[I 2025-03-30 23:50:27,192] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 0.00040842279473800845, 'weight_decay': 0.008, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6486,0.987115,0.7152,0.751005,0.7152,0.7143
2,0.6367,0.751581,0.7788,0.793991,0.7788,0.777392


[I 2025-03-30 23:52:05,919] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0005338741354740678, 'weight_decay': 0.006, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.659,1.04385,0.6955,0.73199,0.6955,0.693292
2,0.7299,0.870157,0.7459,0.768809,0.7459,0.745866
3,0.4491,0.723497,0.7881,0.801598,0.7881,0.787816
4,0.2578,0.694385,0.8056,0.81383,0.8056,0.80494


[I 2025-03-30 23:55:45,243] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 0.002185432916630353, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.2489,2.90452,0.257,0.315996,0.257,0.229047
2,2.2864,2.185203,0.4043,0.457858,0.4043,0.392626


[I 2025-03-30 23:57:24,074] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 9.356743672326617e-05, 'weight_decay': 0.002, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4626,1.110271,0.7273,0.737938,0.7273,0.723377
2,0.7494,0.708608,0.8026,0.808498,0.8026,0.801408
3,0.4155,0.614339,0.8237,0.828558,0.8237,0.823016
4,0.2474,0.576006,0.8332,0.837615,0.8332,0.833296
5,0.1461,0.562529,0.8419,0.844863,0.8419,0.841813
6,0.087,0.570106,0.8438,0.846328,0.8438,0.843974
7,0.0546,0.573907,0.8443,0.846421,0.8443,0.844314


[I 2025-03-31 00:03:32,306] Trial 11 finished with value: 0.8443144872246067 and parameters: {'learning_rate': 9.356743672326617e-05, 'weight_decay': 0.002, 'warmup_steps': 24}. Best is trial 6 with value: 0.8529231852358233.


Trial 12 with params: {'learning_rate': 6.396143199073769e-05, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8064,1.321046,0.7021,0.713952,0.7021,0.693367
2,0.8993,0.785251,0.7935,0.795455,0.7935,0.791946
3,0.5172,0.65567,0.8181,0.821487,0.8181,0.817311
4,0.3352,0.6065,0.8246,0.82783,0.8246,0.824446


[I 2025-03-31 00:07:04,618] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.00026209822023218914, 'weight_decay': 0.004, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7983,0.883772,0.7473,0.771828,0.7473,0.745917
2,0.5763,0.668878,0.8026,0.811887,0.8026,0.801244
3,0.3218,0.619267,0.8196,0.828914,0.8196,0.81949
4,0.1752,0.59254,0.8349,0.842279,0.8349,0.835891
5,0.0901,0.599627,0.8417,0.845324,0.8417,0.841351
6,0.0384,0.59888,0.8506,0.852854,0.8506,0.850707
7,0.0139,0.592991,0.8558,0.857275,0.8558,0.855751


[I 2025-03-31 00:13:05,678] Trial 13 finished with value: 0.8557511376950572 and parameters: {'learning_rate': 0.00026209822023218914, 'weight_decay': 0.004, 'warmup_steps': 25}. Best is trial 13 with value: 0.8557511376950572.


Trial 14 with params: {'learning_rate': 0.0008326079610628555, 'weight_decay': 0.003, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.925,1.457179,0.5915,0.651198,0.5915,0.584535
2,0.967,1.057602,0.6931,0.726042,0.6931,0.693134


[I 2025-03-31 00:14:50,072] Trial 14 pruned. 


Trial 15 with params: {'learning_rate': 0.00026678106258599537, 'weight_decay': 0.005, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.79,0.884841,0.7473,0.771935,0.7473,0.744441
2,0.5775,0.676669,0.8013,0.81457,0.8013,0.800958
3,0.3319,0.60575,0.8204,0.829156,0.8204,0.819838
4,0.1774,0.59531,0.8372,0.843342,0.8372,0.836955
5,0.0886,0.570395,0.8455,0.850042,0.8455,0.845706
6,0.0374,0.578429,0.8555,0.858021,0.8555,0.855625
7,0.0135,0.583886,0.857,0.858896,0.857,0.857244


[I 2025-03-31 00:20:51,612] Trial 15 finished with value: 0.8572441447836451 and parameters: {'learning_rate': 0.00026678106258599537, 'weight_decay': 0.005, 'warmup_steps': 28}. Best is trial 15 with value: 0.8572441447836451.


Trial 16 with params: {'learning_rate': 0.0001285943212265033, 'weight_decay': 0.005, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1866,0.956036,0.7488,0.761681,0.7488,0.746287
2,0.6323,0.651816,0.811,0.817542,0.811,0.809342
3,0.3441,0.57687,0.83,0.836101,0.83,0.83009
4,0.1888,0.578252,0.8311,0.83913,0.8311,0.831645


[I 2025-03-31 00:24:22,612] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 0.0003758393496244357, 'weight_decay': 0.006, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.747,0.989587,0.7167,0.743565,0.7167,0.715164
2,0.6389,0.774202,0.7721,0.790411,0.7721,0.77155
3,0.3759,0.671027,0.8039,0.814958,0.8039,0.803452
4,0.211,0.622382,0.8293,0.837216,0.8293,0.829731


[I 2025-03-31 00:27:48,686] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0026868566033176914, 'weight_decay': 0.01, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.5228,4.675963,0.0103,0.001135,0.0103,0.0019
2,4.6175,4.610219,0.0103,0.002718,0.0103,0.002407
3,4.5889,4.570062,0.0147,0.002616,0.0147,0.003183
4,4.5216,4.499813,0.0247,0.010189,0.0247,0.007838


[I 2025-03-31 00:31:14,303] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 0.00015377020257642402, 'weight_decay': 0.006, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0208,0.920757,0.7489,0.761374,0.7489,0.746058
2,0.5982,0.639108,0.8125,0.819391,0.8125,0.811934
3,0.3302,0.575404,0.8321,0.837549,0.8321,0.831476
4,0.1761,0.568036,0.836,0.842942,0.836,0.836667
5,0.0925,0.575776,0.8432,0.848732,0.8432,0.8436
6,0.046,0.587447,0.8466,0.849147,0.8466,0.846654
7,0.0226,0.588112,0.849,0.851151,0.849,0.849134


[I 2025-03-31 00:37:27,241] Trial 19 finished with value: 0.849134291610294 and parameters: {'learning_rate': 0.00015377020257642402, 'weight_decay': 0.006, 'warmup_steps': 19}. Best is trial 15 with value: 0.8572441447836451.


Trial 20 with params: {'learning_rate': 0.0005023510496184647, 'weight_decay': 0.003, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7562,1.054078,0.6988,0.730489,0.6988,0.698372
2,0.7151,0.830339,0.7547,0.777883,0.7547,0.755243
3,0.4423,0.720194,0.7915,0.804733,0.7915,0.790635
4,0.2622,0.649722,0.815,0.822989,0.815,0.81525


[I 2025-03-31 00:41:43,539] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 0.0005258456047758103, 'weight_decay': 0.004, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7293,1.045874,0.6995,0.728722,0.6995,0.69774
2,0.7273,0.823702,0.7606,0.777072,0.7606,0.760283


[I 2025-03-31 00:43:27,021] Trial 21 pruned. 


Trial 22 with params: {'learning_rate': 0.00033383970206117997, 'weight_decay': 0.003, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7194,0.90484,0.7412,0.759516,0.7412,0.739231
2,0.6098,0.7102,0.789,0.799256,0.789,0.786869
3,0.3521,0.679782,0.8035,0.817256,0.8035,0.803036
4,0.1917,0.607076,0.8311,0.838117,0.8311,0.830863
5,0.0963,0.597737,0.8425,0.847944,0.8425,0.842992
6,0.0421,0.586529,0.8499,0.854745,0.8499,0.850829
7,0.0143,0.582241,0.8568,0.858995,0.8568,0.85705


[I 2025-03-31 00:49:34,142] Trial 22 finished with value: 0.8570503317529787 and parameters: {'learning_rate': 0.00033383970206117997, 'weight_decay': 0.003, 'warmup_steps': 22}. Best is trial 15 with value: 0.8572441447836451.


Trial 23 with params: {'learning_rate': 0.00022018968597801466, 'weight_decay': 0.002, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8491,0.847524,0.7625,0.776644,0.7625,0.761073
2,0.5628,0.663794,0.8015,0.811812,0.8015,0.800027
3,0.3169,0.599278,0.8265,0.83373,0.8265,0.826596
4,0.1682,0.600142,0.831,0.84021,0.831,0.8322


[I 2025-03-31 00:53:09,151] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 0.0002018911464194001, 'weight_decay': 0.0, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8492,0.857236,0.7543,0.771805,0.7543,0.753097
2,0.562,0.648987,0.8079,0.81742,0.8079,0.806225
3,0.3139,0.590577,0.8292,0.83819,0.8292,0.828669
4,0.1653,0.57586,0.8365,0.842627,0.8365,0.836816
5,0.0846,0.576637,0.8444,0.848705,0.8444,0.844641
6,0.037,0.580746,0.8538,0.856085,0.8538,0.853747
7,0.015,0.58875,0.8537,0.855373,0.8537,0.853661


[I 2025-03-31 00:59:29,318] Trial 24 finished with value: 0.8536612858701628 and parameters: {'learning_rate': 0.0002018911464194001, 'weight_decay': 0.0, 'warmup_steps': 15}. Best is trial 15 with value: 0.8572441447836451.


Trial 25 with params: {'learning_rate': 0.0027693395374376512, 'weight_decay': 0.0, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.3396,4.890111,0.0136,0.007254,0.0136,0.00373
2,4.4829,4.52518,0.0235,0.014,0.0235,0.006684


[I 2025-03-31 01:01:14,938] Trial 25 pruned. 


Trial 26 with params: {'learning_rate': 0.000733540863652704, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8771,1.338746,0.6199,0.676315,0.6199,0.616843
2,0.8903,1.062644,0.6923,0.730959,0.6923,0.693558
3,0.5784,0.868441,0.7505,0.766891,0.7505,0.749966
4,0.3515,0.764605,0.7827,0.795845,0.7827,0.783631


[I 2025-03-31 01:04:36,654] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 8.730306074532542e-05, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5567,1.178177,0.7211,0.728073,0.7211,0.716107
2,0.7975,0.724502,0.8021,0.805397,0.8021,0.800564
3,0.446,0.622065,0.8216,0.82642,0.8216,0.82074
4,0.2688,0.578625,0.8319,0.83534,0.8319,0.831667


[I 2025-03-31 01:08:09,193] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 0.00020983616674931198, 'weight_decay': 0.004, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8805,0.871391,0.7562,0.77299,0.7562,0.753854
2,0.5706,0.664568,0.8036,0.814088,0.8036,0.802543
3,0.3219,0.606281,0.8244,0.831149,0.8244,0.823825
4,0.1699,0.58347,0.8362,0.843831,0.8362,0.836667
5,0.086,0.575506,0.8439,0.846675,0.8439,0.843707
6,0.0382,0.582511,0.8487,0.852527,0.8487,0.849205
7,0.0158,0.578121,0.8526,0.854236,0.8526,0.852597


[I 2025-03-31 01:14:14,935] Trial 28 finished with value: 0.8525966141299571 and parameters: {'learning_rate': 0.00020983616674931198, 'weight_decay': 0.004, 'warmup_steps': 26}. Best is trial 15 with value: 0.8572441447836451.


Trial 29 with params: {'learning_rate': 0.0019443311871992159, 'weight_decay': 0.01, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.9372,2.624712,0.3182,0.405686,0.3182,0.298906
2,1.926,1.811625,0.5024,0.555274,0.5024,0.496254
3,1.3792,1.539832,0.5643,0.614586,0.5643,0.564397
4,0.9757,1.190155,0.6624,0.682496,0.6624,0.661587


[I 2025-03-31 01:17:52,956] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 0.0003208499425884695, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7497,0.931704,0.7377,0.76102,0.7377,0.735911
2,0.6059,0.706424,0.7923,0.805757,0.7923,0.791161
3,0.3415,0.629104,0.8166,0.82419,0.8166,0.816169
4,0.1934,0.615104,0.8298,0.838407,0.8298,0.830522
5,0.0982,0.595203,0.8382,0.842856,0.8382,0.838499
6,0.0417,0.595081,0.8501,0.85339,0.8501,0.850467
7,0.0147,0.589977,0.8556,0.857117,0.8556,0.855588


[I 2025-03-31 01:24:31,882] Trial 30 finished with value: 0.8555882581270547 and parameters: {'learning_rate': 0.0003208499425884695, 'weight_decay': 0.007, 'warmup_steps': 23}. Best is trial 15 with value: 0.8572441447836451.


Trial 31 with params: {'learning_rate': 0.0004344570838032361, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.712,0.9881,0.7117,0.737968,0.7117,0.709113
2,0.6707,0.750465,0.7756,0.791787,0.7756,0.775466


[I 2025-03-31 01:26:18,469] Trial 31 pruned. 


Trial 32 with params: {'learning_rate': 0.00044016339994501963, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6991,0.973026,0.7227,0.748276,0.7227,0.720146
2,0.6754,0.782989,0.7708,0.787161,0.7708,0.77022
3,0.4113,0.706575,0.7953,0.808153,0.7953,0.795469
4,0.231,0.639635,0.823,0.830155,0.823,0.823071


[I 2025-03-31 01:29:47,040] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 0.00011250423208136925, 'weight_decay': 0.01, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2436,0.967355,0.7459,0.759803,0.7459,0.743497
2,0.6428,0.654924,0.811,0.816647,0.811,0.809974
3,0.3517,0.575835,0.8323,0.836894,0.8323,0.831596
4,0.1948,0.554932,0.8407,0.844341,0.8407,0.840652
5,0.1069,0.566539,0.8411,0.844554,0.8411,0.841189
6,0.0566,0.582168,0.8431,0.845789,0.8431,0.843241
7,0.0319,0.578312,0.8479,0.850396,0.8479,0.848186


[I 2025-03-31 01:36:10,140] Trial 33 finished with value: 0.8481858621928312 and parameters: {'learning_rate': 0.00011250423208136925, 'weight_decay': 0.01, 'warmup_steps': 29}. Best is trial 15 with value: 0.8572441447836451.


Trial 34 with params: {'learning_rate': 0.00045068624759451683, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7468,1.031952,0.7078,0.739916,0.7078,0.706895
2,0.6841,0.769946,0.7704,0.785428,0.7704,0.77029
3,0.4157,0.706205,0.7938,0.804315,0.7938,0.79284
4,0.2434,0.651654,0.8189,0.825395,0.8189,0.818281


[I 2025-03-31 01:39:50,778] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 0.0002236983724855693, 'weight_decay': 0.005, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8373,0.894268,0.7481,0.770459,0.7481,0.748241
2,0.5744,0.659235,0.8066,0.814033,0.8066,0.805809
3,0.3221,0.602524,0.8258,0.835862,0.8258,0.826193
4,0.1746,0.578393,0.8348,0.84224,0.8348,0.835178
5,0.0868,0.576536,0.8449,0.848637,0.8449,0.844973
6,0.0391,0.588844,0.8495,0.852131,0.8495,0.849809
7,0.0163,0.588883,0.8519,0.853479,0.8519,0.851873


[I 2025-03-31 01:45:55,154] Trial 35 finished with value: 0.8518729419402246 and parameters: {'learning_rate': 0.0002236983724855693, 'weight_decay': 0.005, 'warmup_steps': 23}. Best is trial 15 with value: 0.8572441447836451.


Trial 36 with params: {'learning_rate': 0.004049761177508626, 'weight_decay': 0.006, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6459,4.722727,0.01,0.0001,0.01,0.000198
2,4.6355,4.611166,0.01,0.0001,0.01,0.000198
3,4.6078,4.636775,0.0101,0.0011,0.0101,0.00038
4,4.6202,4.594497,0.0136,0.000314,0.0136,0.000609


[I 2025-03-31 01:49:28,283] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 0.00022340338954769765, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8366,0.879601,0.7544,0.772812,0.7544,0.753218
2,0.5678,0.677194,0.7988,0.812177,0.7988,0.79835


[I 2025-03-31 01:51:10,889] Trial 37 pruned. 


Trial 38 with params: {'learning_rate': 0.0006141452852731064, 'weight_decay': 0.005, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7339,1.221983,0.6512,0.703744,0.6512,0.648904
2,0.7884,0.882217,0.7411,0.756254,0.7411,0.738582


[I 2025-03-31 01:52:56,818] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.00010957402645904347, 'weight_decay': 0.003, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.228,1.028039,0.7403,0.750439,0.7403,0.735831
2,0.6866,0.671881,0.8126,0.817361,0.8126,0.811678
3,0.3798,0.596275,0.8273,0.832451,0.8273,0.826493
4,0.2162,0.57431,0.8343,0.839362,0.8343,0.83438
5,0.1217,0.570776,0.8413,0.846046,0.8413,0.841767
6,0.0669,0.580394,0.843,0.845454,0.843,0.843224
7,0.0392,0.586247,0.843,0.844753,0.843,0.842984


[I 2025-03-31 01:59:04,087] Trial 39 finished with value: 0.8429838645666554 and parameters: {'learning_rate': 0.00010957402645904347, 'weight_decay': 0.003, 'warmup_steps': 1}. Best is trial 15 with value: 0.8572441447836451.


Trial 40 with params: {'learning_rate': 0.00022959529796816636, 'weight_decay': 0.005, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8549,0.904737,0.744,0.767246,0.744,0.743104
2,0.5815,0.655117,0.8013,0.811436,0.8013,0.800588
3,0.3231,0.620002,0.8214,0.830123,0.8214,0.820861
4,0.1723,0.596584,0.8329,0.838704,0.8329,0.833073


[I 2025-03-31 02:02:39,642] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 0.00032347436008366533, 'weight_decay': 0.001, 'warmup_steps': 10}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6953,0.935937,0.7325,0.757371,0.7325,0.731029
2,0.5986,0.709076,0.7913,0.802158,0.7913,0.790445
3,0.3485,0.618156,0.8213,0.828989,0.8213,0.820751
4,0.194,0.614961,0.8303,0.839718,0.8303,0.830953
5,0.0991,0.625111,0.8367,0.842188,0.8367,0.837123
6,0.0416,0.607324,0.847,0.849916,0.847,0.847314
7,0.0146,0.604753,0.8497,0.851771,0.8497,0.850012


[I 2025-03-31 02:09:00,217] Trial 41 finished with value: 0.8500117677335177 and parameters: {'learning_rate': 0.00032347436008366533, 'weight_decay': 0.001, 'warmup_steps': 10}. Best is trial 15 with value: 0.8572441447836451.


Trial 42 with params: {'learning_rate': 0.0003107276408941606, 'weight_decay': 0.001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7359,0.89992,0.7355,0.757668,0.7355,0.732421
2,0.6009,0.683773,0.7991,0.8095,0.7991,0.798733
3,0.3467,0.636536,0.816,0.827455,0.816,0.81661
4,0.1896,0.613644,0.8292,0.836302,0.8292,0.829183


[I 2025-03-31 02:13:03,991] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 8.860683146133816e-05, 'weight_decay': 0.0, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4646,1.1326,0.7254,0.736504,0.7254,0.720945
2,0.7638,0.714086,0.8023,0.80642,0.8023,0.800875
3,0.4286,0.616813,0.8234,0.828755,0.8234,0.822689
4,0.2571,0.580672,0.8294,0.833104,0.8294,0.829182
5,0.1577,0.581182,0.8349,0.838144,0.8349,0.834939
6,0.0958,0.590289,0.8332,0.835932,0.8332,0.833243
7,0.063,0.589951,0.8367,0.838898,0.8367,0.836883


[I 2025-03-31 02:19:40,246] Trial 43 finished with value: 0.8368828579829788 and parameters: {'learning_rate': 8.860683146133816e-05, 'weight_decay': 0.0, 'warmup_steps': 18}. Best is trial 15 with value: 0.8572441447836451.


Trial 44 with params: {'learning_rate': 0.0003735932960369613, 'weight_decay': 0.002, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7407,0.975898,0.7189,0.74785,0.7189,0.717686
2,0.6341,0.756926,0.7751,0.790745,0.7751,0.774822


[I 2025-03-31 02:21:36,677] Trial 44 pruned. 


Trial 45 with params: {'learning_rate': 0.00028213122514460546, 'weight_decay': 0.001, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7323,0.89892,0.736,0.760885,0.736,0.736077
2,0.5813,0.723925,0.7842,0.802374,0.7842,0.785236
3,0.3315,0.635398,0.8183,0.827591,0.8183,0.817695
4,0.1828,0.616483,0.8281,0.83747,0.8281,0.828811


[I 2025-03-31 02:25:22,150] Trial 45 pruned. 


Trial 46 with params: {'learning_rate': 7.176219448961996e-05, 'weight_decay': 0.0, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.6882,1.326548,0.7021,0.71399,0.7021,0.693912
2,0.9106,0.783128,0.7951,0.798556,0.7951,0.793308


[I 2025-03-31 02:27:11,429] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 0.00020330968988314062, 'weight_decay': 0.0, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8724,0.89205,0.7478,0.767201,0.7478,0.746237
2,0.5721,0.653766,0.8062,0.816508,0.8062,0.80521
3,0.3162,0.588164,0.8274,0.833592,0.8274,0.827226
4,0.1692,0.578084,0.8376,0.843881,0.8376,0.838292
5,0.0856,0.564145,0.8479,0.851749,0.8479,0.847845
6,0.0387,0.565783,0.8563,0.858964,0.8563,0.856374
7,0.0157,0.577615,0.8581,0.859895,0.8581,0.858263


[I 2025-03-31 02:33:58,401] Trial 47 finished with value: 0.858262931189481 and parameters: {'learning_rate': 0.00020330968988314062, 'weight_decay': 0.0, 'warmup_steps': 16}. Best is trial 47 with value: 0.858262931189481.


Trial 48 with params: {'learning_rate': 0.00015433736178353414, 'weight_decay': 0.01, 'warmup_steps': 9}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9857,0.909485,0.7533,0.769383,0.7533,0.751553
2,0.5948,0.641669,0.8126,0.819611,0.8126,0.811614
3,0.3277,0.577583,0.8305,0.835636,0.8305,0.829706
4,0.1781,0.566478,0.8405,0.84645,0.8405,0.840912
5,0.092,0.556667,0.8471,0.851775,0.8471,0.847363
6,0.0454,0.5842,0.845,0.84877,0.845,0.845513
7,0.0213,0.578567,0.8512,0.853737,0.8512,0.851617


[I 2025-03-31 02:40:31,647] Trial 48 finished with value: 0.8516165094708837 and parameters: {'learning_rate': 0.00015433736178353414, 'weight_decay': 0.01, 'warmup_steps': 9}. Best is trial 47 with value: 0.858262931189481.


Trial 49 with params: {'learning_rate': 0.00023505769967892257, 'weight_decay': 0.0, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8246,0.901603,0.7447,0.772043,0.7447,0.74522
2,0.5776,0.676053,0.8003,0.812096,0.8003,0.799405
3,0.3248,0.600511,0.8271,0.835429,0.8271,0.826842
4,0.1722,0.593783,0.8301,0.839621,0.8301,0.831025


[I 2025-03-31 02:44:16,025] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.00033674966527091953, 'weight_decay': 0.004, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.729,0.980409,0.7203,0.754428,0.7203,0.721391
2,0.6136,0.721444,0.7861,0.801899,0.7861,0.785364


[I 2025-03-31 02:45:58,476] Trial 50 pruned. 


Trial 51 with params: {'learning_rate': 9.888859487004774e-05, 'weight_decay': 0.002, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3561,1.055246,0.737,0.750179,0.737,0.732616
2,0.6999,0.676264,0.8093,0.815305,0.8093,0.808525
3,0.3848,0.604055,0.8252,0.830913,0.8252,0.824822
4,0.2215,0.575955,0.8351,0.840648,0.8351,0.835562
5,0.1297,0.582236,0.8375,0.84229,0.8375,0.838024
6,0.0738,0.59129,0.8368,0.839827,0.8368,0.83714
7,0.0447,0.596333,0.8372,0.839719,0.8372,0.837442


[I 2025-03-31 02:52:25,743] Trial 51 finished with value: 0.8374418864277717 and parameters: {'learning_rate': 9.888859487004774e-05, 'weight_decay': 0.002, 'warmup_steps': 18}. Best is trial 47 with value: 0.858262931189481.


Trial 52 with params: {'learning_rate': 0.0001372692966509229, 'weight_decay': 0.001, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0573,0.898093,0.7593,0.773159,0.7593,0.757143
2,0.5987,0.632722,0.8172,0.823344,0.8172,0.816573
3,0.3233,0.574,0.8344,0.839697,0.8344,0.833717
4,0.1767,0.564343,0.8392,0.845008,0.8392,0.839279
5,0.0913,0.565464,0.8436,0.847508,0.8436,0.84386
6,0.0453,0.570954,0.849,0.851116,0.849,0.848917
7,0.0225,0.576252,0.8516,0.853034,0.8516,0.851438


[I 2025-03-31 02:59:19,144] Trial 52 finished with value: 0.8514384722359205 and parameters: {'learning_rate': 0.0001372692966509229, 'weight_decay': 0.001, 'warmup_steps': 13}. Best is trial 47 with value: 0.858262931189481.


Trial 53 with params: {'learning_rate': 0.00021967416393079315, 'weight_decay': 0.0, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8476,0.867528,0.7503,0.765891,0.7503,0.749308
2,0.5695,0.658889,0.8035,0.813972,0.8035,0.803334
3,0.3168,0.599247,0.8246,0.831748,0.8246,0.824548
4,0.1705,0.587923,0.8375,0.845747,0.8375,0.838155
5,0.0867,0.596043,0.8452,0.848701,0.8452,0.845383
6,0.0379,0.589571,0.8506,0.854377,0.8506,0.85142
7,0.0146,0.595594,0.8539,0.855681,0.8539,0.854055


[I 2025-03-31 03:05:41,096] Trial 53 finished with value: 0.8540553623229513 and parameters: {'learning_rate': 0.00021967416393079315, 'weight_decay': 0.0, 'warmup_steps': 16}. Best is trial 47 with value: 0.858262931189481.


Trial 54 with params: {'learning_rate': 0.0008389468279763976, 'weight_decay': 0.0, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8863,1.354182,0.616,0.670946,0.616,0.609442
2,0.9653,1.070793,0.6874,0.72935,0.6874,0.686678


[I 2025-03-31 03:07:33,708] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.0002804257836269697, 'weight_decay': 0.001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7781,0.890348,0.7422,0.761634,0.7422,0.742006
2,0.5921,0.681588,0.8,0.8095,0.8,0.799642
3,0.3363,0.630438,0.8196,0.827629,0.8196,0.819046
4,0.1824,0.594396,0.837,0.844946,0.837,0.837489
5,0.0937,0.611631,0.8413,0.846593,0.8413,0.841388
6,0.041,0.595975,0.85,0.853155,0.85,0.850297
7,0.0149,0.594706,0.854,0.856268,0.854,0.854207


[I 2025-03-31 03:13:49,080] Trial 55 finished with value: 0.85420716088112 and parameters: {'learning_rate': 0.0002804257836269697, 'weight_decay': 0.001, 'warmup_steps': 19}. Best is trial 47 with value: 0.858262931189481.


Trial 56 with params: {'learning_rate': 0.00028125829195773934, 'weight_decay': 0.003, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7568,0.940664,0.7312,0.760221,0.7312,0.730635
2,0.5889,0.691558,0.7929,0.804307,0.7929,0.792273


[I 2025-03-31 03:15:40,102] Trial 56 pruned. 


Trial 57 with params: {'learning_rate': 0.0008174510003799652, 'weight_decay': 0.001, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9011,1.399106,0.6039,0.650827,0.6039,0.595838
2,0.9517,1.073867,0.6942,0.722321,0.6942,0.692908
3,0.6256,0.857796,0.7562,0.77148,0.7562,0.755215
4,0.3792,0.800808,0.7755,0.786203,0.7755,0.774354


[I 2025-03-31 03:19:12,339] Trial 57 pruned. 


Trial 58 with params: {'learning_rate': 0.0001697780291860656, 'weight_decay': 0.003, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.98,0.886821,0.762,0.776333,0.762,0.761126
2,0.5834,0.642711,0.8124,0.819738,0.8124,0.811256
3,0.3183,0.56954,0.8338,0.839625,0.8338,0.833558
4,0.1657,0.571678,0.8434,0.849542,0.8434,0.844185
5,0.0855,0.558175,0.8484,0.852049,0.8484,0.848775
6,0.0413,0.583707,0.8513,0.854972,0.8513,0.851684
7,0.0184,0.587043,0.857,0.859521,0.857,0.857309


[I 2025-03-31 03:25:31,506] Trial 58 finished with value: 0.8573092963739418 and parameters: {'learning_rate': 0.0001697780291860656, 'weight_decay': 0.003, 'warmup_steps': 24}. Best is trial 47 with value: 0.858262931189481.


Trial 59 with params: {'learning_rate': 0.0001357757161019523, 'weight_decay': 0.003, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1425,0.960177,0.7472,0.760323,0.7472,0.744395
2,0.6242,0.665396,0.8079,0.815707,0.8079,0.807539
3,0.3426,0.583265,0.8284,0.834663,0.8284,0.827904
4,0.1851,0.583443,0.8362,0.843317,0.8362,0.836664
5,0.0991,0.578332,0.8426,0.847115,0.8426,0.843157
6,0.0494,0.591278,0.8473,0.850137,0.8473,0.847547
7,0.0252,0.591735,0.8494,0.851694,0.8494,0.849555


[I 2025-03-31 03:32:39,497] Trial 59 finished with value: 0.8495547334932742 and parameters: {'learning_rate': 0.0001357757161019523, 'weight_decay': 0.003, 'warmup_steps': 32}. Best is trial 47 with value: 0.858262931189481.


Trial 60 with params: {'learning_rate': 0.000292475991904573, 'weight_decay': 0.006, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7861,0.907537,0.7406,0.768358,0.7406,0.743225
2,0.5931,0.688029,0.7958,0.808241,0.7958,0.795713
3,0.3444,0.633677,0.8142,0.825244,0.8142,0.813415
4,0.1874,0.604357,0.8298,0.836729,0.8298,0.829704


[I 2025-03-31 03:36:19,834] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.0002485718364560262, 'weight_decay': 0.003, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8262,0.911451,0.7434,0.766225,0.7434,0.743338
2,0.5806,0.661771,0.8006,0.813792,0.8006,0.800059
3,0.3221,0.594812,0.8307,0.837491,0.8307,0.829778
4,0.1753,0.582568,0.8326,0.840894,0.8326,0.833301


[I 2025-03-31 03:39:55,240] Trial 61 pruned. 


Trial 62 with params: {'learning_rate': 0.0001250487857337693, 'weight_decay': 0.003, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1822,0.926144,0.7559,0.769218,0.7559,0.753853
2,0.6168,0.644469,0.8136,0.819676,0.8136,0.812219
3,0.3323,0.58537,0.8286,0.834543,0.8286,0.828611
4,0.1825,0.562634,0.8382,0.84406,0.8382,0.838424
5,0.0962,0.564399,0.8443,0.847028,0.8443,0.844275
6,0.0493,0.574166,0.8488,0.851005,0.8488,0.849032
7,0.026,0.579919,0.8516,0.853137,0.8516,0.851583


[I 2025-03-31 03:46:18,586] Trial 62 finished with value: 0.8515827245328798 and parameters: {'learning_rate': 0.0001250487857337693, 'weight_decay': 0.003, 'warmup_steps': 23}. Best is trial 47 with value: 0.858262931189481.


Trial 63 with params: {'learning_rate': 0.0003713174190019163, 'weight_decay': 0.003, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7263,0.969862,0.7224,0.74925,0.7224,0.720941
2,0.6284,0.753834,0.7745,0.791121,0.7745,0.773838


[I 2025-03-31 03:48:08,005] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.00028073010253966434, 'weight_decay': 0.002, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7698,0.899726,0.745,0.768286,0.745,0.743414
2,0.5818,0.705148,0.7878,0.802799,0.7878,0.787828
3,0.3425,0.605365,0.8213,0.831061,0.8213,0.821702
4,0.1861,0.624626,0.8254,0.83748,0.8254,0.826362


[I 2025-03-31 03:51:41,210] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.0002475529634226411, 'weight_decay': 0.006, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7735,0.911932,0.7411,0.769289,0.7411,0.74008
2,0.5678,0.674243,0.8017,0.8106,0.8017,0.800144
3,0.3249,0.591878,0.8244,0.834434,0.8244,0.823927
4,0.1776,0.599794,0.8297,0.838212,0.8297,0.83008


[I 2025-03-31 03:55:21,738] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 0.00016804744190486608, 'weight_decay': 0.004, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9793,0.847357,0.7668,0.777374,0.7668,0.764397
2,0.5786,0.643517,0.8129,0.819126,0.8129,0.81157
3,0.3181,0.572486,0.831,0.83702,0.831,0.830623
4,0.1696,0.574604,0.8372,0.842476,0.8372,0.837231
5,0.0865,0.570906,0.8404,0.843745,0.8404,0.840201
6,0.0411,0.579057,0.8506,0.853119,0.8506,0.850763
7,0.0187,0.585845,0.8512,0.853112,0.8512,0.851373


[I 2025-03-31 04:01:36,594] Trial 66 finished with value: 0.851373002269203 and parameters: {'learning_rate': 0.00016804744190486608, 'weight_decay': 0.004, 'warmup_steps': 24}. Best is trial 47 with value: 0.858262931189481.


Trial 67 with params: {'learning_rate': 0.00012677734325509485, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1833,0.954274,0.7515,0.765693,0.7515,0.748828
2,0.6383,0.648706,0.8114,0.817719,0.8114,0.81064
3,0.3474,0.583286,0.8315,0.836943,0.8315,0.831149
4,0.1962,0.5706,0.8385,0.843412,0.8385,0.838351
5,0.104,0.575812,0.8411,0.846312,0.8411,0.841705
6,0.0528,0.588348,0.8434,0.845694,0.8434,0.843522
7,0.0281,0.591474,0.8466,0.848551,0.8466,0.84659


[I 2025-03-31 04:07:45,607] Trial 67 finished with value: 0.8465895650744375 and parameters: {'learning_rate': 0.00012677734325509485, 'weight_decay': 0.006, 'warmup_steps': 23}. Best is trial 47 with value: 0.858262931189481.


Trial 68 with params: {'learning_rate': 0.0002204153721176256, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8511,0.840683,0.7596,0.778902,0.7596,0.759929
2,0.5622,0.667449,0.8022,0.814467,0.8022,0.802101
3,0.3171,0.589534,0.8244,0.831008,0.8244,0.824231
4,0.1671,0.568411,0.8377,0.842965,0.8377,0.838132
5,0.0848,0.57689,0.8435,0.849067,0.8435,0.844127
6,0.0391,0.578398,0.8508,0.853913,0.8508,0.851313
7,0.0149,0.579169,0.8534,0.855332,0.8534,0.853718


[I 2025-03-31 04:14:01,390] Trial 68 finished with value: 0.853717636085485 and parameters: {'learning_rate': 0.0002204153721176256, 'weight_decay': 0.006, 'warmup_steps': 27}. Best is trial 47 with value: 0.858262931189481.


Trial 69 with params: {'learning_rate': 0.004618563219406311, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6028,4.573802,0.0149,0.001869,0.0149,0.002864
2,4.5977,4.894011,0.01,0.0001,0.01,0.000198
3,4.598,7.054641,0.01,0.0001,0.01,0.000198
4,4.6233,4.683136,0.01,0.0001,0.01,0.000198


[I 2025-03-31 04:17:44,705] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 0.0032109758631513803, 'weight_decay': 0.004, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6094,4.614607,0.0117,0.001003,0.0117,0.001743
2,4.6092,4.608589,0.0118,0.000813,0.0118,0.001305
3,4.6089,4.58256,0.0135,0.000684,0.0135,0.001277
4,4.6003,4.613069,0.013,0.001385,0.013,0.002279


[I 2025-03-31 04:21:17,378] Trial 70 pruned. 


Trial 71 with params: {'learning_rate': 0.00016746504590446637, 'weight_decay': 0.0, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9797,0.883342,0.7586,0.772765,0.7586,0.756138
2,0.5877,0.658132,0.8067,0.81616,0.8067,0.805669
3,0.3282,0.582489,0.8297,0.835397,0.8297,0.828843
4,0.1774,0.578622,0.8387,0.845372,0.8387,0.839532
5,0.0907,0.565256,0.8447,0.849129,0.8447,0.84512
6,0.0438,0.589616,0.8465,0.848912,0.8465,0.846734
7,0.0201,0.589544,0.8512,0.853667,0.8512,0.851521


[I 2025-03-31 04:27:30,834] Trial 71 finished with value: 0.8515208545669823 and parameters: {'learning_rate': 0.00016746504590446637, 'weight_decay': 0.0, 'warmup_steps': 18}. Best is trial 47 with value: 0.858262931189481.


Trial 72 with params: {'learning_rate': 0.0002532263812853275, 'weight_decay': 0.0, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7635,0.909576,0.739,0.76489,0.739,0.736874
2,0.5697,0.661012,0.8025,0.812846,0.8025,0.80197
3,0.321,0.603811,0.8243,0.832624,0.8243,0.824114
4,0.1751,0.580642,0.8367,0.843092,0.8367,0.836949
5,0.0882,0.583275,0.8443,0.849786,0.8443,0.844517
6,0.0393,0.583682,0.8541,0.856674,0.8541,0.854438
7,0.0147,0.589782,0.8568,0.858717,0.8568,0.856861


[I 2025-03-31 04:33:55,945] Trial 72 finished with value: 0.8568613539626442 and parameters: {'learning_rate': 0.0002532263812853275, 'weight_decay': 0.0, 'warmup_steps': 17}. Best is trial 47 with value: 0.858262931189481.


Trial 73 with params: {'learning_rate': 0.00020651989253965477, 'weight_decay': 0.0, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8771,0.884357,0.7509,0.771259,0.7509,0.749038
2,0.576,0.673714,0.8015,0.814337,0.8015,0.801314
3,0.3233,0.587132,0.8285,0.837248,0.8285,0.828663
4,0.1709,0.582674,0.8381,0.845606,0.8381,0.838372
5,0.0871,0.568096,0.8459,0.849505,0.8459,0.846005
6,0.0405,0.573681,0.8545,0.857292,0.8545,0.854566
7,0.0161,0.577248,0.857,0.859397,0.857,0.857221


[I 2025-03-31 04:40:27,294] Trial 73 finished with value: 0.8572207297251221 and parameters: {'learning_rate': 0.00020651989253965477, 'weight_decay': 0.0, 'warmup_steps': 17}. Best is trial 47 with value: 0.858262931189481.


Trial 74 with params: {'learning_rate': 0.0002567211253623803, 'weight_decay': 0.0, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7444,0.891278,0.7484,0.774253,0.7484,0.748784
2,0.5732,0.673959,0.7989,0.810225,0.7989,0.798227
3,0.3272,0.614712,0.8223,0.831962,0.8223,0.822355
4,0.1756,0.593331,0.8333,0.841702,0.8333,0.834346
5,0.0862,0.587664,0.8427,0.847644,0.8427,0.843026
6,0.0401,0.591049,0.8516,0.854747,0.8516,0.851953
7,0.0147,0.589135,0.8504,0.852055,0.8504,0.850578


[I 2025-03-31 04:47:02,578] Trial 74 finished with value: 0.8505781622289461 and parameters: {'learning_rate': 0.0002567211253623803, 'weight_decay': 0.0, 'warmup_steps': 11}. Best is trial 47 with value: 0.858262931189481.


Trial 75 with params: {'learning_rate': 0.00033990488093307077, 'weight_decay': 0.0, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6957,0.917076,0.7378,0.763707,0.7378,0.737385
2,0.6154,0.712186,0.7897,0.803121,0.7897,0.789081


[I 2025-03-31 04:48:55,639] Trial 75 pruned. 


Trial 76 with params: {'learning_rate': 5.7423270605816206e-05, 'weight_decay': 0.007, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.9162,1.504374,0.6783,0.689282,0.6783,0.667774
2,1.0539,0.884806,0.7773,0.779527,0.7773,0.775507
3,0.6184,0.712247,0.8072,0.810909,0.8072,0.806441
4,0.4161,0.645699,0.8184,0.821565,0.8184,0.818065


[I 2025-03-31 04:52:24,380] Trial 76 pruned. 


Trial 77 with params: {'learning_rate': 0.0002558103533892667, 'weight_decay': 0.0, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7732,0.888117,0.7484,0.772703,0.7484,0.749092
2,0.5783,0.699936,0.7939,0.805863,0.7939,0.7937


[I 2025-03-31 04:54:11,238] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.0012905812025664177, 'weight_decay': 0.008, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3416,2.012436,0.4531,0.546272,0.4531,0.448063
2,1.3359,1.374768,0.609,0.654367,0.609,0.604582


[I 2025-03-31 04:56:02,635] Trial 78 pruned. 


Trial 79 with params: {'learning_rate': 0.0005587400967057204, 'weight_decay': 0.004, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7664,1.162762,0.6722,0.718739,0.6722,0.671989
2,0.7578,0.824084,0.754,0.77025,0.754,0.752753
3,0.477,0.761869,0.7844,0.796221,0.7844,0.783667
4,0.2754,0.682548,0.8059,0.814618,0.8059,0.805312


[I 2025-03-31 04:59:38,254] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 0.0006579333675352459, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8343,1.187446,0.6613,0.698624,0.6613,0.660204
2,0.8361,0.965273,0.7204,0.742017,0.7204,0.717291


[I 2025-03-31 05:01:23,072] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 0.0003505183998710469, 'weight_decay': 0.001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7212,0.948777,0.7316,0.760166,0.7316,0.730609
2,0.6213,0.736949,0.7812,0.79517,0.7812,0.780361
3,0.3637,0.636005,0.8147,0.822284,0.8147,0.81493
4,0.1935,0.637104,0.8233,0.831202,0.8233,0.82321


[I 2025-03-31 05:04:52,415] Trial 81 pruned. 


Trial 82 with params: {'learning_rate': 0.0026245310374742674, 'weight_decay': 0.0, 'warmup_steps': 4}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.5527,4.594481,0.0149,0.004775,0.0149,0.003058
2,4.5356,5.061008,0.0088,0.002062,0.0088,0.002103


[I 2025-03-31 05:06:36,316] Trial 82 pruned. 


Trial 83 with params: {'learning_rate': 0.000567070571638144, 'weight_decay': 0.002, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7248,1.174344,0.6691,0.718281,0.6691,0.670883
2,0.7608,0.846547,0.7526,0.772913,0.7526,0.751697
3,0.464,0.79431,0.7726,0.787987,0.7726,0.772227
4,0.2828,0.686611,0.8116,0.819756,0.8116,0.811252


[I 2025-03-31 05:10:06,605] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.0001263783777020524, 'weight_decay': 0.001, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1552,0.93935,0.7542,0.764709,0.7542,0.751756
2,0.6296,0.666131,0.8062,0.814177,0.8062,0.804558
3,0.344,0.591668,0.8263,0.831588,0.8263,0.825499
4,0.1879,0.58065,0.8333,0.839783,0.8333,0.833649


[I 2025-03-31 05:13:42,023] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.00015711013995218934, 'weight_decay': 0.0, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.014,0.896251,0.7547,0.766091,0.7547,0.751324
2,0.5886,0.634849,0.8117,0.819517,0.8117,0.81094
3,0.3205,0.577652,0.8267,0.834334,0.8267,0.826088
4,0.1735,0.572483,0.8381,0.843866,0.8381,0.838395
5,0.0875,0.571002,0.8459,0.849665,0.8459,0.846158
6,0.0412,0.580766,0.8484,0.852058,0.8484,0.84886
7,0.019,0.586512,0.8509,0.852758,0.8509,0.850932


[I 2025-03-31 05:19:45,809] Trial 85 finished with value: 0.850932369990647 and parameters: {'learning_rate': 0.00015711013995218934, 'weight_decay': 0.0, 'warmup_steps': 19}. Best is trial 47 with value: 0.858262931189481.


Trial 86 with params: {'learning_rate': 0.0003122147326697958, 'weight_decay': 0.008, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7326,0.892954,0.7408,0.766912,0.7408,0.741413
2,0.6033,0.69152,0.7965,0.808824,0.7965,0.796931


[I 2025-03-31 05:21:32,757] Trial 86 pruned. 


Trial 87 with params: {'learning_rate': 0.00014058953407551907, 'weight_decay': 0.003, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0511,0.933819,0.7484,0.767775,0.7484,0.746041
2,0.6084,0.637131,0.8123,0.81751,0.8123,0.811186
3,0.3309,0.569449,0.832,0.837841,0.832,0.831191
4,0.1795,0.559938,0.8408,0.845343,0.8408,0.84109
5,0.0951,0.553849,0.8456,0.848526,0.8456,0.845535
6,0.047,0.566119,0.8517,0.854455,0.8517,0.851711
7,0.0227,0.571664,0.8533,0.855025,0.8533,0.85341


[I 2025-03-31 05:27:38,304] Trial 87 finished with value: 0.8534104978313405 and parameters: {'learning_rate': 0.00014058953407551907, 'weight_decay': 0.003, 'warmup_steps': 18}. Best is trial 47 with value: 0.858262931189481.


Trial 88 with params: {'learning_rate': 0.0004069088607169672, 'weight_decay': 0.0, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6949,0.979112,0.7172,0.750586,0.7172,0.716468
2,0.6465,0.746855,0.7803,0.793877,0.7803,0.779931
3,0.3858,0.646085,0.8114,0.817662,0.8114,0.810748
4,0.2169,0.639163,0.8256,0.831744,0.8256,0.825294


[I 2025-03-31 05:31:11,883] Trial 88 pruned. 


Trial 89 with params: {'learning_rate': 0.0002631534070067115, 'weight_decay': 0.003, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7874,0.898595,0.7422,0.766603,0.7422,0.7414
2,0.5714,0.670969,0.8001,0.810436,0.8001,0.799682
3,0.3241,0.606193,0.8238,0.83179,0.8238,0.823787
4,0.1756,0.610654,0.8296,0.838595,0.8296,0.831041


[I 2025-03-31 05:34:34,085] Trial 89 pruned. 


Trial 90 with params: {'learning_rate': 0.00028069581556103485, 'weight_decay': 0.003, 'warmup_steps': 10}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7315,0.874183,0.7498,0.770285,0.7498,0.747777
2,0.5882,0.708739,0.7889,0.804885,0.7889,0.78916


[I 2025-03-31 05:36:16,815] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.00016230216839537633, 'weight_decay': 0.001, 'warmup_steps': 14}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9656,0.890596,0.7536,0.772423,0.7536,0.751939
2,0.5848,0.673484,0.8021,0.814604,0.8021,0.801439
3,0.3155,0.579659,0.83,0.835665,0.83,0.829098
4,0.168,0.584966,0.8382,0.8441,0.8382,0.838669
5,0.0883,0.562879,0.8451,0.849107,0.8451,0.845171
6,0.0418,0.580442,0.8504,0.852664,0.8504,0.850608
7,0.0195,0.584438,0.852,0.853753,0.852,0.851991


[I 2025-03-31 05:42:25,066] Trial 91 finished with value: 0.8519913567256789 and parameters: {'learning_rate': 0.00016230216839537633, 'weight_decay': 0.001, 'warmup_steps': 14}. Best is trial 47 with value: 0.858262931189481.


Trial 92 with params: {'learning_rate': 0.0001258120252199228, 'weight_decay': 0.0, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1298,0.933853,0.7546,0.766804,0.7546,0.751517
2,0.6269,0.649291,0.8106,0.816079,0.8106,0.80965
3,0.3432,0.572031,0.8293,0.834636,0.8293,0.828808
4,0.1929,0.567347,0.8362,0.841858,0.8362,0.8365
5,0.1036,0.558179,0.8431,0.847423,0.8431,0.843716
6,0.0538,0.565787,0.8478,0.850479,0.8478,0.848246
7,0.0284,0.573395,0.8495,0.851519,0.8495,0.849681


[I 2025-03-31 05:48:41,639] Trial 92 finished with value: 0.8496813753599316 and parameters: {'learning_rate': 0.0001258120252199228, 'weight_decay': 0.0, 'warmup_steps': 13}. Best is trial 47 with value: 0.858262931189481.


Trial 93 with params: {'learning_rate': 0.000351454113215484, 'weight_decay': 0.0, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6977,0.983573,0.7213,0.747442,0.7213,0.719181
2,0.6087,0.748977,0.7772,0.792648,0.7772,0.77659
3,0.3501,0.642479,0.8144,0.824352,0.8144,0.814261
4,0.2055,0.619207,0.827,0.83436,0.827,0.826912


[I 2025-03-31 05:52:12,579] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.00023761007639090846, 'weight_decay': 0.0, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7684,0.863444,0.757,0.775484,0.757,0.75667
2,0.5616,0.672062,0.8042,0.815118,0.8042,0.80397
3,0.3237,0.609401,0.8224,0.830636,0.8224,0.82245
4,0.1741,0.575429,0.8389,0.845562,0.8389,0.839101
5,0.0858,0.570941,0.8491,0.853927,0.8491,0.84955
6,0.0387,0.585468,0.8541,0.856953,0.8541,0.854597
7,0.0147,0.586386,0.8559,0.858391,0.8559,0.856214


Using the latest cached version of the module from /home/jovyan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/34c46321f42186df33a6260966e34a368f14868d9cc2ba47d142112e2800d233 (last modified on Sat Mar 29 17:35:20 2025) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
[I 2025-03-31 05:58:15,156] Trial 94 finished with value: 0.8562139034655079 and parameters: {'learning_rate': 0.00023761007639090846, 'weight_decay': 0.0, 'warmup_steps': 19}. Best is trial 47 with value: 0.858262931189481.


Trial 95 with params: {'learning_rate': 0.0003483848551425693, 'weight_decay': 0.0, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7481,0.974823,0.7203,0.754531,0.7203,0.722324
2,0.6267,0.704839,0.7899,0.803294,0.7899,0.790075
3,0.3604,0.66713,0.8076,0.817583,0.8076,0.806501
4,0.2048,0.629081,0.8235,0.830759,0.8235,0.823513


[I 2025-03-31 06:01:44,031] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 0.0001901157180682291, 'weight_decay': 0.002, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8955,0.890103,0.7497,0.770838,0.7497,0.747828
2,0.573,0.670951,0.8015,0.813664,0.8015,0.8008
3,0.3195,0.58462,0.8306,0.837432,0.8306,0.830256
4,0.1677,0.57713,0.8397,0.845126,0.8397,0.839918
5,0.0868,0.575036,0.8457,0.848478,0.8457,0.845532
6,0.0398,0.59991,0.8487,0.850927,0.8487,0.848832
7,0.0171,0.599465,0.8521,0.85347,0.8521,0.851968


[I 2025-03-31 06:07:55,629] Trial 96 finished with value: 0.8519682823447389 and parameters: {'learning_rate': 0.0001901157180682291, 'weight_decay': 0.002, 'warmup_steps': 17}. Best is trial 47 with value: 0.858262931189481.


Trial 97 with params: {'learning_rate': 0.00032957362919749494, 'weight_decay': 0.006, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7527,0.918927,0.7395,0.760387,0.7395,0.738849
2,0.6062,0.724981,0.7843,0.799782,0.7843,0.784383
3,0.3525,0.653388,0.8097,0.819664,0.8097,0.809584
4,0.1963,0.624352,0.8271,0.836396,0.8271,0.827783


[I 2025-03-31 06:11:30,030] Trial 97 pruned. 


Trial 98 with params: {'learning_rate': 0.0035054904723296637, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6529,4.914945,0.0099,0.000233,0.0099,0.000411
2,4.6268,4.61494,0.01,0.0001,0.01,0.000198
3,4.6144,4.613803,0.01,0.0001,0.01,0.000198
4,4.612,4.614803,0.01,0.0001,0.01,0.000198


[I 2025-03-31 06:14:55,720] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 0.00015910941293599255, 'weight_decay': 0.0, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0243,0.892277,0.7548,0.770756,0.7548,0.753621
2,0.5888,0.663658,0.8071,0.816057,0.8071,0.806221
3,0.3239,0.566296,0.8344,0.83902,0.8344,0.833763
4,0.1709,0.569428,0.8382,0.845699,0.8382,0.839345
5,0.0882,0.551341,0.853,0.855416,0.853,0.852888
6,0.0432,0.580336,0.8512,0.854656,0.8512,0.851618
7,0.0198,0.57991,0.8554,0.857256,0.8554,0.855504


[I 2025-03-31 06:21:19,604] Trial 99 finished with value: 0.8555041706901888 and parameters: {'learning_rate': 0.00015910941293599255, 'weight_decay': 0.0, 'warmup_steps': 21}. Best is trial 47 with value: 0.858262931189481.


Trial 100 with params: {'learning_rate': 0.00020851288059485612, 'weight_decay': 0.0, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9025,0.864432,0.752,0.769469,0.752,0.751012
2,0.5786,0.649393,0.8081,0.817094,0.8081,0.807196
3,0.3184,0.613824,0.8204,0.82814,0.8204,0.819752
4,0.1698,0.580155,0.8367,0.842776,0.8367,0.836762
5,0.0857,0.573797,0.8453,0.848297,0.8453,0.845433
6,0.0389,0.582741,0.8511,0.852754,0.8511,0.85101
7,0.016,0.582971,0.8579,0.859899,0.8579,0.857987


[I 2025-03-31 06:27:43,201] Trial 100 finished with value: 0.8579872667335532 and parameters: {'learning_rate': 0.00020851288059485612, 'weight_decay': 0.0, 'warmup_steps': 25}. Best is trial 47 with value: 0.858262931189481.


Trial 101 with params: {'learning_rate': 0.0001616568054524524, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.033,0.858777,0.7641,0.775041,0.7641,0.76201
2,0.5775,0.658846,0.8104,0.821256,0.8104,0.80994
3,0.3148,0.567037,0.8356,0.840039,0.8356,0.835427
4,0.1684,0.569547,0.8363,0.842526,0.8363,0.836484


[I 2025-03-31 06:31:21,372] Trial 101 pruned. 


Trial 102 with params: {'learning_rate': 5.722823462641673e-05, 'weight_decay': 0.001, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.9491,1.515754,0.679,0.691099,0.679,0.669061
2,1.0567,0.883052,0.7766,0.778627,0.7766,0.774363
3,0.6154,0.712471,0.8058,0.808067,0.8058,0.804439
4,0.4119,0.648793,0.8181,0.821371,0.8181,0.817481


[I 2025-03-31 06:34:55,164] Trial 102 pruned. 


Trial 103 with params: {'learning_rate': 0.00024517833288059644, 'weight_decay': 0.0, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8175,0.908932,0.7387,0.7645,0.7387,0.737386
2,0.5845,0.658962,0.8105,0.822091,0.8105,0.810002
3,0.3265,0.606692,0.825,0.83302,0.825,0.825091
4,0.1784,0.607228,0.8302,0.83831,0.8302,0.830303


[I 2025-03-31 06:38:35,681] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00010403782710255664, 'weight_decay': 0.0, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3146,1.059716,0.7323,0.747333,0.7323,0.728078
2,0.7073,0.680816,0.8083,0.812792,0.8083,0.80714
3,0.3894,0.589585,0.8298,0.834782,0.8298,0.829
4,0.2238,0.568029,0.8356,0.841128,0.8356,0.835848
5,0.1266,0.563345,0.8412,0.844414,0.8412,0.841162
6,0.0701,0.569068,0.843,0.846299,0.843,0.843396
7,0.0416,0.573966,0.8433,0.845774,0.8433,0.843581


[I 2025-03-31 06:44:54,001] Trial 104 finished with value: 0.8435814146232896 and parameters: {'learning_rate': 0.00010403782710255664, 'weight_decay': 0.0, 'warmup_steps': 21}. Best is trial 47 with value: 0.858262931189481.


Trial 105 with params: {'learning_rate': 0.0006078662726350267, 'weight_decay': 0.01, 'warmup_steps': 2}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7228,1.125392,0.673,0.719524,0.673,0.675134
2,0.7797,0.907297,0.7332,0.755279,0.7332,0.731462


[I 2025-03-31 06:46:40,254] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.0002630580583460472, 'weight_decay': 0.0, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8067,0.888993,0.7438,0.765762,0.7438,0.742959
2,0.5863,0.68749,0.7993,0.808087,0.7993,0.797668
3,0.3334,0.609616,0.8228,0.830571,0.8228,0.82233
4,0.1777,0.613028,0.8305,0.838921,0.8305,0.830771
5,0.0903,0.600961,0.8417,0.845607,0.8417,0.84152
6,0.0391,0.599515,0.8496,0.851832,0.8496,0.84956
7,0.0153,0.596559,0.8535,0.854995,0.8535,0.853468


[I 2025-03-31 06:52:48,987] Trial 106 finished with value: 0.8534678573042683 and parameters: {'learning_rate': 0.0002630580583460472, 'weight_decay': 0.0, 'warmup_steps': 23}. Best is trial 47 with value: 0.858262931189481.


Trial 107 with params: {'learning_rate': 0.00012080655767541391, 'weight_decay': 0.002, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2309,0.986135,0.7398,0.755463,0.7398,0.737534
2,0.6444,0.666639,0.8071,0.812892,0.8071,0.805092
3,0.3498,0.602182,0.8248,0.833948,0.8248,0.824494
4,0.1936,0.568791,0.8369,0.841734,0.8369,0.836869
5,0.1055,0.571272,0.8398,0.843968,0.8398,0.840237
6,0.0546,0.584526,0.8429,0.845432,0.8429,0.84306
7,0.0299,0.59045,0.8455,0.848031,0.8455,0.845788


[I 2025-03-31 06:59:02,087] Trial 107 finished with value: 0.8457880016241975 and parameters: {'learning_rate': 0.00012080655767541391, 'weight_decay': 0.002, 'warmup_steps': 26}. Best is trial 47 with value: 0.858262931189481.


Trial 108 with params: {'learning_rate': 0.00017812635045734408, 'weight_decay': 0.0, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9268,0.911693,0.7476,0.766172,0.7476,0.746312
2,0.5789,0.645027,0.8118,0.820951,0.8118,0.811768
3,0.3219,0.585717,0.833,0.838904,0.833,0.832303
4,0.1739,0.571133,0.8402,0.847348,0.8402,0.840798
5,0.0899,0.566807,0.8468,0.850112,0.8468,0.846794
6,0.0424,0.579271,0.8538,0.857205,0.8538,0.854195
7,0.0191,0.582632,0.8563,0.858626,0.8563,0.856477


[I 2025-03-31 07:05:26,921] Trial 108 finished with value: 0.8564767576210022 and parameters: {'learning_rate': 0.00017812635045734408, 'weight_decay': 0.0, 'warmup_steps': 22}. Best is trial 47 with value: 0.858262931189481.


Trial 109 with params: {'learning_rate': 0.0004997479077135488, 'weight_decay': 0.004, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7547,1.076018,0.6921,0.732171,0.6921,0.691965
2,0.7205,0.824145,0.763,0.783683,0.763,0.763473
3,0.4413,0.745573,0.7838,0.800105,0.7838,0.783637
4,0.252,0.655966,0.8125,0.821556,0.8125,0.81284


[I 2025-03-31 07:09:06,747] Trial 109 pruned. 


Trial 110 with params: {'learning_rate': 0.00019272128592044227, 'weight_decay': 0.004, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9382,0.880319,0.7546,0.773918,0.7546,0.754193
2,0.5756,0.634656,0.8102,0.819077,0.8102,0.809448
3,0.3154,0.605749,0.8235,0.831079,0.8235,0.823476
4,0.1695,0.590777,0.8343,0.845023,0.8343,0.835698
5,0.0871,0.562585,0.8461,0.849955,0.8461,0.846312
6,0.039,0.575966,0.8522,0.855972,0.8522,0.852493
7,0.0163,0.579754,0.8549,0.857009,0.8549,0.855136


[I 2025-03-31 07:15:13,663] Trial 110 finished with value: 0.8551359469466883 and parameters: {'learning_rate': 0.00019272128592044227, 'weight_decay': 0.004, 'warmup_steps': 26}. Best is trial 47 with value: 0.858262931189481.


Trial 111 with params: {'learning_rate': 0.00015453131306122868, 'weight_decay': 0.0, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0219,0.902906,0.7538,0.769874,0.7538,0.753153
2,0.5939,0.642107,0.812,0.818711,0.812,0.811589
3,0.3216,0.575989,0.8348,0.840506,0.8348,0.834453
4,0.1741,0.581549,0.8357,0.84294,0.8357,0.835908


[I 2025-03-31 07:18:42,313] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 0.00016443200065398682, 'weight_decay': 0.0, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.988,0.887596,0.7564,0.769961,0.7564,0.755016
2,0.5846,0.659413,0.806,0.815696,0.806,0.804392
3,0.3175,0.582374,0.829,0.836263,0.829,0.829182
4,0.1738,0.579287,0.8377,0.844236,0.8377,0.838455
5,0.0891,0.580764,0.8413,0.846401,0.8413,0.841664
6,0.0429,0.573039,0.852,0.855088,0.852,0.852481
7,0.0199,0.578957,0.8511,0.853234,0.8511,0.851354


[I 2025-03-31 07:24:39,415] Trial 112 finished with value: 0.8513536704356837 and parameters: {'learning_rate': 0.00016443200065398682, 'weight_decay': 0.0, 'warmup_steps': 20}. Best is trial 47 with value: 0.858262931189481.


Trial 113 with params: {'learning_rate': 0.00015193795729227338, 'weight_decay': 0.001, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0455,0.900527,0.7557,0.768739,0.7557,0.754722
2,0.5852,0.635555,0.8139,0.817826,0.8139,0.812379
3,0.3163,0.585354,0.8299,0.835206,0.8299,0.829214
4,0.1677,0.572051,0.8394,0.845675,0.8394,0.839983
5,0.0876,0.573012,0.8427,0.845822,0.8427,0.842947
6,0.0417,0.594012,0.8468,0.849973,0.8468,0.846926
7,0.0196,0.597345,0.8504,0.852938,0.8504,0.850557


[I 2025-03-31 07:30:52,403] Trial 113 finished with value: 0.8505567286844333 and parameters: {'learning_rate': 0.00015193795729227338, 'weight_decay': 0.001, 'warmup_steps': 26}. Best is trial 47 with value: 0.858262931189481.


Trial 114 with params: {'learning_rate': 0.00016022439652706313, 'weight_decay': 0.0, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9982,0.877591,0.7638,0.777629,0.7638,0.762094
2,0.5799,0.645222,0.8111,0.819024,0.8111,0.809348
3,0.3163,0.575607,0.8315,0.838062,0.8315,0.831424
4,0.1686,0.571452,0.8391,0.843495,0.8391,0.839119
5,0.088,0.569006,0.845,0.848081,0.845,0.845079
6,0.0421,0.58806,0.8507,0.852602,0.8507,0.850646
7,0.0194,0.588447,0.8542,0.855751,0.8542,0.85419


[I 2025-03-31 07:37:09,544] Trial 114 finished with value: 0.8541895689642333 and parameters: {'learning_rate': 0.00016022439652706313, 'weight_decay': 0.0, 'warmup_steps': 22}. Best is trial 47 with value: 0.858262931189481.


Trial 115 with params: {'learning_rate': 0.00011415190388750719, 'weight_decay': 0.0, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2346,0.994666,0.7433,0.754922,0.7433,0.740506
2,0.6545,0.668484,0.8064,0.812857,0.8064,0.805403
3,0.3554,0.591689,0.8277,0.831911,0.8277,0.826678
4,0.199,0.572365,0.8362,0.841009,0.8362,0.836109


[I 2025-03-31 07:40:43,200] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 0.00021943453655427867, 'weight_decay': 0.0, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8443,0.887975,0.7496,0.771251,0.7496,0.749232
2,0.5731,0.68911,0.7996,0.810507,0.7996,0.797826
3,0.316,0.601321,0.8253,0.833399,0.8253,0.82507
4,0.1712,0.590361,0.8315,0.836758,0.8315,0.831752
5,0.0844,0.584871,0.8443,0.847751,0.8443,0.844338
6,0.0392,0.595627,0.8487,0.852598,0.8487,0.849474
7,0.0145,0.599981,0.8537,0.856495,0.8537,0.854182


[I 2025-03-31 07:46:59,164] Trial 116 finished with value: 0.8541822407174573 and parameters: {'learning_rate': 0.00021943453655427867, 'weight_decay': 0.0, 'warmup_steps': 18}. Best is trial 47 with value: 0.858262931189481.


Trial 117 with params: {'learning_rate': 0.00019466099384651692, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9453,0.93037,0.739,0.764001,0.739,0.737433
2,0.567,0.644594,0.8074,0.8178,0.8074,0.806938
3,0.3141,0.576219,0.8287,0.834188,0.8287,0.827903
4,0.165,0.56585,0.842,0.84694,0.842,0.841944
5,0.0849,0.566788,0.8496,0.853227,0.8496,0.849808
6,0.0376,0.578147,0.8521,0.855371,0.8521,0.852184
7,0.0153,0.579214,0.8566,0.859056,0.8566,0.856908


[I 2025-03-31 07:53:10,644] Trial 117 finished with value: 0.856907713325034 and parameters: {'learning_rate': 0.00019466099384651692, 'weight_decay': 0.005, 'warmup_steps': 29}. Best is trial 47 with value: 0.858262931189481.


Trial 118 with params: {'learning_rate': 0.00013905550653770551, 'weight_decay': 0.005, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1277,0.929908,0.7491,0.764816,0.7491,0.746806
2,0.6137,0.652322,0.8091,0.814623,0.8091,0.807549
3,0.3353,0.589381,0.8301,0.835667,0.8301,0.829631
4,0.183,0.570954,0.8403,0.846209,0.8403,0.840592
5,0.0965,0.569434,0.8417,0.845327,0.8417,0.842061
6,0.0476,0.58212,0.8442,0.847124,0.8442,0.844506
7,0.0235,0.585332,0.8496,0.85179,0.8496,0.849889


[I 2025-03-31 07:59:28,792] Trial 118 finished with value: 0.849889211669569 and parameters: {'learning_rate': 0.00013905550653770551, 'weight_decay': 0.005, 'warmup_steps': 28}. Best is trial 47 with value: 0.858262931189481.


Trial 119 with params: {'learning_rate': 0.0003100751284266726, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7832,0.914524,0.7377,0.762492,0.7377,0.735915
2,0.6029,0.690068,0.7967,0.806912,0.7967,0.79546
3,0.3501,0.645531,0.8124,0.820621,0.8124,0.811704
4,0.194,0.620931,0.8257,0.833243,0.8257,0.826588


[I 2025-03-31 08:03:01,010] Trial 119 pruned. 


Trial 120 with params: {'learning_rate': 0.00018943979662935218, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9337,0.873636,0.7596,0.777239,0.7596,0.758351
2,0.5657,0.643019,0.8098,0.81542,0.8098,0.808644
3,0.3119,0.582202,0.8313,0.83624,0.8313,0.830612
4,0.1659,0.561085,0.8423,0.847868,0.8423,0.842676
5,0.0848,0.572809,0.8439,0.848951,0.8439,0.84442
6,0.038,0.573912,0.8516,0.853926,0.8516,0.851802
7,0.0159,0.577629,0.8543,0.855725,0.8543,0.85425


[I 2025-03-31 08:09:13,724] Trial 120 finished with value: 0.8542496750308672 and parameters: {'learning_rate': 0.00018943979662935218, 'weight_decay': 0.007, 'warmup_steps': 31}. Best is trial 47 with value: 0.858262931189481.


Trial 121 with params: {'learning_rate': 0.00017224841646921478, 'weight_decay': 0.005, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9847,0.869121,0.7618,0.779074,0.7618,0.762031
2,0.5687,0.64634,0.8067,0.81518,0.8067,0.80519
3,0.312,0.58791,0.8261,0.832327,0.8261,0.824745
4,0.1679,0.571671,0.838,0.845499,0.838,0.838715
5,0.0867,0.56185,0.8458,0.849442,0.8458,0.846002
6,0.0405,0.579212,0.8483,0.851289,0.8483,0.848606
7,0.0187,0.579482,0.8505,0.852665,0.8505,0.850753


[I 2025-03-31 08:15:35,707] Trial 121 finished with value: 0.8507533439094348 and parameters: {'learning_rate': 0.00017224841646921478, 'weight_decay': 0.005, 'warmup_steps': 30}. Best is trial 47 with value: 0.858262931189481.


Trial 122 with params: {'learning_rate': 0.0001855452987695449, 'weight_decay': 0.002, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9244,0.906626,0.749,0.769857,0.749,0.74914
2,0.5835,0.650946,0.8083,0.817175,0.8083,0.807469
3,0.3182,0.610747,0.8247,0.833827,0.8247,0.824547
4,0.1701,0.583295,0.8353,0.842828,0.8353,0.836173


[I 2025-03-31 08:19:10,793] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 0.00018212466178634033, 'weight_decay': 0.004, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9004,0.8802,0.7564,0.773904,0.7564,0.754476
2,0.5654,0.677639,0.8049,0.81386,0.8049,0.803358
3,0.311,0.574559,0.832,0.838264,0.832,0.83158
4,0.1635,0.586376,0.8314,0.83917,0.8314,0.832444
5,0.0832,0.586265,0.8422,0.846351,0.8422,0.84245
6,0.0385,0.594125,0.8487,0.85171,0.8487,0.849046
7,0.0167,0.597876,0.8505,0.852252,0.8505,0.850501


[I 2025-03-31 08:25:26,849] Trial 123 finished with value: 0.8505013820411119 and parameters: {'learning_rate': 0.00018212466178634033, 'weight_decay': 0.004, 'warmup_steps': 21}. Best is trial 47 with value: 0.858262931189481.


Trial 124 with params: {'learning_rate': 0.00040228303245632617, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7822,0.99733,0.7126,0.743855,0.7126,0.710084
2,0.6617,0.792287,0.7704,0.79001,0.7704,0.769462


[I 2025-03-31 08:27:18,986] Trial 124 pruned. 


Trial 125 with params: {'learning_rate': 0.00030366880813551987, 'weight_decay': 0.0, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7333,0.92529,0.7346,0.760419,0.7346,0.734227
2,0.5918,0.705243,0.7905,0.804206,0.7905,0.790023
3,0.3507,0.666662,0.8036,0.816248,0.8036,0.80456
4,0.1917,0.638988,0.8257,0.837382,0.8257,0.826321


[I 2025-03-31 08:30:49,271] Trial 125 pruned. 


Trial 126 with params: {'learning_rate': 0.0013550745741247334, 'weight_decay': 0.003, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3913,1.999229,0.4591,0.556044,0.4591,0.460215
2,1.3811,1.358467,0.6098,0.668157,0.6098,0.60676


[I 2025-03-31 08:32:37,623] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.0004359294741940502, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.741,1.08182,0.6975,0.740297,0.6975,0.699882
2,0.6788,0.777582,0.7682,0.789753,0.7682,0.767702
3,0.4149,0.659604,0.8059,0.814054,0.8059,0.804274
4,0.2338,0.628039,0.8227,0.831477,0.8227,0.822902


[I 2025-03-31 08:36:18,476] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.0002981351266785797, 'weight_decay': 0.003, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7642,0.895527,0.742,0.761691,0.742,0.740177
2,0.5904,0.691944,0.794,0.810077,0.794,0.793575
3,0.3385,0.609682,0.8215,0.831086,0.8215,0.821779
4,0.1815,0.578786,0.8356,0.841061,0.8356,0.836029
5,0.0943,0.585289,0.8447,0.847762,0.8447,0.844553
6,0.0423,0.594698,0.8521,0.855358,0.8521,0.852618
7,0.0151,0.588158,0.8583,0.860568,0.8583,0.858425


[I 2025-03-31 08:42:44,306] Trial 128 finished with value: 0.8584254154160783 and parameters: {'learning_rate': 0.0002981351266785797, 'weight_decay': 0.003, 'warmup_steps': 22}. Best is trial 128 with value: 0.8584254154160783.


Trial 129 with params: {'learning_rate': 0.00034397429267564136, 'weight_decay': 0.003, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7402,0.9498,0.7249,0.750793,0.7249,0.723102
2,0.6206,0.717263,0.7855,0.800038,0.7855,0.785198


[I 2025-03-31 08:44:33,421] Trial 129 pruned. 


Trial 130 with params: {'learning_rate': 0.00028716626628042983, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7473,0.924175,0.733,0.763334,0.733,0.732605
2,0.5789,0.693086,0.7959,0.809097,0.7959,0.795158


[I 2025-03-31 08:46:20,718] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 0.0002253768282010734, 'weight_decay': 0.003, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8285,0.873198,0.7519,0.771097,0.7519,0.749699
2,0.5782,0.691525,0.7933,0.808798,0.7933,0.793556
3,0.323,0.595817,0.8277,0.835656,0.8277,0.827789
4,0.177,0.597358,0.8301,0.837481,0.8301,0.830471
5,0.0876,0.586939,0.8437,0.847355,0.8437,0.843566
6,0.0391,0.598059,0.8493,0.851662,0.8493,0.849648
7,0.0154,0.598853,0.8507,0.852505,0.8507,0.850889


[I 2025-03-31 08:52:38,353] Trial 131 finished with value: 0.8508886246284956 and parameters: {'learning_rate': 0.0002253768282010734, 'weight_decay': 0.003, 'warmup_steps': 20}. Best is trial 128 with value: 0.8584254154160783.


Trial 132 with params: {'learning_rate': 0.00039885498427014534, 'weight_decay': 0.004, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7067,1.024889,0.7038,0.739844,0.7038,0.704973
2,0.6472,0.74507,0.7804,0.797976,0.7804,0.780096


[I 2025-03-31 08:54:27,531] Trial 132 pruned. 


Trial 133 with params: {'learning_rate': 0.0002811260379288266, 'weight_decay': 0.001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7425,0.891754,0.7448,0.763579,0.7448,0.74316
2,0.5848,0.71147,0.7906,0.805948,0.7906,0.790495
3,0.3381,0.601746,0.8239,0.829437,0.8239,0.823434
4,0.1796,0.607334,0.8348,0.843167,0.8348,0.835839
5,0.0914,0.59663,0.8442,0.848799,0.8442,0.844316
6,0.0412,0.59242,0.849,0.851647,0.849,0.849408
7,0.0143,0.596666,0.8538,0.85565,0.8538,0.854012


[I 2025-03-31 09:00:47,055] Trial 133 finished with value: 0.854012101695182 and parameters: {'learning_rate': 0.0002811260379288266, 'weight_decay': 0.001, 'warmup_steps': 19}. Best is trial 128 with value: 0.8584254154160783.


Trial 134 with params: {'learning_rate': 0.0002613515451306125, 'weight_decay': 0.003, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7912,0.920437,0.7404,0.76391,0.7404,0.738106
2,0.5722,0.693587,0.7943,0.808711,0.7943,0.793149


[I 2025-03-31 09:02:36,334] Trial 134 pruned. 


Trial 135 with params: {'learning_rate': 0.00010210594446542636, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3601,1.055458,0.7298,0.745649,0.7298,0.726879
2,0.7012,0.682615,0.8091,0.81296,0.8091,0.807373
3,0.3873,0.605259,0.8258,0.832223,0.8258,0.825194
4,0.222,0.577856,0.8337,0.838312,0.8337,0.833793
5,0.1266,0.577638,0.8395,0.84279,0.8395,0.839685
6,0.0706,0.593909,0.8393,0.842029,0.8393,0.839632
7,0.0423,0.595052,0.8399,0.841765,0.8399,0.839984


[I 2025-03-31 09:08:54,253] Trial 135 finished with value: 0.8399842634117834 and parameters: {'learning_rate': 0.00010210594446542636, 'weight_decay': 0.008, 'warmup_steps': 23}. Best is trial 128 with value: 0.8584254154160783.


Trial 136 with params: {'learning_rate': 0.0002103982311107832, 'weight_decay': 0.0, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8079,0.858969,0.7574,0.771141,0.7574,0.755228
2,0.5619,0.661042,0.8056,0.815123,0.8056,0.80473
3,0.3151,0.592822,0.829,0.836423,0.829,0.829006
4,0.1659,0.583148,0.8376,0.843079,0.8376,0.837528
5,0.0829,0.585193,0.845,0.847989,0.845,0.844896
6,0.0392,0.588311,0.8522,0.85492,0.8522,0.852506
7,0.0163,0.588485,0.854,0.855683,0.854,0.85399


[I 2025-03-31 09:15:23,715] Trial 136 finished with value: 0.8539901738407394 and parameters: {'learning_rate': 0.0002103982311107832, 'weight_decay': 0.0, 'warmup_steps': 13}. Best is trial 128 with value: 0.8584254154160783.


Trial 137 with params: {'learning_rate': 8.338635007443792e-05, 'weight_decay': 0.005, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5412,1.181185,0.7203,0.729134,0.7203,0.714648
2,0.7975,0.728677,0.8,0.804874,0.8,0.798808
3,0.451,0.628815,0.82,0.826297,0.82,0.819549
4,0.2743,0.577555,0.834,0.837442,0.834,0.833947
5,0.1701,0.572326,0.8384,0.841268,0.8384,0.838249
6,0.1064,0.579415,0.8372,0.839918,0.8372,0.837484
7,0.0721,0.582963,0.8386,0.840406,0.8386,0.838808


[I 2025-03-31 09:21:59,185] Trial 137 finished with value: 0.838808247505616 and parameters: {'learning_rate': 8.338635007443792e-05, 'weight_decay': 0.005, 'warmup_steps': 22}. Best is trial 128 with value: 0.8584254154160783.


Trial 138 with params: {'learning_rate': 0.0006172014997619855, 'weight_decay': 0.001, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8105,1.224294,0.6523,0.703942,0.6523,0.64802
2,0.8006,0.911332,0.7332,0.752959,0.7332,0.732985
3,0.4995,0.781106,0.7762,0.79263,0.7762,0.774873
4,0.3023,0.698822,0.8057,0.814769,0.8057,0.805863


[I 2025-03-31 09:25:48,981] Trial 138 pruned. 


Trial 139 with params: {'learning_rate': 0.0003354995018750834, 'weight_decay': 0.002, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7142,0.966196,0.7213,0.754357,0.7213,0.722818
2,0.6103,0.701799,0.7919,0.807568,0.7919,0.792303


[I 2025-03-31 09:27:32,369] Trial 139 pruned. 


Trial 140 with params: {'learning_rate': 0.0007926669366535624, 'weight_decay': 0.0, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8857,1.405962,0.6003,0.656804,0.6003,0.598057
2,0.9262,1.005445,0.7072,0.732339,0.7072,0.706321
3,0.6086,0.890864,0.7468,0.766163,0.7468,0.745305
4,0.3785,0.747575,0.7862,0.794511,0.7862,0.785672


[I 2025-03-31 09:31:09,936] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 0.00022426795288967637, 'weight_decay': 0.004, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8549,0.882734,0.7506,0.776795,0.7506,0.750665
2,0.562,0.668402,0.8039,0.814164,0.8039,0.803384
3,0.3211,0.600599,0.8236,0.831442,0.8236,0.823754
4,0.1703,0.582283,0.8364,0.84439,0.8364,0.836991
5,0.0858,0.583157,0.8461,0.849824,0.8461,0.845973
6,0.0395,0.582006,0.8494,0.851704,0.8494,0.849424
7,0.0152,0.585422,0.8539,0.855629,0.8539,0.853999


[I 2025-03-31 09:37:23,256] Trial 141 finished with value: 0.8539985344882018 and parameters: {'learning_rate': 0.00022426795288967637, 'weight_decay': 0.004, 'warmup_steps': 31}. Best is trial 128 with value: 0.8584254154160783.


Trial 142 with params: {'learning_rate': 9.791394727452942e-05, 'weight_decay': 0.002, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.394,1.094379,0.7286,0.737462,0.7286,0.723733
2,0.733,0.704374,0.8081,0.813638,0.8081,0.806988
3,0.403,0.599805,0.8261,0.832024,0.8261,0.825947
4,0.237,0.575231,0.8359,0.840522,0.8359,0.835941
5,0.1399,0.565367,0.841,0.844227,0.841,0.841114
6,0.0811,0.581106,0.8396,0.842185,0.8396,0.839663
7,0.0504,0.578197,0.8434,0.845281,0.8434,0.843536


[I 2025-03-31 09:43:48,752] Trial 142 finished with value: 0.8435355466365999 and parameters: {'learning_rate': 9.791394727452942e-05, 'weight_decay': 0.002, 'warmup_steps': 21}. Best is trial 128 with value: 0.8584254154160783.


Trial 143 with params: {'learning_rate': 0.00022688784783360752, 'weight_decay': 0.004, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8624,0.885702,0.7482,0.765926,0.7482,0.746869
2,0.5797,0.664933,0.8026,0.812714,0.8026,0.801647
3,0.3208,0.604745,0.8254,0.834732,0.8254,0.825406
4,0.1686,0.583049,0.8382,0.847214,0.8382,0.839184
5,0.0838,0.599431,0.8404,0.844981,0.8404,0.840838
6,0.0367,0.593038,0.8493,0.851997,0.8493,0.849408
7,0.0135,0.59801,0.8522,0.854609,0.8522,0.8524


[I 2025-03-31 09:50:30,324] Trial 143 finished with value: 0.8524003211236135 and parameters: {'learning_rate': 0.00022688784783360752, 'weight_decay': 0.004, 'warmup_steps': 25}. Best is trial 128 with value: 0.8584254154160783.


Trial 144 with params: {'learning_rate': 0.0002229977505996049, 'weight_decay': 0.002, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8641,0.889781,0.7463,0.765393,0.7463,0.745477
2,0.5712,0.658966,0.8011,0.810222,0.8011,0.800204
3,0.321,0.588186,0.8283,0.835045,0.8283,0.827755
4,0.1696,0.587377,0.832,0.83971,0.832,0.832382
5,0.0843,0.580028,0.8434,0.847254,0.8434,0.843713
6,0.0388,0.581329,0.8495,0.852848,0.8495,0.85009
7,0.0154,0.581387,0.8537,0.855835,0.8537,0.853889


[I 2025-03-31 09:57:03,119] Trial 144 finished with value: 0.8538886415807668 and parameters: {'learning_rate': 0.0002229977505996049, 'weight_decay': 0.002, 'warmup_steps': 28}. Best is trial 128 with value: 0.8584254154160783.


Trial 145 with params: {'learning_rate': 0.0002485928646598673, 'weight_decay': 0.004, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7908,0.874078,0.7522,0.766341,0.7522,0.750404
2,0.5723,0.649336,0.8066,0.818555,0.8066,0.806158
3,0.3253,0.591186,0.8258,0.8333,0.8258,0.82604
4,0.1745,0.587093,0.8358,0.843014,0.8358,0.836594


[I 2025-03-31 10:00:43,660] Trial 145 pruned. 


Trial 146 with params: {'learning_rate': 0.004283355770338839, 'weight_decay': 0.01, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6026,4.618752,0.0105,0.000247,0.0105,0.000472
2,4.6043,4.607291,0.0101,0.000281,0.0101,0.000505
3,4.602,4.606727,0.0111,0.00155,0.0111,0.002219
4,4.5907,4.572188,0.0135,0.001823,0.0135,0.002552


[I 2025-03-31 10:04:12,015] Trial 146 pruned. 


Trial 147 with params: {'learning_rate': 9.504044653073815e-05, 'weight_decay': 0.004, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5166,1.175426,0.7135,0.724211,0.7135,0.708496
2,0.7877,0.721145,0.7981,0.802486,0.7981,0.796367


[I 2025-03-31 10:06:00,185] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 0.0001090773954464358, 'weight_decay': 0.004, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2929,1.00205,0.7426,0.756794,0.7426,0.739589
2,0.6687,0.665586,0.8119,0.816963,0.8119,0.811212
3,0.369,0.601632,0.8258,0.831648,0.8258,0.825296
4,0.2075,0.573381,0.833,0.837636,0.833,0.833273


[I 2025-03-31 10:09:27,527] Trial 148 pruned. 


Trial 149 with params: {'learning_rate': 0.0002810305566437371, 'weight_decay': 0.0, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7301,0.933086,0.7357,0.757049,0.7357,0.735251
2,0.5862,0.685765,0.7989,0.810688,0.7989,0.798517
3,0.3322,0.61459,0.8216,0.829606,0.8216,0.821084
4,0.1781,0.590959,0.8375,0.844183,0.8375,0.837803
5,0.0903,0.586032,0.8455,0.850747,0.8455,0.845742
6,0.0393,0.605587,0.8527,0.855476,0.8527,0.852896
7,0.0144,0.599988,0.8541,0.855938,0.8541,0.854195


[I 2025-03-31 10:15:57,234] Trial 149 finished with value: 0.8541947369693546 and parameters: {'learning_rate': 0.0002810305566437371, 'weight_decay': 0.0, 'warmup_steps': 18}. Best is trial 128 with value: 0.8584254154160783.


In [19]:
print(best_base)

BestRun(run_id='128', objective=0.8584254154160783, hyperparameters={'learning_rate': 0.0002981351266785797, 'weight_decay': 0.003, 'warmup_steps': 22}, run_summary=None)


In [20]:
base.reset_seed()

## Prohledávání s destilací nad původním datasetem
Konfigurace jednotlivých tréninků.

In [21]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-KD_hp-search", logging_dir=f"~/logs/{DATASET}/-KD_hp-search",  remove_unused_columns=False, epochs=num_epochs, batch_size=batch_size)

Definice hledaných hyperparametrů a jejich rozmezí, rozšířeno o hyperparametry destilace.


In [22]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up),
        "lambda_param": trial.suggest_float("lambda_param",0,1,step=.1),
        "temperature": trial.suggest_float("temperature", 2,7, step=.5)
    }
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [23]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



Konfigurace destilačního trenéra pro jednotlivé tréninky. 

In [24]:
trainer = base.DistilTrainer(
    args=training_args,
    train_dataset=train,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [25]:
best_distill = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Distill",
    n_trials=150
)

[I 2025-03-31 10:15:57,799] A new study created in memory with name: Distill


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8092,0.960473,0.7498,0.765997,0.7498,0.747919
2,0.6756,0.721868,0.8043,0.814398,0.8043,0.804087


[I 2025-03-31 10:17:45,659] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.00010255552094216992, 'weight_decay': 0.0, 'warmup_steps': 28, 'lambda_param': 0.6000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3545,1.186908,0.7399,0.753541,0.7399,0.737003
2,0.8418,0.770264,0.8092,0.814288,0.8092,0.808363
3,0.5112,0.64917,0.8283,0.832567,0.8283,0.827248
4,0.3478,0.60426,0.8382,0.842427,0.8382,0.838038


[I 2025-03-31 10:21:26,216] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 5.497167787383099e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8958,1.726413,0.6608,0.683992,0.6608,0.649111
2,1.2769,1.056655,0.7753,0.778707,0.7753,0.772844


[I 2025-03-31 10:23:14,025] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2048,1.130151,0.7497,0.764223,0.7497,0.746939
2,0.7969,0.751034,0.8089,0.815391,0.8089,0.807326
3,0.4845,0.652853,0.8254,0.832308,0.8254,0.824851
4,0.3334,0.591392,0.8412,0.845264,0.8412,0.841183


[I 2025-03-31 10:26:44,650] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.0008369042894376068, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8468,1.387837,0.6241,0.679246,0.6241,0.613946
2,0.9908,1.088096,0.7028,0.742671,0.7028,0.704157


[I 2025-03-31 10:28:25,421] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.0018591820902866042, 'weight_decay': 0.002, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.7422,2.530437,0.3432,0.423531,0.3432,0.327214
2,1.864,1.817576,0.5123,0.56296,0.5123,0.509678
3,1.3776,1.39079,0.6194,0.640806,0.6194,0.61332
4,1.0228,1.223638,0.6635,0.689385,0.6635,0.658074


[I 2025-03-31 10:31:53,316] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0008204643365323959, 'weight_decay': 0.001, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8724,1.344483,0.6331,0.678243,0.6331,0.629402
2,0.9885,0.989055,0.7229,0.741736,0.7229,0.721171


[I 2025-03-31 10:33:46,780] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 0.0020690200562805084, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.0395,4.239418,0.0138,0.0077,0.0138,0.006043
2,4.1255,4.097263,0.0285,0.007946,0.0285,0.009153
3,4.1405,4.176663,0.0228,0.009959,0.0228,0.009091
4,4.1009,4.111713,0.0354,0.008933,0.0354,0.010628


[I 2025-03-31 10:37:07,620] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3969,1.305195,0.7281,0.741841,0.7281,0.722955
2,0.9441,0.839075,0.8014,0.806019,0.8014,0.800098
3,0.5816,0.683778,0.8239,0.828334,0.8239,0.823058
4,0.4049,0.625886,0.8327,0.836495,0.8327,0.832534


[I 2025-03-31 10:40:38,046] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0010568529720322872, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9978,1.582678,0.5731,0.622682,0.5731,0.564896
2,1.1336,1.148644,0.6812,0.714599,0.6812,0.680103


[I 2025-03-31 10:42:15,333] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 7.577669953489166e-05, 'weight_decay': 0.003, 'warmup_steps': 20, 'lambda_param': 0.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5434,1.428176,0.7135,0.729459,0.7135,0.706879
2,1.0492,0.908367,0.794,0.79785,0.794,0.79235
3,0.66,0.734469,0.8209,0.824295,0.8209,0.819812
4,0.4666,0.657312,0.8292,0.832344,0.8292,0.828768


[I 2025-03-31 10:45:47,149] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 0.00021850539580458986, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8867,0.957592,0.7587,0.776097,0.7587,0.758027
2,0.6641,0.714385,0.8068,0.817722,0.8068,0.806836
3,0.4131,0.626928,0.8301,0.837983,0.8301,0.830039
4,0.282,0.563395,0.8455,0.851469,0.8455,0.845636


[I 2025-03-31 10:49:21,475] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 0.0003527096826929539, 'weight_decay': 0.005, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7328,0.98287,0.7345,0.76232,0.7345,0.734122
2,0.6896,0.776509,0.7874,0.801539,0.7874,0.786926
3,0.4489,0.651687,0.8186,0.832643,0.8186,0.819828
4,0.3007,0.594278,0.8412,0.847016,0.8412,0.8416
5,0.2101,0.564088,0.8441,0.849193,0.8441,0.844739
6,0.1537,0.525436,0.8546,0.858148,0.8546,0.854805
7,0.1224,0.509297,0.8571,0.860715,0.8571,0.857732


[I 2025-03-31 10:55:29,577] Trial 12 finished with value: 0.8577323533640814 and parameters: {'learning_rate': 0.0003527096826929539, 'weight_decay': 0.005, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 5.0}. Best is trial 12 with value: 0.8577323533640814.


Trial 13 with params: {'learning_rate': 0.00015261840840248287, 'weight_decay': 0.005, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0619,1.024166,0.756,0.768695,0.756,0.753773
2,0.7051,0.722987,0.8092,0.817881,0.8092,0.808293
3,0.4286,0.61121,0.8336,0.839127,0.8336,0.832819
4,0.2903,0.575305,0.8428,0.848677,0.8428,0.843056


[I 2025-03-31 10:59:05,655] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 0.0006867392442148531, 'weight_decay': 0.004, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8146,1.24915,0.6633,0.700081,0.6633,0.660944
2,0.885,0.942059,0.7388,0.763547,0.7388,0.739289
3,0.6045,0.820274,0.7723,0.78749,0.7723,0.771098
4,0.4121,0.728656,0.8006,0.811792,0.8006,0.801254


[I 2025-03-31 11:02:38,133] Trial 14 pruned. 


Trial 15 with params: {'learning_rate': 0.0004165711260616245, 'weight_decay': 0.008, 'warmup_steps': 21, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.719,0.98248,0.7344,0.75982,0.7344,0.735551
2,0.7178,0.793992,0.7849,0.798647,0.7849,0.785362
3,0.4747,0.691435,0.8085,0.819082,0.8085,0.808002
4,0.3159,0.613156,0.8343,0.840975,0.8343,0.834617


Using the latest cached version of the module from /home/jovyan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/34c46321f42186df33a6260966e34a368f14868d9cc2ba47d142112e2800d233 (last modified on Sat Mar 29 17:35:20 2025) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
[I 2025-03-31 11:06:50,279] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.0006193972615071132, 'weight_decay': 0.0, 'warmup_steps': 31, 'lambda_param': 0.2, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8134,1.249323,0.6615,0.712228,0.6615,0.661723
2,0.8558,0.907615,0.7454,0.771652,0.7454,0.745899
3,0.5717,0.791913,0.7803,0.791846,0.7803,0.77874
4,0.3897,0.711071,0.8044,0.818099,0.8044,0.805233


[I 2025-03-31 11:10:26,072] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 0.00015564207439234265, 'weight_decay': 0.005, 'warmup_steps': 19, 'lambda_param': 0.30000000000000004, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0343,1.052358,0.7591,0.776286,0.7591,0.757115
2,0.7289,0.733734,0.805,0.814064,0.805,0.803796
3,0.4434,0.623507,0.8292,0.836831,0.8292,0.829012
4,0.2986,0.580186,0.8397,0.847033,0.8397,0.840427


[I 2025-03-31 11:14:05,472] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0021749509684030914, 'weight_decay': 0.005, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.018,2.984347,0.2309,0.318322,0.2309,0.209761
2,2.1684,2.084951,0.4355,0.509248,0.4355,0.431275


[I 2025-03-31 11:15:49,168] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 0.0004614359545671237, 'weight_decay': 0.001, 'warmup_steps': 25, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7106,1.063012,0.7116,0.741732,0.7116,0.711341
2,0.7365,0.806261,0.7793,0.798624,0.7793,0.780494
3,0.4909,0.705406,0.8033,0.816022,0.8033,0.80312
4,0.3263,0.635092,0.8239,0.835964,0.8239,0.825049


[I 2025-03-31 11:19:27,676] Trial 19 pruned. 


Trial 20 with params: {'learning_rate': 7.142849622195741e-05, 'weight_decay': 0.002, 'warmup_steps': 15, 'lambda_param': 0.8, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.6329,1.513485,0.6939,0.711569,0.6939,0.686139
2,1.1149,0.943361,0.7914,0.7947,0.7914,0.789685
3,0.6996,0.765777,0.8149,0.818648,0.8149,0.814338
4,0.4975,0.679993,0.8256,0.828902,0.8256,0.825446


[I 2025-03-31 11:22:59,812] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 8.153422093229849e-05, 'weight_decay': 0.006, 'warmup_steps': 29, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5077,1.341241,0.7277,0.739221,0.7277,0.722424
2,0.9806,0.858479,0.8005,0.803579,0.8005,0.799047
3,0.6074,0.703404,0.8202,0.825809,0.8202,0.819234
4,0.4241,0.634525,0.8338,0.837853,0.8338,0.833741
5,0.3187,0.601971,0.8393,0.842322,0.8393,0.839333
6,0.256,0.58976,0.8426,0.846119,0.8426,0.842917
7,0.2214,0.586138,0.8428,0.846017,0.8428,0.84312


[I 2025-03-31 11:29:17,657] Trial 21 finished with value: 0.8431195696020702 and parameters: {'learning_rate': 8.153422093229849e-05, 'weight_decay': 0.006, 'warmup_steps': 29, 'lambda_param': 0.8, 'temperature': 4.0}. Best is trial 12 with value: 0.8577323533640814.


Trial 22 with params: {'learning_rate': 0.00010421083655508715, 'weight_decay': 0.007, 'warmup_steps': 27, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3358,1.190382,0.7339,0.750184,0.7339,0.729918
2,0.8408,0.775469,0.8067,0.811101,0.8067,0.805496
3,0.5144,0.660795,0.8279,0.835104,0.8279,0.826959
4,0.3505,0.606553,0.8359,0.840625,0.8359,0.836126


[I 2025-03-31 11:33:00,042] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 0.00011218282231670601, 'weight_decay': 0.005, 'warmup_steps': 32, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2871,1.173092,0.7371,0.752884,0.7371,0.734489
2,0.8202,0.773818,0.8036,0.810351,0.8036,0.802992
3,0.4959,0.638627,0.8305,0.8352,0.8305,0.829601
4,0.3367,0.602166,0.838,0.84333,0.838,0.838291
5,0.2477,0.581549,0.8431,0.847833,0.8431,0.843677
6,0.1933,0.5739,0.8429,0.84763,0.8429,0.843686
7,0.1643,0.565429,0.8449,0.848561,0.8449,0.845474


[I 2025-03-31 11:39:30,638] Trial 23 finished with value: 0.8454743149370607 and parameters: {'learning_rate': 0.00011218282231670601, 'weight_decay': 0.005, 'warmup_steps': 32, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 12 with value: 0.8577323533640814.


Trial 24 with params: {'learning_rate': 0.0001016934910955101, 'weight_decay': 0.005, 'warmup_steps': 29, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3365,1.199255,0.737,0.750775,0.737,0.732699
2,0.8445,0.769485,0.8096,0.815452,0.8096,0.808455
3,0.5086,0.646913,0.83,0.834994,0.83,0.829321
4,0.3488,0.591908,0.8404,0.845258,0.8404,0.840547
5,0.2589,0.573252,0.846,0.849616,0.846,0.846137
6,0.2043,0.564108,0.8452,0.848794,0.8452,0.84547
7,0.1746,0.559633,0.8466,0.849729,0.8466,0.846867


[I 2025-03-31 11:46:01,961] Trial 24 finished with value: 0.8468666085642337 and parameters: {'learning_rate': 0.0001016934910955101, 'weight_decay': 0.005, 'warmup_steps': 29, 'lambda_param': 0.5, 'temperature': 3.0}. Best is trial 12 with value: 0.8577323533640814.


Trial 25 with params: {'learning_rate': 9.169598473831618e-05, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4291,1.279046,0.7293,0.742228,0.7293,0.724296
2,0.9163,0.814234,0.8066,0.81023,0.8066,0.805117
3,0.5604,0.671753,0.8303,0.835017,0.8303,0.829594
4,0.3845,0.614357,0.8383,0.844384,0.8383,0.838635
5,0.2862,0.58542,0.8447,0.84875,0.8447,0.844898
6,0.2276,0.572799,0.8446,0.847585,0.8446,0.844877
7,0.1953,0.566413,0.8475,0.849794,0.8475,0.847687


[I 2025-03-31 11:52:45,234] Trial 25 finished with value: 0.8476867095396732 and parameters: {'learning_rate': 9.169598473831618e-05, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 12 with value: 0.8577323533640814.


Trial 26 with params: {'learning_rate': 0.00010773949296901288, 'weight_decay': 0.001, 'warmup_steps': 27, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3064,1.164224,0.738,0.750754,0.738,0.734186
2,0.8243,0.765707,0.8075,0.812255,0.8075,0.806366
3,0.4997,0.656197,0.8292,0.834581,0.8292,0.828561
4,0.3369,0.598186,0.839,0.843453,0.839,0.839274


[I 2025-03-31 11:56:28,276] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.0029678454905841976, 'weight_decay': 0.009000000000000001, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.1581,4.169464,0.0197,0.003346,0.0197,0.004006
2,4.2072,4.23168,0.0164,0.000798,0.0164,0.001384
3,4.1984,4.183973,0.0174,0.003617,0.0174,0.004441
4,4.1946,4.216005,0.0168,0.001657,0.0168,0.001811


[I 2025-03-31 11:59:59,960] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 5.5692560683803484e-05, 'weight_decay': 0.001, 'warmup_steps': 32, 'lambda_param': 0.4, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8831,1.741062,0.6623,0.675522,0.6623,0.650011
2,1.3333,1.122741,0.7705,0.772797,0.7705,0.768629
3,0.8838,0.897187,0.799,0.802021,0.799,0.797701
4,0.6599,0.784426,0.8117,0.814244,0.8117,0.810815


[I 2025-03-31 12:03:29,386] Trial 28 pruned. 


Trial 29 with params: {'learning_rate': 0.000171074624429386, 'weight_decay': 0.006, 'warmup_steps': 24, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9635,0.988353,0.7631,0.7794,0.7631,0.761585
2,0.6844,0.705126,0.8122,0.82043,0.8122,0.8116
3,0.42,0.610616,0.8358,0.842869,0.8358,0.835968
4,0.2827,0.574232,0.8442,0.851417,0.8442,0.844988
5,0.2037,0.547983,0.8507,0.854551,0.8507,0.851305
6,0.1586,0.532793,0.856,0.859162,0.856,0.856514
7,0.1321,0.524541,0.8564,0.859623,0.8564,0.856906


[I 2025-03-31 12:09:43,618] Trial 29 finished with value: 0.8569055419463162 and parameters: {'learning_rate': 0.000171074624429386, 'weight_decay': 0.006, 'warmup_steps': 24, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}. Best is trial 12 with value: 0.8577323533640814.


Trial 30 with params: {'learning_rate': 0.00018356428697514882, 'weight_decay': 0.002, 'warmup_steps': 31, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9738,0.974873,0.7599,0.774707,0.7599,0.758841
2,0.6715,0.689855,0.8165,0.822872,0.8165,0.815895
3,0.4138,0.611575,0.8341,0.8413,0.8341,0.834043
4,0.2813,0.580619,0.844,0.852552,0.844,0.844623
5,0.2026,0.549464,0.8509,0.857268,0.8509,0.851989
6,0.1561,0.525938,0.8577,0.861514,0.8577,0.858327
7,0.1299,0.517754,0.8587,0.86292,0.8587,0.859571


[I 2025-03-31 12:16:15,629] Trial 30 finished with value: 0.8595707754719351 and parameters: {'learning_rate': 0.00018356428697514882, 'weight_decay': 0.002, 'warmup_steps': 31, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 30 with value: 0.8595707754719351.


Trial 31 with params: {'learning_rate': 0.00031237404065078214, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7814,0.952675,0.7479,0.767705,0.7479,0.747999
2,0.6737,0.766577,0.7893,0.802359,0.7893,0.789172
3,0.4293,0.649408,0.8225,0.832691,0.8225,0.822075
4,0.2873,0.580562,0.8371,0.845485,0.8371,0.837928
5,0.202,0.546443,0.8492,0.855137,0.8492,0.849971
6,0.1499,0.512749,0.8554,0.858706,0.8554,0.855919
7,0.1206,0.503739,0.862,0.866135,0.862,0.862754


[I 2025-03-31 12:22:55,182] Trial 31 finished with value: 0.8627538517003588 and parameters: {'learning_rate': 0.00031237404065078214, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 32 with params: {'learning_rate': 0.00023574800370245456, 'weight_decay': 0.005, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8415,0.957749,0.747,0.766861,0.747,0.745861
2,0.6554,0.72105,0.8034,0.81545,0.8034,0.803706


[I 2025-03-31 12:24:46,681] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 0.0001588800749828699, 'weight_decay': 0.0, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0629,1.020029,0.7573,0.771914,0.7573,0.755509
2,0.7049,0.718915,0.807,0.814329,0.807,0.806008
3,0.4286,0.618983,0.8314,0.839807,0.8314,0.831222
4,0.2904,0.574298,0.8436,0.851081,0.8436,0.84468
5,0.21,0.548833,0.8509,0.854418,0.8509,0.85133
6,0.162,0.53525,0.8563,0.860334,0.8563,0.857036
7,0.136,0.524979,0.8575,0.861448,0.8575,0.858262


[I 2025-03-31 12:31:14,019] Trial 33 finished with value: 0.8582623680063252 and parameters: {'learning_rate': 0.0001588800749828699, 'weight_decay': 0.0, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 34 with params: {'learning_rate': 0.0001611265741758045, 'weight_decay': 0.001, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0653,1.003138,0.7607,0.77494,0.7607,0.758867
2,0.7083,0.711981,0.8119,0.819961,0.8119,0.811771
3,0.4334,0.612887,0.8352,0.840942,0.8352,0.834763
4,0.2907,0.577666,0.8394,0.845451,0.8394,0.839645


[I 2025-03-31 12:34:58,861] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 0.0003609658312112723, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7714,1.023451,0.7265,0.759275,0.7265,0.726904
2,0.7005,0.75927,0.7929,0.808553,0.7929,0.793306
3,0.4542,0.652161,0.8212,0.831411,0.8212,0.821029
4,0.3039,0.598152,0.8358,0.844321,0.8358,0.836469
5,0.2108,0.559811,0.8478,0.853416,0.8478,0.848565
6,0.1551,0.529088,0.8537,0.858033,0.8537,0.854421
7,0.1232,0.509056,0.8596,0.863329,0.8596,0.860395


[I 2025-03-31 12:41:16,305] Trial 35 finished with value: 0.8603953751185697 and parameters: {'learning_rate': 0.0003609658312112723, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 0.8, 'temperature': 3.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 36 with params: {'learning_rate': 0.0013761553672596986, 'weight_decay': 0.005, 'warmup_steps': 32, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.261,1.939397,0.4828,0.556014,0.4828,0.477478
2,1.4009,1.423612,0.606,0.665705,0.606,0.604342


[I 2025-03-31 12:43:06,699] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 0.0004477059532096864, 'weight_decay': 0.0, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7327,1.073124,0.7057,0.742322,0.7057,0.704245
2,0.7295,0.805579,0.7733,0.791175,0.7733,0.774073
3,0.4892,0.729631,0.7985,0.812352,0.7985,0.798873
4,0.3288,0.644492,0.8192,0.828079,0.8192,0.818766


[I 2025-03-31 12:46:34,161] Trial 37 pruned. 


Trial 38 with params: {'learning_rate': 0.00022490971189066515, 'weight_decay': 0.002, 'warmup_steps': 23, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8618,0.988784,0.7512,0.771976,0.7512,0.750606
2,0.6685,0.719051,0.8045,0.812388,0.8045,0.803755
3,0.418,0.635595,0.8227,0.832423,0.8227,0.822765
4,0.2817,0.576327,0.8423,0.849772,0.8423,0.843113
5,0.2007,0.549548,0.8494,0.854671,0.8494,0.850226
6,0.1527,0.523494,0.859,0.863972,0.859,0.860053
7,0.1264,0.516286,0.8572,0.862185,0.8572,0.858345


[I 2025-03-31 12:52:49,272] Trial 38 finished with value: 0.8583454091040162 and parameters: {'learning_rate': 0.00022490971189066515, 'weight_decay': 0.002, 'warmup_steps': 23, 'lambda_param': 0.8, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 39 with params: {'learning_rate': 0.00034829643523797484, 'weight_decay': 0.003, 'warmup_steps': 16, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.687,0.957994,0.7488,0.767258,0.7488,0.746665
2,0.6801,0.733461,0.7987,0.810888,0.7987,0.798947


[I 2025-03-31 12:54:36,483] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.00019491458444197872, 'weight_decay': 0.003, 'warmup_steps': 22, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9094,0.948838,0.7621,0.775747,0.7621,0.759994
2,0.6625,0.700951,0.8109,0.819479,0.8109,0.809698
3,0.4099,0.607662,0.836,0.842076,0.836,0.835653
4,0.2807,0.573662,0.8434,0.849863,0.8434,0.843843
5,0.2028,0.548282,0.8532,0.857588,0.8532,0.853247
6,0.1557,0.525381,0.8575,0.862419,0.8575,0.858486
7,0.1297,0.514459,0.8597,0.864495,0.8597,0.860665


[I 2025-03-31 13:00:51,749] Trial 40 finished with value: 0.8606648859919577 and parameters: {'learning_rate': 0.00019491458444197872, 'weight_decay': 0.003, 'warmup_steps': 22, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 41 with params: {'learning_rate': 0.0007247221650687639, 'weight_decay': 0.003, 'warmup_steps': 24, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8232,1.282273,0.6462,0.699627,0.6462,0.644905
2,0.9187,0.992512,0.7332,0.760821,0.7332,0.732438


[I 2025-03-31 13:02:44,624] Trial 41 pruned. 


Trial 42 with params: {'learning_rate': 7.276644812525664e-05, 'weight_decay': 0.002, 'warmup_steps': 19, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5922,1.422755,0.7093,0.723065,0.7093,0.701286
2,1.0425,0.902501,0.7964,0.80093,0.7964,0.794698
3,0.6542,0.734083,0.8195,0.824014,0.8195,0.81899
4,0.4642,0.652979,0.8307,0.833143,0.8307,0.830337


[I 2025-03-31 13:06:17,813] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.0003820930433657385, 'weight_decay': 0.003, 'warmup_steps': 31, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7579,1.050579,0.7184,0.753654,0.7184,0.71997
2,0.7072,0.819214,0.7765,0.793198,0.7765,0.776911


[I 2025-03-31 13:08:10,598] Trial 43 pruned. 


Trial 44 with params: {'learning_rate': 0.00024213436218766237, 'weight_decay': 0.002, 'warmup_steps': 21, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8336,0.926431,0.7612,0.775191,0.7612,0.759455
2,0.6689,0.705511,0.8104,0.819404,0.8104,0.81015
3,0.4272,0.622502,0.8322,0.837227,0.8322,0.831388
4,0.2863,0.589059,0.8386,0.844862,0.8386,0.838762
5,0.2025,0.55042,0.852,0.855547,0.852,0.852182
6,0.1524,0.524143,0.8558,0.858795,0.8558,0.855994
7,0.1252,0.515562,0.8581,0.862243,0.8581,0.859052


[I 2025-03-31 13:14:31,529] Trial 44 finished with value: 0.8590518702020482 and parameters: {'learning_rate': 0.00024213436218766237, 'weight_decay': 0.002, 'warmup_steps': 21, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 45 with params: {'learning_rate': 0.00015758765894152805, 'weight_decay': 0.003, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0311,0.992774,0.764,0.774632,0.764,0.761607
2,0.6989,0.710439,0.811,0.817401,0.811,0.809538
3,0.4267,0.615332,0.8357,0.842488,0.8357,0.834868
4,0.2901,0.581318,0.8419,0.847557,0.8419,0.842284
5,0.2093,0.555565,0.8481,0.85273,0.8481,0.848773
6,0.1621,0.541685,0.8555,0.859666,0.8555,0.856121
7,0.1363,0.530389,0.8549,0.858926,0.8549,0.85576


[I 2025-03-31 13:20:57,823] Trial 45 finished with value: 0.8557596509492794 and parameters: {'learning_rate': 0.00015758765894152805, 'weight_decay': 0.003, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 46 with params: {'learning_rate': 0.0011725126315020571, 'weight_decay': 0.009000000000000001, 'warmup_steps': 15, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1382,1.856072,0.5097,0.605328,0.5097,0.508629
2,1.2611,1.379525,0.6242,0.685046,0.6242,0.627189
3,0.8927,1.010751,0.7218,0.745285,0.7218,0.717217
4,0.6258,0.913197,0.749,0.766008,0.749,0.748266


[I 2025-03-31 13:24:41,306] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 5.232252858049981e-05, 'weight_decay': 0.002, 'warmup_steps': 3, 'lambda_param': 0.5, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8549,1.759225,0.6613,0.67375,0.6613,0.647466
2,1.3352,1.122025,0.7661,0.769217,0.7661,0.763801


[I 2025-03-31 13:26:26,565] Trial 47 pruned. 


Trial 48 with params: {'learning_rate': 0.0027511979602444763, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.2237,4.20594,0.0146,0.000911,0.0146,0.001647
2,4.2263,4.266517,0.0105,0.000349,0.0105,0.00061


[I 2025-03-31 13:28:14,839] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.00057101176794445, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7916,1.159369,0.6886,0.721686,0.6886,0.689026
2,0.8203,0.907433,0.7525,0.780501,0.7525,0.751961
3,0.5417,0.819222,0.7766,0.792753,0.7766,0.775408
4,0.3723,0.677241,0.8169,0.825873,0.8169,0.817008


[I 2025-03-31 13:31:35,976] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.00024539413701107427, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8128,0.960198,0.7559,0.774413,0.7559,0.754251
2,0.6589,0.720392,0.8057,0.815789,0.8057,0.805936
3,0.4182,0.621804,0.8295,0.839613,0.8295,0.8296
4,0.2823,0.579795,0.8396,0.847895,0.8396,0.840131
5,0.2024,0.541226,0.8524,0.857306,0.8524,0.853034
6,0.1532,0.520903,0.8567,0.860787,0.8567,0.857529
7,0.1247,0.50945,0.858,0.862027,0.858,0.85895


[I 2025-03-31 13:37:47,525] Trial 50 finished with value: 0.8589501582843106 and parameters: {'learning_rate': 0.00024539413701107427, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 51 with params: {'learning_rate': 0.000306925673496036, 'weight_decay': 0.001, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.728,0.968189,0.7421,0.770486,0.7421,0.742773
2,0.6645,0.725897,0.8001,0.813318,0.8001,0.800456
3,0.4307,0.62956,0.8256,0.833417,0.8256,0.82582
4,0.2886,0.591487,0.8378,0.844321,0.8378,0.838265


[I 2025-03-31 13:41:32,405] Trial 51 pruned. 


Trial 52 with params: {'learning_rate': 0.0027158955385139997, 'weight_decay': 0.006, 'warmup_steps': 16, 'lambda_param': 0.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.772,3.873099,0.067,0.05763,0.067,0.042491
2,3.5633,3.502888,0.1185,0.110463,0.1185,0.095325
3,3.3742,3.417935,0.1457,0.149619,0.1457,0.11747
4,3.1147,3.177587,0.1985,0.224605,0.1985,0.170567


[I 2025-03-31 13:45:08,678] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 0.00011992021403393908, 'weight_decay': 0.002, 'warmup_steps': 15, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2073,1.135645,0.7462,0.761047,0.7462,0.743321
2,0.7862,0.73528,0.8136,0.818734,0.8136,0.8121
3,0.4741,0.632602,0.8326,0.838418,0.8326,0.831912
4,0.3222,0.587757,0.8415,0.846822,0.8415,0.841808
5,0.2373,0.560336,0.8479,0.851392,0.8479,0.848188
6,0.1866,0.556318,0.8473,0.850815,0.8473,0.847522
7,0.1581,0.546773,0.8505,0.853913,0.8505,0.850923


[I 2025-03-31 13:51:36,202] Trial 53 finished with value: 0.8509226119489293 and parameters: {'learning_rate': 0.00011992021403393908, 'weight_decay': 0.002, 'warmup_steps': 15, 'lambda_param': 1.0, 'temperature': 3.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 54 with params: {'learning_rate': 0.0027026130785766608, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.5215,4.220326,0.0763,0.114304,0.0763,0.051477
2,3.1074,3.005095,0.2213,0.273731,0.2213,0.199657
3,2.6781,2.679445,0.2954,0.349013,0.2954,0.280056
4,2.2815,2.307803,0.3947,0.40163,0.3947,0.373199


[I 2025-03-31 13:55:15,603] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 7.242888062473813e-05, 'weight_decay': 0.001, 'warmup_steps': 24, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.6107,1.449041,0.7105,0.726802,0.7105,0.703344
2,1.0612,0.915346,0.794,0.797337,0.794,0.792281


[I 2025-03-31 13:56:55,877] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 0.0001413812546509425, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1618,1.103663,0.7374,0.75603,0.7374,0.735196
2,0.7354,0.717893,0.8118,0.817056,0.8118,0.811003
3,0.4416,0.618693,0.8344,0.839769,0.8344,0.833773
4,0.2987,0.58466,0.8399,0.845705,0.8399,0.840503
5,0.216,0.563854,0.8493,0.853502,0.8493,0.849616
6,0.1689,0.545411,0.8481,0.852379,0.8481,0.848678
7,0.1421,0.541235,0.8495,0.8532,0.8495,0.850179


[I 2025-03-31 14:03:03,264] Trial 56 finished with value: 0.8501791507919098 and parameters: {'learning_rate': 0.0001413812546509425, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.8, 'temperature': 4.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 57 with params: {'learning_rate': 0.0003268367829332534, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7709,0.955673,0.7421,0.763125,0.7421,0.741013
2,0.6711,0.735611,0.7979,0.80857,0.7979,0.796896


[I 2025-03-31 14:04:57,265] Trial 57 pruned. 


Trial 58 with params: {'learning_rate': 0.000260146677837053, 'weight_decay': 0.003, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8054,0.935562,0.7512,0.767941,0.7512,0.74958
2,0.6544,0.73064,0.8005,0.814176,0.8005,0.800183
3,0.416,0.647702,0.8184,0.828874,0.8184,0.818601
4,0.2795,0.566114,0.8446,0.850426,0.8446,0.84484
5,0.1995,0.548932,0.8502,0.854647,0.8502,0.850658
6,0.1508,0.525808,0.8563,0.860751,0.8563,0.856911
7,0.1228,0.512842,0.8587,0.863138,0.8587,0.859432


[I 2025-03-31 14:11:11,526] Trial 58 finished with value: 0.8594317325113633 and parameters: {'learning_rate': 0.000260146677837053, 'weight_decay': 0.003, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 59 with params: {'learning_rate': 0.0005396271019310598, 'weight_decay': 0.003, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7301,1.068763,0.7124,0.741047,0.7124,0.712591
2,0.7813,0.908887,0.7457,0.769705,0.7457,0.746097
3,0.5257,0.714456,0.7978,0.807053,0.7978,0.795951
4,0.3552,0.633096,0.8286,0.834133,0.8286,0.828011


[I 2025-03-31 14:14:54,580] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 0.00012525369862126539, 'weight_decay': 0.007, 'warmup_steps': 2, 'lambda_param': 0.1, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1206,1.1177,0.7496,0.766487,0.7496,0.747766
2,0.7846,0.740758,0.8115,0.818093,0.8115,0.810404
3,0.4785,0.641838,0.8237,0.83251,0.8237,0.823404
4,0.3239,0.590816,0.8439,0.848205,0.8439,0.843972
5,0.2366,0.575127,0.8434,0.847025,0.8434,0.843411
6,0.1848,0.556938,0.8483,0.851685,0.8483,0.848712
7,0.1566,0.551332,0.8512,0.854204,0.8512,0.851547


[I 2025-03-31 14:21:10,901] Trial 60 finished with value: 0.8515469729971188 and parameters: {'learning_rate': 0.00012525369862126539, 'weight_decay': 0.007, 'warmup_steps': 2, 'lambda_param': 0.1, 'temperature': 5.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 61 with params: {'learning_rate': 0.00013273422808933347, 'weight_decay': 0.005, 'warmup_steps': 16, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1293,1.10553,0.7477,0.760829,0.7477,0.744486
2,0.7668,0.72797,0.8126,0.817842,0.8126,0.81166
3,0.4648,0.641162,0.8288,0.835834,0.8288,0.82763
4,0.316,0.591342,0.8408,0.847373,0.8408,0.840875
5,0.2315,0.568381,0.8484,0.85208,0.8484,0.848809
6,0.1801,0.556307,0.848,0.851861,0.848,0.848654
7,0.1512,0.550404,0.8499,0.853775,0.8499,0.850658


[I 2025-03-31 14:27:33,731] Trial 61 finished with value: 0.850658435252932 and parameters: {'learning_rate': 0.00013273422808933347, 'weight_decay': 0.005, 'warmup_steps': 16, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 62 with params: {'learning_rate': 0.00027753658123962104, 'weight_decay': 0.003, 'warmup_steps': 20, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7751,0.953059,0.7503,0.775311,0.7503,0.750241
2,0.6589,0.701769,0.8095,0.819389,0.8095,0.809238
3,0.419,0.629141,0.8269,0.836001,0.8269,0.826573
4,0.2851,0.595257,0.8368,0.845546,0.8368,0.837567


[I 2025-03-31 14:31:04,003] Trial 62 pruned. 


Trial 63 with params: {'learning_rate': 0.0002958499381119374, 'weight_decay': 0.004, 'warmup_steps': 12, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7315,0.992467,0.7409,0.767548,0.7409,0.741897
2,0.6734,0.73576,0.7998,0.808793,0.7998,0.799147
3,0.4287,0.637946,0.8204,0.829179,0.8204,0.819809
4,0.2878,0.590438,0.8382,0.843298,0.8382,0.838735
5,0.2027,0.560629,0.848,0.853457,0.848,0.848612
6,0.1533,0.532369,0.8526,0.856859,0.8526,0.853455
7,0.1234,0.512923,0.8567,0.860901,0.8567,0.857593


[I 2025-03-31 14:37:12,078] Trial 63 finished with value: 0.8575934850225108 and parameters: {'learning_rate': 0.0002958499381119374, 'weight_decay': 0.004, 'warmup_steps': 12, 'lambda_param': 0.9, 'temperature': 3.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 64 with params: {'learning_rate': 0.00041098534227771127, 'weight_decay': 0.008, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.735,1.053243,0.7163,0.746324,0.7163,0.71428
2,0.7233,0.767054,0.7888,0.801817,0.7888,0.788013


[I 2025-03-31 14:38:55,588] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.0009935611886988007, 'weight_decay': 0.01, 'warmup_steps': 11, 'lambda_param': 0.30000000000000004, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9756,1.559278,0.5768,0.651089,0.5768,0.571538
2,1.1031,1.295696,0.6419,0.703631,0.6419,0.641725
3,0.7785,0.944616,0.7423,0.763727,0.7423,0.741004
4,0.5396,0.830282,0.7745,0.79067,0.7745,0.773902


[I 2025-03-31 14:42:17,736] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 0.004387816666803014, 'weight_decay': 0.003, 'warmup_steps': 31, 'lambda_param': 0.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.1873,4.236221,0.0118,0.000611,0.0118,0.000994
2,4.2268,4.222086,0.0119,0.000421,0.0119,0.000804


[I 2025-03-31 14:44:05,601] Trial 66 pruned. 


Trial 67 with params: {'learning_rate': 7.733680478165602e-05, 'weight_decay': 0.004, 'warmup_steps': 28, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.5738,1.425046,0.7099,0.726142,0.7099,0.703089
2,1.0341,0.885968,0.7978,0.80145,0.7978,0.796018
3,0.6384,0.724933,0.8167,0.819894,0.8167,0.81537
4,0.4478,0.647393,0.8327,0.835531,0.8327,0.832381


[I 2025-03-31 14:47:35,913] Trial 67 pruned. 


Trial 68 with params: {'learning_rate': 0.00018601070851347612, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9796,0.978273,0.7552,0.772249,0.7552,0.753632
2,0.6754,0.687927,0.8147,0.823996,0.8147,0.814262
3,0.4161,0.604511,0.8342,0.842095,0.8342,0.833576
4,0.2778,0.568675,0.8443,0.851651,0.8443,0.84521
5,0.2006,0.544733,0.8546,0.858543,0.8546,0.854792
6,0.154,0.523204,0.8589,0.863002,0.8589,0.859507
7,0.128,0.515691,0.8583,0.861922,0.8583,0.858998


[I 2025-03-31 14:53:54,829] Trial 68 finished with value: 0.8589981451066765 and parameters: {'learning_rate': 0.00018601070851347612, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 69 with params: {'learning_rate': 0.0005451090332539104, 'weight_decay': 0.004, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7633,1.147115,0.6901,0.724551,0.6901,0.688193
2,0.7981,0.862422,0.7616,0.775213,0.7616,0.761852


[I 2025-03-31 14:55:43,931] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 0.00015578735863339677, 'weight_decay': 0.001, 'warmup_steps': 22, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0323,1.016076,0.7568,0.768796,0.7568,0.75444
2,0.698,0.710676,0.8115,0.819456,0.8115,0.810877
3,0.4264,0.628513,0.8283,0.834456,0.8283,0.827646
4,0.291,0.581434,0.8413,0.847297,0.8413,0.841386
5,0.2131,0.55537,0.8492,0.853372,0.8492,0.8497
6,0.166,0.537178,0.8543,0.857842,0.8543,0.854826
7,0.1388,0.531248,0.8545,0.858893,0.8545,0.855398


[I 2025-03-31 15:01:46,052] Trial 70 finished with value: 0.8553977346640503 and parameters: {'learning_rate': 0.00015578735863339677, 'weight_decay': 0.001, 'warmup_steps': 22, 'lambda_param': 0.8, 'temperature': 3.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 71 with params: {'learning_rate': 0.00019423212892076582, 'weight_decay': 0.003, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9175,0.950183,0.7644,0.777748,0.7644,0.762794
2,0.6693,0.717284,0.8072,0.81738,0.8072,0.806002
3,0.4141,0.607578,0.8356,0.841655,0.8356,0.835301
4,0.2779,0.568359,0.8424,0.849157,0.8424,0.843247
5,0.2004,0.538023,0.8535,0.856535,0.8535,0.85366
6,0.1543,0.517893,0.8579,0.862049,0.8579,0.858636
7,0.128,0.50862,0.8598,0.863603,0.8598,0.860413


[I 2025-03-31 15:08:17,462] Trial 71 finished with value: 0.860413058166474 and parameters: {'learning_rate': 0.00019423212892076582, 'weight_decay': 0.003, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 72 with params: {'learning_rate': 0.00014780425634842615, 'weight_decay': 0.004, 'warmup_steps': 28, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1154,1.063837,0.7492,0.764748,0.7492,0.745957
2,0.7362,0.716211,0.8149,0.819869,0.8149,0.813124
3,0.4476,0.636336,0.8254,0.833966,0.8254,0.824669
4,0.3054,0.585432,0.8408,0.845395,0.8408,0.840663


[I 2025-03-31 15:12:07,026] Trial 72 pruned. 


Trial 73 with params: {'learning_rate': 0.00018900200612498793, 'weight_decay': 0.003, 'warmup_steps': 27, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9423,1.012994,0.7437,0.764786,0.7437,0.742171
2,0.6761,0.718515,0.8091,0.818926,0.8091,0.808262
3,0.4224,0.614818,0.8339,0.839977,0.8339,0.832891
4,0.2814,0.580779,0.8429,0.850768,0.8429,0.843795
5,0.203,0.549054,0.8519,0.856694,0.8519,0.852513
6,0.1564,0.532087,0.8566,0.861143,0.8566,0.857558
7,0.1302,0.521554,0.8582,0.862176,0.8582,0.858996


[I 2025-03-31 15:18:34,473] Trial 73 finished with value: 0.8589961489225898 and parameters: {'learning_rate': 0.00018900200612498793, 'weight_decay': 0.003, 'warmup_steps': 27, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 74 with params: {'learning_rate': 6.24006692401181e-05, 'weight_decay': 0.01, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.7623,1.644094,0.6799,0.694309,0.6799,0.670142
2,1.2363,1.033673,0.7791,0.782138,0.7791,0.777076


[I 2025-03-31 15:20:23,744] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00016861472216092986, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0275,0.994455,0.764,0.775812,0.764,0.762231
2,0.6915,0.709584,0.8125,0.820097,0.8125,0.811357
3,0.4285,0.617436,0.8319,0.840285,0.8319,0.832105
4,0.2904,0.564392,0.8468,0.851475,0.8468,0.847342
5,0.2093,0.549875,0.8508,0.855264,0.8508,0.851369
6,0.1618,0.53048,0.8541,0.857986,0.8541,0.854727
7,0.135,0.521945,0.8558,0.860543,0.8558,0.856799


[I 2025-03-31 15:26:48,632] Trial 75 finished with value: 0.8567986166466077 and parameters: {'learning_rate': 0.00016861472216092986, 'weight_decay': 0.002, 'warmup_steps': 32, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 76 with params: {'learning_rate': 0.0003595274129431474, 'weight_decay': 0.002, 'warmup_steps': 30, 'lambda_param': 0.7000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.76,1.00749,0.7326,0.759033,0.7326,0.732843
2,0.6979,0.761698,0.7911,0.806408,0.7911,0.789735


[I 2025-03-31 15:28:38,310] Trial 76 pruned. 


Trial 77 with params: {'learning_rate': 0.00029025946106120713, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7919,1.005096,0.7345,0.763191,0.7345,0.734838
2,0.6665,0.713799,0.8022,0.812037,0.8022,0.801812
3,0.4269,0.638335,0.8246,0.831275,0.8246,0.824131
4,0.2857,0.597831,0.8352,0.842854,0.8352,0.83579


[I 2025-03-31 15:32:06,724] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.0002219161547294356, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8813,0.955644,0.7557,0.775276,0.7557,0.756303
2,0.6589,0.718029,0.8074,0.817101,0.8074,0.807541
3,0.4102,0.607239,0.8336,0.842335,0.8336,0.833658
4,0.2768,0.579799,0.8435,0.849981,0.8435,0.843999
5,0.1983,0.549651,0.8515,0.856107,0.8515,0.851824
6,0.1516,0.527653,0.8578,0.861647,0.8578,0.858407
7,0.1246,0.515843,0.8602,0.86479,0.8602,0.86114


[I 2025-03-31 15:38:16,123] Trial 78 finished with value: 0.8611399905050012 and parameters: {'learning_rate': 0.0002219161547294356, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 79 with params: {'learning_rate': 5.902380787515226e-05, 'weight_decay': 0.002, 'warmup_steps': 29, 'lambda_param': 0.5, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8593,1.670971,0.6722,0.688675,0.6722,0.661471
2,1.2474,1.03488,0.78,0.782546,0.78,0.778018


[I 2025-03-31 15:40:03,631] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 0.00037045269276231844, 'weight_decay': 0.001, 'warmup_steps': 24, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7359,0.975148,0.733,0.752962,0.733,0.731344
2,0.7024,0.752883,0.798,0.808884,0.798,0.797292
3,0.4548,0.661245,0.8157,0.824013,0.8157,0.815239
4,0.3031,0.619838,0.8276,0.838816,0.8276,0.828241


[I 2025-03-31 15:43:26,888] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 0.00014401343924960548, 'weight_decay': 0.003, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1165,1.051032,0.7513,0.767203,0.7513,0.749453
2,0.7178,0.722516,0.8109,0.81904,0.8109,0.810479
3,0.4367,0.623795,0.8307,0.838035,0.8307,0.83037
4,0.2936,0.575818,0.8395,0.84717,0.8395,0.840434


[I 2025-03-31 15:47:01,323] Trial 81 pruned. 


Trial 82 with params: {'learning_rate': 0.00014044454395603832, 'weight_decay': 0.002, 'warmup_steps': 28, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1207,1.03887,0.7578,0.769816,0.7578,0.756329
2,0.7273,0.719158,0.8142,0.821198,0.8142,0.813022
3,0.4365,0.620017,0.8327,0.837786,0.8327,0.832527
4,0.2952,0.583812,0.8404,0.847142,0.8404,0.840694
5,0.2174,0.56358,0.8476,0.851495,0.8476,0.8477
6,0.1686,0.553582,0.8506,0.853909,0.8506,0.850892
7,0.1418,0.545116,0.854,0.857371,0.854,0.854488


[I 2025-03-31 15:53:04,782] Trial 82 finished with value: 0.8544878987557394 and parameters: {'learning_rate': 0.00014044454395603832, 'weight_decay': 0.002, 'warmup_steps': 28, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 83 with params: {'learning_rate': 0.00025006039174788215, 'weight_decay': 0.003, 'warmup_steps': 25, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.843,0.980704,0.7469,0.765463,0.7469,0.745962
2,0.6588,0.715421,0.805,0.813565,0.805,0.804155
3,0.4243,0.626595,0.8267,0.833955,0.8267,0.82654
4,0.2833,0.574397,0.8407,0.846347,0.8407,0.840507


[I 2025-03-31 15:56:40,940] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.0002813362169214627, 'weight_decay': 0.002, 'warmup_steps': 26, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7973,0.959538,0.7486,0.769765,0.7486,0.748924
2,0.666,0.742891,0.8004,0.814791,0.8004,0.800612
3,0.4245,0.653312,0.8209,0.829795,0.8209,0.82031
4,0.286,0.597972,0.8409,0.850136,0.8409,0.841527
5,0.1993,0.562272,0.8483,0.85401,0.8483,0.849047
6,0.1497,0.527652,0.856,0.8599,0.856,0.856821
7,0.1214,0.517339,0.8582,0.862695,0.8582,0.859134


[I 2025-03-31 16:02:54,932] Trial 84 finished with value: 0.8591342271071969 and parameters: {'learning_rate': 0.0002813362169214627, 'weight_decay': 0.002, 'warmup_steps': 26, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 85 with params: {'learning_rate': 0.0001034510641717502, 'weight_decay': 0.0, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3577,1.217657,0.7369,0.747631,0.7369,0.732847
2,0.8761,0.790777,0.808,0.812341,0.808,0.80649
3,0.5324,0.65736,0.8295,0.835096,0.8295,0.828347
4,0.3585,0.608243,0.8347,0.840442,0.8347,0.835061


[I 2025-03-31 16:06:39,833] Trial 85 pruned. 


Trial 86 with params: {'learning_rate': 0.0002681159956916346, 'weight_decay': 0.003, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8178,0.984126,0.7449,0.765544,0.7449,0.743538
2,0.6754,0.73454,0.8002,0.810564,0.8002,0.799978
3,0.4267,0.638845,0.823,0.8336,0.823,0.822596
4,0.2846,0.594024,0.8369,0.843834,0.8369,0.837544
5,0.1996,0.558662,0.8476,0.853258,0.8476,0.848264
6,0.1514,0.528548,0.8577,0.861501,0.8577,0.858194
7,0.1231,0.517516,0.8594,0.863875,0.8594,0.860233


[I 2025-03-31 16:13:12,683] Trial 86 finished with value: 0.8602327518035281 and parameters: {'learning_rate': 0.0002681159956916346, 'weight_decay': 0.003, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 7.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 87 with params: {'learning_rate': 0.000576567303155847, 'weight_decay': 0.003, 'warmup_steps': 21, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.739,1.171529,0.6799,0.721275,0.6799,0.679926
2,0.8232,0.835975,0.7709,0.78429,0.7709,0.770861


[I 2025-03-31 16:15:04,365] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 0.00023978315916488635, 'weight_decay': 0.004, 'warmup_steps': 18, 'lambda_param': 0.8, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8011,0.930897,0.7596,0.780901,0.7596,0.75978
2,0.6559,0.695769,0.8143,0.82287,0.8143,0.813918
3,0.4176,0.638582,0.8211,0.829056,0.8211,0.819789
4,0.2813,0.572732,0.8435,0.849424,0.8435,0.843802
5,0.2015,0.549084,0.8502,0.856553,0.8502,0.851156
6,0.1528,0.524943,0.8564,0.86046,0.8564,0.857014
7,0.1251,0.511111,0.8611,0.864333,0.8611,0.861664


[I 2025-03-31 16:21:24,444] Trial 88 finished with value: 0.861663780559244 and parameters: {'learning_rate': 0.00023978315916488635, 'weight_decay': 0.004, 'warmup_steps': 18, 'lambda_param': 0.8, 'temperature': 7.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 89 with params: {'learning_rate': 0.00015793528987874706, 'weight_decay': 0.004, 'warmup_steps': 30, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0891,1.033808,0.7495,0.764528,0.7495,0.747368
2,0.7176,0.726679,0.8072,0.816024,0.8072,0.805746
3,0.4345,0.617932,0.8324,0.838284,0.8324,0.831436
4,0.2924,0.585921,0.8422,0.84874,0.8422,0.842982
5,0.2112,0.55355,0.8507,0.854609,0.8507,0.850822
6,0.164,0.536977,0.8561,0.86016,0.8561,0.856746
7,0.1377,0.532096,0.8559,0.859849,0.8559,0.856577


[I 2025-03-31 16:28:00,467] Trial 89 finished with value: 0.856576627657451 and parameters: {'learning_rate': 0.00015793528987874706, 'weight_decay': 0.004, 'warmup_steps': 30, 'lambda_param': 1.0, 'temperature': 6.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 90 with params: {'learning_rate': 0.00011846739376105878, 'weight_decay': 0.004, 'warmup_steps': 9, 'lambda_param': 0.9, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1864,1.145021,0.7422,0.759786,0.7422,0.739125
2,0.8011,0.753253,0.8126,0.820667,0.8126,0.811679
3,0.488,0.645285,0.8288,0.834436,0.8288,0.828342
4,0.3367,0.598395,0.8378,0.842815,0.8378,0.838029
5,0.2479,0.578805,0.8439,0.848094,0.8439,0.844274
6,0.1944,0.56443,0.8467,0.850489,0.8467,0.847249
7,0.1648,0.558589,0.8503,0.853152,0.8503,0.850683


[I 2025-03-31 16:34:38,824] Trial 90 finished with value: 0.8506826130687657 and parameters: {'learning_rate': 0.00011846739376105878, 'weight_decay': 0.004, 'warmup_steps': 9, 'lambda_param': 0.9, 'temperature': 7.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 91 with params: {'learning_rate': 0.0003988903465624725, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7207,1.02897,0.725,0.752721,0.725,0.71965
2,0.708,0.792579,0.7821,0.796149,0.7821,0.781716


[I 2025-03-31 16:36:35,293] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 0.00010296168967388349, 'weight_decay': 0.001, 'warmup_steps': 29, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3386,1.170401,0.7417,0.756192,0.7417,0.739147
2,0.8289,0.765646,0.8091,0.815096,0.8091,0.808335
3,0.504,0.64383,0.8314,0.836052,0.8314,0.8306
4,0.3437,0.592656,0.8397,0.844964,0.8397,0.840156
5,0.2543,0.5768,0.844,0.847774,0.844,0.844469
6,0.2013,0.564264,0.8488,0.852363,0.8488,0.849333
7,0.1711,0.560436,0.8463,0.849588,0.8463,0.846857


[I 2025-03-31 16:43:02,295] Trial 92 finished with value: 0.846857420523886 and parameters: {'learning_rate': 0.00010296168967388349, 'weight_decay': 0.001, 'warmup_steps': 29, 'lambda_param': 1.0, 'temperature': 7.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 93 with params: {'learning_rate': 0.00041273390738968437, 'weight_decay': 0.006, 'warmup_steps': 16, 'lambda_param': 0.7000000000000001, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6885,1.00741,0.7324,0.756806,0.7324,0.730744
2,0.7237,0.845782,0.7697,0.787627,0.7697,0.768757


[I 2025-03-31 16:44:54,517] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.0003057465612341374, 'weight_decay': 0.003, 'warmup_steps': 14, 'lambda_param': 0.8, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.736,0.996615,0.7342,0.759015,0.7342,0.732952
2,0.6703,0.758066,0.7942,0.806881,0.7942,0.794277
3,0.4385,0.646803,0.8233,0.832082,0.8233,0.823195
4,0.2913,0.600663,0.8368,0.846286,0.8368,0.837452


[I 2025-03-31 16:48:30,273] Trial 94 pruned. 


Trial 95 with params: {'learning_rate': 0.00020392000346463595, 'weight_decay': 0.005, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9285,0.981302,0.7499,0.771568,0.7499,0.750194
2,0.6678,0.696553,0.8113,0.818978,0.8113,0.810099
3,0.4146,0.630655,0.8291,0.837433,0.8291,0.828289
4,0.2822,0.570436,0.8405,0.848322,0.8405,0.841745
5,0.201,0.549192,0.8506,0.854612,0.8506,0.850974
6,0.153,0.528613,0.8531,0.85682,0.8531,0.853254
7,0.1263,0.518882,0.8568,0.860842,0.8568,0.857481


[I 2025-03-31 16:54:59,517] Trial 95 finished with value: 0.8574811185715461 and parameters: {'learning_rate': 0.00020392000346463595, 'weight_decay': 0.005, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 7.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 96 with params: {'learning_rate': 0.00029802625262800744, 'weight_decay': 0.004, 'warmup_steps': 22, 'lambda_param': 0.8, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.757,0.990596,0.7388,0.763624,0.7388,0.738603
2,0.6631,0.725227,0.804,0.817018,0.804,0.8045
3,0.4259,0.639542,0.8231,0.833596,0.8231,0.823462
4,0.2873,0.590888,0.8372,0.846516,0.8372,0.838485
5,0.2015,0.561513,0.8449,0.850388,0.8449,0.845289
6,0.1508,0.531159,0.8531,0.857938,0.8531,0.853938
7,0.1222,0.512279,0.857,0.860819,0.857,0.857746


[I 2025-03-31 17:01:33,395] Trial 96 finished with value: 0.8577458492110124 and parameters: {'learning_rate': 0.00029802625262800744, 'weight_decay': 0.004, 'warmup_steps': 22, 'lambda_param': 0.8, 'temperature': 6.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 97 with params: {'learning_rate': 0.0002177589738395526, 'weight_decay': 0.002, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8661,0.97619,0.7595,0.778006,0.7595,0.757682
2,0.6696,0.71118,0.8109,0.818663,0.8109,0.809768
3,0.4187,0.646244,0.8216,0.831044,0.8216,0.822021
4,0.2824,0.585469,0.839,0.847043,0.839,0.839966


[I 2025-03-31 17:05:22,206] Trial 97 pruned. 


Trial 98 with params: {'learning_rate': 0.0002772825059203597, 'weight_decay': 0.0, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7694,0.948438,0.7533,0.772676,0.7533,0.75311
2,0.655,0.721582,0.8029,0.81208,0.8029,0.801668
3,0.42,0.634521,0.825,0.836685,0.825,0.82578
4,0.2809,0.590798,0.8383,0.845262,0.8383,0.838728


[I 2025-03-31 17:08:52,056] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 8.710007471084877e-05, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4373,1.295545,0.7255,0.745423,0.7255,0.721579
2,0.9311,0.827866,0.805,0.808541,0.805,0.803384
3,0.5764,0.689208,0.8231,0.82814,0.8231,0.821841
4,0.3978,0.620327,0.8319,0.835788,0.8319,0.831713


[I 2025-03-31 17:12:21,402] Trial 99 pruned. 


Trial 100 with params: {'learning_rate': 0.0002186529179202505, 'weight_decay': 0.002, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9029,0.928572,0.7668,0.779644,0.7668,0.765533
2,0.6638,0.740931,0.8008,0.815627,0.8008,0.800885
3,0.4169,0.631471,0.8265,0.83549,0.8265,0.826336
4,0.282,0.572777,0.8426,0.848439,0.8426,0.842993
5,0.1997,0.544616,0.8516,0.856215,0.8516,0.851857
6,0.152,0.52516,0.8562,0.859741,0.8562,0.85672
7,0.1256,0.515931,0.8572,0.860831,0.8572,0.857995


[I 2025-03-31 17:18:40,784] Trial 100 finished with value: 0.8579946442039805 and parameters: {'learning_rate': 0.0002186529179202505, 'weight_decay': 0.002, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 101 with params: {'learning_rate': 8.667993190720802e-05, 'weight_decay': 0.005, 'warmup_steps': 23, 'lambda_param': 0.6000000000000001, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4602,1.302805,0.7247,0.738967,0.7247,0.720197
2,0.9343,0.819488,0.8058,0.809107,0.8058,0.804415
3,0.5714,0.685779,0.8219,0.827098,0.8219,0.820976
4,0.3948,0.615348,0.8358,0.839181,0.8358,0.835574


[I 2025-03-31 17:22:26,658] Trial 101 pruned. 


Trial 102 with params: {'learning_rate': 0.0009150860935596703, 'weight_decay': 0.003, 'warmup_steps': 12, 'lambda_param': 0.2, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8848,1.398988,0.6163,0.687517,0.6163,0.609451
2,1.0307,1.087118,0.6993,0.7302,0.6993,0.698372
3,0.7188,0.906795,0.748,0.768398,0.748,0.747834
4,0.49,0.775206,0.7839,0.79427,0.7839,0.78394


[I 2025-03-31 17:26:06,710] Trial 102 pruned. 


Trial 103 with params: {'learning_rate': 0.00012330250764608776, 'weight_decay': 0.001, 'warmup_steps': 20, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1781,1.116969,0.7485,0.762321,0.7485,0.745663
2,0.7896,0.747246,0.8125,0.819864,0.8125,0.811589
3,0.4762,0.637301,0.8309,0.837826,0.8309,0.830411
4,0.3251,0.599721,0.8398,0.844923,0.8398,0.840033


[I 2025-03-31 17:29:49,882] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.0002115725831932042, 'weight_decay': 0.005, 'warmup_steps': 20, 'lambda_param': 0.8, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.89,0.973812,0.7524,0.768134,0.7524,0.750747
2,0.6671,0.710001,0.8102,0.81796,0.8102,0.809915
3,0.4169,0.619488,0.8297,0.837469,0.8297,0.829169
4,0.2819,0.567763,0.8459,0.851318,0.8459,0.846059
5,0.2019,0.543938,0.8493,0.853648,0.8493,0.849673
6,0.1542,0.521666,0.856,0.859315,0.856,0.85645
7,0.128,0.509864,0.857,0.86102,0.857,0.857758


[I 2025-03-31 17:36:20,185] Trial 104 finished with value: 0.8577580103909297 and parameters: {'learning_rate': 0.0002115725831932042, 'weight_decay': 0.005, 'warmup_steps': 20, 'lambda_param': 0.8, 'temperature': 7.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 105 with params: {'learning_rate': 0.0008996522710141943, 'weight_decay': 0.004, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9547,1.664539,0.5561,0.653802,0.5561,0.54982
2,1.0411,1.073415,0.7045,0.727921,0.7045,0.701354


[I 2025-03-31 17:38:14,604] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.00016644555832767357, 'weight_decay': 0.0, 'warmup_steps': 2, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9395,1.005616,0.7642,0.776845,0.7642,0.762421
2,0.7016,0.715289,0.8112,0.818003,0.8112,0.81084
3,0.4308,0.625234,0.8271,0.833971,0.8271,0.826422
4,0.2903,0.587224,0.842,0.849062,0.842,0.842609
5,0.2118,0.559641,0.8509,0.855499,0.8509,0.851572
6,0.1645,0.537065,0.8529,0.856945,0.8529,0.853609
7,0.1376,0.52724,0.8555,0.859449,0.8555,0.856299


[I 2025-03-31 17:44:43,731] Trial 106 finished with value: 0.8562991803760489 and parameters: {'learning_rate': 0.00016644555832767357, 'weight_decay': 0.0, 'warmup_steps': 2, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 107 with params: {'learning_rate': 0.0003716158362642867, 'weight_decay': 0.002, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7121,0.979303,0.7388,0.758814,0.7388,0.737245
2,0.699,0.749047,0.7931,0.804352,0.7931,0.792401


[I 2025-03-31 17:46:38,047] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 0.0003809288647547307, 'weight_decay': 0.004, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7282,0.993829,0.7341,0.758191,0.7341,0.733602
2,0.7009,0.759668,0.7948,0.806895,0.7948,0.794605


[I 2025-03-31 17:48:25,025] Trial 108 pruned. 


Trial 109 with params: {'learning_rate': 0.0017183437098516675, 'weight_decay': 0.008, 'warmup_steps': 21, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.6136,2.198464,0.4142,0.513177,0.4142,0.405471
2,1.676,1.606833,0.5653,0.599211,0.5653,0.557581
3,1.2346,1.405959,0.6195,0.665651,0.6195,0.614974
4,0.8896,1.088028,0.6959,0.717476,0.6959,0.695899


[I 2025-03-31 17:52:04,668] Trial 109 pruned. 


Trial 110 with params: {'learning_rate': 0.0008327418394542927, 'weight_decay': 0.0, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9201,1.424273,0.6072,0.667536,0.6072,0.604155
2,1.0089,1.070997,0.7035,0.732359,0.7035,0.702444
3,0.6916,0.888332,0.7522,0.772834,0.7522,0.75052
4,0.4759,0.741419,0.7938,0.804543,0.7938,0.79435


[I 2025-03-31 17:55:49,314] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.00028995514075483655, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7948,0.94959,0.7501,0.771453,0.7501,0.750302
2,0.6665,0.712858,0.8059,0.813677,0.8059,0.805158
3,0.4245,0.623531,0.8269,0.834435,0.8269,0.826381
4,0.2867,0.58798,0.8385,0.845268,0.8385,0.839034


[I 2025-03-31 17:59:32,415] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 0.0003181354256647286, 'weight_decay': 0.004, 'warmup_steps': 24, 'lambda_param': 0.7000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7659,0.980248,0.7367,0.757477,0.7367,0.734884
2,0.675,0.724788,0.7997,0.811433,0.7997,0.800283
3,0.4351,0.658685,0.8154,0.828422,0.8154,0.816773
4,0.2938,0.585136,0.8375,0.843747,0.8375,0.837757


[I 2025-03-31 18:03:08,664] Trial 112 pruned. 


Trial 113 with params: {'learning_rate': 5.969448609920787e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 15, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.8154,1.651874,0.6789,0.695332,0.6789,0.66943
2,1.2334,1.034086,0.7812,0.78279,0.7812,0.779267


[I 2025-03-31 18:05:01,537] Trial 113 pruned. 


Trial 114 with params: {'learning_rate': 0.00021155145356596233, 'weight_decay': 0.005, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.898,0.978491,0.7567,0.772,0.7567,0.754757
2,0.6836,0.72596,0.8016,0.811278,0.8016,0.800413
3,0.4273,0.64816,0.8236,0.834045,0.8236,0.823626
4,0.286,0.579681,0.8411,0.849234,0.8411,0.841813
5,0.2032,0.558219,0.848,0.851927,0.848,0.848422
6,0.1561,0.536079,0.8506,0.854194,0.8506,0.851226
7,0.1275,0.522547,0.8554,0.859865,0.8554,0.856267


[I 2025-03-31 18:11:38,824] Trial 114 finished with value: 0.8562665216663035 and parameters: {'learning_rate': 0.00021155145356596233, 'weight_decay': 0.005, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 115 with params: {'learning_rate': 0.0005780846178772666, 'weight_decay': 0.001, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.774,1.185851,0.676,0.722005,0.676,0.680048
2,0.8154,0.869853,0.7597,0.774592,0.7597,0.758933


[I 2025-03-31 18:13:35,276] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 0.0002469362452940188, 'weight_decay': 0.003, 'warmup_steps': 26, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8359,0.941169,0.7585,0.774801,0.7585,0.757419
2,0.6644,0.731402,0.8035,0.811803,0.8035,0.802928
3,0.4175,0.633153,0.8244,0.833658,0.8244,0.824206
4,0.2787,0.593274,0.8384,0.845758,0.8384,0.839379
5,0.1998,0.546104,0.8473,0.851382,0.8473,0.847154
6,0.1504,0.522823,0.858,0.861854,0.858,0.858828
7,0.1231,0.512196,0.8597,0.863553,0.8597,0.860584


[I 2025-03-31 18:20:06,421] Trial 116 finished with value: 0.8605844313411094 and parameters: {'learning_rate': 0.0002469362452940188, 'weight_decay': 0.003, 'warmup_steps': 26, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 117 with params: {'learning_rate': 0.0008273468343264632, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 0.4, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9166,1.412998,0.6105,0.666279,0.6105,0.606769
2,1.0074,1.037865,0.7165,0.740368,0.7165,0.714796


[I 2025-03-31 18:21:57,648] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.0034337078522599486, 'weight_decay': 0.002, 'warmup_steps': 26, 'lambda_param': 0.7000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.1337,4.155059,0.0218,0.005026,0.0218,0.005364
2,4.1712,4.248651,0.0123,0.002679,0.0123,0.002522


[I 2025-03-31 18:23:45,689] Trial 118 pruned. 


Trial 119 with params: {'learning_rate': 0.00030897570196718, 'weight_decay': 0.005, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7639,1.025927,0.7284,0.759921,0.7284,0.728969
2,0.6754,0.754161,0.7942,0.804827,0.7942,0.792454
3,0.4359,0.630035,0.825,0.83303,0.825,0.824826
4,0.2929,0.600522,0.8327,0.84279,0.8327,0.833761


[I 2025-03-31 18:27:25,644] Trial 119 pruned. 


Trial 120 with params: {'learning_rate': 0.0036101090092247124, 'weight_decay': 0.008, 'warmup_steps': 5, 'lambda_param': 0.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.2373,4.239419,0.0103,0.000481,0.0103,0.000589
2,4.2195,4.260618,0.01,0.0001,0.01,0.000198


[I 2025-03-31 18:29:20,036] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.00034501296364812826, 'weight_decay': 0.002, 'warmup_steps': 22, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7357,0.990841,0.7296,0.761422,0.7296,0.729688
2,0.6859,0.722807,0.8022,0.811298,0.8022,0.801771
3,0.442,0.655171,0.8169,0.826808,0.8169,0.816692
4,0.2985,0.594385,0.8348,0.84307,0.8348,0.835354


[I 2025-03-31 18:33:07,970] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 0.0002574364896354549, 'weight_decay': 0.003, 'warmup_steps': 23, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8067,0.952286,0.7539,0.772591,0.7539,0.754373
2,0.6568,0.695294,0.8099,0.817158,0.8099,0.809453
3,0.4175,0.638329,0.8266,0.835645,0.8266,0.826742
4,0.2837,0.589178,0.8363,0.844304,0.8363,0.836563


[I 2025-03-31 18:36:50,873] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 0.0002946888246678824, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7981,0.970712,0.7391,0.75873,0.7391,0.737383
2,0.6759,0.717797,0.8045,0.814841,0.8045,0.804109
3,0.433,0.641011,0.8234,0.83338,0.8234,0.822972
4,0.2909,0.6009,0.8364,0.84575,0.8364,0.836749


[I 2025-03-31 18:40:31,190] Trial 123 pruned. 


Trial 124 with params: {'learning_rate': 0.00015170756208078966, 'weight_decay': 0.004, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0549,1.03018,0.7582,0.771448,0.7582,0.756139
2,0.7186,0.729341,0.8078,0.817224,0.8078,0.807335
3,0.4377,0.613607,0.836,0.842206,0.836,0.835525
4,0.2962,0.578854,0.8415,0.84853,0.8415,0.841831
5,0.2147,0.554547,0.8499,0.85442,0.8499,0.850396
6,0.1672,0.544399,0.8529,0.856332,0.8529,0.853288
7,0.1404,0.53507,0.8564,0.859954,0.8564,0.856927


[I 2025-03-31 18:47:12,686] Trial 124 finished with value: 0.8569266650334216 and parameters: {'learning_rate': 0.00015170756208078966, 'weight_decay': 0.004, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 125 with params: {'learning_rate': 0.0011314950468693916, 'weight_decay': 0.001, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1238,1.706181,0.5408,0.619639,0.5408,0.541234
2,1.2414,1.220036,0.6643,0.702044,0.6643,0.661441
3,0.8671,1.04423,0.7107,0.742124,0.7107,0.708209
4,0.6092,0.899509,0.7538,0.769795,0.7538,0.753073


[I 2025-03-31 18:51:05,425] Trial 125 pruned. 


Trial 126 with params: {'learning_rate': 0.0009049791490282845, 'weight_decay': 0.0, 'warmup_steps': 25, 'lambda_param': 0.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9197,1.43663,0.6022,0.68142,0.6022,0.60122
2,1.0609,1.07104,0.6998,0.732534,0.6998,0.699481


[I 2025-03-31 18:52:56,395] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.00025435072226170165, 'weight_decay': 0.005, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8482,0.933164,0.756,0.772906,0.756,0.754609
2,0.6558,0.723193,0.8035,0.814188,0.8035,0.802538
3,0.4145,0.629653,0.8259,0.834197,0.8259,0.825911
4,0.2786,0.587432,0.834,0.840405,0.834,0.833998


[I 2025-03-31 18:56:42,665] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.00010384166072833677, 'weight_decay': 0.002, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3538,1.232408,0.7292,0.742115,0.7292,0.7242
2,0.8702,0.800033,0.8011,0.807439,0.8011,0.80008
3,0.5251,0.647695,0.8324,0.838334,0.8324,0.83158
4,0.357,0.605887,0.8359,0.841028,0.8359,0.835998


[I 2025-03-31 19:00:21,272] Trial 128 pruned. 


Trial 129 with params: {'learning_rate': 0.0006712937288776745, 'weight_decay': 0.005, 'warmup_steps': 26, 'lambda_param': 0.4, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7911,1.230911,0.6632,0.70879,0.6632,0.663043
2,0.8757,0.965289,0.7348,0.76721,0.7348,0.737165


[I 2025-03-31 19:02:12,658] Trial 129 pruned. 


Trial 130 with params: {'learning_rate': 0.0004021879953728729, 'weight_decay': 0.001, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7593,1.036409,0.7218,0.747757,0.7218,0.721824
2,0.7109,0.813595,0.7765,0.790713,0.7765,0.775591


[I 2025-03-31 19:04:06,733] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 0.0002738589399356526, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8296,0.958005,0.751,0.770453,0.751,0.749211
2,0.6655,0.72551,0.7985,0.813212,0.7985,0.798187
3,0.4219,0.62474,0.8263,0.836875,0.8263,0.826594
4,0.2801,0.579041,0.838,0.845341,0.838,0.838419
5,0.1982,0.556849,0.8439,0.84888,0.8439,0.844435
6,0.1488,0.524485,0.8552,0.858867,0.8552,0.855708
7,0.1217,0.510975,0.8567,0.859784,0.8567,0.857256


[I 2025-03-31 19:10:34,690] Trial 131 finished with value: 0.8572557366764572 and parameters: {'learning_rate': 0.0002738589399356526, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 132 with params: {'learning_rate': 0.00020851778467913934, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8902,0.93429,0.7688,0.780746,0.7688,0.767339
2,0.6591,0.69749,0.8108,0.8217,0.8108,0.810507
3,0.4067,0.590641,0.8419,0.84848,0.8419,0.841694
4,0.2746,0.575048,0.8413,0.849965,0.8413,0.842072
5,0.1988,0.540025,0.8525,0.85749,0.8525,0.853097
6,0.1534,0.519667,0.8555,0.860421,0.8555,0.856298
7,0.1268,0.509374,0.8574,0.862755,0.8574,0.858321


[I 2025-03-31 19:17:09,915] Trial 132 finished with value: 0.8583208787907418 and parameters: {'learning_rate': 0.00020851778467913934, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 133 with params: {'learning_rate': 0.00017520535718652816, 'weight_decay': 0.002, 'warmup_steps': 18, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9409,0.978992,0.7642,0.779912,0.7642,0.76284
2,0.6766,0.697695,0.8164,0.824201,0.8164,0.815802
3,0.418,0.61223,0.8326,0.839682,0.8326,0.832371
4,0.2842,0.572044,0.842,0.849626,0.842,0.842839
5,0.2056,0.542512,0.8516,0.856762,0.8516,0.852501
6,0.1594,0.527751,0.8574,0.862075,0.8574,0.858195
7,0.1328,0.514528,0.8603,0.863593,0.8603,0.860896


[I 2025-03-31 19:23:40,272] Trial 133 finished with value: 0.8608961482216836 and parameters: {'learning_rate': 0.00017520535718652816, 'weight_decay': 0.002, 'warmup_steps': 18, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 134 with params: {'learning_rate': 0.00015880376919095014, 'weight_decay': 0.004, 'warmup_steps': 14, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0168,0.986801,0.7652,0.78374,0.7652,0.763958
2,0.6984,0.730069,0.8073,0.815272,0.8073,0.805739
3,0.4259,0.626773,0.8289,0.835383,0.8289,0.827449
4,0.2878,0.577219,0.8425,0.848609,0.8425,0.843365
5,0.21,0.561815,0.8475,0.852875,0.8475,0.848221
6,0.1619,0.54186,0.8533,0.857022,0.8533,0.853879
7,0.136,0.535091,0.8535,0.857686,0.8535,0.854294


[I 2025-03-31 19:30:25,149] Trial 134 finished with value: 0.8542941518990432 and parameters: {'learning_rate': 0.00015880376919095014, 'weight_decay': 0.004, 'warmup_steps': 14, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 135 with params: {'learning_rate': 0.0002569760948887693, 'weight_decay': 0.002, 'warmup_steps': 25, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8025,0.947082,0.7536,0.775427,0.7536,0.751334
2,0.6516,0.71026,0.8061,0.819038,0.8061,0.805561
3,0.4211,0.631571,0.8262,0.837052,0.8262,0.826507
4,0.2782,0.589391,0.8383,0.845852,0.8383,0.838501
5,0.1993,0.542878,0.8524,0.856405,0.8524,0.852996
6,0.1496,0.518901,0.8572,0.860884,0.8572,0.857667
7,0.1222,0.507257,0.8596,0.863146,0.8596,0.860283


[I 2025-03-31 19:36:55,355] Trial 135 finished with value: 0.8602834975640807 and parameters: {'learning_rate': 0.0002569760948887693, 'weight_decay': 0.002, 'warmup_steps': 25, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 136 with params: {'learning_rate': 0.00012390680078839525, 'weight_decay': 0.0, 'warmup_steps': 22, 'lambda_param': 0.4, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2079,1.135342,0.7418,0.756756,0.7418,0.738382
2,0.7918,0.741147,0.8148,0.818841,0.8148,0.813798
3,0.4774,0.634162,0.8292,0.835941,0.8292,0.828948
4,0.3226,0.594008,0.8399,0.844493,0.8399,0.840071


[I 2025-03-31 19:40:37,602] Trial 136 pruned. 


Trial 137 with params: {'learning_rate': 0.00029504132830440765, 'weight_decay': 0.002, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7374,0.944585,0.753,0.771172,0.753,0.749992
2,0.6684,0.763164,0.7908,0.803219,0.7908,0.792021
3,0.4263,0.647229,0.8214,0.830864,0.8214,0.820536
4,0.2902,0.594719,0.8367,0.843822,0.8367,0.837308


[I 2025-03-31 19:44:11,128] Trial 137 pruned. 


Trial 138 with params: {'learning_rate': 0.002819055822915683, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.1781,4.245885,0.0132,0.001074,0.0132,0.001659
2,4.1904,4.193885,0.0166,0.002718,0.0166,0.002561
3,4.2005,4.281849,0.012,0.000492,0.012,0.000875
4,4.1827,4.162157,0.0228,0.002349,0.0228,0.002468


[I 2025-03-31 19:47:44,468] Trial 138 pruned. 


Trial 139 with params: {'learning_rate': 0.00016811632073593987, 'weight_decay': 0.003, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9578,1.034617,0.752,0.769129,0.752,0.75017
2,0.692,0.706665,0.8159,0.82371,0.8159,0.814961
3,0.4247,0.614372,0.8348,0.840434,0.8348,0.834319
4,0.2864,0.574425,0.8449,0.851161,0.8449,0.84548
5,0.2078,0.542963,0.8512,0.855214,0.8512,0.851318
6,0.1607,0.532323,0.8546,0.859155,0.8546,0.855485
7,0.1335,0.525728,0.8552,0.859773,0.8552,0.856216


[I 2025-03-31 19:54:01,999] Trial 139 finished with value: 0.8562164500207757 and parameters: {'learning_rate': 0.00016811632073593987, 'weight_decay': 0.003, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 6.5}. Best is trial 31 with value: 0.8627538517003588.


Trial 140 with params: {'learning_rate': 0.0010770498866468602, 'weight_decay': 0.001, 'warmup_steps': 18, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0098,1.796083,0.5274,0.633554,0.5274,0.527545
2,1.1528,1.257429,0.6563,0.70146,0.6563,0.656179
3,0.8172,1.034655,0.7227,0.747115,0.7227,0.722301
4,0.5737,0.887805,0.7582,0.77425,0.7582,0.758499


[I 2025-03-31 19:57:41,947] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 0.0003453764738062971, 'weight_decay': 0.001, 'warmup_steps': 24, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7311,1.005128,0.7273,0.751536,0.7273,0.725936
2,0.6954,0.775238,0.7911,0.804943,0.7911,0.790333


[I 2025-03-31 19:59:27,954] Trial 141 pruned. 


Trial 142 with params: {'learning_rate': 0.00011746052092047507, 'weight_decay': 0.003, 'warmup_steps': 20, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2312,1.152876,0.7437,0.75906,0.7437,0.740869
2,0.808,0.75816,0.8094,0.81568,0.8094,0.808566
3,0.4879,0.639776,0.8288,0.833598,0.8288,0.827925
4,0.3317,0.587737,0.8387,0.843912,0.8387,0.83884
5,0.2443,0.566603,0.8454,0.84958,0.8454,0.845709


[I 2025-03-31 20:12:35,804] Trial 143 finished with value: 0.8574002272130745 and parameters: {'learning_rate': 0.00015815932399401082, 'weight_decay': 0.004, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 31 with value: 0.8627538517003588.


Trial 144 with params: {'learning_rate': 0.00029370471825383113, 'weight_decay': 0.0, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7251,0.925489,0.7524,0.770243,0.7524,0.751577
2,0.6597,0.720284,0.8023,0.815969,0.8023,0.802864
3,0.4181,0.629376,0.8289,0.838383,0.8289,0.828907
4,0.2826,0.577973,0.8427,0.848298,0.8427,0.843241
5,0.2005,0.548796,0.8488,0.855055,0.8488,0.849455
6,0.151,0.528219,0.8567,0.860423,0.8567,0.857423
7,0.1222,0.508785,0.8621,0.865753,0.8621,0.86281


[I 2025-03-31 20:19:12,296] Trial 144 finished with value: 0.8628102629696968 and parameters: {'learning_rate': 0.00029370471825383113, 'weight_decay': 0.0, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}. Best is trial 144 with value: 0.8628102629696968.


Trial 145 with params: {'learning_rate': 0.00024986778733554016, 'weight_decay': 0.0, 'warmup_steps': 6, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7552,0.925198,0.7641,0.778096,0.7641,0.763309
2,0.6616,0.726295,0.8028,0.814193,0.8028,0.803137
3,0.4286,0.624565,0.8302,0.837294,0.8302,0.829299
4,0.2903,0.582763,0.843,0.851438,0.843,0.843874
5,0.2026,0.54967,0.8496,0.856655,0.8496,0.850682
6,0.1539,0.526343,0.8547,0.858721,0.8547,0.855497
7,0.1258,0.511163,0.8586,0.862674,0.8586,0.859516


[I 2025-03-31 20:26:05,226] Trial 145 finished with value: 0.8595156564703188 and parameters: {'learning_rate': 0.00024986778733554016, 'weight_decay': 0.0, 'warmup_steps': 6, 'lambda_param': 0.5, 'temperature': 3.5}. Best is trial 144 with value: 0.8628102629696968.


Trial 146 with params: {'learning_rate': 0.00026717268466778317, 'weight_decay': 0.0, 'warmup_steps': 5, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7349,0.98276,0.7412,0.762049,0.7412,0.74068
2,0.6539,0.721885,0.8078,0.818908,0.8078,0.808594
3,0.4198,0.651287,0.8209,0.828199,0.8209,0.819579
4,0.2848,0.584584,0.8407,0.847547,0.8407,0.840684
5,0.2021,0.552294,0.8461,0.851141,0.8461,0.84664
6,0.1519,0.524215,0.8551,0.858775,0.8551,0.855532
7,0.124,0.512093,0.8571,0.861174,0.8571,0.857961


[I 2025-03-31 20:32:55,313] Trial 146 finished with value: 0.8579613138291315 and parameters: {'learning_rate': 0.00026717268466778317, 'weight_decay': 0.0, 'warmup_steps': 5, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 144 with value: 0.8628102629696968.


Trial 147 with params: {'learning_rate': 0.00030251822265868754, 'weight_decay': 0.0, 'warmup_steps': 10, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7207,0.954137,0.7467,0.764286,0.7467,0.745227
2,0.6696,0.736193,0.8044,0.816866,0.8044,0.804442
3,0.4331,0.630266,0.8254,0.836618,0.8254,0.82553
4,0.2907,0.595146,0.8375,0.845619,0.8375,0.838249


[I 2025-03-31 20:36:44,526] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 0.0005383737534101895, 'weight_decay': 0.0, 'warmup_steps': 18, 'lambda_param': 0.6000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7238,1.207737,0.6766,0.726769,0.6766,0.675655
2,0.802,0.840469,0.7634,0.778324,0.7634,0.763689


[I 2025-03-31 20:38:36,605] Trial 148 pruned. 


Trial 149 with params: {'learning_rate': 0.0005176368140508914, 'weight_decay': 0.0, 'warmup_steps': 4, 'lambda_param': 0.8, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6844,1.092137,0.6989,0.734969,0.6989,0.698778
2,0.7628,0.865202,0.7607,0.783013,0.7607,0.761051
3,0.5089,0.741111,0.7949,0.809526,0.7949,0.794556
4,0.3469,0.649155,0.822,0.830514,0.822,0.822635


[I 2025-03-31 20:42:25,219] Trial 149 pruned. 


In [26]:
print(best_distill)

BestRun(run_id='144', objective=0.8628102629696968, hyperparameters={'learning_rate': 0.00029370471825383113, 'weight_decay': 0.0, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}, run_summary=None)


Přepočet kroků s ohledem na změnu velikosti datasetu.

In [None]:
data_length = len(train_combo)
min_r = math.ceil(data_length/batch_size)*2
max_r = math.ceil(data_length/batch_size)*num_epochs
warm_up = math.ceil(data_length/batch_size/10)

In [27]:
base.reset_seed()

## Prohledávání s normálním tréninkem nad augmentovaným datasetem
Konfigurace jednotlivých tréninků.

In [28]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-aug_hp-search", logging_dir=f"~/logs/{DATASET}/-aug_hp-search", epochs=num_epochs, batch_size=batch_size)

Definice hledaných hyperparametrů a jejich rozmezí.

In [29]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up)
    }   
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [30]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



Konfigurace trenéra pro jednotlivé tréninky. 

In [31]:
trainer = Trainer(
    args=training_args,
    train_dataset=train_combo,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [32]:
best_base_aug = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Base-head",
    n_trials=150
)

[I 2025-03-31 20:42:26,026] A new study created in memory with name: Base-head


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4408,0.822459,0.7563,0.773909,0.7563,0.756202
2,0.4808,0.680551,0.8013,0.811327,0.8013,0.801277


[I 2025-03-31 20:45:23,470] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.0007875660249889869, 'weight_decay': 0.001, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5865,1.239148,0.6471,0.683557,0.6471,0.644067
2,0.7998,0.940884,0.7294,0.756915,0.7294,0.728664
3,0.4997,0.838248,0.7648,0.779964,0.7648,0.76587


[I 2025-03-31 20:49:45,919] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 6.533369619026643e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1658,0.966175,0.7603,0.764178,0.7603,0.758189
2,0.6663,0.664417,0.8127,0.816665,0.8127,0.811903


[I 2025-03-31 20:52:46,697] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.0013035123791853842, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.089,1.662967,0.536,0.586524,0.536,0.527496
2,1.167,1.320862,0.6261,0.669939,0.6261,0.623307
3,0.763,1.061449,0.7008,0.723795,0.7008,0.699606


[I 2025-03-31 20:57:08,341] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.002311294500510415, 'weight_decay': 0.002, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.5323,4.558012,0.0186,0.003836,0.0186,0.004611
2,4.5599,4.541796,0.0189,0.002119,0.0189,0.003548


[I 2025-03-31 21:00:00,447] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7343,0.782408,0.7827,0.792378,0.7827,0.782375
2,0.4828,0.586732,0.8266,0.831415,0.8266,0.826383
3,0.2459,0.573033,0.8347,0.838491,0.8347,0.834013
4,0.1274,0.573715,0.8426,0.84524,0.8426,0.842417
5,0.0629,0.591085,0.8431,0.847573,0.8431,0.843252


[I 2025-03-31 21:07:04,630] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0003654769917956456, 'weight_decay': 0.003, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4128,0.859494,0.7466,0.761862,0.7466,0.744405
2,0.5261,0.679407,0.8047,0.812549,0.8047,0.80441
3,0.2926,0.713027,0.8064,0.818216,0.8064,0.806731


[I 2025-03-31 21:11:20,934] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 9.505122659935192e-05, 'weight_decay': 0.003, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.855,0.844302,0.7758,0.78068,0.7758,0.774168
2,0.5509,0.610774,0.8208,0.826291,0.8208,0.820028
3,0.2935,0.576896,0.8297,0.833511,0.8297,0.829217
4,0.1611,0.572023,0.8397,0.843737,0.8397,0.840049
5,0.0859,0.58795,0.8427,0.847012,0.8427,0.84323


[I 2025-03-31 21:18:36,137] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 0.00040842279473800845, 'weight_decay': 0.008, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3942,0.894361,0.7357,0.754881,0.7357,0.734459
2,0.5544,0.740482,0.7878,0.801586,0.7878,0.787753
3,0.3094,0.693657,0.8043,0.815146,0.8043,0.804638


[I 2025-03-31 21:22:53,668] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0005338741354740678, 'weight_decay': 0.006, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4345,0.990294,0.7097,0.730631,0.7097,0.70719
2,0.6289,0.769475,0.7785,0.789368,0.7785,0.778557


[I 2025-03-31 21:25:42,807] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 5.765419213017514e-05, 'weight_decay': 0.0, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3301,1.014754,0.7497,0.754497,0.7497,0.747198
2,0.7027,0.671199,0.8101,0.814429,0.8101,0.809856
3,0.3989,0.598245,0.8259,0.829481,0.8259,0.825372
4,0.2503,0.568053,0.8327,0.835915,0.8327,0.832877
5,0.1611,0.565537,0.836,0.839189,0.836,0.836258


[I 2025-03-31 21:32:48,197] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 8.864358030226235e-05, 'weight_decay': 0.003, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9081,0.862831,0.7774,0.783053,0.7774,0.776203
2,0.5734,0.624781,0.8204,0.82458,0.8204,0.819811
3,0.3091,0.595603,0.8272,0.832239,0.8272,0.826155


[I 2025-03-31 21:36:59,233] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 7.882328855146668e-05, 'weight_decay': 0.004, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0283,0.897475,0.768,0.772996,0.768,0.766292
2,0.5973,0.635277,0.8177,0.822727,0.8177,0.817375
3,0.3222,0.586918,0.8276,0.830878,0.8276,0.826454
4,0.1829,0.576839,0.8354,0.838065,0.8354,0.835541
5,0.1023,0.58242,0.8411,0.845449,0.8411,0.841532


[I 2025-03-31 21:43:57,731] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.0001642985400515745, 'weight_decay': 0.0, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5579,0.771416,0.7807,0.794479,0.7807,0.780358
2,0.4608,0.616364,0.8177,0.826018,0.8177,0.817488
3,0.2366,0.596709,0.8297,0.835752,0.8297,0.829226
4,0.1202,0.603267,0.8396,0.844193,0.8396,0.840088
5,0.058,0.613827,0.8437,0.847167,0.8437,0.843833


[I 2025-03-31 21:50:54,051] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 0.00011670513687585162, 'weight_decay': 0.0, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7129,0.796311,0.7835,0.790534,0.7835,0.782642
2,0.4976,0.601245,0.8232,0.829083,0.8232,0.822783
3,0.2587,0.587073,0.8292,0.835483,0.8292,0.82836
4,0.1351,0.576687,0.8409,0.843865,0.8409,0.841096
5,0.0686,0.590135,0.8453,0.849348,0.8453,0.845773
6,0.0326,0.613591,0.8486,0.85036,0.8486,0.848592
7,0.0148,0.622789,0.8497,0.850792,0.8497,0.849556


[I 2025-03-31 22:00:55,491] Trial 14 finished with value: 0.849555953355696 and parameters: {'learning_rate': 0.00011670513687585162, 'weight_decay': 0.0, 'warmup_steps': 15}. Best is trial 14 with value: 0.849555953355696.


Trial 15 with params: {'learning_rate': 5.674481186879401e-05, 'weight_decay': 0.0, 'warmup_steps': 14}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.311,1.032336,0.7433,0.748925,0.7433,0.741177
2,0.7119,0.691929,0.809,0.814173,0.809,0.808417
3,0.4146,0.61195,0.8218,0.825572,0.8218,0.82121


[I 2025-03-31 22:05:16,124] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.0003152249444441531, 'weight_decay': 0.0, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3914,0.851174,0.7478,0.767152,0.7478,0.745299
2,0.5009,0.663126,0.802,0.811855,0.802,0.802203
3,0.2741,0.641314,0.8218,0.829406,0.8218,0.821903


[I 2025-03-31 22:09:35,574] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 0.0020085822314002493, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.7957,2.385157,0.3623,0.425926,0.3623,0.348693
2,1.7919,1.863603,0.4924,0.541572,0.4924,0.485705


[I 2025-03-31 22:12:33,322] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0026868566033176914, 'weight_decay': 0.01, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.5551,4.563218,0.0144,0.001789,0.0144,0.002298
2,4.5625,4.548606,0.0221,0.00648,0.0221,0.004903


[I 2025-03-31 22:15:26,829] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 0.00017098269191031398, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5427,0.745009,0.7848,0.791389,0.7848,0.782768
2,0.4521,0.600238,0.8263,0.833895,0.8263,0.826351
3,0.2299,0.582568,0.8338,0.837744,0.8338,0.833017
4,0.1158,0.594926,0.8403,0.844638,0.8403,0.840527
5,0.0549,0.604499,0.8489,0.852855,0.8489,0.849169
6,0.0231,0.625468,0.8531,0.854629,0.8531,0.853096
7,0.0076,0.630627,0.8551,0.856251,0.8551,0.855021


[I 2025-03-31 22:25:41,380] Trial 19 finished with value: 0.8550210810424264 and parameters: {'learning_rate': 0.00017098269191031398, 'weight_decay': 0.005, 'warmup_steps': 32}. Best is trial 19 with value: 0.8550210810424264.


Trial 20 with params: {'learning_rate': 0.00017843716893255613, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5327,0.749371,0.7827,0.792057,0.7827,0.782135
2,0.4486,0.610554,0.8206,0.828978,0.8206,0.821068
3,0.229,0.605294,0.8336,0.839862,0.8336,0.832921
4,0.1187,0.605127,0.8386,0.844021,0.8386,0.839026
5,0.0547,0.620021,0.8448,0.84975,0.8448,0.845294


[I 2025-03-31 22:32:45,493] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 0.00013838706756599107, 'weight_decay': 0.006, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6524,0.750771,0.7894,0.795823,0.7894,0.788333
2,0.456,0.582339,0.8285,0.834051,0.8285,0.828418
3,0.2293,0.596644,0.8296,0.836416,0.8296,0.829461
4,0.1163,0.58215,0.8462,0.848756,0.8462,0.846178
5,0.0557,0.611288,0.8466,0.849517,0.8466,0.846684
6,0.0245,0.630725,0.8516,0.853776,0.8516,0.85192
7,0.0098,0.637584,0.8526,0.854103,0.8526,0.85273


[I 2025-03-31 22:42:37,418] Trial 21 finished with value: 0.8527297661452897 and parameters: {'learning_rate': 0.00013838706756599107, 'weight_decay': 0.006, 'warmup_steps': 31}. Best is trial 19 with value: 0.8550210810424264.


Trial 22 with params: {'learning_rate': 6.232465593002573e-05, 'weight_decay': 0.005, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.21,0.980993,0.756,0.760952,0.756,0.754061
2,0.6796,0.678715,0.8119,0.816088,0.8119,0.811311
3,0.39,0.607861,0.8252,0.829092,0.8252,0.824428


[I 2025-03-31 22:46:44,716] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 0.0002905816595893275, 'weight_decay': 0.005, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4333,0.832704,0.7582,0.772479,0.7582,0.757155
2,0.482,0.667189,0.8057,0.813692,0.8057,0.804794
3,0.2599,0.649698,0.822,0.833135,0.822,0.822717


[I 2025-03-31 22:50:56,317] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 9.331816530926763e-05, 'weight_decay': 0.008, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8775,0.822334,0.7765,0.782734,0.7765,0.775417
2,0.5238,0.608323,0.8251,0.830911,0.8251,0.824866
3,0.2738,0.580907,0.8337,0.838538,0.8337,0.833182
4,0.1482,0.571038,0.8404,0.843512,0.8404,0.8403
5,0.0786,0.598491,0.8417,0.845345,0.8417,0.841651


[I 2025-03-31 22:57:52,766] Trial 24 pruned. 


Trial 25 with params: {'learning_rate': 0.0002317985117029916, 'weight_decay': 0.003, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4566,0.789253,0.7698,0.785374,0.7698,0.768805
2,0.4552,0.648203,0.8121,0.822805,0.8121,0.812107
3,0.237,0.639435,0.8238,0.83328,0.8238,0.823174


[I 2025-03-31 23:02:03,094] Trial 25 pruned. 


Trial 26 with params: {'learning_rate': 0.000733540863652704, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.612,1.14919,0.6708,0.701484,0.6708,0.669219
2,0.77,0.926166,0.7341,0.760842,0.7341,0.734767


[I 2025-03-31 23:04:50,707] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.00023602967275791542, 'weight_decay': 0.006, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4491,0.793858,0.7678,0.781048,0.7678,0.767079
2,0.4666,0.628257,0.8183,0.826944,0.8183,0.818509
3,0.245,0.640081,0.8252,0.831921,0.8252,0.824599


[I 2025-03-31 23:08:59,343] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 0.00033246076824024954, 'weight_decay': 0.0, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4347,0.847114,0.7521,0.767577,0.7521,0.749581
2,0.5182,0.708828,0.791,0.803672,0.791,0.790187
3,0.2834,0.659577,0.8198,0.826965,0.8198,0.81904


[I 2025-03-31 23:13:10,504] Trial 28 pruned. 


Trial 29 with params: {'learning_rate': 0.0007328609105663036, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6329,1.117058,0.6758,0.699157,0.6758,0.671611
2,0.777,0.919477,0.7299,0.753615,0.7299,0.730382


[I 2025-03-31 23:15:56,091] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 0.00010311216656631768, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8522,0.801374,0.7828,0.789485,0.7828,0.782276
2,0.5224,0.603055,0.8201,0.826309,0.8201,0.820036
3,0.2699,0.58113,0.8323,0.837366,0.8323,0.83218
4,0.1434,0.566185,0.8408,0.844457,0.8408,0.840859
5,0.0738,0.583957,0.8418,0.845278,0.8418,0.84197


[I 2025-03-31 23:23:00,676] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 0.000258339139521669, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4696,0.798169,0.7675,0.782516,0.7675,0.765978
2,0.4725,0.682918,0.8051,0.816717,0.8051,0.805737
3,0.2476,0.629038,0.8261,0.833036,0.8261,0.82561
4,0.1303,0.626855,0.8324,0.837107,0.8324,0.832751
5,0.0626,0.639783,0.8435,0.846703,0.8435,0.843366
6,0.0252,0.642688,0.8516,0.853819,0.8516,0.851586
7,0.0076,0.63329,0.8571,0.858365,0.8571,0.857054


[I 2025-03-31 23:32:53,873] Trial 31 finished with value: 0.8570542008480629 and parameters: {'learning_rate': 0.000258339139521669, 'weight_decay': 0.007, 'warmup_steps': 31}. Best is trial 31 with value: 0.8570542008480629.


Trial 32 with params: {'learning_rate': 0.0005389429697782068, 'weight_decay': 0.009000000000000001, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5115,1.023345,0.7002,0.720385,0.7002,0.699179
2,0.6377,0.783004,0.7729,0.789121,0.7729,0.774164


[I 2025-03-31 23:35:40,258] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 0.0002486823675715325, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4395,0.775452,0.7769,0.788466,0.7769,0.774592
2,0.4586,0.633096,0.8143,0.820524,0.8143,0.814537
3,0.245,0.677278,0.8185,0.829461,0.8185,0.818548


[I 2025-03-31 23:39:52,057] Trial 33 pruned. 


Trial 34 with params: {'learning_rate': 0.00020403497206333, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4916,0.748407,0.7819,0.791661,0.7819,0.780617
2,0.4563,0.607255,0.8214,0.827115,0.8214,0.821247
3,0.2382,0.612566,0.8285,0.836051,0.8285,0.828523


[I 2025-03-31 23:44:03,392] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 8.626241033204354e-05, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9498,0.85191,0.7766,0.782542,0.7766,0.775666
2,0.566,0.621528,0.8225,0.828425,0.8225,0.821969
3,0.3025,0.587564,0.8319,0.836017,0.8319,0.831083
4,0.1686,0.577079,0.8377,0.840888,0.8377,0.837903
5,0.092,0.595771,0.8388,0.843237,0.8388,0.839326


[I 2025-03-31 23:51:24,282] Trial 35 pruned. 


Trial 36 with params: {'learning_rate': 0.004049761177508626, 'weight_decay': 0.006, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6426,4.616481,0.0106,0.000315,0.0106,0.000556
2,4.6193,4.614971,0.01,0.000868,0.01,0.000373


[I 2025-03-31 23:54:19,466] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 0.00016789423526605387, 'weight_decay': 0.0, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5513,0.758888,0.7841,0.794722,0.7841,0.783405
2,0.4585,0.610711,0.8223,0.829478,0.8223,0.821508
3,0.2362,0.598357,0.8313,0.838719,0.8313,0.831013
4,0.1213,0.589775,0.8416,0.845554,0.8416,0.84164
5,0.0584,0.615421,0.8465,0.850888,0.8465,0.846466
6,0.025,0.635299,0.8534,0.85521,0.8534,0.853327
7,0.0093,0.638881,0.8554,0.85702,0.8554,0.855194


[I 2025-04-01 00:04:33,766] Trial 37 finished with value: 0.8551944205214187 and parameters: {'learning_rate': 0.00016789423526605387, 'weight_decay': 0.0, 'warmup_steps': 19}. Best is trial 31 with value: 0.8570542008480629.


Trial 38 with params: {'learning_rate': 0.0002949188173299918, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4306,0.825173,0.7618,0.775104,0.7618,0.759547
2,0.4842,0.688112,0.7965,0.808637,0.7965,0.796416
3,0.2553,0.665639,0.8185,0.824466,0.8185,0.817338


[I 2025-04-01 00:08:47,131] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.00019920107654773192, 'weight_decay': 0.008, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5093,0.74006,0.7858,0.794853,0.7858,0.785188
2,0.459,0.63797,0.8133,0.823376,0.8133,0.813777
3,0.2361,0.626501,0.8224,0.831778,0.8224,0.822034


[I 2025-04-01 00:13:10,194] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.004241076779716196, 'weight_decay': 0.003, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6227,4.730128,0.0098,0.000682,0.0098,0.001032
2,4.6293,4.619731,0.01,0.000298,0.01,0.000578
3,4.5899,5.221293,0.0101,0.00013,0.0101,0.000255


[I 2025-04-01 00:17:35,002] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 0.00013083770591206187, 'weight_decay': 0.001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6643,0.775658,0.783,0.791248,0.783,0.782115
2,0.4754,0.596844,0.8239,0.831156,0.8239,0.82381
3,0.2424,0.573573,0.8378,0.841924,0.8378,0.837606
4,0.1256,0.578133,0.8404,0.84405,0.8404,0.840547
5,0.061,0.591153,0.8484,0.852969,0.8484,0.848905
6,0.0273,0.617423,0.8493,0.850423,0.8493,0.849078
7,0.0114,0.632217,0.8518,0.853158,0.8518,0.851559


[I 2025-04-01 00:27:53,174] Trial 41 finished with value: 0.8515590748190037 and parameters: {'learning_rate': 0.00013083770591206187, 'weight_decay': 0.001, 'warmup_steps': 19}. Best is trial 31 with value: 0.8570542008480629.


Trial 42 with params: {'learning_rate': 0.00017338620514654378, 'weight_decay': 0.0, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.519,0.766355,0.7786,0.788525,0.7786,0.776899
2,0.4489,0.604085,0.8233,0.832137,0.8233,0.823726
3,0.2294,0.592386,0.8317,0.83812,0.8317,0.831986
4,0.1172,0.610501,0.8355,0.840193,0.8355,0.835713
5,0.0552,0.63329,0.8424,0.84646,0.8424,0.842618


[I 2025-04-01 00:35:26,463] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.0001358185582556629, 'weight_decay': 0.001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6563,0.760994,0.7906,0.79686,0.7906,0.789621
2,0.4746,0.614245,0.8234,0.831734,0.8234,0.823151
3,0.2416,0.5813,0.8342,0.839607,0.8342,0.834276
4,0.1231,0.57204,0.8438,0.846459,0.8438,0.843913
5,0.06,0.599149,0.848,0.851263,0.848,0.848236
6,0.0259,0.631939,0.8485,0.851232,0.8485,0.848569
7,0.0108,0.630256,0.8555,0.857495,0.8555,0.855702


[I 2025-04-01 00:45:42,331] Trial 43 finished with value: 0.85570156280188 and parameters: {'learning_rate': 0.0001358185582556629, 'weight_decay': 0.001, 'warmup_steps': 22}. Best is trial 31 with value: 0.8570542008480629.


Trial 44 with params: {'learning_rate': 0.00019373507601239386, 'weight_decay': 0.002, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4989,0.770339,0.7739,0.783586,0.7739,0.772242
2,0.4553,0.63693,0.8138,0.825156,0.8138,0.814
3,0.235,0.615476,0.8243,0.834266,0.8243,0.824191


[I 2025-04-01 00:50:08,497] Trial 44 pruned. 


Trial 45 with params: {'learning_rate': 0.00010931050790575246, 'weight_decay': 0.002, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7446,0.778847,0.7846,0.790233,0.7846,0.783308
2,0.4959,0.599431,0.8264,0.832248,0.8264,0.826129
3,0.2561,0.585671,0.8319,0.836389,0.8319,0.831257
4,0.1351,0.566549,0.8433,0.846629,0.8433,0.843344
5,0.067,0.589141,0.8448,0.849136,0.8448,0.845363
6,0.032,0.608888,0.8478,0.849977,0.8478,0.847863
7,0.015,0.622038,0.8494,0.85093,0.8494,0.849272


[I 2025-04-01 01:00:00,306] Trial 45 finished with value: 0.849272215217933 and parameters: {'learning_rate': 0.00010931050790575246, 'weight_decay': 0.002, 'warmup_steps': 24}. Best is trial 31 with value: 0.8570542008480629.


Trial 46 with params: {'learning_rate': 0.00017679488702810524, 'weight_decay': 0.0, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5436,0.770267,0.778,0.788297,0.778,0.777044
2,0.4544,0.600685,0.8215,0.830143,0.8215,0.821982
3,0.2301,0.609743,0.8285,0.834919,0.8285,0.827995
4,0.1168,0.602211,0.8396,0.843793,0.8396,0.83985
5,0.0552,0.608179,0.8481,0.852169,0.8481,0.848626
6,0.0224,0.630569,0.8525,0.854726,0.8525,0.852861
7,0.0078,0.633235,0.8554,0.857416,0.8554,0.855539


[I 2025-04-01 01:09:55,214] Trial 46 finished with value: 0.8555389363869504 and parameters: {'learning_rate': 0.00017679488702810524, 'weight_decay': 0.0, 'warmup_steps': 29}. Best is trial 31 with value: 0.8570542008480629.


Trial 47 with params: {'learning_rate': 0.00013509134012626786, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.698,0.774146,0.7818,0.790617,0.7818,0.780833
2,0.4804,0.585448,0.8299,0.835645,0.8299,0.82992
3,0.2451,0.590335,0.8312,0.837799,0.8312,0.830954
4,0.1264,0.572273,0.8433,0.847023,0.8433,0.843529
5,0.0615,0.601824,0.8464,0.850562,0.8464,0.8466
6,0.0277,0.620177,0.8474,0.849222,0.8474,0.847501
7,0.012,0.630865,0.8494,0.851534,0.8494,0.849525


[I 2025-04-01 01:19:47,754] Trial 47 finished with value: 0.8495252619729601 and parameters: {'learning_rate': 0.00013509134012626786, 'weight_decay': 0.0, 'warmup_steps': 32}. Best is trial 31 with value: 0.8570542008480629.


Trial 48 with params: {'learning_rate': 0.0002835116575606236, 'weight_decay': 0.0, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.431,0.83713,0.754,0.773166,0.754,0.754158
2,0.4793,0.640988,0.8154,0.821288,0.8154,0.815013
3,0.2547,0.64939,0.8201,0.827165,0.8201,0.819386


[I 2025-04-01 01:23:59,702] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.00012908874611713492, 'weight_decay': 0.0, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6986,0.769666,0.7828,0.7906,0.7828,0.78227
2,0.4769,0.603749,0.8262,0.83226,0.8262,0.826024
3,0.2428,0.587585,0.8324,0.837102,0.8324,0.831988
4,0.1217,0.5802,0.8418,0.845542,0.8418,0.841774
5,0.0599,0.607907,0.8425,0.846706,0.8425,0.842821


[I 2025-04-01 01:30:56,776] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.0027800474932883233, 'weight_decay': 0.0, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.582,4.655816,0.0089,0.000171,0.0089,0.000332
2,4.5694,4.575641,0.0162,0.002317,0.0162,0.002953
3,4.6005,4.608271,0.0108,0.000847,0.0108,0.001478


[I 2025-04-01 01:35:05,727] Trial 50 pruned. 


Trial 51 with params: {'learning_rate': 0.00015710788766322775, 'weight_decay': 0.0, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5706,0.776594,0.7808,0.791043,0.7808,0.778359
2,0.4663,0.599572,0.8239,0.831888,0.8239,0.82401
3,0.2381,0.590872,0.8335,0.84071,0.8335,0.832983
4,0.1225,0.593359,0.8425,0.845661,0.8425,0.842683
5,0.0586,0.618042,0.8451,0.848454,0.8451,0.845453
6,0.0257,0.631938,0.8485,0.850487,0.8485,0.84861
7,0.0097,0.638034,0.8508,0.852491,0.8508,0.850912


[I 2025-04-01 01:44:48,291] Trial 51 finished with value: 0.8509119600945421 and parameters: {'learning_rate': 0.00015710788766322775, 'weight_decay': 0.0, 'warmup_steps': 21}. Best is trial 31 with value: 0.8570542008480629.


Trial 52 with params: {'learning_rate': 6.1005881023266626e-05, 'weight_decay': 0.007, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1878,1.003662,0.7585,0.762,0.7585,0.756082
2,0.6973,0.68117,0.8108,0.815634,0.8108,0.810279
3,0.4007,0.602088,0.826,0.828709,0.826,0.825165


[I 2025-04-01 01:49:12,423] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 0.0001989486888720143, 'weight_decay': 0.005, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4805,0.771186,0.7761,0.788682,0.7761,0.775225
2,0.4422,0.602114,0.8225,0.831877,0.8225,0.822929
3,0.228,0.610326,0.8282,0.834662,0.8282,0.827808


[I 2025-04-01 01:53:36,618] Trial 53 pruned. 


Trial 54 with params: {'learning_rate': 0.003728309996441757, 'weight_decay': 0.0, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.584,4.623127,0.0106,0.001011,0.0106,0.001371
2,4.6033,4.611508,0.0102,0.001552,0.0102,0.001775
3,4.5884,4.60574,0.0111,0.011174,0.0111,0.001639


[I 2025-04-01 01:58:04,032] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.0006137660286321804, 'weight_decay': 0.001, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5393,1.07101,0.6915,0.713639,0.6915,0.691376
2,0.6863,0.830558,0.7583,0.771121,0.7583,0.757544


[I 2025-04-01 02:00:55,313] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 0.004913837305728667, 'weight_decay': 0.002, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.7041,5.142561,0.01,0.0001,0.01,0.000198
2,4.624,4.614953,0.0113,0.000305,0.0113,0.00058
3,4.6044,4.613128,0.01,0.0001,0.01,0.000198


[I 2025-04-01 02:05:26,110] Trial 56 pruned. 


Trial 57 with params: {'learning_rate': 7.867832312882212e-05, 'weight_decay': 0.0, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0272,0.907437,0.7644,0.771015,0.7644,0.762984
2,0.6009,0.637668,0.8199,0.824746,0.8199,0.819339
3,0.3301,0.593761,0.8275,0.831622,0.8275,0.826902


[I 2025-04-01 02:09:49,641] Trial 57 pruned. 


Trial 58 with params: {'learning_rate': 0.00021771047684957567, 'weight_decay': 0.01, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4627,0.781936,0.7703,0.782937,0.7703,0.768222
2,0.4547,0.615618,0.8162,0.821411,0.8162,0.815624
3,0.2347,0.603345,0.8272,0.833676,0.8272,0.8261


[I 2025-04-01 02:14:17,303] Trial 58 pruned. 


Trial 59 with params: {'learning_rate': 0.0001332810682812896, 'weight_decay': 0.006, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6907,0.788821,0.7785,0.788232,0.7785,0.777544
2,0.4832,0.593138,0.8248,0.830776,0.8248,0.824819
3,0.2454,0.583595,0.8336,0.840138,0.8336,0.832942
4,0.126,0.583131,0.8403,0.844554,0.8403,0.840907
5,0.0609,0.600922,0.8438,0.848468,0.8438,0.84421
6,0.027,0.622298,0.8484,0.851164,0.8484,0.848593
7,0.0113,0.621383,0.8525,0.854052,0.8525,0.852361


[I 2025-04-01 02:24:40,026] Trial 59 finished with value: 0.8523608849540999 and parameters: {'learning_rate': 0.0001332810682812896, 'weight_decay': 0.006, 'warmup_steps': 29}. Best is trial 31 with value: 0.8570542008480629.


Trial 60 with params: {'learning_rate': 0.0002597323899851586, 'weight_decay': 0.003, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4455,0.789119,0.7707,0.781989,0.7707,0.769785
2,0.48,0.635474,0.8146,0.822409,0.8146,0.814755


[I 2025-04-01 02:27:37,740] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.0001557729158539645, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5692,0.782707,0.7803,0.790831,0.7803,0.779488
2,0.459,0.598628,0.8243,0.834467,0.8243,0.824916
3,0.2326,0.598909,0.8311,0.838326,0.8311,0.83136
4,0.117,0.586578,0.8448,0.848924,0.8448,0.844803
5,0.0578,0.602059,0.8482,0.852673,0.8482,0.848924
6,0.025,0.627017,0.851,0.853395,0.851,0.851176
7,0.0094,0.627834,0.8537,0.855425,0.8537,0.85387


[I 2025-04-01 02:38:01,767] Trial 61 finished with value: 0.8538702850762562 and parameters: {'learning_rate': 0.0001557729158539645, 'weight_decay': 0.007, 'warmup_steps': 22}. Best is trial 31 with value: 0.8570542008480629.


Trial 62 with params: {'learning_rate': 0.00011655615959071843, 'weight_decay': 0.007, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7173,0.773948,0.7852,0.792258,0.7852,0.784248
2,0.4906,0.604347,0.8237,0.829626,0.8237,0.823538
3,0.2533,0.578281,0.8366,0.840806,0.8366,0.83617
4,0.1335,0.581766,0.8398,0.842552,0.8398,0.839711
5,0.066,0.607162,0.8422,0.846396,0.8422,0.842524


[I 2025-04-01 02:45:30,102] Trial 62 pruned. 


Trial 63 with params: {'learning_rate': 5.416101708697321e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4116,1.125751,0.7357,0.738401,0.7357,0.732569
2,0.7937,0.720483,0.8047,0.807471,0.8047,0.803741
3,0.4614,0.621273,0.823,0.82675,0.823,0.822127


[I 2025-04-01 02:49:54,287] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.0014740970021661379, 'weight_decay': 0.005, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2542,1.889066,0.4861,0.548887,0.4861,0.474148
2,1.3238,1.452634,0.5943,0.643924,0.5943,0.591149


[I 2025-04-01 02:52:46,846] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.00013654495396102535, 'weight_decay': 0.007, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.623,0.768296,0.7847,0.792917,0.7847,0.783627
2,0.4658,0.595848,0.8269,0.832541,0.8269,0.82687
3,0.233,0.588179,0.8323,0.83816,0.8323,0.831854
4,0.1201,0.58736,0.8416,0.846781,0.8416,0.842299
5,0.0565,0.597843,0.8487,0.852178,0.8487,0.8489
6,0.0249,0.628802,0.8498,0.851629,0.8498,0.849848
7,0.0103,0.631118,0.854,0.855782,0.854,0.854098


[I 2025-04-01 03:02:54,112] Trial 65 finished with value: 0.8540980732865708 and parameters: {'learning_rate': 0.00013654495396102535, 'weight_decay': 0.007, 'warmup_steps': 18}. Best is trial 31 with value: 0.8570542008480629.


Trial 66 with params: {'learning_rate': 0.00016055834735144862, 'weight_decay': 0.008, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5755,0.753338,0.7845,0.794737,0.7845,0.784107
2,0.4587,0.589028,0.8262,0.831014,0.8262,0.82533
3,0.2341,0.600582,0.8307,0.836249,0.8307,0.830645
4,0.12,0.58557,0.8411,0.846444,0.8411,0.841935
5,0.0571,0.606564,0.8455,0.851596,0.8455,0.84616
6,0.0248,0.622251,0.8498,0.852878,0.8498,0.849941
7,0.0094,0.620332,0.8523,0.855035,0.8523,0.85272


[I 2025-04-01 03:13:12,712] Trial 66 finished with value: 0.8527197523834285 and parameters: {'learning_rate': 0.00016055834735144862, 'weight_decay': 0.008, 'warmup_steps': 16}. Best is trial 31 with value: 0.8570542008480629.


Trial 67 with params: {'learning_rate': 0.00043944091000312606, 'weight_decay': 0.008, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4501,0.932676,0.727,0.746407,0.727,0.727082
2,0.5821,0.784749,0.7736,0.789552,0.7736,0.77407
3,0.3285,0.708123,0.8006,0.812192,0.8006,0.800906


[I 2025-04-01 03:17:38,072] Trial 67 pruned. 


Trial 68 with params: {'learning_rate': 0.00010934155064576332, 'weight_decay': 0.007, 'warmup_steps': 14}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7562,0.802945,0.7823,0.789552,0.7823,0.78167
2,0.5103,0.609198,0.822,0.828178,0.822,0.821903
3,0.2659,0.587783,0.8279,0.833844,0.8279,0.826863
4,0.1402,0.572125,0.8396,0.843259,0.8396,0.839945
5,0.0707,0.592178,0.8412,0.845324,0.8412,0.841669


[I 2025-04-01 03:25:01,459] Trial 68 pruned. 


Trial 69 with params: {'learning_rate': 0.0001494924194285667, 'weight_decay': 0.005, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5911,0.756805,0.7859,0.795002,0.7859,0.784771
2,0.4623,0.60034,0.8226,0.830498,0.8226,0.822889
3,0.2328,0.614103,0.8252,0.831843,0.8252,0.824969


[I 2025-04-01 03:29:26,551] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 9.777273002232183e-05, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8855,0.832389,0.7727,0.778931,0.7727,0.771439
2,0.5452,0.596404,0.8292,0.834942,0.8292,0.828885
3,0.2867,0.577367,0.8326,0.837115,0.8326,0.831755
4,0.1548,0.563215,0.84,0.844762,0.84,0.840275
5,0.0804,0.577288,0.8429,0.847347,0.8429,0.843137


[I 2025-04-01 03:36:52,624] Trial 70 pruned. 


Trial 71 with params: {'learning_rate': 0.00028438316143275045, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.427,0.810533,0.7612,0.774578,0.7612,0.75966
2,0.4871,0.665464,0.8079,0.816353,0.8079,0.807854
3,0.2588,0.637438,0.8292,0.835347,0.8292,0.828492
4,0.1372,0.646446,0.8331,0.839473,0.8331,0.833802
5,0.0673,0.647117,0.8405,0.843915,0.8405,0.840161


[I 2025-04-01 03:43:59,103] Trial 71 pruned. 


Trial 72 with params: {'learning_rate': 0.0002005036840473374, 'weight_decay': 0.004, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4691,0.759674,0.7746,0.7881,0.7746,0.77326
2,0.447,0.633033,0.8161,0.825849,0.8161,0.816191
3,0.2284,0.607205,0.8305,0.839107,0.8305,0.830766
4,0.115,0.617597,0.8382,0.842989,0.8382,0.838595
5,0.0549,0.621259,0.8449,0.848015,0.8449,0.845121


[I 2025-04-01 03:51:18,700] Trial 72 pruned. 


Trial 73 with params: {'learning_rate': 0.000482439313953908, 'weight_decay': 0.001, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.428,0.93716,0.7287,0.746959,0.7287,0.729153
2,0.6014,0.756211,0.782,0.797467,0.782,0.781476
3,0.3442,0.737892,0.7975,0.806663,0.7975,0.796385


[I 2025-04-01 03:55:43,385] Trial 73 pruned. 


Trial 74 with params: {'learning_rate': 0.00044835113849237766, 'weight_decay': 0.006, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4476,0.936509,0.729,0.746513,0.729,0.726519
2,0.5856,0.753178,0.7826,0.792803,0.7826,0.781313


[I 2025-04-01 03:58:40,830] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00016223479576125968, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.564,0.764284,0.7877,0.797346,0.7877,0.78787
2,0.4601,0.61138,0.82,0.827245,0.82,0.819651
3,0.2325,0.601671,0.8292,0.835442,0.8292,0.829291
4,0.1195,0.599623,0.8379,0.842742,0.8379,0.838367
5,0.0573,0.619154,0.8432,0.84773,0.8432,0.843514
6,0.0238,0.647834,0.8471,0.849406,0.8471,0.847252
7,0.0089,0.648851,0.8512,0.852677,0.8512,0.851081


[I 2025-04-01 04:08:59,926] Trial 75 finished with value: 0.8510814892939125 and parameters: {'learning_rate': 0.00016223479576125968, 'weight_decay': 0.007, 'warmup_steps': 22}. Best is trial 31 with value: 0.8570542008480629.


Trial 76 with params: {'learning_rate': 0.00024712623965777986, 'weight_decay': 0.006, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.469,0.777182,0.7706,0.78218,0.7706,0.769779
2,0.4723,0.640132,0.8104,0.820165,0.8104,0.810679


Using the latest cached version of the module from /home/jovyan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/34c46321f42186df33a6260966e34a368f14868d9cc2ba47d142112e2800d233 (last modified on Sat Mar 29 17:35:20 2025) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
[I 2025-04-01 04:12:04,070] Trial 76 pruned. 
Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Trial 77 with params: {'learning_rate': 0.00023210198222455396, 'weight_decay': 0.002, 'warmup_steps': 18}


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4655,0.79071,0.7712,0.783064,0.7712,0.769725
2,0.4686,0.632329,0.8176,0.826607,0.8176,0.818105
3,0.2474,0.621656,0.8271,0.832082,0.8271,0.826657
4,0.1307,0.633252,0.8357,0.84135,0.8357,0.835578
5,0.0606,0.641242,0.844,0.85032,0.844,0.844864
6,0.0248,0.647978,0.8473,0.850773,0.8473,0.847453
7,0.0077,0.638003,0.8539,0.85635,0.8539,0.854121


[I 2025-04-01 04:22:31,141] Trial 77 finished with value: 0.8541212466611162 and parameters: {'learning_rate': 0.00023210198222455396, 'weight_decay': 0.002, 'warmup_steps': 18}. Best is trial 31 with value: 0.8570542008480629.


Trial 78 with params: {'learning_rate': 0.00019175658864349472, 'weight_decay': 0.002, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4928,0.742568,0.784,0.792893,0.784,0.782907
2,0.4554,0.625741,0.8162,0.824979,0.8162,0.815831
3,0.2354,0.620975,0.8257,0.833317,0.8257,0.825864
4,0.1188,0.597188,0.8398,0.845501,0.8398,0.840532
5,0.0576,0.633523,0.8428,0.846762,0.8428,0.843144


[I 2025-04-01 04:29:44,806] Trial 78 pruned. 


Trial 79 with params: {'learning_rate': 0.000252073530901726, 'weight_decay': 0.002, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4421,0.79934,0.7676,0.778813,0.7676,0.765952
2,0.4627,0.642552,0.814,0.822172,0.814,0.813009


[I 2025-04-01 04:32:31,569] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 9.349140721589015e-05, 'weight_decay': 0.002, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.949,0.867435,0.7677,0.774778,0.7677,0.766674
2,0.564,0.608581,0.8235,0.828627,0.8235,0.823067
3,0.296,0.578212,0.8314,0.834996,0.8314,0.830635
4,0.1607,0.57456,0.8392,0.842641,0.8392,0.83908
5,0.0859,0.583473,0.8413,0.845879,0.8413,0.841587


[I 2025-04-01 04:40:03,518] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 9.295562520499347e-05, 'weight_decay': 0.005, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8803,0.831874,0.7788,0.786392,0.7788,0.777862
2,0.5385,0.61199,0.8226,0.828686,0.8226,0.822167
3,0.2818,0.578362,0.8315,0.835468,0.8315,0.830598


Using the latest cached version of the module from /home/jovyan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/34c46321f42186df33a6260966e34a368f14868d9cc2ba47d142112e2800d233 (last modified on Sat Mar 29 17:35:20 2025) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
[I 2025-04-01 04:44:23,856] Trial 81 pruned. 


Trial 82 with params: {'learning_rate': 0.0003156208034765572, 'weight_decay': 0.002, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4038,0.867817,0.7465,0.766909,0.7465,0.7448
2,0.503,0.666729,0.8061,0.815236,0.8061,0.805904
3,0.2712,0.638692,0.8213,0.827603,0.8213,0.820427


[I 2025-04-01 04:48:44,967] Trial 82 pruned. 


Trial 83 with params: {'learning_rate': 0.00011959454113040998, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7418,0.775704,0.7855,0.792615,0.7855,0.784829
2,0.496,0.598719,0.8243,0.830395,0.8243,0.823625
3,0.2546,0.583795,0.8298,0.835764,0.8298,0.829033


[I 2025-04-01 04:53:05,401] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.00024176339379894683, 'weight_decay': 0.001, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4676,0.784147,0.7722,0.786091,0.7722,0.771568
2,0.4623,0.625362,0.8158,0.823993,0.8158,0.815617
3,0.2417,0.607421,0.8306,0.835712,0.8306,0.83029
4,0.1256,0.624659,0.8303,0.83457,0.8303,0.829911
5,0.0599,0.630139,0.8473,0.850803,0.8473,0.847309
6,0.0246,0.638107,0.8526,0.854302,0.8526,0.852473
7,0.0075,0.636228,0.8577,0.85891,0.8577,0.857595


[I 2025-04-01 05:03:33,727] Trial 84 finished with value: 0.8575947234486994 and parameters: {'learning_rate': 0.00024176339379894683, 'weight_decay': 0.001, 'warmup_steps': 29}. Best is trial 84 with value: 0.8575947234486994.


Trial 85 with params: {'learning_rate': 0.0001650372628745251, 'weight_decay': 0.001, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5604,0.740943,0.7892,0.798158,0.7892,0.788575
2,0.4406,0.595052,0.8227,0.829815,0.8227,0.822618
3,0.2202,0.586743,0.8367,0.842072,0.8367,0.836446
4,0.109,0.597003,0.8413,0.845418,0.8413,0.841805
5,0.0518,0.610939,0.8479,0.850611,0.8479,0.847735
6,0.0212,0.638337,0.8542,0.856084,0.8542,0.85425
7,0.0079,0.638786,0.8559,0.857587,0.8559,0.855952


[I 2025-04-01 05:13:54,596] Trial 85 finished with value: 0.855952197529166 and parameters: {'learning_rate': 0.0001650372628745251, 'weight_decay': 0.001, 'warmup_steps': 28}. Best is trial 84 with value: 0.8575947234486994.


Trial 86 with params: {'learning_rate': 0.0002531376394893783, 'weight_decay': 0.001, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4336,0.777848,0.7708,0.781394,0.7708,0.770052
2,0.4714,0.640336,0.8146,0.821951,0.8146,0.813568
3,0.2459,0.648469,0.82,0.828279,0.82,0.819652


[I 2025-04-01 05:18:16,708] Trial 86 pruned. 


Trial 87 with params: {'learning_rate': 0.00023571487846759192, 'weight_decay': 0.001, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4618,0.794908,0.7676,0.780539,0.7676,0.76524
2,0.4584,0.611386,0.8227,0.829102,0.8227,0.822443
3,0.2403,0.619637,0.8292,0.835668,0.8292,0.828582


[I 2025-04-01 05:22:40,137] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 6.887549964417383e-05, 'weight_decay': 0.002, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.139,0.920385,0.7666,0.771131,0.7666,0.76492
2,0.6256,0.641484,0.8151,0.819087,0.8151,0.81447


[I 2025-04-01 05:25:40,747] Trial 88 pruned. 


Trial 89 with params: {'learning_rate': 0.00017883267792684685, 'weight_decay': 0.001, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5395,0.754027,0.7815,0.792063,0.7815,0.780501
2,0.4576,0.614218,0.8224,0.829182,0.8224,0.822307
3,0.236,0.608512,0.8288,0.836547,0.8288,0.827917
4,0.1206,0.596785,0.8419,0.845722,0.8419,0.841653
5,0.0569,0.615387,0.8499,0.853536,0.8499,0.850062
6,0.0246,0.632669,0.853,0.854482,0.853,0.852703
7,0.0088,0.632025,0.8561,0.857905,0.8561,0.856235


[I 2025-04-01 05:36:08,794] Trial 89 finished with value: 0.8562347696254196 and parameters: {'learning_rate': 0.00017883267792684685, 'weight_decay': 0.001, 'warmup_steps': 27}. Best is trial 84 with value: 0.8575947234486994.


Trial 90 with params: {'learning_rate': 0.00019824196107398465, 'weight_decay': 0.0, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5116,0.778068,0.7737,0.785318,0.7737,0.773094
2,0.4595,0.615579,0.8189,0.827359,0.8189,0.819327
3,0.2363,0.626865,0.8227,0.828942,0.8227,0.82206


[I 2025-04-01 05:40:34,082] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.0002129133294674653, 'weight_decay': 0.003, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4812,0.782635,0.775,0.786452,0.775,0.773755
2,0.4541,0.63156,0.8175,0.825781,0.8175,0.817059
3,0.2366,0.614134,0.8259,0.83408,0.8259,0.82608


[I 2025-04-01 05:44:55,821] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 0.00011146393064827147, 'weight_decay': 0.0, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7724,0.794533,0.7824,0.791692,0.7824,0.782026
2,0.5043,0.600091,0.8241,0.82958,0.8241,0.823399
3,0.2577,0.575612,0.8356,0.840688,0.8356,0.835193
4,0.1344,0.583219,0.8393,0.843832,0.8393,0.839908
5,0.068,0.591654,0.8438,0.847775,0.8438,0.844181
6,0.0319,0.621255,0.8485,0.850656,0.8485,0.848528
7,0.0149,0.623783,0.8498,0.851803,0.8498,0.850051


[I 2025-04-01 05:55:26,916] Trial 92 finished with value: 0.8500510425818866 and parameters: {'learning_rate': 0.00011146393064827147, 'weight_decay': 0.0, 'warmup_steps': 25}. Best is trial 84 with value: 0.8575947234486994.


Trial 93 with params: {'learning_rate': 0.0001641240339677858, 'weight_decay': 0.002, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6061,0.776637,0.7774,0.787571,0.7774,0.775892
2,0.4497,0.607014,0.8236,0.830148,0.8236,0.823534
3,0.2286,0.596106,0.8318,0.838599,0.8318,0.831825
4,0.1127,0.593232,0.8402,0.843797,0.8402,0.840238
5,0.0537,0.627095,0.844,0.848416,0.844,0.844306


[I 2025-04-01 06:02:44,649] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.00012331553756891753, 'weight_decay': 0.002, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.69,0.788502,0.7802,0.791414,0.7802,0.779499
2,0.4808,0.590984,0.8279,0.832385,0.8279,0.827287
3,0.2446,0.585115,0.8306,0.836248,0.8306,0.830131
4,0.126,0.571192,0.8408,0.844441,0.8408,0.840884
5,0.0627,0.592746,0.8497,0.853529,0.8497,0.850016
6,0.0284,0.621559,0.8474,0.849148,0.8474,0.847293
7,0.0124,0.63094,0.8511,0.852882,0.8511,0.851098


[I 2025-04-01 06:13:12,303] Trial 94 finished with value: 0.8510979097805386 and parameters: {'learning_rate': 0.00012331553756891753, 'weight_decay': 0.002, 'warmup_steps': 20}. Best is trial 84 with value: 0.8575947234486994.


Trial 95 with params: {'learning_rate': 0.0002767967782108509, 'weight_decay': 0.0, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4316,0.786517,0.7687,0.780129,0.7687,0.767182
2,0.4828,0.641882,0.8125,0.819024,0.8125,0.811441


[I 2025-04-01 06:16:05,927] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 0.000251198311411577, 'weight_decay': 0.002, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4324,0.801743,0.7638,0.777164,0.7638,0.762987
2,0.466,0.639932,0.8122,0.82011,0.8122,0.812053
3,0.2451,0.625218,0.8288,0.835763,0.8288,0.828862
4,0.13,0.622009,0.8333,0.838861,0.8333,0.833469
5,0.0615,0.644641,0.8395,0.844213,0.8395,0.839815


[I 2025-04-01 06:23:25,903] Trial 96 pruned. 


Trial 97 with params: {'learning_rate': 0.00013747370875648516, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6571,0.758353,0.7874,0.796239,0.7874,0.786697
2,0.4556,0.588057,0.8287,0.834476,0.8287,0.828556
3,0.232,0.582839,0.8331,0.839577,0.8331,0.83312
4,0.116,0.581106,0.8442,0.848022,0.8442,0.844264
5,0.0564,0.60192,0.8478,0.851202,0.8478,0.848063
6,0.0245,0.624631,0.8506,0.852907,0.8506,0.850796
7,0.0097,0.631688,0.8526,0.854562,0.8526,0.85272


[I 2025-04-01 06:33:50,309] Trial 97 finished with value: 0.8527197732345759 and parameters: {'learning_rate': 0.00013747370875648516, 'weight_decay': 0.0, 'warmup_steps': 32}. Best is trial 84 with value: 0.8575947234486994.


Trial 98 with params: {'learning_rate': 0.0035054904723296637, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.6136,4.671011,0.0121,0.000505,0.0121,0.000957
2,4.5806,,0.01,0.0001,0.01,0.000198


[I 2025-04-01 06:36:50,728] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 0.00032988803669291067, 'weight_decay': 0.0, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4364,0.829438,0.7551,0.770281,0.7551,0.753706
2,0.5026,0.691915,0.8017,0.809173,0.8017,0.801368
3,0.2744,0.633137,0.8226,0.827964,0.8226,0.821954


[I 2025-04-01 06:41:11,158] Trial 99 pruned. 


Trial 100 with params: {'learning_rate': 0.00024257879228753928, 'weight_decay': 0.001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4615,0.7781,0.7738,0.785121,0.7738,0.77188
2,0.4578,0.659305,0.8056,0.816454,0.8056,0.806472


[I 2025-04-01 06:44:06,976] Trial 100 pruned. 


Trial 101 with params: {'learning_rate': 0.00010918237090316295, 'weight_decay': 0.0, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7515,0.789816,0.7848,0.79207,0.7848,0.783355
2,0.5003,0.595096,0.8234,0.828786,0.8234,0.822769
3,0.2557,0.580256,0.833,0.839052,0.833,0.83244
4,0.1354,0.581082,0.841,0.844466,0.841,0.841089
5,0.0666,0.597515,0.8421,0.845283,0.8421,0.841968


[I 2025-04-01 06:51:32,898] Trial 101 pruned. 


Trial 102 with params: {'learning_rate': 0.00042107547311878626, 'weight_decay': 0.005, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4246,0.910928,0.7346,0.752884,0.7346,0.732461
2,0.5597,0.705703,0.7932,0.805524,0.7932,0.792623


[I 2025-04-01 06:54:34,037] Trial 102 pruned. 


Trial 103 with params: {'learning_rate': 0.0002119064687482245, 'weight_decay': 0.001, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4805,0.792432,0.7728,0.787198,0.7728,0.771065
2,0.4531,0.622361,0.8168,0.825334,0.8168,0.816855
3,0.2346,0.609225,0.8303,0.836565,0.8303,0.830022
4,0.1208,0.610539,0.8398,0.84466,0.8398,0.840126
5,0.0547,0.636496,0.8431,0.846782,0.8431,0.84331


[I 2025-04-01 07:01:46,233] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.0001071668942083213, 'weight_decay': 0.001, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7899,0.802229,0.7809,0.788226,0.7809,0.780332
2,0.5064,0.584997,0.8284,0.833728,0.8284,0.828362
3,0.2617,0.572412,0.8347,0.838457,0.8347,0.833927
4,0.1391,0.576378,0.8389,0.842694,0.8389,0.839013
5,0.071,0.586411,0.8457,0.849581,0.8457,0.846095
6,0.0343,0.607884,0.849,0.850445,0.849,0.848725
7,0.0167,0.610752,0.8501,0.851669,0.8501,0.850078


[I 2025-04-01 07:12:00,323] Trial 104 finished with value: 0.8500780718992536 and parameters: {'learning_rate': 0.0001071668942083213, 'weight_decay': 0.001, 'warmup_steps': 27}. Best is trial 84 with value: 0.8575947234486994.


Trial 105 with params: {'learning_rate': 0.0006078662726350267, 'weight_decay': 0.01, 'warmup_steps': 2}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5245,1.020663,0.7038,0.726301,0.7038,0.701258
2,0.6832,0.86131,0.7529,0.771889,0.7529,0.753125
3,0.4097,0.781361,0.784,0.795213,0.784,0.783794


[I 2025-04-01 07:16:27,018] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.0002535523477632805, 'weight_decay': 0.0, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4453,0.793478,0.7674,0.780335,0.7674,0.766129
2,0.4682,0.644273,0.8114,0.820675,0.8114,0.811595


[I 2025-04-01 07:19:21,798] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 0.002493660865891546, 'weight_decay': 0.005, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.8283,4.201065,0.0822,0.096342,0.0822,0.054657
2,3.4316,3.420828,0.1707,0.19361,0.1707,0.140254
3,3.0349,2.929533,0.265,0.296424,0.265,0.248566


[I 2025-04-01 07:23:58,850] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 0.00016088500731856073, 'weight_decay': 0.007, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5971,0.778332,0.78,0.790753,0.78,0.778796
2,0.465,0.595557,0.8289,0.835976,0.8289,0.829251
3,0.2368,0.602815,0.8317,0.838722,0.8317,0.831175
4,0.1198,0.606409,0.841,0.845445,0.841,0.840819
5,0.0574,0.612602,0.8465,0.85121,0.8465,0.847043
6,0.025,0.644957,0.8467,0.848981,0.8467,0.846938
7,0.0091,0.644294,0.8505,0.852928,0.8505,0.850639


[I 2025-04-01 07:34:50,304] Trial 108 finished with value: 0.8506389443624147 and parameters: {'learning_rate': 0.00016088500731856073, 'weight_decay': 0.007, 'warmup_steps': 20}. Best is trial 84 with value: 0.8575947234486994.


Trial 109 with params: {'learning_rate': 0.00011050900312372248, 'weight_decay': 0.002, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8169,0.820936,0.7783,0.786704,0.7783,0.77736
2,0.5286,0.605657,0.8258,0.83056,0.8258,0.825448
3,0.2758,0.569777,0.8344,0.839118,0.8344,0.833987
4,0.1473,0.57912,0.8373,0.842104,0.8373,0.83764
5,0.0732,0.596691,0.8409,0.846093,0.8409,0.841615


[I 2025-04-01 07:42:18,162] Trial 109 pruned. 


Trial 110 with params: {'learning_rate': 0.0004889125990488732, 'weight_decay': 0.002, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4542,0.964093,0.7191,0.734869,0.7191,0.71539
2,0.6065,0.767861,0.7723,0.788905,0.7723,0.772491


[I 2025-04-01 07:45:14,816] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.0008421825948317979, 'weight_decay': 0.007, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6977,1.341056,0.622,0.669722,0.622,0.618644
2,0.8425,1.046944,0.7039,0.728262,0.7039,0.702332
3,0.533,0.88267,0.7514,0.769343,0.7514,0.751578


[I 2025-04-01 07:49:39,619] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 0.00020393357126083418, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5078,0.75926,0.7765,0.786917,0.7765,0.774702
2,0.4477,0.621416,0.8192,0.826041,0.8192,0.818661
3,0.2272,0.62354,0.8269,0.835883,0.8269,0.826599


[I 2025-04-01 07:54:04,886] Trial 112 pruned. 


Trial 113 with params: {'learning_rate': 0.00021007654913268774, 'weight_decay': 0.008, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4857,0.744484,0.7841,0.792243,0.7841,0.782334
2,0.4573,0.63612,0.8165,0.824061,0.8165,0.816063
3,0.2379,0.61549,0.8247,0.833844,0.8247,0.824491


[I 2025-04-01 07:58:33,290] Trial 113 pruned. 


Trial 114 with params: {'learning_rate': 7.13472662987329e-05, 'weight_decay': 0.007, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1633,0.958055,0.7621,0.766823,0.7621,0.760179
2,0.6575,0.65186,0.8141,0.818492,0.8141,0.812956


[I 2025-04-01 08:01:27,721] Trial 114 pruned. 


Trial 115 with params: {'learning_rate': 0.0001307644182668806, 'weight_decay': 0.004, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6891,0.763663,0.7836,0.791544,0.7836,0.78253
2,0.4714,0.595338,0.8238,0.831145,0.8238,0.823869
3,0.2474,0.585831,0.8355,0.841119,0.8355,0.834881
4,0.1289,0.581991,0.8416,0.845199,0.8416,0.841813
5,0.0613,0.607352,0.8434,0.84668,0.8434,0.843793
6,0.0276,0.627893,0.8486,0.850467,0.8486,0.848579
7,0.0116,0.635808,0.852,0.853742,0.852,0.852056


[I 2025-04-01 08:11:56,648] Trial 115 finished with value: 0.8520561652418506 and parameters: {'learning_rate': 0.0001307644182668806, 'weight_decay': 0.004, 'warmup_steps': 30}. Best is trial 84 with value: 0.8575947234486994.


Trial 116 with params: {'learning_rate': 0.00013880841023798597, 'weight_decay': 0.0, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6457,0.771836,0.7824,0.791613,0.7824,0.782122
2,0.468,0.602028,0.8209,0.82681,0.8209,0.820952
3,0.2402,0.586023,0.8346,0.83978,0.8346,0.833986
4,0.1214,0.584469,0.8402,0.843839,0.8402,0.840081
5,0.0584,0.611478,0.8462,0.849214,0.8462,0.846324
6,0.0258,0.634541,0.848,0.85009,0.848,0.848264
7,0.0105,0.64592,0.8519,0.853028,0.8519,0.851807


[I 2025-04-01 08:22:20,570] Trial 116 finished with value: 0.8518072593918218 and parameters: {'learning_rate': 0.00013880841023798597, 'weight_decay': 0.0, 'warmup_steps': 27}. Best is trial 84 with value: 0.8575947234486994.


Trial 117 with params: {'learning_rate': 0.0027121193476131807, 'weight_decay': 0.009000000000000001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.5021,4.581251,0.0203,0.003241,0.0203,0.003573
2,4.54,4.626503,0.0104,0.000337,0.0104,0.000538
3,4.5876,4.586789,0.0116,0.001341,0.0116,0.002227


[I 2025-04-01 08:26:44,765] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.00013541939500363146, 'weight_decay': 0.006, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6449,0.761917,0.7847,0.793539,0.7847,0.784037
2,0.4647,0.589492,0.8264,0.832712,0.8264,0.825767
3,0.2388,0.586666,0.8348,0.839165,0.8348,0.833942
4,0.1226,0.577992,0.8439,0.84701,0.8439,0.843945
5,0.0585,0.591946,0.8471,0.85047,0.8471,0.847552
6,0.0259,0.616537,0.8504,0.852468,0.8504,0.850544
7,0.0106,0.623736,0.8538,0.855316,0.8538,0.853913


[I 2025-04-01 08:37:06,071] Trial 118 finished with value: 0.8539130391569877 and parameters: {'learning_rate': 0.00013541939500363146, 'weight_decay': 0.006, 'warmup_steps': 29}. Best is trial 84 with value: 0.8575947234486994.


Trial 119 with params: {'learning_rate': 0.00021314096180181686, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4608,0.741754,0.7854,0.795187,0.7854,0.78411
2,0.4532,0.606466,0.8238,0.828977,0.8238,0.823326
3,0.2317,0.613102,0.8306,0.83923,0.8306,0.830655
4,0.1204,0.614996,0.8409,0.844623,0.8409,0.840548
5,0.0571,0.626929,0.8455,0.849823,0.8455,0.845665
6,0.0234,0.628423,0.8526,0.854149,0.8526,0.852328
7,0.007,0.629357,0.8567,0.858849,0.8567,0.856787


[I 2025-04-01 08:47:16,989] Trial 119 finished with value: 0.8567871481254612 and parameters: {'learning_rate': 0.00021314096180181686, 'weight_decay': 0.007, 'warmup_steps': 19}. Best is trial 84 with value: 0.8575947234486994.


Trial 120 with params: {'learning_rate': 0.00035059268088946086, 'weight_decay': 0.008, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4101,0.855901,0.7519,0.767058,0.7519,0.750941
2,0.5201,0.700927,0.7963,0.806299,0.7963,0.79636
3,0.2887,0.684316,0.8112,0.819845,0.8112,0.810699


[I 2025-04-01 08:51:35,092] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.00045732402794190655, 'weight_decay': 0.006, 'warmup_steps': 9}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4229,0.925827,0.7345,0.750678,0.7345,0.732482
2,0.591,0.768221,0.7801,0.795844,0.7801,0.780197


[I 2025-04-01 08:54:32,662] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 0.00020141815353268575, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4662,0.74577,0.784,0.794914,0.784,0.783471
2,0.4467,0.59197,0.8246,0.832681,0.8246,0.824469
3,0.2286,0.603137,0.8325,0.838564,0.8325,0.831544
4,0.1181,0.58885,0.8402,0.844076,0.8402,0.840121
5,0.0558,0.613192,0.8483,0.852941,0.8483,0.848752
6,0.0222,0.626298,0.8519,0.853847,0.8519,0.851851
7,0.0078,0.623628,0.8585,0.860201,0.8585,0.858592


[I 2025-04-01 09:04:35,954] Trial 122 finished with value: 0.8585920779886291 and parameters: {'learning_rate': 0.00020141815353268575, 'weight_decay': 0.006, 'warmup_steps': 23}. Best is trial 122 with value: 0.8585920779886291.


Trial 123 with params: {'learning_rate': 0.00023059012379124415, 'weight_decay': 0.006, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4586,0.791012,0.7727,0.783989,0.7727,0.772135
2,0.4541,0.656333,0.8106,0.819989,0.8106,0.810116


[I 2025-04-01 09:07:37,903] Trial 123 pruned. 


Trial 124 with params: {'learning_rate': 0.00019454911403491722, 'weight_decay': 0.008, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.483,0.768126,0.775,0.788222,0.775,0.774445
2,0.4585,0.606478,0.8188,0.825133,0.8188,0.818168
3,0.2393,0.623404,0.8248,0.832141,0.8248,0.824804


[I 2025-04-01 09:12:13,634] Trial 124 pruned. 


Trial 125 with params: {'learning_rate': 0.00020227298894907243, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.526,0.782981,0.7739,0.787237,0.7739,0.773757
2,0.4623,0.636481,0.8111,0.820339,0.8111,0.811094


[I 2025-04-01 09:15:10,789] Trial 125 pruned. 


Trial 126 with params: {'learning_rate': 8.903956616127396e-05, 'weight_decay': 0.005, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.931,0.852164,0.7759,0.783955,0.7759,0.775208
2,0.5575,0.607248,0.824,0.82724,0.824,0.823132
3,0.2958,0.58043,0.8315,0.836574,0.8315,0.831619
4,0.1626,0.562283,0.8424,0.844856,0.8424,0.842356
5,0.0878,0.585129,0.8416,0.845378,0.8416,0.841908


[I 2025-04-01 09:22:18,313] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.0001474942356763857, 'weight_decay': 0.004, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5926,0.759374,0.7829,0.791844,0.7829,0.782319
2,0.464,0.578943,0.8278,0.834421,0.8278,0.827404
3,0.2378,0.597172,0.8318,0.837631,0.8318,0.831821
4,0.1199,0.599645,0.8402,0.844979,0.8402,0.840671
5,0.0586,0.612872,0.8439,0.848033,0.8439,0.844479


[I 2025-04-01 09:29:25,706] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.00015379394474418276, 'weight_decay': 0.0, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5919,0.775462,0.7783,0.787716,0.7783,0.776524
2,0.4525,0.601338,0.8239,0.830505,0.8239,0.823368
3,0.2302,0.609124,0.8288,0.837451,0.8288,0.828941


[I 2025-04-01 09:33:43,528] Trial 128 pruned. 


Trial 129 with params: {'learning_rate': 0.0002800892902809688, 'weight_decay': 0.007, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4087,0.813171,0.7634,0.774689,0.7634,0.762585
2,0.4855,0.657569,0.8081,0.818048,0.8081,0.808188
3,0.2591,0.648037,0.8202,0.830719,0.8202,0.82071


[I 2025-04-01 09:38:02,656] Trial 129 pruned. 


Trial 130 with params: {'learning_rate': 0.00028773140142047265, 'weight_decay': 0.008, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4281,0.782691,0.7699,0.783087,0.7699,0.768047
2,0.4813,0.684351,0.802,0.813074,0.802,0.801851
3,0.2547,0.654843,0.8191,0.827845,0.8191,0.819197


[I 2025-04-01 09:42:21,378] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 0.00011675686624803736, 'weight_decay': 0.01, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7219,0.781637,0.7873,0.794898,0.7873,0.786588
2,0.4928,0.583907,0.8289,0.834003,0.8289,0.828551
3,0.2492,0.58052,0.8332,0.838288,0.8332,0.83315
4,0.1295,0.578341,0.8422,0.845119,0.8422,0.842034
5,0.0638,0.59454,0.8486,0.851507,0.8486,0.848583
6,0.0288,0.61686,0.8493,0.851058,0.8493,0.849314
7,0.0129,0.62246,0.8519,0.85333,0.8519,0.851796


[I 2025-04-01 09:52:15,110] Trial 131 finished with value: 0.8517958338454459 and parameters: {'learning_rate': 0.00011675686624803736, 'weight_decay': 0.01, 'warmup_steps': 19}. Best is trial 122 with value: 0.8585920779886291.


Trial 132 with params: {'learning_rate': 0.00038636180098425673, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4214,0.898555,0.7382,0.760917,0.7382,0.737892
2,0.5458,0.71303,0.7882,0.799572,0.7882,0.788895
3,0.298,0.703193,0.8056,0.81487,0.8056,0.805183


[I 2025-04-01 09:56:39,083] Trial 132 pruned. 


Trial 133 with params: {'learning_rate': 0.00010448581381551576, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8204,0.812754,0.7825,0.79277,0.7825,0.782688
2,0.5063,0.607745,0.8233,0.830394,0.8233,0.823134
3,0.2596,0.578639,0.8348,0.839443,0.8348,0.834308
4,0.1368,0.571041,0.842,0.845007,0.842,0.842112
5,0.0693,0.589806,0.8463,0.849761,0.8463,0.846612
6,0.0332,0.613836,0.849,0.851055,0.849,0.849174
7,0.016,0.618655,0.8506,0.851456,0.8506,0.850422


[I 2025-04-01 10:06:52,736] Trial 133 finished with value: 0.8504215752497062 and parameters: {'learning_rate': 0.00010448581381551576, 'weight_decay': 0.007, 'warmup_steps': 24}. Best is trial 122 with value: 0.8585920779886291.


Trial 134 with params: {'learning_rate': 0.00017063030577171216, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5492,0.774499,0.7773,0.787831,0.7773,0.77607
2,0.461,0.608591,0.8218,0.829364,0.8218,0.821396
3,0.2393,0.596426,0.8288,0.836172,0.8288,0.828232


Exception in thread Thread-4373 (_pin_memory_loop):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py", line 766, in run_closure
    _threading_Thread_run(self)
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 59, in _pin_memory_loop
    do_one_step()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 35, in do_one_step
    r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 122, in get
    return _ForkingPickler.loads(res)
  File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/reductions.py", line 541, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/lib/python3.10/multiproc

KeyboardInterrupt: 

In [None]:
print(best_base_aug)

BestRun(run_id='35', objective=0.7718702742260117, hyperparameters={'learning_rate': 0.0024870786738035154, 'weight_decay': 0.009000000000000001, 'warmup_steps': 20}, run_summary=None)


In [33]:
base.reset_seed()

## Prohledávání s destilací nad augmentovaným datasetem
Konfigurace jednotlivých tréninků.

In [34]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-aug-KD_hp-search", logging_dir=f"~/logs/{DATASET}/-aug-KD_hp-search", remove_unused_columns=False, epochs=num_epochs, batch_size=batch_size)

Definice hledaných hyperparametrů a jejich rozmezí, rozšířeno o hyperparametry destilace.

In [35]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up),
        "lambda_param": trial.suggest_float("lambda_param",0,1,step=.1),
        "temperature": trial.suggest_float("temperature", 2,7, step=.5)
    }
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [36]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



Konfigurace destilačního trenéra pro jednotlivé tréninky. 

In [37]:
trainer = base.DistilTrainer(
    args=training_args,
    train_dataset=train_combo,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [38]:
best_distill_aug = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Distill",
    n_trials=150
)

[I 2025-04-01 10:12:25,505] A new study created in memory with name: Distill


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.442,0.830699,0.7765,0.783738,0.7765,0.774274
2,0.5706,0.673918,0.8124,0.820475,0.8124,0.811902


[I 2025-04-01 10:15:20,335] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.00010255552094216992, 'weight_decay': 0.0, 'warmup_steps': 28, 'lambda_param': 0.6000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8296,0.944463,0.7806,0.785789,0.7806,0.779428
2,0.6493,0.671517,0.8257,0.829334,0.8257,0.824909
3,0.4064,0.621557,0.8314,0.836576,0.8314,0.830642


[I 2025-04-01 10:19:49,578] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 5.497167787383099e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2824,1.246314,0.7422,0.74418,0.7422,0.738161
2,0.9097,0.837624,0.804,0.808529,0.804,0.802817


[I 2025-04-01 10:22:50,319] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7303,0.914107,0.7849,0.790449,0.7849,0.783453
2,0.6232,0.657464,0.8267,0.831352,0.8267,0.826094
3,0.3903,0.611531,0.8284,0.833487,0.8284,0.827957


[I 2025-04-01 10:27:18,980] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.0008369042894376068, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5923,1.235321,0.654,0.686634,0.654,0.649404
2,0.8585,0.978041,0.733,0.756104,0.733,0.730797


[I 2025-04-01 10:30:23,060] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.0018591820902866042, 'weight_decay': 0.002, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.512,2.202312,0.4192,0.468917,0.4192,0.401841
2,1.648,1.623557,0.5546,0.602417,0.5546,0.548884
3,1.1917,1.296716,0.6468,0.673898,0.6468,0.644449


[I 2025-04-01 10:34:56,426] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0008204643365323959, 'weight_decay': 0.001, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6525,1.284374,0.6538,0.687889,0.6538,0.654946
2,0.8789,0.993049,0.7257,0.752534,0.7257,0.724426


[I 2025-04-01 10:38:05,073] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 0.0020690200562805084, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.0531,4.186987,0.0162,0.00199,0.0162,0.002225
2,4.1106,4.240138,0.0139,0.001437,0.0139,0.001926
3,4.1081,4.219965,0.0125,0.00164,0.0125,0.002338


[I 2025-04-01 10:42:31,823] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8954,1.017693,0.7715,0.774537,0.7715,0.769589
2,0.7094,0.703505,0.8173,0.822054,0.8173,0.816462
3,0.4491,0.637664,0.8297,0.835144,0.8297,0.828964
4,0.3223,0.585901,0.8387,0.843114,0.8387,0.839196
5,0.2487,0.574397,0.8431,0.847447,0.8431,0.843405


[I 2025-04-01 10:50:03,669] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0010568529720322872, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7775,1.366403,0.6251,0.652539,0.6251,0.618378
2,1.0033,1.141408,0.6862,0.724629,0.6862,0.685155


[I 2025-04-01 10:53:04,719] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 5.622306732978549e-05, 'weight_decay': 0.004, 'warmup_steps': 6, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2364,1.244618,0.7441,0.747751,0.7441,0.74135
2,0.9002,0.828459,0.8047,0.809358,0.8047,0.803465
3,0.586,0.700628,0.8215,0.825061,0.8215,0.820827


[I 2025-04-01 10:57:52,587] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 0.00020808715310578245, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5141,0.831358,0.7828,0.791868,0.7828,0.781216
2,0.5584,0.66434,0.8218,0.828637,0.8218,0.821592
3,0.3537,0.604921,0.8288,0.835388,0.8288,0.828641


[I 2025-04-01 11:02:31,234] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 0.00014318207047557446, 'weight_decay': 0.001, 'warmup_steps': 21, 'lambda_param': 0.8, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6355,0.843175,0.7882,0.793896,0.7882,0.786918
2,0.5717,0.638198,0.8274,0.834182,0.8274,0.827412
3,0.3585,0.593833,0.8378,0.843879,0.8378,0.837846
4,0.2554,0.56237,0.8477,0.851278,0.8477,0.84814
5,0.1929,0.555205,0.8468,0.852904,0.8468,0.847557


[I 2025-04-01 11:10:06,578] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.00012331668578613732, 'weight_decay': 0.006, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7352,0.9066,0.7788,0.786061,0.7788,0.777391
2,0.618,0.663153,0.8254,0.831521,0.8254,0.824778
3,0.3883,0.616934,0.8321,0.838407,0.8321,0.831891


[I 2025-04-01 11:14:34,834] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 0.00014946504427538972, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6121,0.873375,0.7843,0.791027,0.7843,0.78278
2,0.5823,0.651809,0.8259,0.83375,0.8259,0.825478
3,0.3647,0.599251,0.8366,0.841632,0.8366,0.836594
4,0.2569,0.579115,0.8442,0.850257,0.8442,0.844575
5,0.1947,0.549717,0.8493,0.854088,0.8493,0.849929
6,0.1593,0.533478,0.8534,0.856263,0.8534,0.853825
7,0.1386,0.530327,0.8502,0.853534,0.8502,0.850606


[I 2025-04-01 11:25:11,200] Trial 14 finished with value: 0.8506060588786533 and parameters: {'learning_rate': 0.00014946504427538972, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 3.0}. Best is trial 14 with value: 0.8506060588786533.


Trial 15 with params: {'learning_rate': 0.00021258259817850112, 'weight_decay': 0.01, 'warmup_steps': 6, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4584,0.815189,0.7804,0.790658,0.7804,0.777932
2,0.5465,0.63277,0.8296,0.835338,0.8296,0.828953
3,0.3448,0.606354,0.8328,0.84101,0.8328,0.833177
4,0.2401,0.570778,0.8453,0.850523,0.8453,0.845641
5,0.1822,0.537425,0.8532,0.858335,0.8532,0.853825


[I 2025-04-01 11:32:41,313] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.00011041300981406029, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7377,0.909569,0.7838,0.78935,0.7838,0.782507
2,0.6199,0.663607,0.8246,0.830672,0.8246,0.823881
3,0.3857,0.611179,0.8346,0.838729,0.8346,0.833682
4,0.2739,0.570013,0.8443,0.848693,0.8443,0.844701
5,0.2082,0.553166,0.8509,0.855028,0.8509,0.851288
6,0.1721,0.549667,0.85,0.852794,0.85,0.850397
7,0.1516,0.544905,0.8528,0.856284,0.8528,0.853355


[I 2025-04-01 11:43:08,385] Trial 16 finished with value: 0.8533553539536822 and parameters: {'learning_rate': 0.00011041300981406029, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}. Best is trial 16 with value: 0.8533553539536822.


Trial 17 with params: {'learning_rate': 0.0001271138219055735, 'weight_decay': 0.007, 'warmup_steps': 2, 'lambda_param': 0.4, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6767,0.900991,0.7837,0.789562,0.7837,0.782144
2,0.6142,0.660892,0.8251,0.831589,0.8251,0.824896
3,0.3843,0.614124,0.8351,0.839754,0.8351,0.834878
4,0.2723,0.58461,0.8418,0.846846,0.8418,0.842149
5,0.2061,0.56178,0.8477,0.852536,0.8477,0.848219


[I 2025-04-01 11:50:44,656] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0001860438527479082, 'weight_decay': 0.007, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5128,0.83866,0.784,0.794963,0.784,0.783586
2,0.5577,0.657363,0.8208,0.829161,0.8208,0.820977
3,0.3531,0.599801,0.8355,0.840816,0.8355,0.835459
4,0.2495,0.567155,0.8458,0.849758,0.8458,0.845826
5,0.1875,0.532284,0.8554,0.860145,0.8554,0.856073
6,0.1523,0.522173,0.8542,0.857552,0.8542,0.854698
7,0.1315,0.511869,0.8592,0.863289,0.8592,0.859819


[I 2025-04-01 12:01:13,515] Trial 18 finished with value: 0.8598192450811298 and parameters: {'learning_rate': 0.0001860438527479082, 'weight_decay': 0.007, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 6.0}. Best is trial 18 with value: 0.8598192450811298.


Trial 19 with params: {'learning_rate': 0.00015104795396754156, 'weight_decay': 0.006, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6009,0.862312,0.7848,0.790669,0.7848,0.783925
2,0.5796,0.647929,0.8276,0.834318,0.8276,0.826655
3,0.3627,0.599817,0.8346,0.840054,0.8346,0.834606
4,0.2545,0.572232,0.8445,0.850283,0.8445,0.845102
5,0.1928,0.55579,0.8469,0.853428,0.8469,0.847748


[I 2025-04-01 12:08:46,875] Trial 19 pruned. 


Trial 20 with params: {'learning_rate': 0.00024294067604751493, 'weight_decay': 0.007, 'warmup_steps': 2, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4382,0.842595,0.7731,0.7846,0.7731,0.772725
2,0.5604,0.656607,0.8155,0.823201,0.8155,0.815646


[I 2025-04-01 12:11:48,352] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 0.0007376036707263537, 'weight_decay': 0.008, 'warmup_steps': 6, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5157,1.128502,0.6881,0.713831,0.6881,0.686903
2,0.783,0.905016,0.7466,0.770387,0.7466,0.747624
3,0.5356,0.742442,0.7923,0.805317,0.7923,0.793237


[I 2025-04-01 12:16:19,415] Trial 21 pruned. 


Trial 22 with params: {'learning_rate': 6.45219786851023e-05, 'weight_decay': 0.01, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1199,1.159607,0.7509,0.75565,0.7509,0.748902
2,0.8288,0.782377,0.8091,0.814107,0.8091,0.808467


[I 2025-04-01 12:19:14,720] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 0.00026445676532948267, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4134,0.844569,0.775,0.78427,0.775,0.773213
2,0.5599,0.662098,0.8179,0.826104,0.8179,0.818378
3,0.3573,0.620914,0.8266,0.832966,0.8266,0.826376


[I 2025-04-01 12:23:45,252] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 0.0011514770369736363, 'weight_decay': 0.008, 'warmup_steps': 7, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8544,1.492744,0.5882,0.629125,0.5882,0.5819
2,1.0817,1.179606,0.6761,0.719541,0.6761,0.67801


[I 2025-04-01 12:26:46,911] Trial 24 pruned. 


Trial 25 with params: {'learning_rate': 5.16710029953126e-05, 'weight_decay': 0.008, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2668,1.288857,0.7387,0.741885,0.7387,0.735399
2,0.9491,0.871327,0.7998,0.803598,0.7998,0.798651
3,0.6283,0.723977,0.8205,0.823593,0.8205,0.819705


[I 2025-04-01 12:31:16,247] Trial 25 pruned. 


Trial 26 with params: {'learning_rate': 5.592172930187604e-05, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2322,1.269221,0.7403,0.743715,0.7403,0.736947
2,0.9206,0.846661,0.7988,0.80279,0.7988,0.79787


[I 2025-04-01 12:34:10,480] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.00012350880810512403, 'weight_decay': 0.009000000000000001, 'warmup_steps': 14, 'lambda_param': 0.5, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6961,0.881915,0.7896,0.794183,0.7896,0.787784
2,0.6021,0.650803,0.8287,0.833736,0.8287,0.827877
3,0.3784,0.616572,0.8305,0.836568,0.8305,0.830706
4,0.2674,0.573378,0.8468,0.849994,0.8468,0.846815
5,0.2037,0.555856,0.8486,0.853867,0.8486,0.849404
6,0.1683,0.544879,0.8531,0.856856,0.8531,0.853903
7,0.1468,0.539134,0.8532,0.856605,0.8532,0.853822


[I 2025-04-01 12:44:44,622] Trial 27 finished with value: 0.853822087045385 and parameters: {'learning_rate': 0.00012350880810512403, 'weight_decay': 0.009000000000000001, 'warmup_steps': 14, 'lambda_param': 0.5, 'temperature': 5.0}. Best is trial 18 with value: 0.8598192450811298.


Trial 28 with params: {'learning_rate': 0.00013092326614596477, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7,0.895973,0.7834,0.789626,0.7834,0.781984
2,0.6041,0.654521,0.8257,0.832912,0.8257,0.825523
3,0.3772,0.603585,0.8339,0.838938,0.8339,0.833383
4,0.2648,0.571067,0.8453,0.851145,0.8453,0.846344
5,0.2015,0.555156,0.847,0.85246,0.847,0.847825


[I 2025-04-01 12:52:22,314] Trial 28 pruned. 


Trial 29 with params: {'learning_rate': 0.00019194287081269768, 'weight_decay': 0.008, 'warmup_steps': 12, 'lambda_param': 0.5, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5018,0.824057,0.7858,0.793747,0.7858,0.784239
2,0.5555,0.655732,0.8232,0.832615,0.8232,0.823735
3,0.351,0.608989,0.8332,0.840131,0.8332,0.833396
4,0.247,0.579373,0.8404,0.844874,0.8404,0.840529
5,0.1858,0.552676,0.8494,0.855527,0.8494,0.85032


[I 2025-04-01 12:59:55,910] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 5.66291389001909e-05, 'weight_decay': 0.007, 'warmup_steps': 14, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2445,1.224903,0.7463,0.748394,0.7463,0.743313
2,0.8886,0.824864,0.8049,0.808286,0.8049,0.80356


[I 2025-04-01 13:03:04,649] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 8.786086936277051e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 2, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8824,1.014865,0.7714,0.775863,0.7714,0.769891
2,0.703,0.691921,0.8215,0.826238,0.8215,0.820574
3,0.4393,0.621664,0.8312,0.835727,0.8312,0.830512
4,0.3141,0.584309,0.8405,0.844818,0.8405,0.840664
5,0.2421,0.568263,0.8463,0.850852,0.8463,0.846671


[I 2025-04-01 13:10:42,896] Trial 31 pruned. 


Trial 32 with params: {'learning_rate': 7.62208749019178e-05, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.8, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0017,1.077697,0.7668,0.769261,0.7668,0.764545
2,0.7624,0.74236,0.8127,0.818785,0.8127,0.811925
3,0.4829,0.648911,0.8293,0.833901,0.8293,0.828736


[I 2025-04-01 13:15:19,758] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 0.00023675231015297013, 'weight_decay': 0.009000000000000001, 'warmup_steps': 7, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4446,0.855499,0.7723,0.783014,0.7723,0.77129
2,0.5594,0.687688,0.8103,0.819136,0.8103,0.809916
3,0.36,0.610415,0.8347,0.841996,0.8347,0.834942
4,0.2527,0.568092,0.8465,0.850615,0.8465,0.846289
5,0.1887,0.54524,0.8534,0.859867,0.8534,0.854322
6,0.1522,0.51793,0.8566,0.860371,0.8566,0.857378
7,0.1301,0.509577,0.8597,0.864082,0.8597,0.860667


[I 2025-04-01 13:25:59,561] Trial 33 finished with value: 0.8606671441794386 and parameters: {'learning_rate': 0.00023675231015297013, 'weight_decay': 0.009000000000000001, 'warmup_steps': 7, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 33 with value: 0.8606671441794386.


Trial 34 with params: {'learning_rate': 0.00022849167557974757, 'weight_decay': 0.009000000000000001, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4624,0.870063,0.768,0.783015,0.768,0.76568
2,0.554,0.660903,0.8169,0.826085,0.8169,0.816648
3,0.3529,0.610366,0.834,0.842124,0.834,0.833702
4,0.2513,0.5799,0.8395,0.84712,0.8395,0.84024
5,0.1844,0.539624,0.85,0.855019,0.85,0.850783


[I 2025-04-01 13:33:36,906] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 0.00016081556420152821, 'weight_decay': 0.005, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5734,0.847151,0.7826,0.790171,0.7826,0.78112
2,0.5729,0.65525,0.8239,0.832351,0.8239,0.823511
3,0.3614,0.609031,0.8361,0.842339,0.8361,0.835898
4,0.2529,0.561738,0.8498,0.854561,0.8498,0.85046
5,0.1924,0.555009,0.8481,0.854286,0.8481,0.848941
6,0.1566,0.527615,0.8579,0.861935,0.8579,0.858538
7,0.136,0.521199,0.857,0.860543,0.857,0.857559


[I 2025-04-01 13:44:12,255] Trial 35 finished with value: 0.8575593306595666 and parameters: {'learning_rate': 0.00016081556420152821, 'weight_decay': 0.005, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 5.5}. Best is trial 33 with value: 0.8606671441794386.


Trial 36 with params: {'learning_rate': 0.00023401460146566425, 'weight_decay': 0.005, 'warmup_steps': 4, 'lambda_param': 0.8, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4255,0.840958,0.7811,0.791216,0.7811,0.779604
2,0.554,0.666879,0.8157,0.823711,0.8157,0.816233
3,0.3512,0.607582,0.8353,0.84293,0.8353,0.835704
4,0.2474,0.561075,0.8438,0.849117,0.8438,0.843853
5,0.1837,0.546799,0.8502,0.856335,0.8502,0.850829
6,0.1479,0.516172,0.8585,0.861923,0.8585,0.859068
7,0.1266,0.505179,0.8593,0.863528,0.8593,0.860062


[I 2025-04-01 13:54:55,348] Trial 36 finished with value: 0.8600619179433522 and parameters: {'learning_rate': 0.00023401460146566425, 'weight_decay': 0.005, 'warmup_steps': 4, 'lambda_param': 0.8, 'temperature': 5.5}. Best is trial 33 with value: 0.8606671441794386.


Trial 37 with params: {'learning_rate': 0.00014813481211787215, 'weight_decay': 0.004, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5886,0.857093,0.7836,0.792487,0.7836,0.782681
2,0.574,0.640124,0.829,0.836463,0.829,0.828909
3,0.3602,0.608153,0.8334,0.839084,0.8334,0.832862
4,0.2531,0.567553,0.8457,0.849512,0.8457,0.845743
5,0.1912,0.547647,0.8495,0.855141,0.8495,0.850094
6,0.157,0.532293,0.8559,0.85925,0.8559,0.856507
7,0.1368,0.526182,0.8548,0.858722,0.8548,0.855435


[I 2025-04-01 14:05:38,176] Trial 37 finished with value: 0.8554350297200556 and parameters: {'learning_rate': 0.00014813481211787215, 'weight_decay': 0.004, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 5.5}. Best is trial 33 with value: 0.8606671441794386.


Trial 38 with params: {'learning_rate': 0.0005275684994443943, 'weight_decay': 0.004, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4342,1.014817,0.7283,0.7517,0.7283,0.727372
2,0.6822,0.793523,0.779,0.792374,0.779,0.778156
3,0.4523,0.768142,0.7911,0.806284,0.7911,0.792288


[I 2025-04-01 14:10:13,610] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.00024230743564446759, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4243,0.829402,0.7797,0.787549,0.7797,0.777918
2,0.5511,0.657745,0.8188,0.826683,0.8188,0.818532
3,0.3533,0.610583,0.8331,0.839667,0.8331,0.832644


[I 2025-04-01 14:14:46,887] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.00015621785974900628, 'weight_decay': 0.006, 'warmup_steps': 2, 'lambda_param': 0.9, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5765,0.859187,0.7827,0.788389,0.7827,0.780988
2,0.5766,0.638347,0.8259,0.830967,0.8259,0.824584
3,0.364,0.600896,0.8347,0.841289,0.8347,0.83432
4,0.2573,0.56869,0.8431,0.847459,0.8431,0.843084
5,0.1952,0.545995,0.8493,0.854441,0.8493,0.850049
6,0.1592,0.532936,0.8516,0.85538,0.8516,0.852341
7,0.1381,0.526815,0.856,0.860224,0.856,0.856923


[I 2025-04-01 14:25:36,145] Trial 40 finished with value: 0.8569229387400098 and parameters: {'learning_rate': 0.00015621785974900628, 'weight_decay': 0.006, 'warmup_steps': 2, 'lambda_param': 0.9, 'temperature': 5.0}. Best is trial 33 with value: 0.8606671441794386.


Trial 41 with params: {'learning_rate': 0.00021324235029436422, 'weight_decay': 0.006, 'warmup_steps': 9, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4688,0.832812,0.7836,0.790457,0.7836,0.782256
2,0.5516,0.656456,0.8217,0.828795,0.8217,0.821125
3,0.3522,0.597184,0.8338,0.841135,0.8338,0.834169
4,0.2465,0.567859,0.8453,0.850053,0.8453,0.845535
5,0.1846,0.535835,0.8543,0.861131,0.8543,0.855292
6,0.1502,0.520723,0.8556,0.859959,0.8556,0.856341
7,0.1284,0.509498,0.8597,0.864101,0.8597,0.860535


[I 2025-04-01 14:36:24,917] Trial 41 finished with value: 0.8605345030771987 and parameters: {'learning_rate': 0.00021324235029436422, 'weight_decay': 0.006, 'warmup_steps': 9, 'lambda_param': 0.8, 'temperature': 4.5}. Best is trial 33 with value: 0.8606671441794386.


Trial 42 with params: {'learning_rate': 0.002298170148918028, 'weight_decay': 0.006, 'warmup_steps': 10, 'lambda_param': 0.1, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.3075,3.458879,0.1397,0.188298,0.1397,0.116548
2,2.72,2.691112,0.292,0.304655,0.292,0.267383
3,2.259,2.315181,0.3862,0.431677,0.3862,0.373188


[I 2025-04-01 14:41:00,507] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.00014560729164089404, 'weight_decay': 0.005, 'warmup_steps': 14, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6441,0.859412,0.7863,0.790407,0.7863,0.78427
2,0.5841,0.6353,0.8317,0.837919,0.8317,0.831612
3,0.3652,0.604848,0.8316,0.837548,0.8316,0.831802


[I 2025-04-01 14:45:33,602] Trial 43 pruned. 


Trial 44 with params: {'learning_rate': 0.00040230856338490886, 'weight_decay': 0.006, 'warmup_steps': 6, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3802,0.931404,0.7465,0.761755,0.7465,0.744431
2,0.6133,0.747116,0.7904,0.801991,0.7904,0.789877
3,0.4026,0.680622,0.8134,0.821928,0.8134,0.813327


[I 2025-04-01 14:50:13,285] Trial 44 pruned. 


Trial 45 with params: {'learning_rate': 0.004229168606699789, 'weight_decay': 0.009000000000000001, 'warmup_steps': 24, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.0906,4.239159,0.0121,0.0006,0.0121,0.001108
2,4.1177,4.223066,0.012,0.000676,0.012,0.001242


[I 2025-04-01 14:53:14,471] Trial 45 pruned. 


Trial 46 with params: {'learning_rate': 0.0005023354056531005, 'weight_decay': 0.003, 'warmup_steps': 13, 'lambda_param': 0.8, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4276,0.998175,0.7275,0.746568,0.7275,0.725524
2,0.6622,0.75373,0.7931,0.804157,0.7931,0.792337
3,0.4409,0.72299,0.8013,0.81443,0.8013,0.802687


[I 2025-04-01 14:57:44,790] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 5.232252858049981e-05, 'weight_decay': 0.002, 'warmup_steps': 3, 'lambda_param': 0.5, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3229,1.315865,0.7301,0.733014,0.7301,0.726465
2,0.9652,0.879638,0.7957,0.79968,0.7957,0.79476


[I 2025-04-01 15:00:47,675] Trial 47 pruned. 


Trial 48 with params: {'learning_rate': 0.0027511979602444763, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.124,4.254261,0.0107,0.001048,0.0107,0.001669
2,4.1332,4.230837,0.01,0.00121,0.01,0.00038


[I 2025-04-01 15:03:49,661] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.00033467475481528913, 'weight_decay': 0.005, 'warmup_steps': 5, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3717,0.864587,0.765,0.781469,0.765,0.763057
2,0.5828,0.669411,0.8154,0.822041,0.8154,0.814875
3,0.381,0.643206,0.8206,0.832748,0.8206,0.821313


[I 2025-04-01 15:08:23,021] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.0001498281325389243, 'weight_decay': 0.006, 'warmup_steps': 8, 'lambda_param': 0.7000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6036,0.852192,0.789,0.794056,0.789,0.787427
2,0.5751,0.648382,0.8245,0.829449,0.8245,0.824117
3,0.3618,0.601158,0.8354,0.84263,0.8354,0.835757
4,0.2547,0.567915,0.8478,0.852195,0.8478,0.847803
5,0.193,0.543072,0.8501,0.856257,0.8501,0.851042
6,0.1584,0.529325,0.8537,0.858204,0.8537,0.854624
7,0.1382,0.524218,0.8527,0.857191,0.8527,0.853481


[I 2025-04-01 15:19:03,318] Trial 50 finished with value: 0.8534811357687364 and parameters: {'learning_rate': 0.0001498281325389243, 'weight_decay': 0.006, 'warmup_steps': 8, 'lambda_param': 0.7000000000000001, 'temperature': 6.0}. Best is trial 33 with value: 0.8606671441794386.


Trial 51 with params: {'learning_rate': 0.00020535820992259815, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4681,0.819867,0.7859,0.796284,0.7859,0.784531
2,0.5509,0.636211,0.8262,0.832083,0.8262,0.825952
3,0.3503,0.592103,0.8375,0.843078,0.8375,0.837692
4,0.2458,0.564285,0.8474,0.852449,0.8474,0.84743
5,0.1845,0.533041,0.8561,0.860875,0.8561,0.856498
6,0.1501,0.514087,0.8583,0.862049,0.8583,0.859023
7,0.1289,0.505024,0.8605,0.864208,0.8605,0.861124


[I 2025-04-01 15:29:38,756] Trial 51 finished with value: 0.8611237155660776 and parameters: {'learning_rate': 0.00020535820992259815, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 52 with params: {'learning_rate': 0.00020870672493436443, 'weight_decay': 0.003, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4639,0.835296,0.781,0.78934,0.781,0.779421
2,0.5502,0.649714,0.8225,0.830455,0.8225,0.822086
3,0.3502,0.600561,0.834,0.840162,0.834,0.833663
4,0.2442,0.552349,0.8468,0.850105,0.8468,0.846788
5,0.1842,0.534597,0.856,0.861949,0.856,0.856787
6,0.148,0.519439,0.8595,0.863361,0.8595,0.860171
7,0.1275,0.509428,0.8603,0.864188,0.8603,0.860791


[I 2025-04-01 15:40:14,339] Trial 52 finished with value: 0.8607910018237797 and parameters: {'learning_rate': 0.00020870672493436443, 'weight_decay': 0.003, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 3.5}. Best is trial 51 with value: 0.8611237155660776.


Trial 53 with params: {'learning_rate': 0.0002618813755780017, 'weight_decay': 0.003, 'warmup_steps': 2, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3913,0.858022,0.77,0.786136,0.77,0.76953
2,0.5541,0.676307,0.8131,0.821073,0.8131,0.812611


[I 2025-04-01 15:43:22,426] Trial 53 pruned. 


Trial 54 with params: {'learning_rate': 0.0027026130785766608, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,3.4064,3.483115,0.1258,0.15438,0.1258,0.091969
2,2.8987,2.915239,0.2513,0.316216,0.2513,0.229278
3,2.3989,2.277465,0.4036,0.418467,0.4036,0.388858


[I 2025-04-01 15:47:53,736] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.0005299625745065567, 'weight_decay': 0.002, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4211,0.988277,0.7265,0.746292,0.7265,0.725124
2,0.6754,0.811526,0.7743,0.794108,0.7743,0.774071


[I 2025-04-01 15:51:00,172] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 7.925837293949273e-05, 'weight_decay': 0.001, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9899,1.068276,0.7662,0.76853,0.7662,0.763645
2,0.7507,0.724704,0.8185,0.822713,0.8185,0.817367


[I 2025-04-01 16:03:13,874] Trial 57 pruned. 


Trial 58 with params: {'learning_rate': 0.0036268618324185363, 'weight_decay': 0.005, 'warmup_steps': 21, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.0671,4.238917,0.0108,0.000645,0.0108,0.001068
2,4.1327,4.222339,0.0111,0.000867,0.0111,0.001259


[I 2025-04-01 16:06:13,150] Trial 58 pruned. 


Trial 59 with params: {'learning_rate': 0.000325302090969157, 'weight_decay': 0.006, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3923,0.912212,0.7566,0.773482,0.7566,0.755412
2,0.578,0.694711,0.8083,0.815679,0.8083,0.807655
3,0.3764,0.645284,0.8234,0.831912,0.8234,0.822877


[I 2025-04-01 16:10:49,409] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 0.0003020939879565185, 'weight_decay': 0.005, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4445,0.848321,0.7698,0.783472,0.7698,0.768861
2,0.5671,0.697922,0.8055,0.817021,0.8055,0.805456


[I 2025-04-01 16:13:52,783] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.00017988707904248417, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5036,0.836941,0.7804,0.790112,0.7804,0.779334
2,0.5479,0.638017,0.8276,0.835532,0.8276,0.827709
3,0.3511,0.607196,0.8318,0.838525,0.8318,0.831463
4,0.2469,0.559106,0.8444,0.848375,0.8444,0.844394
5,0.1867,0.543482,0.8511,0.856186,0.8511,0.851622
6,0.153,0.519752,0.8558,0.85957,0.8558,0.856508
7,0.1324,0.511157,0.8582,0.862135,0.8582,0.858854


[I 2025-04-01 16:24:35,905] Trial 61 finished with value: 0.8588544598908805 and parameters: {'learning_rate': 0.00017988707904248417, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 62 with params: {'learning_rate': 6.558317820187559e-05, 'weight_decay': 0.004, 'warmup_steps': 11, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1272,1.161462,0.7521,0.755695,0.7521,0.749912
2,0.829,0.77747,0.8134,0.817301,0.8134,0.812236
3,0.5314,0.669064,0.8247,0.827501,0.8247,0.823604


[I 2025-04-01 16:29:16,536] Trial 62 pruned. 


Trial 63 with params: {'learning_rate': 0.00024414893560250413, 'weight_decay': 0.002, 'warmup_steps': 1, 'lambda_param': 0.30000000000000004, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4333,0.826473,0.7821,0.790377,0.7821,0.78095
2,0.5588,0.65929,0.8248,0.832077,0.8248,0.82502
3,0.3565,0.613939,0.8303,0.83815,0.8303,0.83052
4,0.2496,0.581193,0.8449,0.849992,0.8449,0.84526
5,0.1851,0.548617,0.8477,0.852885,0.8477,0.847885


IOPub message rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_msg_rate_limit`.

Current values:
ServerApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
ServerApp.rate_limit_window=3.0 (secs)

[I 2025-04-01 16:36:58,054] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.00041098534227771127, 'weight_decay': 0.008, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4459,0.931702,0.7453,0.759672,0.7453,0.743936
2,0.6262,0.725409,0.798,0.810353,0.798,0.79762


[I 2025-04-01 16:40:02,973] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.00044050251030992095, 'weight_decay': 0.002, 'warmup_steps': 8, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3879,0.905176,0.7485,0.764434,0.7485,0.747079
2,0.6304,0.736753,0.796,0.805726,0.796,0.796215
3,0.418,0.669302,0.8147,0.82076,0.8147,0.813777


[I 2025-04-01 16:44:41,634] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 0.0002717249638274919, 'weight_decay': 0.002, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4176,0.857122,0.773,0.785835,0.773,0.772491
2,0.5688,0.656879,0.818,0.824973,0.818,0.817784
3,0.3657,0.631246,0.8264,0.835012,0.8264,0.826752


[I 2025-04-01 16:49:14,448] Trial 66 pruned. 


Trial 67 with params: {'learning_rate': 0.0003640485467190753, 'weight_decay': 0.007, 'warmup_steps': 9, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4096,0.934198,0.747,0.767454,0.747,0.74658
2,0.6001,0.706059,0.8044,0.813497,0.8044,0.803803
3,0.3909,0.660175,0.8146,0.82328,0.8146,0.814168


[I 2025-04-01 16:53:47,901] Trial 67 pruned. 


Trial 68 with params: {'learning_rate': 0.00040467369472663053, 'weight_decay': 0.01, 'warmup_steps': 7, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3825,0.953366,0.7394,0.760727,0.7394,0.737448
2,0.6052,0.724032,0.7982,0.809402,0.7982,0.79802
3,0.3977,0.655002,0.8207,0.830156,0.8207,0.821389


[I 2025-04-01 16:58:20,331] Trial 68 pruned. 


Trial 69 with params: {'learning_rate': 0.00046850307312499905, 'weight_decay': 0.008, 'warmup_steps': 32, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4496,0.939609,0.7441,0.760473,0.7441,0.742815
2,0.6443,0.77363,0.7864,0.799873,0.7864,0.786292


[I 2025-04-01 17:01:24,384] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 0.00013125971998685846, 'weight_decay': 0.004, 'warmup_steps': 5, 'lambda_param': 0.6000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6822,0.897873,0.7802,0.786286,0.7802,0.779143
2,0.6038,0.657669,0.8243,0.82986,0.8243,0.823706
3,0.3779,0.603274,0.8351,0.839595,0.8351,0.835003
4,0.2683,0.577,0.8419,0.847144,0.8419,0.842446
5,0.2038,0.561938,0.8466,0.853115,0.8466,0.84782


[I 2025-04-01 17:09:04,413] Trial 70 pruned. 


Trial 71 with params: {'learning_rate': 0.0001933540519632629, 'weight_decay': 0.007, 'warmup_steps': 7, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.494,0.833071,0.7829,0.791105,0.7829,0.782055
2,0.5542,0.648797,0.8218,0.83048,0.8218,0.821447
3,0.3544,0.595969,0.838,0.843374,0.838,0.837753
4,0.2498,0.562611,0.8469,0.852161,0.8469,0.847481
5,0.1875,0.546189,0.8508,0.856225,0.8508,0.851389
6,0.1521,0.521403,0.854,0.857276,0.854,0.854425
7,0.1313,0.510865,0.8571,0.861033,0.8571,0.85781


[I 2025-04-01 17:19:36,516] Trial 71 finished with value: 0.8578096393301994 and parameters: {'learning_rate': 0.0001933540519632629, 'weight_decay': 0.007, 'warmup_steps': 7, 'lambda_param': 1.0, 'temperature': 6.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 72 with params: {'learning_rate': 0.00012523809716738472, 'weight_decay': 0.008, 'warmup_steps': 11, 'lambda_param': 0.9, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6848,0.893978,0.7867,0.792573,0.7867,0.78539
2,0.6043,0.659887,0.8285,0.835074,0.8285,0.827973
3,0.3799,0.60989,0.8353,0.841035,0.8353,0.835281
4,0.268,0.568941,0.8453,0.850558,0.8453,0.845534
5,0.2038,0.554439,0.8522,0.858042,0.8522,0.852829
6,0.1676,0.538073,0.8544,0.857382,0.8544,0.85477
7,0.1468,0.535368,0.8544,0.858087,0.8544,0.854954


[I 2025-04-01 17:30:09,390] Trial 72 finished with value: 0.8549543919542087 and parameters: {'learning_rate': 0.00012523809716738472, 'weight_decay': 0.008, 'warmup_steps': 11, 'lambda_param': 0.9, 'temperature': 5.5}. Best is trial 51 with value: 0.8611237155660776.


Trial 73 with params: {'learning_rate': 0.0003207416231506959, 'weight_decay': 0.006, 'warmup_steps': 8, 'lambda_param': 0.8, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3927,0.852067,0.7678,0.78026,0.7678,0.766269
2,0.577,0.724363,0.8036,0.813154,0.8036,0.804076
3,0.3743,0.613233,0.8326,0.837286,0.8326,0.832338
4,0.2588,0.590088,0.8394,0.844891,0.8394,0.839677
5,0.1912,0.565092,0.8427,0.848947,0.8427,0.843406


[I 2025-04-01 17:37:37,617] Trial 73 pruned. 


Trial 74 with params: {'learning_rate': 0.0010340419267627111, 'weight_decay': 0.0, 'warmup_steps': 7, 'lambda_param': 0.2, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7903,1.339886,0.632,0.660489,0.632,0.628901
2,1.0094,1.057961,0.7063,0.733694,0.7063,0.706553


[I 2025-04-01 17:40:37,232] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00022307896848376617, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4546,0.820353,0.7881,0.794852,0.7881,0.787078
2,0.5534,0.656204,0.8179,0.822667,0.8179,0.817436
3,0.3554,0.603907,0.832,0.838475,0.832,0.831831
4,0.2482,0.574891,0.8434,0.847908,0.8434,0.843017
5,0.1858,0.545339,0.8497,0.854298,0.8497,0.849917


[I 2025-04-01 17:48:09,320] Trial 75 pruned. 


Trial 76 with params: {'learning_rate': 0.0001434884487562377, 'weight_decay': 0.007, 'warmup_steps': 6, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6233,0.861528,0.7888,0.794011,0.7888,0.787897
2,0.5828,0.65066,0.8258,0.834153,0.8258,0.825533
3,0.3678,0.593488,0.8397,0.845327,0.8397,0.839796
4,0.2598,0.566406,0.8493,0.853505,0.8493,0.849347
5,0.1966,0.549915,0.8513,0.856659,0.8513,0.851886


[I 2025-04-01 17:55:37,426] Trial 76 pruned. 


Trial 77 with params: {'learning_rate': 0.0001234171373726414, 'weight_decay': 0.01, 'warmup_steps': 11, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6949,0.903577,0.7779,0.785139,0.7779,0.776194
2,0.6081,0.655859,0.8265,0.831692,0.8265,0.826182
3,0.3796,0.607024,0.8357,0.842032,0.8357,0.835863
4,0.2687,0.571168,0.8441,0.848333,0.8441,0.844477
5,0.2041,0.554618,0.8506,0.856558,0.8506,0.85156
6,0.167,0.541634,0.8504,0.853382,0.8504,0.850795
7,0.1464,0.536439,0.8529,0.856038,0.8529,0.853409


[I 2025-04-01 18:06:08,277] Trial 77 finished with value: 0.8534086560450145 and parameters: {'learning_rate': 0.0001234171373726414, 'weight_decay': 0.01, 'warmup_steps': 11, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 78 with params: {'learning_rate': 0.0001938428963922486, 'weight_decay': 0.0, 'warmup_steps': 0, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4991,0.850614,0.78,0.78964,0.78,0.778533
2,0.5591,0.641634,0.8247,0.831434,0.8247,0.824669
3,0.3521,0.597871,0.833,0.838176,0.833,0.832954
4,0.2456,0.565468,0.8469,0.850802,0.8469,0.847022
5,0.1847,0.547653,0.8498,0.85601,0.8498,0.85059
6,0.1507,0.520495,0.8543,0.857819,0.8543,0.85475
7,0.1301,0.509457,0.8574,0.861079,0.8574,0.857887


[I 2025-04-01 18:16:42,301] Trial 78 finished with value: 0.857887347511673 and parameters: {'learning_rate': 0.0001938428963922486, 'weight_decay': 0.0, 'warmup_steps': 0, 'lambda_param': 0.5, 'temperature': 3.5}. Best is trial 51 with value: 0.8611237155660776.


Trial 79 with params: {'learning_rate': 0.00014520152793544533, 'weight_decay': 0.001, 'warmup_steps': 5, 'lambda_param': 0.5, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6218,0.878721,0.7853,0.792669,0.7853,0.784286
2,0.5912,0.642149,0.8274,0.833139,0.8274,0.827281
3,0.3712,0.610829,0.8333,0.83872,0.8333,0.833383


[I 2025-04-01 18:21:14,362] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 8.548596451496562e-05, 'weight_decay': 0.0, 'warmup_steps': 6, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9213,1.032105,0.7694,0.773919,0.7694,0.768117
2,0.7133,0.710253,0.819,0.824623,0.819,0.818508
3,0.4532,0.631085,0.8319,0.837272,0.8319,0.831573
4,0.3265,0.594236,0.8398,0.842923,0.8398,0.83963
5,0.2506,0.576232,0.8446,0.84973,0.8446,0.845058


[I 2025-04-01 18:28:44,808] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 0.00016984865937691763, 'weight_decay': 0.0, 'warmup_steps': 3, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.537,0.846713,0.7883,0.795537,0.7883,0.786874
2,0.5611,0.636377,0.828,0.833843,0.828,0.826828
3,0.3528,0.601434,0.8323,0.83822,0.8323,0.832009


[I 2025-04-01 18:33:21,133] Trial 81 pruned. 


Trial 82 with params: {'learning_rate': 0.0002677585757152088, 'weight_decay': 0.004, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4238,0.87227,0.7658,0.778776,0.7658,0.765165
2,0.5642,0.678861,0.812,0.820269,0.812,0.811724
3,0.3586,0.620881,0.8249,0.832238,0.8249,0.824835


[I 2025-04-01 18:37:54,174] Trial 82 pruned. 


Trial 83 with params: {'learning_rate': 0.0001449056110308957, 'weight_decay': 0.004, 'warmup_steps': 2, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6165,0.877756,0.7877,0.794806,0.7877,0.786466
2,0.5915,0.647834,0.8259,0.832895,0.8259,0.825813
3,0.3714,0.604963,0.8357,0.842671,0.8357,0.835531
4,0.2609,0.567432,0.8436,0.849966,0.8436,0.844174
5,0.1978,0.545084,0.8509,0.856304,0.8509,0.851907


Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

[I 2025-04-01 18:45:36,451] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.00014323466854092214, 'weight_decay': 0.007, 'warmup_steps': 9, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6197,0.870354,0.7858,0.791006,0.7858,0.783661
2,0.5863,0.641776,0.8289,0.834518,0.8289,0.828366
3,0.3664,0.592192,0.8407,0.846494,0.8407,0.84041
4,0.2574,0.564745,0.8454,0.849555,0.8454,0.845514
5,0.195,0.556363,0.8491,0.854854,0.8491,0.849674


[I 2025-04-01 18:53:08,196] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.0002996812058967516, 'weight_decay': 0.008, 'warmup_steps': 5, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3849,0.850567,0.7697,0.778387,0.7697,0.766821
2,0.5635,0.68442,0.8077,0.818427,0.8077,0.807635


[I 2025-04-01 18:56:12,165] Trial 85 pruned. 


Trial 86 with params: {'learning_rate': 0.0001426659655240895, 'weight_decay': 0.001, 'warmup_steps': 0, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6099,0.859793,0.7861,0.792336,0.7861,0.784849
2,0.5736,0.64386,0.827,0.832529,0.827,0.826659
3,0.3604,0.58973,0.8416,0.847,0.8416,0.841561
4,0.2528,0.556437,0.8475,0.851229,0.8475,0.847606
5,0.1925,0.552068,0.8499,0.855805,0.8499,0.850448
6,0.1582,0.532367,0.8523,0.85532,0.8523,0.852604
7,0.1382,0.527134,0.8536,0.857112,0.8536,0.853997


[I 2025-04-01 19:07:01,770] Trial 86 finished with value: 0.8539971289977599 and parameters: {'learning_rate': 0.0001426659655240895, 'weight_decay': 0.001, 'warmup_steps': 0, 'lambda_param': 0.5, 'temperature': 3.5}. Best is trial 51 with value: 0.8611237155660776.


Trial 87 with params: {'learning_rate': 0.0003208688968633574, 'weight_decay': 0.008, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3804,0.897621,0.7565,0.770931,0.7565,0.754284
2,0.5796,0.706483,0.8063,0.816405,0.8063,0.806129


[I 2025-04-01 19:10:13,703] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 8.797931572341382e-05, 'weight_decay': 0.004, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8866,1.011667,0.7743,0.778048,0.7743,0.772035
2,0.709,0.707555,0.819,0.822465,0.819,0.817687
3,0.4473,0.634312,0.831,0.835127,0.831,0.830258


[I 2025-04-01 19:14:48,844] Trial 88 pruned. 


Trial 89 with params: {'learning_rate': 9.835283753498323e-05, 'weight_decay': 0.004, 'warmup_steps': 8, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8063,0.938793,0.7836,0.788006,0.7836,0.781679
2,0.6475,0.672933,0.825,0.829618,0.825,0.824054
3,0.4074,0.614609,0.8338,0.839357,0.8338,0.833555
4,0.2901,0.574311,0.8443,0.848356,0.8443,0.844692
5,0.2227,0.558445,0.8478,0.852722,0.8478,0.848516


[I 2025-04-01 19:22:27,806] Trial 89 pruned. 


Trial 90 with params: {'learning_rate': 0.0011115662517499805, 'weight_decay': 0.004, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8276,1.508272,0.5904,0.633513,0.5904,0.578856
2,1.0394,1.15658,0.6817,0.724326,0.6817,0.685462
3,0.7155,0.937641,0.7414,0.759764,0.7414,0.739287


[I 2025-04-01 19:26:55,803] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.000264933570160417, 'weight_decay': 0.005, 'warmup_steps': 2, 'lambda_param': 0.9, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3824,0.853559,0.7715,0.785132,0.7715,0.77124
2,0.5521,0.68342,0.8108,0.822037,0.8108,0.811421


[I 2025-04-01 19:29:58,145] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 0.0008846159350465202, 'weight_decay': 0.001, 'warmup_steps': 28, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6756,1.324884,0.6417,0.674537,0.6417,0.634821
2,0.9047,1.037995,0.7154,0.746602,0.7154,0.714413
3,0.6198,0.907907,0.752,0.776815,0.752,0.754447


[I 2025-04-01 19:34:31,264] Trial 92 pruned. 


Trial 93 with params: {'learning_rate': 0.00044764456410795314, 'weight_decay': 0.005, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3992,0.950054,0.736,0.752852,0.736,0.732815
2,0.6253,0.766311,0.7898,0.801436,0.7898,0.789502


[I 2025-04-01 19:37:31,259] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.0006161448586038617, 'weight_decay': 0.005, 'warmup_steps': 20, 'lambda_param': 0.1, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4774,1.037115,0.7159,0.735059,0.7159,0.713613
2,0.7313,0.82969,0.7678,0.782959,0.7678,0.767522
3,0.4933,0.74111,0.7965,0.807549,0.7965,0.796835


[I 2025-04-01 19:42:02,337] Trial 94 pruned. 


Trial 95 with params: {'learning_rate': 0.00016573911915462533, 'weight_decay': 0.004, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.551,0.838754,0.7912,0.79499,0.7912,0.789331
2,0.5695,0.646241,0.8248,0.830315,0.8248,0.824439
3,0.3556,0.59075,0.8378,0.842599,0.8378,0.837451
4,0.2502,0.568946,0.8443,0.849083,0.8443,0.844484
5,0.189,0.550554,0.8464,0.852065,0.8464,0.847001


[I 2025-04-01 19:49:43,547] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 5.399635979922363e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3152,1.268355,0.7401,0.744018,0.7401,0.737039
2,0.9195,0.83843,0.8039,0.808218,0.8039,0.802773
3,0.5987,0.705389,0.8218,0.825353,0.8218,0.821165


[I 2025-04-01 19:54:15,261] Trial 96 pruned. 


Trial 97 with params: {'learning_rate': 0.0001806824939484019, 'weight_decay': 0.007, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5209,0.847702,0.7863,0.793806,0.7863,0.784727
2,0.5615,0.641524,0.826,0.832468,0.826,0.826224
3,0.3554,0.611106,0.8301,0.83823,0.8301,0.830257


[I 2025-04-01 19:58:57,570] Trial 97 pruned. 


Trial 98 with params: {'learning_rate': 0.0001147772186457988, 'weight_decay': 0.008, 'warmup_steps': 29, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7762,0.934277,0.7815,0.787055,0.7815,0.780008
2,0.6357,0.668139,0.8212,0.826341,0.8212,0.82022
3,0.395,0.609068,0.8329,0.838039,0.8329,0.832293


[I 2025-04-01 20:03:32,671] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 7.012156354747e-05, 'weight_decay': 0.004, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0674,1.143844,0.7574,0.760909,0.7574,0.754978
2,0.8175,0.782288,0.8068,0.811472,0.8068,0.805358
3,0.524,0.674672,0.8222,0.825048,0.8222,0.821075


[I 2025-04-01 20:08:13,299] Trial 99 pruned. 


Trial 100 with params: {'learning_rate': 0.004463096479266976, 'weight_decay': 0.003, 'warmup_steps': 18, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,4.132,4.447169,0.0126,0.001247,0.0126,0.001632
2,4.1264,4.240985,0.0085,0.002806,0.0085,0.000745
3,4.1121,4.247659,0.01,0.0001,0.01,0.000198


[I 2025-04-01 20:12:48,692] Trial 100 pruned. 


Trial 101 with params: {'learning_rate': 8.94540164446801e-05, 'weight_decay': 0.008, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8829,1.013629,0.7738,0.778953,0.7738,0.77203
2,0.7059,0.698256,0.8228,0.826808,0.8228,0.822128
3,0.4422,0.630398,0.8296,0.833969,0.8296,0.828576


[I 2025-04-01 20:17:29,072] Trial 101 pruned. 


Trial 102 with params: {'learning_rate': 0.0001148160691533016, 'weight_decay': 0.005, 'warmup_steps': 2, 'lambda_param': 0.8, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7386,0.92713,0.7827,0.787432,0.7827,0.780768
2,0.6344,0.674152,0.824,0.828381,0.824,0.823049
3,0.3963,0.621031,0.8293,0.835182,0.8293,0.828373


[I 2025-04-01 20:22:09,805] Trial 102 pruned. 


Trial 103 with params: {'learning_rate': 0.0003881193148130847, 'weight_decay': 0.0, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3788,0.889534,0.7615,0.779302,0.7615,0.759655
2,0.6019,0.695444,0.8054,0.81358,0.8054,0.804696


[I 2025-04-01 20:25:17,479] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00016146239281839112, 'weight_decay': 0.006, 'warmup_steps': 2, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5573,0.851085,0.7862,0.793448,0.7862,0.785306
2,0.5722,0.659902,0.8229,0.830047,0.8229,0.822667
3,0.3638,0.597364,0.8378,0.843648,0.8378,0.83766
4,0.2544,0.561822,0.8474,0.851966,0.8474,0.847605
5,0.1931,0.551446,0.8507,0.855097,0.8507,0.85135
6,0.157,0.529462,0.8567,0.860189,0.8567,0.857252
7,0.1359,0.519267,0.8581,0.862441,0.8581,0.858983


[I 2025-04-01 20:36:03,012] Trial 104 finished with value: 0.8589832423225972 and parameters: {'learning_rate': 0.00016146239281839112, 'weight_decay': 0.006, 'warmup_steps': 2, 'lambda_param': 0.9, 'temperature': 3.5}. Best is trial 51 with value: 0.8611237155660776.


Trial 105 with params: {'learning_rate': 0.0003828197487927862, 'weight_decay': 0.0, 'warmup_steps': 1, 'lambda_param': 0.30000000000000004, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.384,0.941751,0.7419,0.757974,0.7419,0.739531
2,0.6112,0.723023,0.8013,0.811886,0.8013,0.80111


[I 2025-04-01 20:39:08,482] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.00043058010857109744, 'weight_decay': 0.008, 'warmup_steps': 19, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4103,0.91918,0.7473,0.759874,0.7473,0.744991
2,0.6259,0.756015,0.7893,0.80334,0.7893,0.788279
3,0.4097,0.676506,0.8162,0.823925,0.8162,0.816228


[I 2025-04-01 20:43:46,569] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 7.60931064441586e-05, 'weight_decay': 0.007, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0086,1.089306,0.7616,0.764909,0.7616,0.759523
2,0.7716,0.743448,0.8124,0.816402,0.8124,0.811452


[I 2025-04-01 20:46:52,451] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 0.00041814560154837044, 'weight_decay': 0.005, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3873,0.954545,0.7386,0.761447,0.7386,0.737724
2,0.6229,0.748672,0.796,0.808467,0.796,0.796425


[I 2025-04-01 20:50:00,610] Trial 108 pruned. 


Trial 109 with params: {'learning_rate': 0.00026035212780520543, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4379,0.83342,0.7792,0.7904,0.7792,0.77826
2,0.5613,0.679262,0.8086,0.818123,0.8086,0.807894
3,0.3622,0.626303,0.8273,0.836939,0.8273,0.827718


[I 2025-04-01 20:54:36,147] Trial 109 pruned. 


Trial 110 with params: {'learning_rate': 0.0003313632543731077, 'weight_decay': 0.004, 'warmup_steps': 2, 'lambda_param': 0.4, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3765,0.920363,0.7493,0.768876,0.7493,0.747856
2,0.5817,0.695971,0.8097,0.819419,0.8097,0.810034
3,0.3763,0.639033,0.8241,0.8318,0.8241,0.824316


[I 2025-04-01 20:59:20,884] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.00017328374485156897, 'weight_decay': 0.006, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5251,0.834188,0.785,0.791263,0.785,0.782869
2,0.5635,0.654969,0.822,0.83012,0.822,0.822
3,0.3563,0.599527,0.8376,0.843487,0.8376,0.837588
4,0.2497,0.548762,0.8516,0.855863,0.8516,0.851795
5,0.1889,0.540596,0.8519,0.857597,0.8519,0.85254
6,0.1541,0.523547,0.8564,0.859727,0.8564,0.856909
7,0.1337,0.51337,0.8576,0.860865,0.8576,0.858087


[I 2025-04-01 21:10:13,803] Trial 111 finished with value: 0.8580871698773421 and parameters: {'learning_rate': 0.00017328374485156897, 'weight_decay': 0.006, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 112 with params: {'learning_rate': 0.000167272461402189, 'weight_decay': 0.003, 'warmup_steps': 11, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5511,0.835913,0.7916,0.797621,0.7916,0.789846
2,0.5662,0.643185,0.8257,0.832898,0.8257,0.824921
3,0.3579,0.60068,0.8367,0.842697,0.8367,0.836671
4,0.2512,0.573176,0.8456,0.849683,0.8456,0.845842
5,0.1897,0.550591,0.851,0.856382,0.851,0.85145
6,0.1548,0.531149,0.8538,0.856928,0.8538,0.854319
7,0.1343,0.521441,0.8566,0.860681,0.8566,0.857337


[I 2025-04-01 21:20:55,301] Trial 112 finished with value: 0.8573374625013228 and parameters: {'learning_rate': 0.000167272461402189, 'weight_decay': 0.003, 'warmup_steps': 11, 'lambda_param': 0.9, 'temperature': 3.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 113 with params: {'learning_rate': 0.00029111777079629677, 'weight_decay': 0.007, 'warmup_steps': 1, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3793,0.871401,0.7612,0.774326,0.7612,0.758628
2,0.5591,0.679695,0.8141,0.823489,0.8141,0.813581
3,0.3611,0.614351,0.8299,0.836568,0.8299,0.82942


[I 2025-04-01 21:25:35,555] Trial 113 pruned. 


Trial 114 with params: {'learning_rate': 0.00016203468494340327, 'weight_decay': 0.007, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5715,0.848332,0.7853,0.793835,0.7853,0.783723
2,0.5644,0.63932,0.8262,0.833069,0.8262,0.825867
3,0.3533,0.594601,0.8318,0.837405,0.8318,0.831708
4,0.2484,0.563664,0.8474,0.852801,0.8474,0.847906
5,0.1895,0.546038,0.8516,0.856043,0.8516,0.851955
6,0.1547,0.523088,0.8556,0.8589,0.8556,0.856024
7,0.1347,0.517129,0.8552,0.859186,0.8552,0.855807


[I 2025-04-01 21:36:24,386] Trial 114 finished with value: 0.8558074704232184 and parameters: {'learning_rate': 0.00016203468494340327, 'weight_decay': 0.007, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 51 with value: 0.8611237155660776.


Trial 115 with params: {'learning_rate': 0.0002452794184793272, 'weight_decay': 0.01, 'warmup_steps': 9, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4226,0.83598,0.7731,0.785156,0.7731,0.771512
2,0.5485,0.669474,0.8181,0.826406,0.8181,0.818248
3,0.3515,0.609673,0.8302,0.836961,0.8302,0.830048


[I 2025-04-01 21:41:03,318] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 0.0003481533047431033, 'weight_decay': 0.005, 'warmup_steps': 8, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3931,0.918964,0.7446,0.763218,0.7446,0.743495
2,0.595,0.710985,0.8005,0.812252,0.8005,0.800346
3,0.3894,0.649155,0.8203,0.827874,0.8203,0.82022


[I 2025-04-01 21:45:43,223] Trial 116 pruned. 


Trial 117 with params: {'learning_rate': 0.00026696674274803874, 'weight_decay': 0.006, 'warmup_steps': 6, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4103,0.839757,0.7774,0.788183,0.7774,0.775361
2,0.5559,0.669513,0.8165,0.822718,0.8165,0.815772
3,0.3595,0.619695,0.8271,0.835575,0.8271,0.827653


[I 2025-04-01 21:50:16,483] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.0002488322527924432, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4424,0.838131,0.7758,0.786217,0.7758,0.774132
2,0.5605,0.6771,0.8064,0.814735,0.8064,0.806422


[I 2025-04-01 21:53:20,210] Trial 118 pruned. 


Trial 119 with params: {'learning_rate': 0.00012478511665942084, 'weight_decay': 0.004, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6753,0.89379,0.7831,0.789987,0.7831,0.78189
2,0.6047,0.657496,0.8248,0.829967,0.8248,0.823787
3,0.3779,0.608982,0.8361,0.840005,0.8361,0.83544
4,0.2681,0.570289,0.8438,0.848265,0.8438,0.844035
5,0.2036,0.559152,0.8462,0.850991,0.8462,0.846544


[I 2025-04-01 22:01:06,079] Trial 119 pruned. 


Trial 120 with params: {'learning_rate': 0.00011140545283319333, 'weight_decay': 0.006, 'warmup_steps': 11, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.7446,0.912304,0.7823,0.787823,0.7823,0.781017
2,0.6211,0.653108,0.827,0.833323,0.827,0.826802
3,0.3864,0.610397,0.8342,0.839118,0.8342,0.833636
4,0.2754,0.574254,0.8459,0.850216,0.8459,0.846245
5,0.2099,0.553849,0.8489,0.853801,0.8489,0.849584


[I 2025-04-01 22:08:50,735] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 8.532115701682182e-05, 'weight_decay': 0.003, 'warmup_steps': 21, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9131,1.002618,0.777,0.779959,0.777,0.775281
2,0.6947,0.694518,0.8244,0.828141,0.8244,0.82387
3,0.439,0.624959,0.8324,0.837682,0.8324,0.832016
4,0.3155,0.584783,0.8428,0.846802,0.8428,0.84305
5,0.2447,0.569396,0.846,0.85073,0.846,0.846394


[I 2025-04-01 22:16:38,890] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 6.735821664620213e-05, 'weight_decay': 0.002, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1451,1.18638,0.7491,0.752863,0.7491,0.746719
2,0.85,0.797286,0.8048,0.808415,0.8048,0.803308


[I 2025-04-01 22:19:43,901] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 5.34759282725924e-05, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3284,1.294507,0.7333,0.736856,0.7333,0.729823
2,0.9444,0.855294,0.8012,0.804876,0.8012,0.799972
3,0.6191,0.718001,0.8173,0.821513,0.8173,0.816865


[I 2025-04-01 22:24:24,052] Trial 123 pruned. 


Trial 124 with params: {'learning_rate': 0.00014680154537818888, 'weight_decay': 0.005, 'warmup_steps': 9, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.6398,0.886667,0.7791,0.786659,0.7791,0.777475
2,0.5892,0.648358,0.8246,0.829676,0.8246,0.823946
3,0.3691,0.604564,0.834,0.839405,0.834,0.833816
4,0.2586,0.575045,0.8412,0.846025,0.8412,0.841581
5,0.1969,0.552728,0.8501,0.856169,0.8501,0.85098


[I 2025-04-01 22:32:03,149] Trial 124 pruned. 


Trial 125 with params: {'learning_rate': 0.00019870271204584246, 'weight_decay': 0.004, 'warmup_steps': 14, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([100]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([100, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5197,0.843124,0.7835,0.79201,0.7835,0.781942
2,0.5597,0.652736,0.8209,0.830002,0.8209,0.820869
3,0.3585,0.603887,0.8326,0.83952,0.8326,0.833225
4,0.2495,0.574057,0.8456,0.849201,0.8456,0.845725


[W 2025-04-01 22:39:07,371] Trial 125 failed with parameters: {'learning_rate': 0.00019870271204584246, 'weight_decay': 0.004, 'warmup_steps': 14, 'lambda_param': 0.7000000000000001, 'temperature': 3.0} because of the following error: KeyboardInterrupt().
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/optuna/study/_optimize.py", line 197, in _run_trial
    value_or_values = func(trial)
  File "/usr/local/lib/python3.10/dist-packages/transformers/integrations/integration_utils.py", line 250, in _objective
    trainer.train(resume_from_checkpoint=checkpoint, trial=trial)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2241, in train
    return inner_training_loop(
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2548, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3

KeyboardInterrupt: 

    r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 122, in get
    return _ForkingPickler.loads(res)
  File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/reductions.py", line 541, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 57, in detach


    with _resource_sharer.get_connection(self._id) as conn:
  File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 86, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 502, in Client
    c = SocketClient(address)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 630, in SocketClient
    s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory


In [None]:
print(best_distill_aug)

BestRun(run_id='44', objective=0.7611287079618219, hyperparameters={'learning_rate': 0.00063155918393816, 'weight_decay': 0.009000000000000001, 'warmup_steps': 5, 'lambda_param': 0.6000000000000001, 'temperature': 4.0}, run_summary=None)


In [None]:
print("Best random init training score: ", best_base)
print("Best random init distilation trianing score: ", best_distill)
print("Best pretrained (head only) training score: ", best_base_aug)
print("Best pretrained distilation (head only) training score: ",best_distill_aug)