# Prohledávání hyperparametrů pro model TinyViT nad datasetem CIFAR10 v původní i augmentované podobě

Tento notebook slouží k nalezení optimálních hyperparametrů nad datasetem CIFAR10 pro model TinyViT. Hyperparametry jsou hledány pro původní i augmentovaný dataset pro normální trénink i destilaci.

K prohledávání je využito knihovny Optuna s algoritmem Hyperband. Nejlepší konfigurace je volena na základě F1-skóre, zkoušeno je 150 kombinací hyperparametrů pro každou z variant.

## Import knihoven a základní nastavení

In [None]:
from transformers import Trainer, AutoModelForImageClassification
from torch.utils.data import ConcatDataset
import optuna
import torch
import math
import base
import os

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package punkt to /home/jovyan/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /home/jovyan/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!


In [None]:
dataset_part = base.get_dataset_part()

Resetování náhodného seedu pro replikovatelnost výsledků.

In [None]:
base.reset_seed()

Ověření dostupnosti GPU.

In [4]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("GPU is available and will be used:", torch.cuda.get_device_name(0))
else:
    device = torch.device("cpu")
    print("GPU is not available, using CPU.")

GPU is available and will be used: NVIDIA A100 80GB PCIe MIG 2g.20gb


Načtení datasetu a aplikace základních a augmentačních transformací.

In [5]:
DATASET = "cifar10"

In [None]:
transform = base.base_transforms()

test = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.TEST, transform=transform)
train = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.TRAIN, transform=transform)
eval = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.EVAL, transform=transform)

In [7]:
augment_transform = base.aug_transforms()
train_aug = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.TRAIN, transform=augment_transform)

Provedení filtrace augmentovaného datasetu dle popsaného mechanismu.

In [8]:
train_aug = base.remove_diff_pred_class(train, train_aug, pytorch_dataset=True)
train_combo = ConcatDataset([train, train_aug])

Removing entries from augmented dataset that are different from the base one - based on saved logits:   0%|   …

Základní konfigurace tréninku během prohledávání. Optuna nepracuje s epochami, ale s kroky. Níže je prováděn přepočet. 

Minimální délka tréninku jsou dvě epochy, maximální sedm epoch. Maximální počet kroků pro warm up je nastaven na 10 % první epochy.

In [None]:
num_epochs = 7
batch_size = 128

In [None]:
data_length = len(train)
min_r = math.ceil(data_length/batch_size)*2
max_r = math.ceil(data_length/batch_size)*num_epochs
warm_up = math.ceil(data_length/batch_size/10)

## Prohledávání s normálním tréninkem nad původním datasetem
Definice hledaných hyperparametrů a jejich rozmezí.

In [12]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up)
    }   
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [None]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



In [None]:
base.reset_seed()

Konfigurace jednotlivých tréninků.

In [15]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/_hp-search", logging_dir=f"~/logs/{DATASET}/_hp-search", epochs=num_epochs, batch_size=batch_size)

Definice získání studentského modelu.

In [None]:
def get_model():
    return AutoModelForImageClassification.from_pretrained("timm/tiny_vit_5m_224.in1k", num_labels=10, ignore_mismatched_sizes=True)

Konfigurace trenéra pro jednotlivé tréninky. 

In [17]:
trainer = Trainer(
    args=training_args,
    train_dataset=train,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)
  

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [18]:
best_base = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Base",
    n_trials=150
)

[I 2025-03-29 16:18:44,733] A new study created in memory with name: Base


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4427,0.264957,0.9117,0.919823,0.911798,0.911894
2,0.1384,0.190786,0.9374,0.93995,0.93788,0.937399
3,0.0747,0.146533,0.9559,0.956957,0.955881,0.956239
4,0.0425,0.150831,0.9584,0.958616,0.958541,0.958336


[I 2025-03-29 16:29:31,752] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.0007875660249889869, 'weight_decay': 0.001, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5514,0.41773,0.8557,0.882328,0.85637,0.855427
2,0.2784,0.339531,0.8819,0.892951,0.882365,0.880033


[I 2025-03-29 16:34:52,224] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 6.533369619026643e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5266,0.174688,0.9451,0.947232,0.945241,0.945465
2,0.0922,0.130627,0.9562,0.956605,0.95643,0.95628
3,0.0396,0.12304,0.9618,0.961901,0.962067,0.961894
4,0.0161,0.121924,0.9696,0.96972,0.969756,0.969725


[I 2025-03-29 16:45:37,164] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.0013035123791853842, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8162,0.680534,0.7682,0.795655,0.768655,0.766107
2,0.4594,0.387826,0.8659,0.870784,0.866465,0.865212


[I 2025-03-29 16:50:59,362] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.002311294500510415, 'weight_decay': 0.002, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3528,1.407308,0.5228,0.576865,0.52367,0.493646
2,0.9994,1.008663,0.6558,0.68169,0.657148,0.643225
3,0.7787,0.749806,0.7491,0.763195,0.749192,0.747998
4,0.6159,0.661946,0.7716,0.795378,0.771904,0.77347


[I 2025-03-29 17:01:46,736] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4403,0.168277,0.9468,0.94893,0.946834,0.947092
2,0.088,0.123281,0.9612,0.961262,0.961493,0.961186
3,0.0411,0.120905,0.9647,0.965272,0.9647,0.964859
4,0.0175,0.132797,0.9679,0.96815,0.968049,0.968094


[I 2025-03-29 17:12:29,708] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0003654769917956456, 'weight_decay': 0.003, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.451,0.227819,0.9229,0.926144,0.923448,0.923253
2,0.1578,0.209714,0.9301,0.937129,0.930442,0.930244
3,0.0971,0.160412,0.9502,0.950742,0.950349,0.950313
4,0.0553,0.151367,0.9573,0.958074,0.957273,0.957539


[I 2025-03-29 17:23:13,998] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 9.505122659935192e-05, 'weight_decay': 0.003, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4495,0.168017,0.9442,0.947011,0.94415,0.94474
2,0.0877,0.130934,0.9565,0.957313,0.956724,0.956609
3,0.0382,0.115613,0.9653,0.965487,0.965489,0.965436
4,0.0146,0.132549,0.9662,0.966456,0.966369,0.966374


[I 2025-03-29 17:33:59,659] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 0.00040842279473800845, 'weight_decay': 0.008, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.425,0.386194,0.8755,0.890119,0.87655,0.87544
2,0.1675,0.201339,0.9328,0.934603,0.933255,0.931989


[I 2025-03-29 17:37:38,428] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0005338741354740678, 'weight_decay': 0.006, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4618,0.286336,0.9062,0.910548,0.906179,0.906465
2,0.2085,0.270973,0.909,0.918543,0.90956,0.908993
3,0.1291,0.193552,0.9348,0.936477,0.934689,0.935144
4,0.0737,0.173231,0.9466,0.947142,0.946908,0.946483


[I 2025-03-29 17:43:12,242] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 6.533528818763353e-05, 'weight_decay': 0.01, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5037,0.158559,0.9494,0.950632,0.949494,0.94965
2,0.0869,0.132052,0.9552,0.956138,0.955405,0.955445
3,0.0366,0.127167,0.9628,0.963155,0.962956,0.96298
4,0.0141,0.141551,0.9658,0.966127,0.965927,0.965906


[I 2025-03-29 17:48:46,552] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 7.708968913466938e-05, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5066,0.160335,0.9476,0.949904,0.947825,0.947975
2,0.0914,0.11888,0.961,0.961458,0.961057,0.961126
3,0.038,0.118747,0.9655,0.965752,0.965588,0.965642
4,0.0163,0.133683,0.9668,0.96698,0.96694,0.966915
5,0.0052,0.130086,0.9684,0.968645,0.968529,0.968522
6,0.0012,0.132783,0.9708,0.97098,0.970872,0.970903
7,0.0005,0.133448,0.9711,0.971246,0.971205,0.971205


[I 2025-03-29 17:58:34,457] Trial 11 finished with value: 0.9712048758910423 and parameters: {'learning_rate': 7.708968913466938e-05, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 11 with value: 0.9712048758910423.


Trial 12 with params: {'learning_rate': 5.217026363807214e-05, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6313,0.180836,0.9423,0.94368,0.942567,0.942575
2,0.1056,0.120743,0.9616,0.962072,0.961704,0.961793
3,0.0442,0.116826,0.966,0.966138,0.966118,0.966074
4,0.0143,0.138144,0.9651,0.96526,0.965298,0.9652


[I 2025-03-29 18:04:07,758] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 5.226430585490316e-05, 'weight_decay': 0.007, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6229,0.162117,0.948,0.949867,0.948113,0.948362
2,0.0937,0.119216,0.9623,0.962655,0.962396,0.962444
3,0.04,0.11778,0.9635,0.963617,0.963599,0.963547
4,0.0159,0.135057,0.9646,0.964619,0.964764,0.964605
5,0.0045,0.143058,0.9663,0.966513,0.96637,0.966393
6,0.0023,0.143602,0.969,0.969161,0.9691,0.969122
7,0.001,0.144122,0.9685,0.96861,0.968627,0.968602


[I 2025-03-29 18:13:52,622] Trial 13 finished with value: 0.9686021568730794 and parameters: {'learning_rate': 5.226430585490316e-05, 'weight_decay': 0.007, 'warmup_steps': 26}. Best is trial 11 with value: 0.9712048758910423.


Trial 14 with params: {'learning_rate': 9.95605435141112e-05, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.488,0.186999,0.9393,0.944452,0.939422,0.939996
2,0.0905,0.122951,0.9593,0.959491,0.959471,0.959415
3,0.0412,0.117226,0.9653,0.965525,0.965482,0.965473
4,0.0169,0.122421,0.9686,0.968717,0.968783,0.968669
5,0.006,0.123625,0.9709,0.971367,0.971008,0.971096
6,0.0016,0.126107,0.972,0.972076,0.972148,0.972095
7,0.0007,0.125646,0.9729,0.973052,0.973021,0.973024


[I 2025-03-29 18:23:34,906] Trial 14 finished with value: 0.9730240184701999 and parameters: {'learning_rate': 9.95605435141112e-05, 'weight_decay': 0.007, 'warmup_steps': 28}. Best is trial 14 with value: 0.9730240184701999.


Trial 15 with params: {'learning_rate': 0.0003662169232204062, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4851,0.252533,0.9189,0.922387,0.919356,0.919473
2,0.1636,0.19222,0.9349,0.937273,0.935188,0.934952
3,0.0928,0.16212,0.9478,0.948848,0.948042,0.948096
4,0.054,0.162263,0.9534,0.953965,0.953588,0.953452


[I 2025-03-29 18:29:04,981] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.00038309918336020546, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4508,0.286934,0.9057,0.912826,0.906158,0.906662
2,0.1627,0.206656,0.9301,0.931591,0.930497,0.929844


[I 2025-03-29 18:31:51,745] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 0.0020085822314002493, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.1278,1.028901,0.6516,0.689256,0.652173,0.651455
2,0.7247,0.781852,0.7386,0.775961,0.738718,0.73893
3,0.5041,0.59359,0.7976,0.811637,0.798066,0.792535
4,0.3691,0.406194,0.8686,0.883081,0.868396,0.871252


[I 2025-03-29 18:37:22,966] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0026868566033176914, 'weight_decay': 0.01, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8859,2.314781,0.1417,0.171584,0.140837,0.092388
2,2.1076,2.089234,0.2119,0.192986,0.21148,0.188983
3,2.0578,2.083014,0.2073,0.213258,0.205387,0.162042
4,2.0793,2.016677,0.2433,0.223367,0.241827,0.212747


[I 2025-03-29 18:42:54,846] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 0.00015627747538495373, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4432,0.166431,0.9477,0.949691,0.94783,0.948009
2,0.0987,0.140477,0.9544,0.955254,0.954645,0.95455


[I 2025-03-29 18:45:41,106] Trial 19 pruned. 


Trial 20 with params: {'learning_rate': 7.639542885278315e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5332,0.152277,0.9525,0.953275,0.952709,0.952691
2,0.09,0.119022,0.9601,0.960693,0.960234,0.960347
3,0.0391,0.131622,0.9617,0.961712,0.961984,0.961659
4,0.0152,0.138279,0.966,0.966798,0.966095,0.966205
5,0.0047,0.137755,0.9692,0.969378,0.969317,0.969342
6,0.0014,0.142714,0.9705,0.97067,0.970633,0.970647
7,0.0005,0.142881,0.971,0.971137,0.971157,0.97114


[I 2025-03-29 18:55:22,884] Trial 20 finished with value: 0.9711397392439892 and parameters: {'learning_rate': 7.639542885278315e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}. Best is trial 14 with value: 0.9730240184701999.


Trial 21 with params: {'learning_rate': 6.804198974992601e-05, 'weight_decay': 0.008, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5522,0.184256,0.9417,0.94464,0.941812,0.94229
2,0.0954,0.120527,0.9606,0.960914,0.960722,0.960676
3,0.0387,0.115763,0.9671,0.967535,0.967227,0.967206
4,0.0143,0.133903,0.9678,0.967836,0.968005,0.967823
5,0.0051,0.141145,0.9706,0.971114,0.970654,0.970745
6,0.0017,0.137425,0.9697,0.969858,0.9698,0.969814
7,0.0007,0.13898,0.9709,0.970967,0.97103,0.970987


[I 2025-03-29 19:05:04,917] Trial 21 finished with value: 0.970986506707678 and parameters: {'learning_rate': 6.804198974992601e-05, 'weight_decay': 0.008, 'warmup_steps': 32}. Best is trial 14 with value: 0.9730240184701999.


Trial 22 with params: {'learning_rate': 0.00019913817180425286, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4516,0.215778,0.9265,0.931213,0.926719,0.926925
2,0.1159,0.159373,0.9473,0.949858,0.947442,0.94775
3,0.061,0.120467,0.962,0.962484,0.962111,0.962203
4,0.0286,0.131587,0.9636,0.963788,0.963698,0.963716
5,0.0112,0.132403,0.968,0.968147,0.968156,0.968126
6,0.0036,0.129192,0.9714,0.971634,0.97151,0.971546
7,0.0008,0.12885,0.9733,0.973376,0.973434,0.973383


[I 2025-03-29 19:14:46,002] Trial 22 finished with value: 0.9733827228355143 and parameters: {'learning_rate': 0.00019913817180425286, 'weight_decay': 0.008, 'warmup_steps': 30}. Best is trial 22 with value: 0.9733827228355143.


Trial 23 with params: {'learning_rate': 9.496688021669307e-05, 'weight_decay': 0.005, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4814,0.195403,0.935,0.944111,0.93522,0.936601
2,0.0914,0.127305,0.9584,0.958982,0.958643,0.958548
3,0.0406,0.117421,0.964,0.964429,0.96413,0.964192
4,0.0173,0.136321,0.9668,0.967283,0.96696,0.966996


[I 2025-03-29 19:20:18,460] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 0.00011865097794262479, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.476,0.160698,0.9466,0.948305,0.946541,0.946851
2,0.096,0.127472,0.9565,0.957121,0.956624,0.956618
3,0.0435,0.127944,0.9638,0.964849,0.96389,0.964029
4,0.0197,0.128134,0.9674,0.968073,0.96751,0.967614
5,0.006,0.135274,0.9698,0.970111,0.969916,0.969965
6,0.0018,0.128943,0.9732,0.973455,0.973298,0.973361
7,0.0007,0.130094,0.9731,0.973254,0.973226,0.973229


[I 2025-03-29 19:29:59,931] Trial 24 finished with value: 0.9732291600489491 and parameters: {'learning_rate': 0.00011865097794262479, 'weight_decay': 0.006, 'warmup_steps': 27}. Best is trial 22 with value: 0.9733827228355143.


Trial 25 with params: {'learning_rate': 0.00020601407276034348, 'weight_decay': 0.003, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4529,0.22015,0.9265,0.929263,0.926812,0.926729
2,0.1105,0.160777,0.9484,0.949656,0.948564,0.948669


[I 2025-03-29 19:32:46,753] Trial 25 pruned. 


Trial 26 with params: {'learning_rate': 0.00024009854177757173, 'weight_decay': 0.009000000000000001, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.443,0.239708,0.9211,0.924993,0.921206,0.921138
2,0.1258,0.146923,0.9495,0.950336,0.949771,0.949617
3,0.0655,0.127077,0.9607,0.961614,0.960873,0.960989
4,0.0338,0.134583,0.9646,0.965048,0.964578,0.964718
5,0.0153,0.142064,0.9652,0.96569,0.965229,0.965309
6,0.0035,0.12639,0.9717,0.971953,0.971812,0.97186
7,0.0012,0.125383,0.9721,0.972385,0.972205,0.972276


[I 2025-03-29 19:42:28,614] Trial 26 finished with value: 0.9722758907507199 and parameters: {'learning_rate': 0.00024009854177757173, 'weight_decay': 0.009000000000000001, 'warmup_steps': 28}. Best is trial 22 with value: 0.9733827228355143.


Trial 27 with params: {'learning_rate': 0.0002467077135460003, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.462,0.229358,0.9229,0.930496,0.922725,0.924326
2,0.1227,0.148176,0.9497,0.95096,0.949809,0.949961


[I 2025-03-29 19:45:14,970] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 0.002953666986018182, 'weight_decay': 0.002, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9899,2.270544,0.1752,0.120661,0.173074,0.125013
2,2.1675,2.218947,0.1808,0.170645,0.18035,0.14429


[I 2025-03-29 19:48:01,025] Trial 28 pruned. 


Trial 29 with params: {'learning_rate': 0.00011735172641973649, 'weight_decay': 0.003, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3922,0.171859,0.9448,0.948764,0.945024,0.945414
2,0.0866,0.110174,0.9631,0.963302,0.963305,0.963241
3,0.0408,0.123265,0.9633,0.96444,0.96343,0.963679
4,0.0177,0.122345,0.9682,0.968252,0.968392,0.968251
5,0.0056,0.141341,0.969,0.96939,0.969073,0.969177
6,0.0016,0.138268,0.9703,0.970583,0.970407,0.970469
7,0.0004,0.13675,0.9706,0.970681,0.970753,0.970708


[I 2025-03-29 19:57:41,137] Trial 29 finished with value: 0.970708143786813 and parameters: {'learning_rate': 0.00011735172641973649, 'weight_decay': 0.003, 'warmup_steps': 0}. Best is trial 22 with value: 0.9733827228355143.


Trial 30 with params: {'learning_rate': 0.00028100291767653175, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4396,0.263432,0.911,0.917949,0.911041,0.911214
2,0.1379,0.180815,0.9405,0.942103,0.940826,0.940662
3,0.0752,0.15109,0.9527,0.954112,0.952643,0.953095
4,0.0418,0.146234,0.9598,0.959802,0.96,0.959856


[I 2025-03-29 20:03:12,372] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 0.0003282029820771861, 'weight_decay': 0.01, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4606,0.284063,0.9081,0.913485,0.908336,0.908256
2,0.1468,0.176599,0.9406,0.942442,0.940695,0.940876


[I 2025-03-29 20:05:58,996] Trial 31 pruned. 


Trial 32 with params: {'learning_rate': 0.0002644965932082481, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4661,0.252513,0.9178,0.922497,0.917815,0.918037
2,0.132,0.185334,0.9389,0.942004,0.939248,0.938791
3,0.0786,0.144427,0.9556,0.956549,0.955725,0.95587
4,0.0394,0.126335,0.9619,0.962133,0.962028,0.962043


[I 2025-03-29 20:11:29,842] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 0.0004211137487642013, 'weight_decay': 0.008, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4909,0.258089,0.9143,0.916936,0.914541,0.914683
2,0.1829,0.199585,0.9308,0.932218,0.930988,0.931096


[I 2025-03-29 20:14:15,801] Trial 33 pruned. 


Trial 34 with params: {'learning_rate': 0.00019066411536696978, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4455,0.268575,0.9118,0.922968,0.912035,0.913265
2,0.1131,0.1263,0.9586,0.958792,0.958775,0.958678
3,0.0582,0.131439,0.9586,0.959712,0.958708,0.958863
4,0.0255,0.138647,0.9623,0.962984,0.962367,0.962557
5,0.0132,0.131235,0.9678,0.968373,0.967896,0.968002
6,0.0028,0.131631,0.9719,0.972222,0.97198,0.972065
7,0.0007,0.130106,0.9712,0.971347,0.971334,0.971322


[I 2025-03-29 20:23:57,846] Trial 34 finished with value: 0.9713224748941587 and parameters: {'learning_rate': 0.00019066411536696978, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}. Best is trial 22 with value: 0.9733827228355143.


Trial 35 with params: {'learning_rate': 0.00017057009867124738, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4481,0.228622,0.9261,0.933495,0.926568,0.926627
2,0.1031,0.146528,0.9518,0.952248,0.952089,0.95171
3,0.0542,0.136457,0.9589,0.959592,0.95911,0.959087
4,0.0268,0.137977,0.9611,0.961338,0.961399,0.961165


[I 2025-03-29 20:29:31,614] Trial 35 pruned. 


Trial 36 with params: {'learning_rate': 0.004049761177508626, 'weight_decay': 0.006, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3119,2.435007,0.1599,0.053823,0.158469,0.061093
2,2.3176,2.315834,0.0905,0.033091,0.090894,0.042763
3,2.299,2.30991,0.1012,0.083057,0.10163,0.06812
4,2.3324,2.322517,0.0979,0.019637,0.098441,0.032539


[I 2025-03-29 20:35:03,961] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 9.286230673587775e-05, 'weight_decay': 0.01, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4787,0.195805,0.9343,0.941442,0.934547,0.934903
2,0.095,0.127605,0.959,0.95975,0.95915,0.959204
3,0.0416,0.13322,0.9615,0.963116,0.961612,0.961845
4,0.0167,0.13403,0.9668,0.967241,0.966971,0.966983


[I 2025-03-29 20:40:35,724] Trial 37 pruned. 


Trial 38 with params: {'learning_rate': 0.00018692749398230822, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4533,0.222554,0.9253,0.93549,0.925648,0.926935
2,0.1098,0.140472,0.9535,0.954715,0.953717,0.95356
3,0.0503,0.125461,0.9636,0.963826,0.963718,0.963694
4,0.0258,0.136012,0.9661,0.96655,0.966363,0.966162


[I 2025-03-29 20:46:06,977] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.0010475348879951107, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7077,0.494804,0.8371,0.848305,0.837999,0.836959
2,0.36,0.373679,0.8716,0.880331,0.871831,0.870768
3,0.2491,0.346505,0.8842,0.89485,0.883587,0.884876
4,0.156,0.219374,0.9299,0.930334,0.930147,0.930064


[I 2025-03-29 20:51:39,469] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.0003364737045777045, 'weight_decay': 0.01, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.437,0.288007,0.9028,0.909155,0.903288,0.902647
2,0.1536,0.189819,0.9357,0.937956,0.9359,0.935996


[I 2025-03-29 20:54:26,308] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 0.0001456647286080767, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4408,0.186801,0.9389,0.94408,0.939042,0.939524
2,0.0993,0.142152,0.9517,0.95274,0.952018,0.951638
3,0.0467,0.118045,0.9641,0.964396,0.964277,0.964232
4,0.0238,0.136592,0.9648,0.965133,0.964942,0.964923
5,0.0093,0.150167,0.9679,0.968492,0.968142,0.968083
6,0.002,0.129746,0.972,0.972164,0.972223,0.972152
7,0.0005,0.130666,0.9726,0.97269,0.972788,0.972727


[I 2025-03-29 21:04:11,230] Trial 41 finished with value: 0.9727269225527291 and parameters: {'learning_rate': 0.0001456647286080767, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}. Best is trial 22 with value: 0.9733827228355143.


Trial 42 with params: {'learning_rate': 0.0001818125580572801, 'weight_decay': 0.01, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4621,0.215622,0.9295,0.933378,0.929449,0.929614
2,0.1094,0.159993,0.9466,0.949075,0.946963,0.946631


[I 2025-03-29 21:06:58,337] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.00017882142807170676, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4527,0.183903,0.9398,0.942148,0.940063,0.940083
2,0.1077,0.140467,0.9535,0.954359,0.953651,0.953654
3,0.0505,0.130678,0.9613,0.962081,0.961326,0.961575
4,0.0264,0.1246,0.9649,0.965076,0.965132,0.965038
5,0.0111,0.117063,0.9707,0.971177,0.970829,0.970923
6,0.0025,0.122644,0.9732,0.973555,0.973341,0.973361
7,0.0006,0.119076,0.9734,0.973597,0.973533,0.973544


[I 2025-03-29 21:16:40,809] Trial 43 finished with value: 0.9735436774258494 and parameters: {'learning_rate': 0.00017882142807170676, 'weight_decay': 0.008, 'warmup_steps': 23}. Best is trial 43 with value: 0.9735436774258494.


Trial 44 with params: {'learning_rate': 7.012112975444019e-05, 'weight_decay': 0.0, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5314,0.190051,0.9388,0.943309,0.938922,0.939424
2,0.0921,0.115931,0.9618,0.962005,0.961981,0.96185
3,0.0392,0.110456,0.9659,0.966239,0.966034,0.966107
4,0.0174,0.1205,0.9678,0.967948,0.967883,0.967902
5,0.0062,0.139446,0.968,0.968288,0.968169,0.968151
6,0.0017,0.135429,0.969,0.969226,0.969125,0.969148
7,0.0007,0.135128,0.9693,0.96941,0.969458,0.969428


[I 2025-03-29 21:26:28,760] Trial 44 finished with value: 0.9694279207563451 and parameters: {'learning_rate': 7.012112975444019e-05, 'weight_decay': 0.0, 'warmup_steps': 24}. Best is trial 43 with value: 0.9735436774258494.


Trial 45 with params: {'learning_rate': 0.00013563560676260026, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4725,0.155023,0.951,0.951477,0.951083,0.95115
2,0.0956,0.1339,0.9568,0.95715,0.957034,0.956867
3,0.0471,0.136746,0.9599,0.961421,0.96002,0.960337
4,0.0237,0.126859,0.9663,0.966381,0.966458,0.966373
5,0.008,0.136413,0.9685,0.96865,0.968603,0.968615
6,0.0027,0.134947,0.97,0.970251,0.970129,0.97016
7,0.0007,0.133567,0.971,0.97112,0.971163,0.971127


[I 2025-03-29 21:36:17,079] Trial 45 finished with value: 0.9711269670216645 and parameters: {'learning_rate': 0.00013563560676260026, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 43 with value: 0.9735436774258494.


Trial 46 with params: {'learning_rate': 0.00010750244532497942, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4602,0.168979,0.9438,0.945788,0.943889,0.943957
2,0.0895,0.123324,0.961,0.96112,0.961155,0.961055
3,0.0408,0.116944,0.9653,0.966106,0.965374,0.96556
4,0.0182,0.136171,0.9654,0.965523,0.965564,0.965494


[I 2025-03-29 21:41:49,589] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 0.00010887451629772067, 'weight_decay': 0.005, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.473,0.209213,0.9285,0.935446,0.928765,0.928972
2,0.0928,0.12634,0.959,0.959661,0.95913,0.959288
3,0.0444,0.11384,0.9662,0.966276,0.96638,0.966264
4,0.0175,0.121459,0.968,0.968048,0.968188,0.968096
5,0.0068,0.12541,0.9713,0.97141,0.971463,0.971386
6,0.0016,0.128498,0.9726,0.972798,0.972677,0.972704
7,0.0005,0.126973,0.9722,0.972323,0.972319,0.972316


[I 2025-03-29 21:51:59,083] Trial 47 finished with value: 0.9723158718719297 and parameters: {'learning_rate': 0.00010887451629772067, 'weight_decay': 0.005, 'warmup_steps': 27}. Best is trial 43 with value: 0.9735436774258494.


Trial 48 with params: {'learning_rate': 0.00012147190692302132, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4696,0.194579,0.9335,0.939963,0.933958,0.934283
2,0.0991,0.116843,0.9625,0.963129,0.962608,0.962688
3,0.0462,0.125989,0.9619,0.963254,0.962109,0.962094
4,0.0186,0.123807,0.9694,0.969666,0.969547,0.969552
5,0.007,0.137423,0.968,0.968142,0.968143,0.968116
6,0.0018,0.142694,0.9709,0.971183,0.970996,0.971051
7,0.0005,0.137763,0.9703,0.970454,0.970454,0.970445


[I 2025-03-29 22:01:41,232] Trial 48 finished with value: 0.9704449399761612 and parameters: {'learning_rate': 0.00012147190692302132, 'weight_decay': 0.007, 'warmup_steps': 24}. Best is trial 43 with value: 0.9735436774258494.


Trial 49 with params: {'learning_rate': 0.00020338031147463888, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4252,0.193183,0.9378,0.941039,0.937924,0.938353
2,0.11,0.145897,0.9502,0.951443,0.950415,0.95037


[I 2025-03-29 22:04:26,329] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.0027800474932883233, 'weight_decay': 0.0, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8725,2.200063,0.1886,0.220976,0.187333,0.151398
2,2.0761,2.243609,0.1518,0.180142,0.151297,0.111972


[I 2025-03-29 22:07:12,889] Trial 50 pruned. 


Trial 51 with params: {'learning_rate': 0.0002326906365354164, 'weight_decay': 0.005, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4434,0.208238,0.9329,0.937489,0.932971,0.933748
2,0.1229,0.156702,0.9493,0.95135,0.94967,0.949365
3,0.0619,0.132097,0.9606,0.960954,0.960639,0.96072
4,0.0336,0.129805,0.9631,0.963476,0.963283,0.963316


[I 2025-03-29 22:12:44,343] Trial 51 pruned. 


Trial 52 with params: {'learning_rate': 6.1005881023266626e-05, 'weight_decay': 0.007, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5168,0.164689,0.9483,0.949997,0.948293,0.948674
2,0.0957,0.125215,0.9592,0.95967,0.959413,0.959368
3,0.037,0.128637,0.9629,0.962982,0.963079,0.962942
4,0.0157,0.139914,0.9651,0.965234,0.965256,0.96523


[I 2025-03-29 22:18:17,269] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 9.335977849844236e-05, 'weight_decay': 0.006, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5156,0.164909,0.9456,0.947876,0.945861,0.945872
2,0.0951,0.12318,0.9592,0.959896,0.95925,0.959379
3,0.0411,0.124694,0.9635,0.963677,0.963621,0.963585
4,0.0182,0.128578,0.9681,0.968284,0.968204,0.96816
5,0.0052,0.140075,0.9692,0.969279,0.969306,0.969253
6,0.0016,0.140276,0.9705,0.970659,0.970568,0.970605
7,0.0005,0.139486,0.9711,0.971168,0.971163,0.971161


[I 2025-03-29 22:28:00,649] Trial 53 finished with value: 0.971161361651063 and parameters: {'learning_rate': 9.335977849844236e-05, 'weight_decay': 0.006, 'warmup_steps': 30}. Best is trial 43 with value: 0.9735436774258494.


Trial 54 with params: {'learning_rate': 0.000403916017640712, 'weight_decay': 0.0, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4707,0.291617,0.9043,0.910795,0.904971,0.904341
2,0.1649,0.218964,0.9268,0.928785,0.927255,0.926721


[I 2025-03-29 22:30:46,989] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.0002606336830980987, 'weight_decay': 0.0, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3941,0.257073,0.9131,0.921715,0.913562,0.913187
2,0.1264,0.147241,0.9522,0.952736,0.952555,0.952348
3,0.0674,0.154349,0.9544,0.955422,0.954847,0.954531
4,0.0373,0.14077,0.9605,0.960801,0.960808,0.960574


[I 2025-03-29 22:36:18,400] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 6.358026237171493e-05, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5657,0.175585,0.9429,0.945054,0.943064,0.943401
2,0.0977,0.122658,0.9618,0.962183,0.961959,0.961962
3,0.0427,0.119655,0.9662,0.966514,0.966362,0.966287
4,0.0148,0.130585,0.967,0.967152,0.967165,0.967099
5,0.0048,0.135921,0.9694,0.96959,0.969537,0.969544
6,0.0013,0.140654,0.969,0.969291,0.969089,0.969161
7,0.0007,0.14093,0.9688,0.968911,0.968939,0.96891


[I 2025-03-29 22:46:01,518] Trial 56 finished with value: 0.968909508253556 and parameters: {'learning_rate': 6.358026237171493e-05, 'weight_decay': 0.005, 'warmup_steps': 26}. Best is trial 43 with value: 0.9735436774258494.


Trial 57 with params: {'learning_rate': 0.00011887515276957258, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4756,0.175363,0.9418,0.945091,0.942009,0.94229
2,0.0911,0.114684,0.9615,0.961794,0.96175,0.961617
3,0.0414,0.118665,0.9638,0.963768,0.964035,0.963815
4,0.0181,0.119333,0.9678,0.968153,0.967963,0.967923
5,0.0066,0.124018,0.971,0.971074,0.971121,0.971055
6,0.0014,0.128777,0.9718,0.971978,0.971863,0.971896
7,0.0004,0.12406,0.9739,0.973988,0.973991,0.973977


[I 2025-03-29 22:55:42,088] Trial 57 finished with value: 0.9739767587933696 and parameters: {'learning_rate': 0.00011887515276957258, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 57 with value: 0.9739767587933696.


Trial 58 with params: {'learning_rate': 7.081459585768469e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5166,0.157713,0.9484,0.949999,0.948519,0.948801
2,0.0896,0.135183,0.9543,0.955158,0.954508,0.954464
3,0.0387,0.119601,0.9634,0.963434,0.96362,0.96346
4,0.0142,0.14231,0.9674,0.967523,0.967569,0.967505
5,0.0045,0.136179,0.971,0.971363,0.971103,0.971162
6,0.0017,0.142174,0.9704,0.970628,0.970506,0.970554
7,0.0006,0.140919,0.9703,0.970452,0.970437,0.970436


[I 2025-03-29 23:05:22,242] Trial 58 finished with value: 0.970435554108661 and parameters: {'learning_rate': 7.081459585768469e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}. Best is trial 57 with value: 0.9739767587933696.


Trial 59 with params: {'learning_rate': 0.00012028135740743376, 'weight_decay': 0.008, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4666,0.172004,0.9422,0.944974,0.942428,0.942612
2,0.0918,0.123715,0.9594,0.95977,0.95956,0.959529
3,0.0458,0.114339,0.967,0.967412,0.967211,0.967137
4,0.0198,0.130145,0.966,0.966401,0.966112,0.966088


[I 2025-03-29 23:10:54,268] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 0.0011700191952905836, 'weight_decay': 0.003, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7844,0.606262,0.7978,0.814055,0.798145,0.798925
2,0.4278,0.394319,0.8686,0.875057,0.868988,0.868295
3,0.2832,0.37096,0.8783,0.883766,0.878477,0.877122
4,0.1838,0.2482,0.9198,0.922083,0.919851,0.920471


[I 2025-03-29 23:16:26,335] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.00015444204635882978, 'weight_decay': 0.005, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4578,0.185739,0.9382,0.940062,0.938438,0.938308
2,0.1038,0.137587,0.9553,0.955526,0.955515,0.955384
3,0.0525,0.131201,0.9616,0.961605,0.961799,0.961626
4,0.0248,0.119548,0.9677,0.967785,0.967862,0.96781
5,0.0104,0.122705,0.9692,0.969304,0.969377,0.969314
6,0.0025,0.121753,0.9732,0.973276,0.973361,0.973302
7,0.0007,0.12095,0.9719,0.971907,0.972101,0.971982


[I 2025-03-29 23:26:23,149] Trial 61 finished with value: 0.9719819824160216 and parameters: {'learning_rate': 0.00015444204635882978, 'weight_decay': 0.005, 'warmup_steps': 30}. Best is trial 57 with value: 0.9739767587933696.


Trial 62 with params: {'learning_rate': 0.00012570938701673154, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4502,0.174311,0.9423,0.94611,0.942293,0.942867
2,0.0914,0.152564,0.9512,0.953606,0.95141,0.951516
3,0.0424,0.127247,0.9629,0.963182,0.963154,0.963019
4,0.0182,0.119976,0.9679,0.96811,0.96805,0.968056
5,0.0072,0.125787,0.9707,0.970815,0.970829,0.970797
6,0.0019,0.123235,0.9733,0.973433,0.97343,0.973416
7,0.0005,0.122741,0.9732,0.973319,0.973316,0.973314


[I 2025-03-29 23:36:36,046] Trial 62 finished with value: 0.9733137782981413 and parameters: {'learning_rate': 0.00012570938701673154, 'weight_decay': 0.007, 'warmup_steps': 23}. Best is trial 57 with value: 0.9739767587933696.


Trial 63 with params: {'learning_rate': 8.738951618852924e-05, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4901,0.1585,0.9469,0.949671,0.947015,0.947436
2,0.0919,0.134571,0.9554,0.956417,0.95564,0.955632
3,0.0406,0.126198,0.9628,0.963202,0.962866,0.962961
4,0.0161,0.131945,0.9655,0.96577,0.965672,0.965625


[I 2025-03-29 23:42:21,340] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.0014740970021661379, 'weight_decay': 0.005, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8538,0.722874,0.7606,0.775777,0.761405,0.757537
2,0.5142,0.415468,0.8566,0.86794,0.856551,0.858756
3,0.3461,0.413751,0.8581,0.870598,0.858578,0.857625
4,0.2339,0.27061,0.9089,0.915556,0.908956,0.910306


[I 2025-03-29 23:47:55,783] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.0003061126129336506, 'weight_decay': 0.004, 'warmup_steps': 10}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4077,0.237446,0.923,0.927004,0.923444,0.923557
2,0.139,0.192619,0.9423,0.946188,0.942542,0.942608


[I 2025-03-29 23:50:43,302] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 0.00019251840253040213, 'weight_decay': 0.007, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4145,0.211831,0.9306,0.935297,0.930927,0.93115
2,0.1066,0.149509,0.9504,0.951337,0.95065,0.950608
3,0.0534,0.133214,0.9605,0.961084,0.960595,0.960703
4,0.0268,0.143382,0.962,0.962287,0.962153,0.962158


[I 2025-03-29 23:56:48,051] Trial 66 pruned. 


Trial 67 with params: {'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.484,0.18063,0.9393,0.942906,0.939729,0.9396
2,0.0887,0.113693,0.9616,0.961775,0.961758,0.961648
3,0.0388,0.115217,0.9657,0.965841,0.965846,0.965788
4,0.0169,0.122903,0.9698,0.970278,0.969851,0.969986
5,0.0052,0.123374,0.9719,0.972085,0.972034,0.972037
6,0.0018,0.125322,0.9739,0.974106,0.974027,0.974044
7,0.0006,0.125152,0.9745,0.974653,0.974612,0.974629


[I 2025-03-30 00:06:37,997] Trial 67 finished with value: 0.9746290627725797 and parameters: {'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}. Best is trial 67 with value: 0.9746290627725797.


Trial 68 with params: {'learning_rate': 0.00011106805870942286, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4624,0.169977,0.9447,0.946935,0.944822,0.945205
2,0.0959,0.143071,0.9526,0.95406,0.952775,0.95289


[I 2025-03-30 00:09:24,500] Trial 68 pruned. 


Trial 69 with params: {'learning_rate': 0.00026379078208589916, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4441,0.222593,0.9265,0.928663,0.927032,0.926341
2,0.1334,0.159377,0.9481,0.948861,0.948333,0.948255
3,0.0736,0.130405,0.9594,0.959929,0.959591,0.959609
4,0.0385,0.128816,0.9635,0.963646,0.963622,0.963592


[I 2025-03-30 00:14:59,569] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 5.1939313310282055e-05, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5933,0.170069,0.9475,0.948246,0.947505,0.947741
2,0.1025,0.130518,0.9586,0.958863,0.958806,0.958685
3,0.044,0.131537,0.961,0.961754,0.961118,0.961203
4,0.0172,0.138142,0.9638,0.96396,0.963907,0.963899
5,0.0059,0.142225,0.9666,0.96677,0.966693,0.966708
6,0.0019,0.154488,0.9657,0.965878,0.965794,0.965821
7,0.0009,0.157517,0.9671,0.967139,0.967213,0.967152


[I 2025-03-30 00:24:43,523] Trial 70 finished with value: 0.9671521366228003 and parameters: {'learning_rate': 5.1939313310282055e-05, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 71 with params: {'learning_rate': 0.00014524706562936044, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4472,0.193107,0.9371,0.943006,0.937234,0.937835
2,0.0985,0.129569,0.9569,0.957309,0.957132,0.957059
3,0.0469,0.120871,0.9645,0.96482,0.964678,0.964671
4,0.0214,0.127811,0.9673,0.967792,0.967473,0.967487
5,0.0078,0.141804,0.9683,0.968597,0.968436,0.968397
6,0.0017,0.126927,0.9728,0.972941,0.9729,0.972906
7,0.0006,0.128563,0.9722,0.972338,0.972329,0.972319


[I 2025-03-30 00:34:37,383] Trial 71 finished with value: 0.9723193002334269 and parameters: {'learning_rate': 0.00014524706562936044, 'weight_decay': 0.008, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 72 with params: {'learning_rate': 5.507864621388507e-05, 'weight_decay': 0.006, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5677,0.165092,0.9479,0.948916,0.948123,0.947962
2,0.0989,0.121365,0.9602,0.960538,0.96033,0.960381
3,0.0392,0.126578,0.9638,0.964132,0.963942,0.963888
4,0.0141,0.133393,0.9665,0.966559,0.966659,0.966587
5,0.0055,0.139647,0.9685,0.968777,0.968597,0.968638
6,0.0018,0.142426,0.9692,0.969382,0.969321,0.969346
7,0.0009,0.143794,0.9702,0.970323,0.970353,0.970325


[I 2025-03-30 00:44:32,600] Trial 72 finished with value: 0.9703253748951186 and parameters: {'learning_rate': 5.507864621388507e-05, 'weight_decay': 0.006, 'warmup_steps': 17}. Best is trial 67 with value: 0.9746290627725797.


Trial 73 with params: {'learning_rate': 0.00016046280027725454, 'weight_decay': 0.008, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4351,0.21221,0.9289,0.934863,0.929134,0.929615
2,0.1011,0.142241,0.9507,0.952456,0.95095,0.950924


[I 2025-03-30 00:47:20,179] Trial 73 pruned. 


Trial 74 with params: {'learning_rate': 9.37652748553604e-05, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4919,0.160033,0.9499,0.951871,0.950038,0.950202
2,0.0912,0.114528,0.9623,0.963014,0.962444,0.962569
3,0.0396,0.124348,0.9639,0.964135,0.964013,0.96405
4,0.0172,0.138689,0.9656,0.965844,0.965663,0.965732
5,0.0062,0.148131,0.9677,0.96789,0.967817,0.967806
6,0.0017,0.143545,0.9686,0.968725,0.968748,0.968729
7,0.0005,0.143692,0.9705,0.970645,0.970633,0.970633


[I 2025-03-30 00:57:09,320] Trial 74 finished with value: 0.9706326901919524 and parameters: {'learning_rate': 9.37652748553604e-05, 'weight_decay': 0.008, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 75 with params: {'learning_rate': 0.0001248004164266306, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4823,0.173416,0.9453,0.947302,0.945634,0.945508
2,0.1069,0.133538,0.9555,0.956817,0.955664,0.955781
3,0.0484,0.126762,0.9606,0.961098,0.960756,0.960817
4,0.0224,0.126711,0.9671,0.967185,0.967207,0.967142
5,0.0073,0.12235,0.9713,0.971565,0.971462,0.971426
6,0.0014,0.124575,0.9738,0.973994,0.973922,0.973939
7,0.0006,0.125869,0.9739,0.973992,0.974063,0.974013


[I 2025-03-30 01:06:56,130] Trial 75 finished with value: 0.9740127201248043 and parameters: {'learning_rate': 0.0001248004164266306, 'weight_decay': 0.007, 'warmup_steps': 28}. Best is trial 67 with value: 0.9746290627725797.


Trial 76 with params: {'learning_rate': 0.0001696458210351118, 'weight_decay': 0.007, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4505,0.165953,0.9476,0.949679,0.947445,0.947733
2,0.1046,0.149689,0.9511,0.952606,0.951242,0.951238
3,0.0535,0.137503,0.9571,0.957555,0.957189,0.957243
4,0.024,0.135038,0.9646,0.965266,0.964684,0.964856
5,0.0113,0.132872,0.97,0.970516,0.970166,0.970199
6,0.0027,0.125815,0.9705,0.970777,0.970603,0.970653
7,0.0006,0.121762,0.9708,0.97106,0.970909,0.970953


[I 2025-03-30 01:16:56,980] Trial 76 finished with value: 0.9709527966468403 and parameters: {'learning_rate': 0.0001696458210351118, 'weight_decay': 0.007, 'warmup_steps': 29}. Best is trial 67 with value: 0.9746290627725797.


Trial 77 with params: {'learning_rate': 0.00014335654906866193, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4612,0.170037,0.9432,0.94672,0.94346,0.943869
2,0.0969,0.12812,0.9575,0.958406,0.957669,0.957737
3,0.0446,0.132042,0.9623,0.963247,0.962293,0.962631
4,0.0232,0.127355,0.9658,0.966171,0.965945,0.965984


[I 2025-03-30 01:22:30,368] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.00010454672389277825, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4693,0.162285,0.9465,0.948813,0.946548,0.946716
2,0.088,0.126648,0.9581,0.958615,0.958342,0.958189
3,0.0412,0.12447,0.9624,0.962739,0.962549,0.962576
4,0.0178,0.119486,0.9692,0.969359,0.969421,0.969336
5,0.0066,0.126724,0.9709,0.971115,0.971057,0.971008
6,0.0016,0.123238,0.9731,0.9732,0.973232,0.973203
7,0.0006,0.122065,0.9735,0.973557,0.973628,0.973585


[I 2025-03-30 01:32:16,511] Trial 78 finished with value: 0.9735846964699236 and parameters: {'learning_rate': 0.00010454672389277825, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 79 with params: {'learning_rate': 0.0002079601235835947, 'weight_decay': 0.006, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.431,0.192682,0.9349,0.937382,0.934886,0.935424
2,0.1116,0.156186,0.9485,0.949829,0.948703,0.948384
3,0.0601,0.125112,0.9614,0.961804,0.961419,0.961531
4,0.03,0.130207,0.9635,0.9639,0.96367,0.963682


[I 2025-03-30 01:37:51,349] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 8.27169910109526e-05, 'weight_decay': 0.007, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4957,0.165876,0.9455,0.947117,0.945629,0.945811
2,0.0895,0.1286,0.9596,0.960093,0.959769,0.959731
3,0.0365,0.119121,0.9645,0.964592,0.964664,0.964571
4,0.0154,0.132444,0.9656,0.965782,0.965809,0.965715


[I 2025-03-30 01:43:23,766] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 8.194846030220038e-05, 'weight_decay': 0.007, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5238,0.172659,0.9432,0.945275,0.943411,0.943322
2,0.0901,0.124284,0.9596,0.960043,0.95988,0.959702
3,0.0387,0.121594,0.9641,0.964388,0.964264,0.964217
4,0.0157,0.130117,0.9674,0.967553,0.967521,0.967515
5,0.0051,0.139408,0.9682,0.968445,0.968262,0.968287
6,0.0017,0.139534,0.9707,0.970852,0.970789,0.970795
7,0.0007,0.138082,0.9705,0.970617,0.970608,0.970588


[I 2025-03-30 01:53:29,830] Trial 81 finished with value: 0.970587624728708 and parameters: {'learning_rate': 8.194846030220038e-05, 'weight_decay': 0.007, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 82 with params: {'learning_rate': 0.0001986188637046071, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4427,0.210421,0.9298,0.934271,0.929868,0.930368
2,0.1071,0.151278,0.9488,0.951315,0.949032,0.949219


[I 2025-03-30 01:56:17,738] Trial 82 pruned. 


Trial 83 with params: {'learning_rate': 5.380559807793641e-05, 'weight_decay': 0.007, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5949,0.183758,0.9409,0.943337,0.941075,0.941179
2,0.1051,0.122399,0.9587,0.959288,0.958904,0.958895
3,0.0426,0.123192,0.9631,0.963355,0.963218,0.963267
4,0.015,0.133967,0.9652,0.965336,0.965435,0.965267
5,0.0056,0.135403,0.9686,0.968714,0.96877,0.968685
6,0.0018,0.13866,0.9706,0.970894,0.970696,0.970768
7,0.0006,0.138683,0.9699,0.970049,0.97004,0.970034


[I 2025-03-30 03:06:05,544] Trial 83 finished with value: 0.9700344276167805 and parameters: {'learning_rate': 5.380559807793641e-05, 'weight_decay': 0.007, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 84 with params: {'learning_rate': 0.00017677589724360998, 'weight_decay': 0.007, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4453,0.203578,0.9333,0.939881,0.93328,0.934452
2,0.1053,0.155676,0.9494,0.949931,0.949671,0.949516


[I 2025-03-30 03:08:53,037] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.00011839895364819732, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4669,0.182481,0.9414,0.944823,0.941621,0.94197
2,0.0894,0.135924,0.9571,0.957336,0.957382,0.957001
3,0.0414,0.114988,0.9656,0.965838,0.965772,0.96578
4,0.0187,0.122173,0.9687,0.968832,0.96886,0.968821
5,0.0065,0.12917,0.9701,0.970231,0.970276,0.970194
6,0.0018,0.126243,0.9709,0.971052,0.971007,0.971012
7,0.0007,0.124432,0.972,0.972103,0.972158,0.972116


[I 2025-03-30 03:18:43,788] Trial 85 finished with value: 0.9721158721004268 and parameters: {'learning_rate': 0.00011839895364819732, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 86 with params: {'learning_rate': 0.0002597113179487162, 'weight_decay': 0.01, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3861,0.255988,0.9183,0.924961,0.918852,0.9187
2,0.1281,0.153887,0.9459,0.948216,0.946154,0.946101


[I 2025-03-30 03:21:30,384] Trial 86 pruned. 


Trial 87 with params: {'learning_rate': 8.810867924200206e-05, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5046,0.158865,0.9471,0.948808,0.947287,0.947466
2,0.0894,0.115137,0.9633,0.963811,0.963436,0.963496
3,0.0399,0.124236,0.9641,0.965033,0.964235,0.96428
4,0.0165,0.119988,0.9697,0.969858,0.969838,0.969794
5,0.0052,0.13318,0.9706,0.97087,0.970734,0.970732
6,0.0014,0.130003,0.9732,0.973439,0.973266,0.973339
7,0.0005,0.132448,0.973,0.973143,0.973104,0.973117


[I 2025-03-30 03:31:12,459] Trial 87 finished with value: 0.9731167875703608 and parameters: {'learning_rate': 8.810867924200206e-05, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 88 with params: {'learning_rate': 0.00010532808384570563, 'weight_decay': 0.003, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4537,0.161396,0.9478,0.950757,0.947878,0.948341
2,0.0904,0.116728,0.9613,0.961792,0.961423,0.961526
3,0.0394,0.117158,0.9653,0.965646,0.965448,0.965382
4,0.0171,0.12592,0.9682,0.968345,0.968421,0.968288
5,0.0066,0.128168,0.9685,0.968847,0.96859,0.968684
6,0.0018,0.126169,0.9716,0.971886,0.971687,0.971749
7,0.0007,0.124042,0.9724,0.972574,0.97252,0.972522


[I 2025-03-30 03:41:24,416] Trial 88 finished with value: 0.9725216660618647 and parameters: {'learning_rate': 0.00010532808384570563, 'weight_decay': 0.003, 'warmup_steps': 18}. Best is trial 67 with value: 0.9746290627725797.


Trial 89 with params: {'learning_rate': 9.02154822352379e-05, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4959,0.15244,0.9497,0.951645,0.949748,0.950114
2,0.094,0.120436,0.9608,0.961247,0.960891,0.960993
3,0.0432,0.123506,0.9659,0.966203,0.966028,0.966023
4,0.0182,0.129197,0.9671,0.967468,0.96727,0.967214
5,0.0062,0.136298,0.9679,0.968166,0.968022,0.968022
6,0.0014,0.130983,0.9715,0.97169,0.971674,0.971654
7,0.0005,0.128676,0.9709,0.971079,0.971025,0.971044


[I 2025-03-30 03:51:06,458] Trial 89 finished with value: 0.9710441023231715 and parameters: {'learning_rate': 9.02154822352379e-05, 'weight_decay': 0.006, 'warmup_steps': 24}. Best is trial 67 with value: 0.9746290627725797.


Trial 90 with params: {'learning_rate': 0.0005558154008655438, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5202,0.31641,0.8956,0.908271,0.895688,0.896819
2,0.2148,0.236014,0.9192,0.921985,0.919682,0.918001


[I 2025-03-30 03:53:52,329] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.00010109337133292047, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4693,0.181072,0.9379,0.94362,0.937973,0.938952
2,0.0882,0.136194,0.9568,0.957225,0.95706,0.956862
3,0.04,0.116333,0.964,0.96433,0.964152,0.964192
4,0.0155,0.126719,0.9678,0.967956,0.967943,0.967907
5,0.0058,0.133262,0.9694,0.969698,0.969508,0.969534
6,0.0016,0.129357,0.9714,0.971797,0.971458,0.971572
7,0.0005,0.128628,0.9728,0.972969,0.972927,0.972935


[I 2025-03-30 04:03:36,367] Trial 91 finished with value: 0.9729347451315938 and parameters: {'learning_rate': 0.00010109337133292047, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 92 with params: {'learning_rate': 6.965676100182774e-05, 'weight_decay': 0.006, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5394,0.155437,0.95,0.951823,0.950035,0.950433
2,0.0931,0.119745,0.9615,0.96188,0.961575,0.961651
3,0.0422,0.113075,0.9665,0.966632,0.966612,0.966589
4,0.0163,0.131795,0.9672,0.967514,0.967297,0.967366
5,0.0063,0.141808,0.9698,0.969959,0.969887,0.969885
6,0.0024,0.140139,0.9705,0.970658,0.970597,0.970617
7,0.0008,0.142113,0.9697,0.969754,0.969819,0.969762


[I 2025-03-30 04:13:21,778] Trial 92 finished with value: 0.9697620985698204 and parameters: {'learning_rate': 6.965676100182774e-05, 'weight_decay': 0.006, 'warmup_steps': 29}. Best is trial 67 with value: 0.9746290627725797.


Trial 93 with params: {'learning_rate': 0.00014386945094024955, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4384,0.196651,0.9347,0.939447,0.934881,0.935347
2,0.0996,0.143961,0.9529,0.954151,0.952991,0.953088
3,0.0456,0.120739,0.9648,0.964979,0.96502,0.964954
4,0.0212,0.137572,0.9647,0.965468,0.964909,0.964779


[I 2025-03-30 04:18:54,750] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.0002009904356943865, 'weight_decay': 0.008, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4635,0.192116,0.9378,0.940142,0.937821,0.938123
2,0.1112,0.168058,0.9442,0.945296,0.944647,0.943987


[I 2025-03-30 04:21:51,830] Trial 94 pruned. 


Trial 95 with params: {'learning_rate': 8.952659244058166e-05, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4988,0.15468,0.9486,0.951116,0.948765,0.948983
2,0.0895,0.131008,0.9569,0.957209,0.957039,0.95691
3,0.0393,0.126802,0.9639,0.964696,0.96415,0.964102
4,0.0163,0.139509,0.9656,0.966161,0.965605,0.965766
5,0.0062,0.136443,0.9698,0.969988,0.96987,0.969883
6,0.0018,0.131995,0.9729,0.973196,0.97297,0.973059
7,0.0005,0.135058,0.972,0.972105,0.972107,0.972097


[I 2025-03-30 04:31:45,530] Trial 95 finished with value: 0.9720970570488252 and parameters: {'learning_rate': 8.952659244058166e-05, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 96 with params: {'learning_rate': 0.0001758223185828369, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.436,0.195483,0.9345,0.936979,0.934772,0.934788
2,0.1045,0.1239,0.9565,0.957509,0.956571,0.956833
3,0.0535,0.116707,0.9637,0.963962,0.963817,0.963854
4,0.0284,0.13338,0.9628,0.963219,0.962982,0.962993


[I 2025-03-30 04:37:18,639] Trial 96 pruned. 


Trial 97 with params: {'learning_rate': 0.00014435651061544283, 'weight_decay': 0.004, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4547,0.151858,0.9505,0.9511,0.950742,0.950577
2,0.096,0.143365,0.9533,0.954292,0.953537,0.953421
3,0.0473,0.137357,0.9564,0.956996,0.956528,0.95659
4,0.0208,0.134886,0.9654,0.965703,0.965533,0.965535
5,0.0085,0.124345,0.9698,0.969993,0.969932,0.969928
6,0.0016,0.129113,0.972,0.972136,0.972115,0.972087
7,0.0004,0.125463,0.9722,0.972279,0.972338,0.972292


[I 2025-03-30 04:47:13,028] Trial 97 finished with value: 0.9722917827979171 and parameters: {'learning_rate': 0.00014435651061544283, 'weight_decay': 0.004, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 98 with params: {'learning_rate': 0.0035054904723296637, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4411,2.315875,0.1016,0.01016,0.1,0.018446
2,2.3358,2.307172,0.0977,0.00977,0.1,0.017801
3,2.3065,2.307979,0.0997,0.00997,0.1,0.018132
4,2.306,2.303693,0.1022,0.01022,0.1,0.018545


[I 2025-03-30 04:52:45,391] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 8.090843589470582e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5192,0.191693,0.9357,0.942236,0.935953,0.936642
2,0.0939,0.125641,0.9586,0.959039,0.958773,0.958664
3,0.0412,0.112248,0.9659,0.966181,0.965993,0.966046
4,0.016,0.12339,0.9672,0.967253,0.967396,0.967268
5,0.005,0.136233,0.9691,0.969222,0.969218,0.96919
6,0.002,0.133604,0.9716,0.971717,0.971696,0.971701
7,0.0007,0.134227,0.9716,0.971691,0.971725,0.971703


[I 2025-03-30 05:02:28,186] Trial 99 finished with value: 0.9717032287019307 and parameters: {'learning_rate': 8.090843589470582e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 100 with params: {'learning_rate': 0.00013333557567672514, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4703,0.184463,0.938,0.942025,0.938115,0.938742
2,0.0915,0.127593,0.9566,0.956952,0.956817,0.956721
3,0.0449,0.117571,0.9639,0.964046,0.964087,0.96404
4,0.0208,0.133898,0.9651,0.965296,0.965317,0.965217
5,0.0083,0.135006,0.9689,0.969183,0.969024,0.969036
6,0.0023,0.127831,0.9712,0.971395,0.971355,0.971371
7,0.0006,0.130229,0.972,0.972134,0.972161,0.972144


[I 2025-03-30 05:12:10,361] Trial 100 finished with value: 0.9721438763030031 and parameters: {'learning_rate': 0.00013333557567672514, 'weight_decay': 0.006, 'warmup_steps': 32}. Best is trial 67 with value: 0.9746290627725797.


Trial 101 with params: {'learning_rate': 8.457343840273425e-05, 'weight_decay': 0.005, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5227,0.175287,0.9418,0.944126,0.942023,0.942248
2,0.093,0.130851,0.9575,0.958486,0.957633,0.957711
3,0.0361,0.122858,0.9643,0.964424,0.964432,0.964398
4,0.0157,0.130294,0.9671,0.967277,0.967275,0.967196
5,0.0049,0.138564,0.9689,0.968933,0.969025,0.968957
6,0.0018,0.137228,0.9721,0.972346,0.972232,0.972235
7,0.0006,0.134674,0.9714,0.971514,0.971532,0.971508


[I 2025-03-30 05:21:52,790] Trial 101 finished with value: 0.971507647241191 and parameters: {'learning_rate': 8.457343840273425e-05, 'weight_decay': 0.005, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 102 with params: {'learning_rate': 0.00013450126417323204, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4523,0.166758,0.9471,0.949957,0.947201,0.947506
2,0.0951,0.142674,0.954,0.955368,0.954203,0.954179
3,0.0471,0.134535,0.9598,0.960467,0.959977,0.960046
4,0.0213,0.134079,0.966,0.966386,0.966117,0.966203
5,0.0074,0.132766,0.9696,0.96985,0.969751,0.969736
6,0.002,0.128227,0.9712,0.971467,0.971268,0.971343
7,0.0007,0.125296,0.9733,0.973404,0.973452,0.973418


[I 2025-03-30 05:31:38,637] Trial 102 finished with value: 0.9734180083133032 and parameters: {'learning_rate': 0.00013450126417323204, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 103 with params: {'learning_rate': 0.0001191596467352941, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4528,0.188721,0.9365,0.941793,0.936635,0.937069
2,0.096,0.129955,0.9563,0.956817,0.956492,0.956451
3,0.0438,0.121136,0.9641,0.964253,0.964271,0.964136
4,0.0187,0.131935,0.9662,0.9664,0.966365,0.966346


[I 2025-03-30 05:37:13,158] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00020340932723015692, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4434,0.182599,0.9404,0.943583,0.940435,0.940956
2,0.1114,0.165758,0.9458,0.946534,0.946161,0.945917
3,0.0583,0.128684,0.96,0.960493,0.960175,0.960226
4,0.0301,0.132907,0.9627,0.962767,0.962881,0.962718


[I 2025-03-30 05:42:45,754] Trial 104 pruned. 


Trial 105 with params: {'learning_rate': 0.00013280021760138654, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4455,0.166273,0.9454,0.947371,0.945513,0.945864
2,0.0969,0.113096,0.9611,0.961483,0.961291,0.961246
3,0.0477,0.120004,0.9631,0.96352,0.963244,0.963337
4,0.019,0.127062,0.9678,0.96796,0.967998,0.967915
5,0.0083,0.129998,0.969,0.969066,0.969209,0.969096
6,0.0025,0.125647,0.9708,0.970977,0.970969,0.970964
7,0.0005,0.123952,0.9726,0.972624,0.972792,0.972698


[I 2025-03-30 05:52:32,530] Trial 105 finished with value: 0.9726977797037911 and parameters: {'learning_rate': 0.00013280021760138654, 'weight_decay': 0.007, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 106 with params: {'learning_rate': 0.0001829983417738769, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4454,0.204333,0.9316,0.936346,0.931646,0.932153
2,0.1066,0.147652,0.9505,0.951349,0.950818,0.950712
3,0.0564,0.12977,0.9617,0.962244,0.961869,0.961952
4,0.0287,0.135011,0.9639,0.964518,0.964003,0.964168


[I 2025-03-30 05:58:06,958] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 0.00013656273349393398, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4643,0.184143,0.9378,0.940551,0.937881,0.938058
2,0.1018,0.122665,0.9598,0.960125,0.960015,0.959929
3,0.046,0.118513,0.964,0.964471,0.964185,0.964242
4,0.0202,0.126552,0.9667,0.966969,0.966898,0.96685


[I 2025-03-30 06:03:39,936] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 0.00035245706866971816, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4611,0.260278,0.9093,0.919522,0.909054,0.910429
2,0.161,0.213905,0.9273,0.929813,0.927883,0.927233


[I 2025-03-30 06:06:26,330] Trial 108 pruned. 


Trial 109 with params: {'learning_rate': 8.967758113070001e-05, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4971,0.164796,0.9488,0.950274,0.948963,0.949021
2,0.0919,0.139902,0.952,0.952738,0.952196,0.952106
3,0.0401,0.12399,0.9634,0.963607,0.963576,0.963521
4,0.0163,0.123507,0.9689,0.969029,0.969037,0.969018
5,0.0044,0.135821,0.9691,0.969304,0.969253,0.969241
6,0.0015,0.132404,0.971,0.971226,0.971109,0.97116
7,0.0005,0.132394,0.973,0.973215,0.973103,0.973147


[I 2025-03-30 06:16:16,338] Trial 109 finished with value: 0.9731471393902427 and parameters: {'learning_rate': 8.967758113070001e-05, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 110 with params: {'learning_rate': 8.151680199857094e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4999,0.170178,0.9435,0.945741,0.943742,0.943889
2,0.0946,0.122508,0.9607,0.960957,0.9609,0.96081
3,0.0383,0.11892,0.9661,0.966055,0.966297,0.966145
4,0.0154,0.138279,0.9642,0.964755,0.964334,0.96432


[I 2025-03-30 06:21:53,230] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 5.320320856550781e-05, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5919,0.167232,0.9477,0.949095,0.947801,0.94804
2,0.0976,0.132922,0.9569,0.957303,0.957018,0.957024
3,0.0422,0.12692,0.9625,0.96262,0.962637,0.962568
4,0.016,0.136806,0.9648,0.964908,0.964941,0.96491


[I 2025-03-30 06:27:27,605] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 9.712315393149582e-05, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4901,0.167578,0.9442,0.9468,0.944355,0.944523
2,0.0901,0.116376,0.9617,0.961936,0.961906,0.961842
3,0.0403,0.126801,0.9636,0.964275,0.963725,0.963785
4,0.0184,0.131314,0.9679,0.968178,0.968073,0.968014
5,0.006,0.130232,0.9712,0.971257,0.971352,0.97128
6,0.0016,0.133706,0.9722,0.972338,0.972348,0.972333
7,0.0007,0.133325,0.9724,0.972443,0.972554,0.972489


[I 2025-03-30 06:37:14,385] Trial 112 finished with value: 0.9724886290579775 and parameters: {'learning_rate': 9.712315393149582e-05, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 113 with params: {'learning_rate': 9.531486307332714e-05, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4951,0.175469,0.9426,0.946678,0.942685,0.943326
2,0.0917,0.116126,0.9613,0.961878,0.961368,0.961519
3,0.0382,0.123098,0.9642,0.964553,0.964336,0.964319
4,0.0169,0.132723,0.9683,0.968395,0.968462,0.968397
5,0.0057,0.132848,0.9706,0.971017,0.97068,0.970751
6,0.0014,0.129039,0.9726,0.972825,0.972685,0.972728
7,0.0005,0.128906,0.9733,0.973419,0.973396,0.973392


[I 2025-03-30 06:46:59,315] Trial 113 finished with value: 0.9733916772087708 and parameters: {'learning_rate': 9.531486307332714e-05, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 114 with params: {'learning_rate': 0.00011092880575613935, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4812,0.176779,0.9418,0.946343,0.941884,0.942691
2,0.0918,0.133048,0.9555,0.955985,0.95573,0.955618
3,0.0401,0.120944,0.9654,0.965693,0.965508,0.965544
4,0.0193,0.129736,0.9659,0.966233,0.966026,0.966089
5,0.0067,0.137211,0.9701,0.970667,0.970268,0.97027
6,0.0018,0.134849,0.9719,0.97203,0.97199,0.972003
7,0.0005,0.135013,0.9716,0.971683,0.971742,0.971706


[I 2025-03-30 06:56:44,490] Trial 114 finished with value: 0.9717064882341813 and parameters: {'learning_rate': 0.00011092880575613935, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 115 with params: {'learning_rate': 0.0001745698577191295, 'weight_decay': 0.005, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4495,0.19754,0.9357,0.938077,0.935965,0.935656
2,0.1064,0.166857,0.9457,0.94716,0.945957,0.945719


[I 2025-03-30 06:59:30,539] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 5.2717703637833475e-05, 'weight_decay': 0.006, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6035,0.16351,0.9484,0.950402,0.948422,0.948996
2,0.1055,0.123079,0.9616,0.96219,0.961754,0.961792
3,0.0446,0.125655,0.9638,0.964023,0.963881,0.9639
4,0.0182,0.133344,0.9668,0.966967,0.966911,0.96691
5,0.0065,0.145029,0.9676,0.967876,0.967676,0.967725
6,0.0025,0.152239,0.9671,0.967444,0.967159,0.967265
7,0.0012,0.154766,0.9677,0.967872,0.967802,0.967823


[I 2025-03-30 07:10:04,584] Trial 116 finished with value: 0.9678226720898128 and parameters: {'learning_rate': 5.2717703637833475e-05, 'weight_decay': 0.006, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 117 with params: {'learning_rate': 0.0027121193476131807, 'weight_decay': 0.009000000000000001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.499,1.830881,0.3853,0.411548,0.386306,0.358818
2,1.32,1.377711,0.5035,0.547051,0.504523,0.478944


[I 2025-03-30 07:12:51,402] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.00010188457962913947, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4955,0.175822,0.9405,0.945125,0.94065,0.941479
2,0.0933,0.129425,0.9588,0.95942,0.958979,0.958954
3,0.0428,0.126026,0.9642,0.964344,0.964223,0.964208
4,0.0184,0.130308,0.9655,0.96589,0.965592,0.965662


[I 2025-03-30 07:18:28,106] Trial 118 pruned. 


Trial 119 with params: {'learning_rate': 0.00019675195405497828, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4404,0.202859,0.9337,0.938311,0.933846,0.934515
2,0.1063,0.148895,0.9527,0.954249,0.95279,0.953048
3,0.0603,0.131246,0.9603,0.96085,0.960529,0.96052
4,0.0298,0.136256,0.9631,0.963482,0.963263,0.963306


[I 2025-03-30 07:24:01,369] Trial 119 pruned. 


Trial 120 with params: {'learning_rate': 0.00013501072872136455, 'weight_decay': 0.006, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4414,0.175811,0.9437,0.946499,0.943714,0.944056
2,0.096,0.135413,0.9547,0.955932,0.954841,0.95505


[I 2025-03-30 07:26:47,918] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.00013070972010450813, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4714,0.196855,0.9345,0.940368,0.934741,0.935331
2,0.0981,0.136377,0.9555,0.956317,0.955712,0.955815
3,0.0485,0.121169,0.9635,0.963964,0.963576,0.963664
4,0.0203,0.140132,0.9641,0.964639,0.964268,0.964275


[I 2025-03-30 07:32:30,659] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 6.540779845278149e-05, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5374,0.161358,0.949,0.951307,0.948992,0.949528
2,0.0965,0.135395,0.9556,0.956149,0.955745,0.955729
3,0.0463,0.127693,0.9626,0.963289,0.962672,0.962801
4,0.0182,0.135122,0.966,0.966208,0.966113,0.966114


[I 2025-03-30 07:38:04,456] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 0.00011008229952490226, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4686,0.1871,0.9382,0.942538,0.938205,0.938872
2,0.0913,0.131699,0.9579,0.958776,0.958024,0.95804
3,0.0412,0.118333,0.9653,0.965705,0.965333,0.965453
4,0.0165,0.135455,0.9669,0.967065,0.967029,0.967
5,0.0062,0.135735,0.9691,0.969256,0.969276,0.969221
6,0.0014,0.135739,0.9721,0.972386,0.972157,0.972247
7,0.0005,0.134396,0.9719,0.972056,0.972039,0.972026


[I 2025-03-30 07:47:47,783] Trial 123 finished with value: 0.9720258086672494 and parameters: {'learning_rate': 0.00011008229952490226, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 124 with params: {'learning_rate': 8.497260814999432e-05, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5085,0.147665,0.9531,0.954428,0.953174,0.9534
2,0.0908,0.129249,0.9596,0.960254,0.959781,0.959748
3,0.0414,0.122767,0.9653,0.965769,0.965378,0.965474
4,0.0164,0.130805,0.9676,0.967976,0.967638,0.967762
5,0.0049,0.141563,0.9694,0.969615,0.969525,0.969552
6,0.0016,0.142032,0.9703,0.970556,0.970358,0.970429
7,0.0006,0.141238,0.972,0.972178,0.972097,0.972123


[I 2025-03-30 07:57:35,876] Trial 124 finished with value: 0.9721228189106135 and parameters: {'learning_rate': 8.497260814999432e-05, 'weight_decay': 0.008, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 125 with params: {'learning_rate': 0.00010705347971676416, 'weight_decay': 0.007, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4931,0.195239,0.9331,0.937072,0.933398,0.93332
2,0.0915,0.120881,0.9603,0.960761,0.960493,0.960503
3,0.0431,0.122319,0.9647,0.965341,0.964764,0.964925
4,0.0184,0.135559,0.9656,0.966025,0.965672,0.965702
5,0.0079,0.129502,0.9711,0.971342,0.971268,0.971249
6,0.0017,0.130699,0.9707,0.970995,0.970808,0.970879
7,0.0006,0.129199,0.9719,0.972165,0.972024,0.972061


[I 2025-03-30 08:07:38,835] Trial 125 finished with value: 0.9720614753577481 and parameters: {'learning_rate': 0.00010705347971676416, 'weight_decay': 0.007, 'warmup_steps': 29}. Best is trial 67 with value: 0.9746290627725797.


Trial 126 with params: {'learning_rate': 0.00015536347405307435, 'weight_decay': 0.007, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.445,0.204069,0.9339,0.937964,0.933977,0.934634
2,0.1008,0.137909,0.9538,0.954446,0.95401,0.953844


[I 2025-03-30 08:10:25,062] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.0016071794381718252, 'weight_decay': 0.001, 'warmup_steps': 4}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.9394,0.828077,0.7233,0.766336,0.72332,0.725879
2,0.5621,0.523616,0.8266,0.84259,0.827186,0.821433
3,0.3934,0.436955,0.8542,0.860253,0.854677,0.852696
4,0.2771,0.303247,0.9029,0.907215,0.902895,0.904158


[I 2025-03-30 08:15:57,771] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.0003399928889305973, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4656,0.332113,0.8892,0.902108,0.889466,0.889477
2,0.1523,0.184879,0.9398,0.940712,0.940291,0.939544
3,0.089,0.143696,0.956,0.95619,0.956114,0.9561
4,0.0491,0.137035,0.9608,0.961054,0.960918,0.960896


[I 2025-03-30 08:21:52,688] Trial 128 pruned. 


Trial 129 with params: {'learning_rate': 8.439056309864063e-05, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5198,0.168952,0.9443,0.947392,0.944472,0.944931
2,0.0949,0.130793,0.957,0.957639,0.957154,0.957066
3,0.0406,0.119988,0.964,0.964499,0.964052,0.964174
4,0.0176,0.129575,0.968,0.968493,0.968113,0.96808
5,0.0059,0.128367,0.9688,0.968991,0.968921,0.968911
6,0.0019,0.133195,0.9708,0.971114,0.970908,0.970974
7,0.0007,0.130736,0.9726,0.972769,0.972712,0.972736


[I 2025-03-30 08:31:43,052] Trial 129 finished with value: 0.9727355858752323 and parameters: {'learning_rate': 8.439056309864063e-05, 'weight_decay': 0.008, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 130 with params: {'learning_rate': 5.039116231539376e-05, 'weight_decay': 0.004, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6051,0.170574,0.9463,0.948633,0.946257,0.946861
2,0.1047,0.131284,0.9586,0.959057,0.958788,0.958741
3,0.0445,0.122122,0.9644,0.964512,0.964515,0.964473
4,0.0164,0.134896,0.9659,0.966116,0.966005,0.966044


[I 2025-03-30 08:37:17,959] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 9.865676035790842e-05, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4997,0.176426,0.942,0.946689,0.94204,0.942873
2,0.0936,0.129945,0.958,0.95901,0.957989,0.958273
3,0.0433,0.111188,0.9669,0.967132,0.967039,0.96706
4,0.0173,0.125201,0.967,0.967184,0.967165,0.967119
5,0.0069,0.125166,0.9714,0.971555,0.971514,0.9715
6,0.0018,0.128501,0.971,0.971184,0.97113,0.971136
7,0.0006,0.128971,0.9724,0.972513,0.972526,0.972509


[I 2025-03-30 08:47:02,789] Trial 131 finished with value: 0.9725088269779644 and parameters: {'learning_rate': 9.865676035790842e-05, 'weight_decay': 0.006, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 132 with params: {'learning_rate': 0.0001031135430792371, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.477,0.174832,0.941,0.945864,0.941355,0.941561
2,0.0886,0.12127,0.9575,0.958093,0.957661,0.957642
3,0.0391,0.125873,0.964,0.964698,0.963992,0.964121
4,0.0174,0.130478,0.9675,0.967788,0.967628,0.967621
5,0.0062,0.133686,0.9696,0.96963,0.969751,0.969655
6,0.0018,0.131681,0.971,0.971135,0.971104,0.971113
7,0.0005,0.132101,0.9724,0.972531,0.972488,0.972494


[I 2025-03-30 08:56:45,708] Trial 132 finished with value: 0.9724943263816586 and parameters: {'learning_rate': 0.0001031135430792371, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 133 with params: {'learning_rate': 0.00024272350993485774, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4416,0.240931,0.9179,0.926741,0.91819,0.918118
2,0.1249,0.196565,0.9355,0.939339,0.935664,0.935763
3,0.0654,0.135111,0.9603,0.960635,0.960378,0.960467
4,0.0389,0.118608,0.9658,0.966313,0.965902,0.966037
5,0.0145,0.133749,0.9679,0.968311,0.968078,0.968119
6,0.0041,0.123594,0.9718,0.972237,0.971876,0.972002
7,0.0007,0.119107,0.9723,0.97237,0.972453,0.972402


[I 2025-03-30 09:07:12,082] Trial 133 finished with value: 0.9724015868347069 and parameters: {'learning_rate': 0.00024272350993485774, 'weight_decay': 0.006, 'warmup_steps': 24}. Best is trial 67 with value: 0.9746290627725797.


Trial 134 with params: {'learning_rate': 0.00011822094169472689, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4756,0.214333,0.9309,0.936297,0.931096,0.931417
2,0.0956,0.123982,0.9598,0.959998,0.960041,0.9599
3,0.0436,0.130867,0.9623,0.962942,0.962474,0.962571
4,0.0187,0.137892,0.9641,0.964398,0.964288,0.964261


[I 2025-03-30 09:12:45,121] Trial 134 pruned. 


Trial 135 with params: {'learning_rate': 0.0001554632484654868, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4367,0.190347,0.9363,0.942769,0.936471,0.937506
2,0.098,0.135401,0.9565,0.956915,0.956738,0.956654
3,0.0483,0.134247,0.9616,0.962221,0.961697,0.961786
4,0.0244,0.125289,0.9664,0.966427,0.966617,0.966414
5,0.0082,0.123202,0.9712,0.971458,0.971344,0.971347
6,0.0021,0.125235,0.9711,0.971504,0.97122,0.97129
7,0.0006,0.120516,0.9736,0.973783,0.973711,0.973724


[I 2025-03-30 09:22:29,854] Trial 135 finished with value: 0.9737240545587218 and parameters: {'learning_rate': 0.0001554632484654868, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 136 with params: {'learning_rate': 0.0001491088894733688, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4437,0.193755,0.9342,0.938147,0.934399,0.934664
2,0.0959,0.135124,0.9566,0.956872,0.956778,0.95661
3,0.05,0.115838,0.9654,0.965467,0.96557,0.965484
4,0.0217,0.142573,0.9631,0.963248,0.963335,0.96313


[I 2025-03-30 09:28:30,756] Trial 136 pruned. 


Trial 137 with params: {'learning_rate': 0.00020373553713241103, 'weight_decay': 0.01, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4307,0.204079,0.9305,0.936795,0.930902,0.93104
2,0.1125,0.156992,0.951,0.951342,0.951205,0.950917
3,0.0613,0.132714,0.9586,0.958946,0.958776,0.958741
4,0.0296,0.126875,0.9656,0.966118,0.965588,0.965739
5,0.0124,0.136907,0.968,0.968456,0.968095,0.968057
6,0.003,0.120398,0.9711,0.971344,0.971209,0.971254
7,0.0007,0.11675,0.9726,0.972669,0.972761,0.972696


[I 2025-03-30 09:38:21,843] Trial 137 finished with value: 0.9726962609113586 and parameters: {'learning_rate': 0.00020373553713241103, 'weight_decay': 0.01, 'warmup_steps': 21}. Best is trial 67 with value: 0.9746290627725797.


Trial 138 with params: {'learning_rate': 0.00021360881101219152, 'weight_decay': 0.009000000000000001, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4505,0.211936,0.9319,0.936072,0.932058,0.932434
2,0.1184,0.145459,0.9511,0.951376,0.951423,0.951061
3,0.0594,0.121639,0.9628,0.963214,0.962852,0.962987
4,0.0284,0.131662,0.9647,0.965043,0.964897,0.964905


[I 2025-03-30 09:43:55,121] Trial 138 pruned. 


Trial 139 with params: {'learning_rate': 7.779035601268777e-05, 'weight_decay': 0.01, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5024,0.165126,0.9452,0.948349,0.945357,0.945766
2,0.0889,0.120134,0.9618,0.962133,0.96194,0.961959
3,0.0385,0.124889,0.9625,0.962643,0.962648,0.962596
4,0.0147,0.134513,0.966,0.966071,0.966156,0.966097


[I 2025-03-30 09:49:28,610] Trial 139 pruned. 


Trial 140 with params: {'learning_rate': 0.0001273571460488291, 'weight_decay': 0.01, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.463,0.197419,0.9338,0.938253,0.933963,0.93433
2,0.0968,0.150764,0.9499,0.951056,0.950199,0.949997
3,0.0439,0.12697,0.9614,0.961796,0.961594,0.961592
4,0.0209,0.143053,0.9645,0.964784,0.964609,0.964623


[I 2025-03-30 09:55:18,624] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 0.00011871615512500498, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4633,0.174251,0.9423,0.946695,0.942337,0.943179
2,0.0935,0.122295,0.961,0.961437,0.961077,0.961088
3,0.044,0.11873,0.9642,0.964963,0.964433,0.964385
4,0.0182,0.124497,0.9689,0.96898,0.96906,0.969
5,0.0056,0.130972,0.9684,0.968876,0.968548,0.968604
6,0.0017,0.12967,0.9708,0.970972,0.970933,0.97094
7,0.0004,0.128852,0.9719,0.972016,0.972049,0.97202


[I 2025-03-30 10:05:35,594] Trial 141 finished with value: 0.9720196300173409 and parameters: {'learning_rate': 0.00011871615512500498, 'weight_decay': 0.006, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 142 with params: {'learning_rate': 6.322290328638982e-05, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5932,0.168159,0.9462,0.947418,0.946289,0.946424
2,0.0945,0.116851,0.9616,0.961874,0.96167,0.961716
3,0.0382,0.115456,0.966,0.966086,0.966153,0.966082
4,0.0144,0.132907,0.9667,0.966842,0.966826,0.966821
5,0.005,0.145341,0.9665,0.966601,0.966665,0.966585
6,0.0018,0.14751,0.9679,0.968132,0.968022,0.968057
7,0.0008,0.14704,0.9683,0.96838,0.968435,0.968397


[I 2025-03-30 10:15:20,642] Trial 142 finished with value: 0.9683973996710821 and parameters: {'learning_rate': 6.322290328638982e-05, 'weight_decay': 0.008, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 143 with params: {'learning_rate': 9.817682263120722e-05, 'weight_decay': 0.007, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.47,0.166883,0.9458,0.948582,0.945946,0.946128
2,0.0883,0.139596,0.9551,0.955908,0.955327,0.955096


[I 2025-03-30 10:18:06,309] Trial 143 pruned. 


Trial 144 with params: {'learning_rate': 0.00012998111661535324, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4524,0.189832,0.9366,0.939831,0.936816,0.936797
2,0.0929,0.131191,0.956,0.95642,0.956239,0.956027
3,0.0456,0.120929,0.9628,0.963373,0.962981,0.962971
4,0.022,0.132565,0.9662,0.966453,0.966349,0.966343
5,0.0072,0.133132,0.9702,0.970478,0.970286,0.970335
6,0.0023,0.132746,0.9715,0.971826,0.971625,0.971686
7,0.0005,0.133915,0.9723,0.972465,0.972457,0.972449


[I 2025-03-30 10:27:53,000] Trial 144 finished with value: 0.9724491874340389 and parameters: {'learning_rate': 0.00012998111661535324, 'weight_decay': 0.005, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 145 with params: {'learning_rate': 0.00033617132254965394, 'weight_decay': 0.008, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4433,0.2491,0.9194,0.924907,0.919641,0.919943
2,0.1543,0.184701,0.9366,0.938142,0.936901,0.936862


[I 2025-03-30 10:30:40,478] Trial 145 pruned. 


Trial 146 with params: {'learning_rate': 0.0001420927094320487, 'weight_decay': 0.008, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4332,0.177322,0.9417,0.944145,0.941765,0.941982
2,0.0987,0.147004,0.9513,0.952629,0.95151,0.951653
3,0.0491,0.13949,0.9595,0.959934,0.959731,0.959562
4,0.0223,0.126913,0.9685,0.968732,0.968658,0.968659
5,0.0084,0.134333,0.9701,0.97039,0.970262,0.970246
6,0.0026,0.134138,0.9713,0.971466,0.971443,0.971443
7,0.0005,0.133518,0.9717,0.971862,0.971848,0.971846


[I 2025-03-30 10:40:29,461] Trial 146 finished with value: 0.9718459221200899 and parameters: {'learning_rate': 0.0001420927094320487, 'weight_decay': 0.008, 'warmup_steps': 20}. Best is trial 67 with value: 0.9746290627725797.


Trial 147 with params: {'learning_rate': 0.0003079538495067879, 'weight_decay': 0.008, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4617,0.253198,0.9165,0.918857,0.916963,0.916414
2,0.1417,0.230481,0.9276,0.932726,0.92799,0.92701


[I 2025-03-30 10:43:17,116] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 9.806964711146234e-05, 'weight_decay': 0.005, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5107,0.20066,0.933,0.941627,0.933199,0.934555
2,0.0923,0.119668,0.9598,0.96045,0.959939,0.959969
3,0.0409,0.121229,0.965,0.965443,0.965102,0.965193
4,0.0158,0.132176,0.967,0.967488,0.967103,0.967153
5,0.0067,0.124224,0.9712,0.971412,0.971332,0.971347
6,0.0015,0.125096,0.9724,0.972615,0.972512,0.972556
7,0.0006,0.124561,0.974,0.974159,0.974137,0.974144


[I 2025-03-30 10:53:28,623] Trial 148 finished with value: 0.9741444675782933 and parameters: {'learning_rate': 9.806964711146234e-05, 'weight_decay': 0.005, 'warmup_steps': 31}. Best is trial 67 with value: 0.9746290627725797.


Trial 149 with params: {'learning_rate': 0.00015367446969151156, 'weight_decay': 0.008, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4403,0.20293,0.9333,0.938917,0.933727,0.933899
2,0.0976,0.138907,0.9527,0.954045,0.95301,0.952865
3,0.0469,0.12047,0.964,0.964423,0.96416,0.964183
4,0.0227,0.143753,0.9622,0.962246,0.962467,0.962233


[I 2025-03-30 10:59:03,079] Trial 149 pruned. 


In [19]:
print(best_base)

BestRun(run_id='67', objective=0.9746290627725797, hyperparameters={'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}, run_summary=None)


In [20]:
base.reset_seed()

## Prohledávání s destilací nad původním datasetem
Konfigurace jednotlivých tréninků.

In [21]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-KD_hp-search", logging_dir=f"~/logs/{DATASET}/-KD_hp-search",  remove_unused_columns=False, epochs=num_epochs, batch_size=batch_size)

Definice hledaných hyperparametrů a jejich rozmezí, rozšířeno o hyperparametry destilace.

In [22]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up),
        "lambda_param": trial.suggest_float("lambda_param",0,1,step=.1),
        "temperature": trial.suggest_float("temperature", 2,7, step=.5)
    }
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [23]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



Konfigurace destilačního trenéra pro jednotlivé tréninky.

In [24]:
trainer = base.DistilTrainer(
    args=training_args,
    train_dataset=train,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [25]:
best_distill = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Distill",
    n_trials=150
)

[I 2025-03-30 10:59:03,800] A new study created in memory with name: Distill


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3557,0.212412,0.9306,0.935032,0.930799,0.931365
2,0.1754,0.18813,0.9481,0.949182,0.948338,0.948282


[I 2025-03-30 11:01:53,132] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.00010255552094216992, 'weight_decay': 0.0, 'warmup_steps': 28, 'lambda_param': 0.6000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3818,0.182053,0.9508,0.952567,0.950948,0.951066
2,0.1515,0.160317,0.9634,0.963918,0.963531,0.963576
3,0.1289,0.152544,0.9663,0.966602,0.966435,0.966452
4,0.1185,0.146904,0.9712,0.971227,0.971397,0.97127


[I 2025-03-30 11:07:28,309] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 5.497167787383099e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4329,0.185381,0.9494,0.950536,0.949473,0.949682
2,0.1553,0.160801,0.9609,0.961298,0.961113,0.961059


[I 2025-03-30 11:10:14,755] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3487,0.185177,0.948,0.950778,0.948282,0.948494
2,0.1488,0.165327,0.9602,0.960533,0.960442,0.960264
3,0.1288,0.158627,0.9628,0.963369,0.963121,0.962927
4,0.1196,0.148593,0.9675,0.968181,0.967725,0.967667


[I 2025-03-30 11:16:05,982] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.0008369042894376068, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4396,0.36551,0.8545,0.861139,0.85528,0.854383
2,0.267,0.301993,0.8838,0.892392,0.884524,0.88383


[I 2025-03-30 11:18:53,396] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.0018591820902866042, 'weight_decay': 0.002, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7343,0.696554,0.6657,0.712928,0.667039,0.652811
2,0.4866,0.434172,0.8197,0.829463,0.819996,0.816106
3,0.3655,0.367814,0.8528,0.855263,0.852756,0.851811
4,0.2825,0.30841,0.881,0.892329,0.881014,0.88229


[I 2025-03-30 11:24:30,517] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0008204643365323959, 'weight_decay': 0.001, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4234,0.40445,0.8371,0.84239,0.837095,0.835689
2,0.2652,0.269981,0.9023,0.904466,0.902606,0.902513


[I 2025-03-30 11:27:18,198] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 0.0020690200562805084, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8218,0.699356,0.6694,0.688589,0.669481,0.664684
2,0.5543,0.539932,0.7604,0.790052,0.760039,0.765404
3,0.4209,0.4211,0.8226,0.829499,0.822219,0.821837
4,0.3312,0.33505,0.8714,0.880605,0.871141,0.873199


[I 2025-03-30 11:32:52,978] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3364,0.186132,0.9483,0.951483,0.948471,0.948956
2,0.1477,0.167558,0.9605,0.961283,0.960719,0.960673
3,0.1275,0.153626,0.9661,0.966731,0.966252,0.966318
4,0.1191,0.149911,0.9679,0.967981,0.968054,0.968007
5,0.1139,0.144899,0.9709,0.971303,0.971029,0.971056
6,0.1114,0.143224,0.9712,0.971625,0.971328,0.971353
7,0.1101,0.141783,0.972,0.972181,0.972151,0.972117


[I 2025-03-30 11:43:05,426] Trial 8 finished with value: 0.9721166180268345 and parameters: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 8 with value: 0.9721166180268345.


Trial 9 with params: {'learning_rate': 0.0010568529720322872, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5079,0.444718,0.8085,0.826369,0.809057,0.807433
2,0.3038,0.333859,0.8734,0.888738,0.873605,0.874092


[I 2025-03-30 11:45:52,643] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 5.622306732978549e-05, 'weight_decay': 0.004, 'warmup_steps': 6, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3921,0.178731,0.953,0.95432,0.953251,0.953311
2,0.1535,0.165265,0.9618,0.962035,0.961956,0.961907
3,0.1287,0.155834,0.9648,0.965164,0.964951,0.964924
4,0.1186,0.151742,0.9667,0.966901,0.96687,0.96685


[I 2025-03-30 11:51:28,426] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 0.00020808715310578245, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3604,0.198915,0.9416,0.943461,0.941996,0.941815
2,0.1619,0.175332,0.9541,0.954815,0.954274,0.954336
3,0.1384,0.162317,0.961,0.961478,0.961092,0.961174
4,0.1244,0.153341,0.9669,0.967243,0.967045,0.966979


[I 2025-03-30 11:57:06,022] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 0.00014318207047557446, 'weight_decay': 0.001, 'warmup_steps': 21, 'lambda_param': 0.8, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3487,0.189529,0.9457,0.948857,0.946004,0.946151
2,0.1538,0.166781,0.9602,0.961066,0.960386,0.960273
3,0.1312,0.160574,0.9629,0.962981,0.963057,0.962901
4,0.121,0.151962,0.9674,0.967371,0.967575,0.967394


[I 2025-03-30 12:02:40,396] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.0001679567168095784, 'weight_decay': 0.008, 'warmup_steps': 7, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3207,0.187072,0.9494,0.951799,0.949455,0.94978
2,0.155,0.172692,0.9557,0.956504,0.955954,0.955779
3,0.1339,0.150654,0.9716,0.971897,0.971743,0.971745
4,0.1213,0.148089,0.9703,0.970448,0.970469,0.970403


[I 2025-03-30 12:08:16,793] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 9.781484202771949e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 5, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3459,0.176221,0.9564,0.95723,0.956616,0.956565
2,0.1488,0.158917,0.9646,0.964786,0.96481,0.964717
3,0.1286,0.152697,0.9661,0.966667,0.966283,0.966268
4,0.1192,0.149028,0.9693,0.969748,0.969513,0.969453
5,0.1136,0.143631,0.9712,0.971804,0.971349,0.971395
6,0.1117,0.140586,0.9743,0.974509,0.974409,0.974422
7,0.11,0.139735,0.9733,0.973505,0.97344,0.973423


[I 2025-03-30 12:18:04,458] Trial 14 finished with value: 0.9734230267943806 and parameters: {'learning_rate': 9.781484202771949e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 5, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 14 with value: 0.9734230267943806.


Trial 15 with params: {'learning_rate': 0.00018002615153235487, 'weight_decay': 0.008, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3266,0.209343,0.9368,0.940688,0.937078,0.936974
2,0.1589,0.174025,0.9548,0.956361,0.954929,0.955084
3,0.1345,0.159286,0.9647,0.965132,0.964815,0.964853
4,0.1232,0.151518,0.9671,0.967592,0.967137,0.967288


[I 2025-03-30 12:23:40,138] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 7.384419630274902e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3541,0.184959,0.9507,0.952677,0.950962,0.950929
2,0.1493,0.164179,0.961,0.961764,0.961117,0.961262
3,0.1285,0.153277,0.9673,0.967406,0.967431,0.967384
4,0.1188,0.147851,0.9696,0.969767,0.969697,0.969713
5,0.1141,0.144161,0.9709,0.971293,0.971042,0.971039
6,0.1114,0.142653,0.9713,0.971468,0.971437,0.971433
7,0.1101,0.14191,0.9724,0.972476,0.972548,0.972501


[I 2025-03-30 12:33:28,702] Trial 16 finished with value: 0.9725011462973223 and parameters: {'learning_rate': 7.384419630274902e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 14 with value: 0.9734230267943806.


Trial 17 with params: {'learning_rate': 0.000124594001444187, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3145,0.201343,0.9375,0.942745,0.937894,0.93818
2,0.1527,0.176041,0.9551,0.956645,0.95534,0.955292
3,0.1313,0.157594,0.9641,0.964995,0.96425,0.964289
4,0.12,0.149439,0.9675,0.968093,0.967692,0.967659
5,0.1144,0.144078,0.9706,0.971143,0.970823,0.970777
6,0.1115,0.140014,0.9721,0.97235,0.972217,0.972249
7,0.11,0.138475,0.9733,0.973527,0.973429,0.973448


[I 2025-03-30 12:43:23,966] Trial 17 finished with value: 0.9734480866106539 and parameters: {'learning_rate': 0.000124594001444187, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 17 with value: 0.9734480866106539.


Trial 18 with params: {'learning_rate': 0.00014341173135625626, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3018,0.197755,0.942,0.945034,0.942297,0.942168
2,0.1525,0.162885,0.9617,0.961891,0.961867,0.961785
3,0.1318,0.158134,0.9638,0.964261,0.964041,0.963849
4,0.1197,0.150287,0.9688,0.968895,0.969032,0.968919
5,0.1151,0.142874,0.9711,0.971512,0.971302,0.971266
6,0.1115,0.138603,0.9764,0.976484,0.976566,0.976498
7,0.1102,0.138232,0.9752,0.975305,0.975382,0.975326


[I 2025-03-30 12:53:11,011] Trial 18 finished with value: 0.9753260189154085 and parameters: {'learning_rate': 0.00014341173135625626, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 19 with params: {'learning_rate': 0.00012899425163390336, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3174,0.192056,0.9472,0.949387,0.947434,0.947494
2,0.1497,0.165637,0.9616,0.961908,0.961822,0.961627
3,0.1289,0.159055,0.9618,0.963253,0.961916,0.962183
4,0.1203,0.146536,0.9718,0.971902,0.971908,0.971878
5,0.1143,0.144155,0.9722,0.972596,0.972356,0.972366
6,0.1115,0.140706,0.9725,0.972841,0.972628,0.972672
7,0.11,0.140125,0.9728,0.973019,0.972951,0.972959


[I 2025-03-30 13:02:56,970] Trial 19 finished with value: 0.9729594402061267 and parameters: {'learning_rate': 0.00012899425163390336, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 7.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 20 with params: {'learning_rate': 0.00041588197261701134, 'weight_decay': 0.009000000000000001, 'warmup_steps': 13, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3477,0.244635,0.9165,0.920994,0.916883,0.916772
2,0.1937,0.212577,0.9341,0.937127,0.934396,0.934604


[I 2025-03-30 13:05:44,632] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 7.917372034759902e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 10, 'lambda_param': 0.7000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3645,0.176224,0.954,0.955064,0.954162,0.954233
2,0.1489,0.16252,0.9631,0.963297,0.963262,0.963171
3,0.1277,0.154292,0.9668,0.967186,0.966979,0.966889
4,0.1185,0.149912,0.9673,0.967474,0.967443,0.967439


[I 2025-03-30 13:11:22,817] Trial 21 pruned. 


Trial 22 with params: {'learning_rate': 0.000447661846734586, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.8, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3318,0.263688,0.9048,0.911067,0.905185,0.905347
2,0.1982,0.206848,0.9376,0.938907,0.937881,0.937373


[I 2025-03-30 13:14:10,330] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 0.00021374902549225927, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.2947,0.201968,0.9398,0.940956,0.940067,0.93983
2,0.1633,0.199826,0.9423,0.944919,0.942361,0.942396
3,0.1389,0.164633,0.9598,0.960032,0.959995,0.959907
4,0.1248,0.154837,0.9655,0.96579,0.965615,0.965645


[I 2025-03-30 13:19:43,668] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 0.00010112961434437739, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.329,0.191202,0.9435,0.947631,0.943739,0.94401
2,0.1488,0.161645,0.9621,0.9627,0.962152,0.962303
3,0.128,0.151421,0.9679,0.968116,0.968094,0.968043
4,0.119,0.146577,0.9694,0.969643,0.969561,0.969585
5,0.1137,0.143095,0.9714,0.971875,0.971561,0.971589
6,0.1113,0.140731,0.9722,0.972464,0.97238,0.972352
7,0.11,0.139209,0.9729,0.97316,0.973053,0.973055


[I 2025-03-30 13:29:31,227] Trial 24 finished with value: 0.9730549110777786 and parameters: {'learning_rate': 0.00010112961434437739, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 25 with params: {'learning_rate': 5.761199644855385e-05, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3758,0.187012,0.9503,0.952849,0.950466,0.950698
2,0.1508,0.165607,0.9612,0.961632,0.961375,0.961348
3,0.1287,0.15379,0.9665,0.966663,0.966723,0.966619
4,0.1182,0.14964,0.9689,0.969185,0.969021,0.969076
5,0.1141,0.146575,0.9703,0.97054,0.970471,0.970446
6,0.1117,0.145736,0.9699,0.970295,0.970051,0.970093
7,0.1101,0.145006,0.9696,0.969843,0.969783,0.969744


[I 2025-03-30 13:40:30,487] Trial 25 finished with value: 0.9697437502776565 and parameters: {'learning_rate': 5.761199644855385e-05, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 6.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 26 with params: {'learning_rate': 0.00036673897334545683, 'weight_decay': 0.003, 'warmup_steps': 0, 'lambda_param': 0.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3179,0.240005,0.9178,0.9216,0.91823,0.918004
2,0.1857,0.214993,0.9347,0.938431,0.934802,0.935628


[I 2025-03-30 13:43:25,861] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.00018775431018063502, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3058,0.195814,0.9409,0.942722,0.941248,0.941142
2,0.159,0.174802,0.9551,0.956941,0.955129,0.95545
3,0.1346,0.155827,0.9674,0.96805,0.967475,0.967624
4,0.1233,0.154901,0.9666,0.96731,0.966704,0.966801


[I 2025-03-30 13:49:00,336] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 8.56035984463901e-05, 'weight_decay': 0.01, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3584,0.180193,0.9524,0.95426,0.952565,0.952763
2,0.149,0.165996,0.9616,0.96219,0.961725,0.961786
3,0.1281,0.153238,0.9659,0.966555,0.966046,0.966109
4,0.1181,0.148077,0.9692,0.969227,0.96941,0.969295
5,0.1138,0.144236,0.9722,0.972728,0.972341,0.972393
6,0.1115,0.142334,0.9722,0.97241,0.972326,0.972336
7,0.11,0.14133,0.9731,0.973294,0.973241,0.973229


[I 2025-03-30 13:58:58,507] Trial 28 finished with value: 0.973229236328114 and parameters: {'learning_rate': 8.56035984463901e-05, 'weight_decay': 0.01, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 4.5}. Best is trial 18 with value: 0.9753260189154085.


Trial 29 with params: {'learning_rate': 0.0011267334199977662, 'weight_decay': 0.007, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5079,0.514883,0.7843,0.811899,0.785394,0.777068
2,0.3273,0.32309,0.8762,0.882995,0.876168,0.875305


[I 2025-03-30 14:01:47,885] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 7.710729969126271e-05, 'weight_decay': 0.005, 'warmup_steps': 10, 'lambda_param': 0.2, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3658,0.180483,0.9517,0.953385,0.951963,0.952003
2,0.1491,0.165527,0.9594,0.95992,0.959553,0.9596
3,0.1285,0.154159,0.9667,0.967205,0.966882,0.96688
4,0.1185,0.149366,0.9678,0.967978,0.968021,0.967925


[I 2025-03-30 14:07:23,336] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 0.00016724791613560432, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.346,0.222287,0.9273,0.932231,0.927743,0.927487
2,0.1577,0.169437,0.9568,0.957385,0.957051,0.956957
3,0.1348,0.157775,0.966,0.966836,0.966162,0.966215
4,0.1222,0.149021,0.9684,0.968565,0.968569,0.968536
5,0.1157,0.142947,0.973,0.973158,0.97308,0.973097
6,0.1123,0.139199,0.9745,0.974715,0.974596,0.97462
7,0.1106,0.137768,0.9749,0.975069,0.975014,0.97502


[I 2025-03-30 14:17:11,728] Trial 31 finished with value: 0.9750200670357133 and parameters: {'learning_rate': 0.00016724791613560432, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.9, 'temperature': 4.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 32 with params: {'learning_rate': 0.00013553561983282748, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3572,0.190811,0.9428,0.947579,0.942932,0.943662
2,0.1538,0.175379,0.9552,0.956275,0.955348,0.955361
3,0.1329,0.154949,0.9644,0.966042,0.964504,0.964658
4,0.121,0.14986,0.9681,0.968513,0.968241,0.968267
5,0.1148,0.14153,0.9728,0.973431,0.972914,0.97298
6,0.1118,0.137841,0.9763,0.976481,0.976422,0.976408
7,0.1102,0.136758,0.9763,0.976497,0.976441,0.976412


[I 2025-03-30 14:27:05,829] Trial 32 finished with value: 0.9764120594095738 and parameters: {'learning_rate': 0.00013553561983282748, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 33 with params: {'learning_rate': 9.798842916219257e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3864,0.172762,0.9577,0.9584,0.957864,0.957895
2,0.1479,0.163882,0.9605,0.960941,0.960618,0.960679
3,0.1287,0.152813,0.9677,0.968224,0.967896,0.967814
4,0.1186,0.148163,0.9697,0.969905,0.969905,0.969835
5,0.1138,0.143475,0.972,0.972314,0.972107,0.97213
6,0.1112,0.141525,0.9729,0.97315,0.973055,0.97303
7,0.1099,0.140777,0.9729,0.972983,0.973043,0.972984


[I 2025-03-30 14:37:03,502] Trial 33 finished with value: 0.9729843818379591 and parameters: {'learning_rate': 9.798842916219257e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 34 with params: {'learning_rate': 0.0004765833477578671, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3712,0.283862,0.897,0.906492,0.897288,0.898169
2,0.2028,0.227252,0.9268,0.929384,0.926981,0.926536


[I 2025-03-30 14:39:52,818] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 5.479487851074696e-05, 'weight_decay': 0.01, 'warmup_steps': 21, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4237,0.181257,0.9542,0.955529,0.954367,0.954473
2,0.1533,0.163609,0.9637,0.964157,0.963849,0.963876
3,0.1296,0.154649,0.9641,0.964291,0.964266,0.964192
4,0.1191,0.148328,0.9691,0.96927,0.969281,0.96923
5,0.1146,0.147145,0.9702,0.970597,0.97035,0.970344
6,0.112,0.145104,0.9711,0.971382,0.971246,0.971247
7,0.1105,0.144626,0.9707,0.970947,0.970845,0.970829


[I 2025-03-30 14:49:40,812] Trial 35 finished with value: 0.9708288001325898 and parameters: {'learning_rate': 5.479487851074696e-05, 'weight_decay': 0.01, 'warmup_steps': 21, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 36 with params: {'learning_rate': 0.00025975114163242537, 'weight_decay': 0.01, 'warmup_steps': 18, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3407,0.218588,0.9265,0.933388,0.92692,0.927375
2,0.1705,0.194564,0.9425,0.944456,0.942634,0.942806


[I 2025-03-30 14:52:28,428] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 0.0001055526602227995, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.363,0.18189,0.9502,0.95292,0.950421,0.950672
2,0.151,0.163114,0.9616,0.961745,0.961824,0.961689
3,0.1285,0.155386,0.9643,0.965343,0.964499,0.964577
4,0.1185,0.146137,0.9697,0.969894,0.969865,0.969815
5,0.1138,0.143183,0.9719,0.972315,0.972025,0.972068
6,0.1112,0.14021,0.9735,0.973814,0.973614,0.973661
7,0.1099,0.138863,0.9745,0.974617,0.974659,0.974612


[I 2025-03-30 15:02:17,589] Trial 37 finished with value: 0.9746118380275559 and parameters: {'learning_rate': 0.0001055526602227995, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 38 with params: {'learning_rate': 9.646392086313548e-05, 'weight_decay': 0.008, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3709,0.188322,0.9462,0.94833,0.946603,0.946387
2,0.1485,0.158581,0.9643,0.964981,0.964414,0.964484
3,0.1275,0.15204,0.9681,0.968618,0.968199,0.968315
4,0.1186,0.149426,0.9679,0.968065,0.968089,0.968035


[I 2025-03-30 15:07:59,302] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 5.7230182765429275e-05, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4265,0.180569,0.9514,0.952157,0.95159,0.951609
2,0.1523,0.161044,0.9611,0.961569,0.961233,0.961258
3,0.1293,0.154021,0.9663,0.966823,0.966436,0.966464
4,0.1194,0.151002,0.9672,0.967283,0.967393,0.967305


[I 2025-03-30 15:13:35,139] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.002301313995834585, 'weight_decay': 0.007, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8981,0.980763,0.5324,0.639532,0.532476,0.519171
2,0.6495,0.666543,0.7057,0.725882,0.70624,0.699218
3,0.4994,0.570375,0.7465,0.80077,0.746346,0.751928
4,0.3953,0.442498,0.8084,0.838704,0.808332,0.812727


[I 2025-03-30 15:19:10,461] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 9.869734112270565e-05, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3873,0.181805,0.9507,0.952847,0.950869,0.951165
2,0.1486,0.16496,0.9599,0.960436,0.960083,0.959999
3,0.1277,0.15205,0.9682,0.96845,0.968428,0.968377
4,0.1186,0.145379,0.971,0.971293,0.971127,0.971184
5,0.1134,0.14303,0.9721,0.972535,0.972186,0.972279
6,0.1112,0.141171,0.9726,0.972856,0.972749,0.972748
7,0.11,0.139688,0.9727,0.972858,0.972882,0.972829


[I 2025-03-30 15:28:57,987] Trial 41 finished with value: 0.9728293194330183 and parameters: {'learning_rate': 9.869734112270565e-05, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 42 with params: {'learning_rate': 0.0003597284442432274, 'weight_decay': 0.006, 'warmup_steps': 22, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3551,0.250065,0.9157,0.919228,0.915965,0.916016
2,0.1881,0.210241,0.9345,0.93655,0.934823,0.934764
3,0.1548,0.174082,0.9544,0.954591,0.954607,0.954524
4,0.1353,0.160222,0.964,0.964181,0.964143,0.964126


[I 2025-03-30 15:34:33,127] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.0032088988731785663, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.2, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3704,1.657869,0.1034,0.075995,0.102519,0.047257
2,1.5161,1.517255,0.1373,0.094602,0.136857,0.075044


[I 2025-03-30 15:37:20,476] Trial 43 pruned. 


Trial 44 with params: {'learning_rate': 0.0014691315499909523, 'weight_decay': 0.009000000000000001, 'warmup_steps': 29, 'lambda_param': 0.9, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6272,0.562682,0.7521,0.77285,0.751865,0.749841
2,0.4046,0.389463,0.8419,0.847449,0.842599,0.83954
3,0.305,0.340875,0.8676,0.876097,0.867535,0.867179
4,0.2399,0.252262,0.9116,0.912983,0.911548,0.911825


[I 2025-03-30 15:43:29,246] Trial 44 pruned. 


Trial 45 with params: {'learning_rate': 0.004229168606699789, 'weight_decay': 0.009000000000000001, 'warmup_steps': 24, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4508,1.52468,0.1449,0.093415,0.145276,0.088245
2,1.5082,1.505709,0.1487,0.057201,0.148372,0.070046


[I 2025-03-30 15:46:16,892] Trial 45 pruned. 


Trial 46 with params: {'learning_rate': 0.00032851466793796933, 'weight_decay': 0.007, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3559,0.223177,0.9267,0.932606,0.926913,0.927708
2,0.183,0.203725,0.9385,0.940901,0.938672,0.938529
3,0.153,0.184675,0.95,0.952304,0.950025,0.950602
4,0.1338,0.160041,0.9628,0.963283,0.962985,0.962984


[I 2025-03-30 15:51:51,186] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 0.0025789104733638904, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.0701,1.567796,0.3463,0.483715,0.345982,0.320358
2,0.9351,0.974829,0.5181,0.552043,0.518281,0.508579


[I 2025-03-30 15:54:38,694] Trial 47 pruned. 


Trial 48 with params: {'learning_rate': 0.0027511979602444763, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5496,1.572123,0.0853,0.063885,0.087715,0.044881
2,1.5205,1.528469,0.1413,0.131581,0.14062,0.107607


[I 2025-03-30 15:57:25,858] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.0015898708923464957, 'weight_decay': 0.004, 'warmup_steps': 17, 'lambda_param': 0.1, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6412,0.574255,0.7476,0.765795,0.749107,0.747548
2,0.4232,0.375255,0.8507,0.862475,0.850888,0.850751
3,0.3185,0.32559,0.8744,0.880044,0.874527,0.875043
4,0.2486,0.26285,0.9047,0.907956,0.904687,0.905506


[I 2025-03-30 16:02:59,764] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 5.9361329005039714e-05, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3871,0.187124,0.9478,0.950539,0.947966,0.948114
2,0.1534,0.160328,0.9631,0.963759,0.963231,0.963328
3,0.1284,0.153439,0.9655,0.965712,0.965635,0.96566
4,0.1186,0.148924,0.9683,0.968385,0.968543,0.968418
5,0.1137,0.147572,0.969,0.969411,0.969164,0.969153
6,0.1116,0.144659,0.9698,0.970114,0.969974,0.969979
7,0.1101,0.144138,0.9709,0.971048,0.971058,0.971029


[I 2025-03-30 16:12:46,652] Trial 50 finished with value: 0.9710285170360388 and parameters: {'learning_rate': 5.9361329005039714e-05, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 51 with params: {'learning_rate': 0.00018204615991542676, 'weight_decay': 0.008, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3402,0.206386,0.9369,0.941645,0.936957,0.937803
2,0.1594,0.173622,0.9562,0.957019,0.956255,0.956474
3,0.1345,0.154647,0.9661,0.966978,0.966219,0.9664
4,0.1227,0.151076,0.9693,0.969805,0.969461,0.969508
5,0.1157,0.144026,0.9733,0.973492,0.973399,0.973409
6,0.1122,0.140447,0.9747,0.974885,0.974808,0.974822
7,0.1104,0.139311,0.9751,0.975294,0.975235,0.975237


[I 2025-03-30 16:22:34,243] Trial 51 finished with value: 0.9752365662843173 and parameters: {'learning_rate': 0.00018204615991542676, 'weight_decay': 0.008, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 52 with params: {'learning_rate': 0.00043569463522663814, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.323,0.265347,0.9089,0.910352,0.909367,0.908272
2,0.1978,0.211717,0.9353,0.937959,0.935669,0.935422
3,0.1616,0.193247,0.945,0.945979,0.944986,0.945115
4,0.1409,0.171935,0.9569,0.958221,0.956983,0.957242


[I 2025-03-30 16:28:09,842] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 0.00019274299123550742, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3535,0.199232,0.9437,0.945001,0.943839,0.943987
2,0.1617,0.172106,0.9559,0.956328,0.956242,0.956035
3,0.1379,0.156376,0.9673,0.967777,0.967369,0.967428
4,0.125,0.153188,0.9669,0.967447,0.967088,0.967088


[I 2025-03-30 16:33:43,991] Trial 53 pruned. 


Trial 54 with params: {'learning_rate': 0.0002340759161127536, 'weight_decay': 0.005, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3586,0.23,0.9213,0.928206,0.921453,0.921959
2,0.1691,0.192369,0.9447,0.94682,0.944852,0.945243
3,0.1421,0.162305,0.9616,0.962197,0.961697,0.961821
4,0.1272,0.154978,0.9663,0.966595,0.966334,0.966385


[I 2025-03-30 16:39:18,998] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.00046529059578259626, 'weight_decay': 0.007, 'warmup_steps': 18, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3621,0.254579,0.9102,0.918926,0.910549,0.911307
2,0.2034,0.210571,0.9345,0.936562,0.93467,0.934734


[I 2025-03-30 16:42:06,166] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 8.274316768557815e-05, 'weight_decay': 0.008, 'warmup_steps': 16, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3778,0.180278,0.9513,0.953356,0.951491,0.95154
2,0.1493,0.159726,0.9607,0.961644,0.9609,0.960984
3,0.1284,0.151752,0.9673,0.967371,0.967556,0.967373
4,0.118,0.144469,0.9717,0.971805,0.971867,0.971808
5,0.1137,0.141827,0.9733,0.973664,0.973477,0.973435
6,0.1113,0.140873,0.9736,0.973815,0.973722,0.973722
7,0.1099,0.140074,0.974,0.974101,0.974157,0.974104


[I 2025-03-30 16:51:54,171] Trial 56 finished with value: 0.974103624688017 and parameters: {'learning_rate': 8.274316768557815e-05, 'weight_decay': 0.008, 'warmup_steps': 16, 'lambda_param': 0.8, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 57 with params: {'learning_rate': 6.432079156127297e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3961,0.188329,0.9458,0.948334,0.946053,0.946218
2,0.1512,0.160956,0.9646,0.965254,0.964755,0.96481
3,0.1287,0.155237,0.9653,0.965662,0.965473,0.965437
4,0.1191,0.148097,0.9701,0.970286,0.970252,0.970214
5,0.1138,0.14443,0.9726,0.972971,0.972693,0.972718
6,0.1115,0.142386,0.9732,0.973311,0.973354,0.973319
7,0.11,0.141847,0.9721,0.97223,0.972264,0.972209


[I 2025-03-30 17:02:11,841] Trial 57 finished with value: 0.9722091702278648 and parameters: {'learning_rate': 6.432079156127297e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 0.5, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 58 with params: {'learning_rate': 0.00016893047242669506, 'weight_decay': 0.004, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3475,0.22251,0.924,0.933395,0.924394,0.925195
2,0.1571,0.168189,0.9589,0.959265,0.959031,0.959018
3,0.136,0.159036,0.9632,0.963336,0.963408,0.963295
4,0.1229,0.153184,0.9665,0.967639,0.966732,0.966723


[I 2025-03-30 17:07:46,876] Trial 58 pruned. 


Trial 59 with params: {'learning_rate': 0.0028085976163393445, 'weight_decay': 0.002, 'warmup_steps': 17, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.1243,1.457994,0.2861,0.380311,0.286232,0.235398
2,1.0453,1.216419,0.3836,0.487607,0.383443,0.339553
3,0.9143,0.97837,0.4984,0.591465,0.498591,0.49634
4,0.8279,0.869814,0.568,0.60785,0.568162,0.568652


[I 2025-03-30 17:13:22,068] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 5.976931804392223e-05, 'weight_decay': 0.008, 'warmup_steps': 14, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4049,0.185022,0.9505,0.952126,0.950679,0.950767
2,0.1501,0.157181,0.9645,0.964984,0.964656,0.964744
3,0.1278,0.152666,0.9662,0.966551,0.966332,0.96635
4,0.1178,0.149727,0.9677,0.967776,0.967899,0.967811


[I 2025-03-30 17:18:56,326] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.00016980566072716556, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3395,0.200779,0.9392,0.944286,0.939394,0.939963
2,0.1576,0.165803,0.9612,0.961797,0.96139,0.961421
3,0.1346,0.156845,0.9651,0.96544,0.965178,0.965239
4,0.1218,0.148291,0.9683,0.968578,0.968468,0.968478
5,0.1155,0.144012,0.9723,0.972591,0.972441,0.972393
6,0.1118,0.138811,0.9741,0.974252,0.97421,0.974215
7,0.1104,0.137963,0.9737,0.973895,0.973835,0.973832


[I 2025-03-30 17:28:41,477] Trial 61 finished with value: 0.973832399586948 and parameters: {'learning_rate': 0.00016980566072716556, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 62 with params: {'learning_rate': 0.00018710210526752272, 'weight_decay': 0.009000000000000001, 'warmup_steps': 12, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.327,0.216795,0.9296,0.934208,0.930024,0.929951
2,0.1576,0.179418,0.9515,0.952489,0.951712,0.951585


[I 2025-03-30 17:31:27,973] Trial 62 pruned. 


Trial 63 with params: {'learning_rate': 0.00023171151664329458, 'weight_decay': 0.008, 'warmup_steps': 12, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3201,0.212095,0.9318,0.938166,0.931946,0.932893
2,0.1655,0.181174,0.9516,0.952671,0.951742,0.951788
3,0.1393,0.166486,0.9594,0.959751,0.95961,0.959512
4,0.1264,0.154456,0.9662,0.966338,0.966381,0.966289


[I 2025-03-30 17:37:03,248] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.00014037189452890584, 'weight_decay': 0.01, 'warmup_steps': 20, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3497,0.185452,0.9493,0.951008,0.949532,0.949601
2,0.1547,0.165478,0.9596,0.960151,0.959713,0.959684
3,0.1319,0.156361,0.9644,0.96463,0.964633,0.96454
4,0.1205,0.146713,0.9702,0.970408,0.970432,0.970338
5,0.1145,0.142211,0.9732,0.973361,0.973351,0.973315
6,0.1116,0.139662,0.9745,0.974738,0.974642,0.974627
7,0.1101,0.138816,0.9744,0.974502,0.974581,0.974492


[I 2025-03-30 17:46:50,775] Trial 64 finished with value: 0.974492136838864 and parameters: {'learning_rate': 0.00014037189452890584, 'weight_decay': 0.01, 'warmup_steps': 20, 'lambda_param': 0.8, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 65 with params: {'learning_rate': 0.0002071169980195919, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.8, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3424,0.212899,0.9316,0.935337,0.931838,0.932093
2,0.1621,0.179913,0.9546,0.955551,0.954857,0.954646
3,0.1391,0.160608,0.9624,0.962863,0.962631,0.962535
4,0.1251,0.153168,0.966,0.966404,0.966229,0.966162


[I 2025-03-30 17:52:26,588] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 5.0098619486030555e-05, 'weight_decay': 0.007, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4287,0.183122,0.95,0.950904,0.950162,0.950178
2,0.1524,0.165513,0.9615,0.961837,0.961681,0.961637
3,0.1282,0.153858,0.9657,0.966263,0.965814,0.965901
4,0.1188,0.150746,0.9683,0.968597,0.968471,0.968417
5,0.1141,0.148432,0.9692,0.96963,0.969352,0.969368
6,0.1118,0.146714,0.9698,0.97001,0.969958,0.969948
7,0.1105,0.145982,0.9701,0.970308,0.970252,0.970233


[I 2025-03-30 18:02:13,594] Trial 66 finished with value: 0.9702332808336276 and parameters: {'learning_rate': 5.0098619486030555e-05, 'weight_decay': 0.007, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 67 with params: {'learning_rate': 0.00018437386835220431, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3525,0.223034,0.9247,0.933881,0.925196,0.925496
2,0.1577,0.166898,0.9602,0.960715,0.960357,0.960363
3,0.1344,0.15695,0.9637,0.964087,0.963829,0.963913
4,0.123,0.148577,0.9714,0.971616,0.971529,0.971515
5,0.1159,0.141565,0.974,0.974346,0.974128,0.974127
6,0.1122,0.138625,0.9744,0.974691,0.974475,0.974517
7,0.1105,0.137614,0.9755,0.975628,0.975627,0.975603


[I 2025-03-30 18:12:00,868] Trial 67 finished with value: 0.9756034131850368 and parameters: {'learning_rate': 0.00018437386835220431, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 68 with params: {'learning_rate': 0.00017557083916535206, 'weight_decay': 0.01, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3555,0.192884,0.9461,0.948304,0.946089,0.946632
2,0.1573,0.170393,0.9566,0.957141,0.956845,0.956665
3,0.1371,0.156116,0.9657,0.965863,0.96593,0.965767
4,0.1231,0.148301,0.9687,0.968893,0.968827,0.968791
5,0.1159,0.143751,0.972,0.972237,0.972167,0.972163
6,0.1123,0.139354,0.9725,0.972763,0.972604,0.972634
7,0.1104,0.137398,0.9743,0.974444,0.974436,0.974412


[I 2025-03-30 18:21:49,383] Trial 68 finished with value: 0.9744115924935196 and parameters: {'learning_rate': 0.00017557083916535206, 'weight_decay': 0.01, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 69 with params: {'learning_rate': 0.0004850647223008225, 'weight_decay': 0.009000000000000001, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3941,0.254605,0.9089,0.915069,0.90888,0.909468
2,0.2106,0.23048,0.9225,0.929752,0.922957,0.922853


[I 2025-03-30 18:24:36,521] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 0.0001812744264729855, 'weight_decay': 0.01, 'warmup_steps': 20, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3461,0.182199,0.9499,0.951387,0.950066,0.950171
2,0.1589,0.164744,0.9599,0.960428,0.960092,0.960064
3,0.1353,0.155177,0.9656,0.966197,0.965706,0.965847
4,0.1235,0.15502,0.967,0.967152,0.967157,0.967101


[I 2025-03-30 18:30:10,756] Trial 70 pruned. 


Trial 71 with params: {'learning_rate': 0.0004384700251936054, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30, 'lambda_param': 0.30000000000000004, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3804,0.254584,0.9123,0.915484,0.912913,0.911979
2,0.2024,0.221374,0.9294,0.932067,0.929641,0.929173
3,0.1647,0.186272,0.9487,0.949899,0.948775,0.949026
4,0.1417,0.166289,0.9611,0.961444,0.961198,0.961233


[I 2025-03-30 18:35:45,000] Trial 71 pruned. 


Trial 72 with params: {'learning_rate': 0.0001157193379607402, 'weight_decay': 0.01, 'warmup_steps': 30, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3732,0.187135,0.9473,0.949753,0.947616,0.947516
2,0.1503,0.172185,0.9557,0.956735,0.955959,0.955932
3,0.1295,0.151521,0.9678,0.968054,0.967958,0.967961
4,0.1197,0.146844,0.9712,0.971361,0.971376,0.971334
5,0.1143,0.143321,0.9708,0.971576,0.970961,0.971023
6,0.1117,0.139744,0.9732,0.973423,0.973326,0.973335
7,0.1103,0.138604,0.9749,0.975045,0.975072,0.975019


[I 2025-03-30 18:46:02,690] Trial 72 finished with value: 0.9750186842896913 and parameters: {'learning_rate': 0.0001157193379607402, 'weight_decay': 0.01, 'warmup_steps': 30, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 73 with params: {'learning_rate': 0.00016433668825404572, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3447,0.19176,0.9462,0.948953,0.946248,0.946636
2,0.1569,0.162277,0.9623,0.962926,0.96249,0.962574
3,0.1341,0.156675,0.9656,0.96597,0.965764,0.965817
4,0.1214,0.15066,0.9692,0.96956,0.969441,0.969348
5,0.1154,0.145824,0.9709,0.971678,0.971095,0.971093
6,0.112,0.14008,0.9738,0.973916,0.973952,0.97391
7,0.1103,0.138404,0.9757,0.975743,0.975859,0.975779


[I 2025-03-30 18:56:23,631] Trial 73 finished with value: 0.9757787767116017 and parameters: {'learning_rate': 0.00016433668825404572, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 74 with params: {'learning_rate': 0.00024147004208896432, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3454,0.211408,0.933,0.938416,0.933404,0.933459
2,0.1686,0.178357,0.9526,0.953069,0.952839,0.952669


[I 2025-03-30 18:59:11,772] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00011732516921287371, 'weight_decay': 0.01, 'warmup_steps': 29, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3658,0.18651,0.9467,0.949909,0.946781,0.947177
2,0.1522,0.163599,0.9597,0.960712,0.959727,0.959995
3,0.1302,0.157304,0.9638,0.964518,0.964046,0.964034
4,0.1189,0.150052,0.9688,0.968828,0.969016,0.968893
5,0.1139,0.143664,0.971,0.971478,0.971129,0.971197
6,0.1115,0.141027,0.9732,0.9735,0.973308,0.97337
7,0.1101,0.139608,0.9736,0.973831,0.973739,0.973757


[I 2025-03-30 19:08:59,478] Trial 75 finished with value: 0.9737566722834966 and parameters: {'learning_rate': 0.00011732516921287371, 'weight_decay': 0.01, 'warmup_steps': 29, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 76 with params: {'learning_rate': 6.499882416252976e-05, 'weight_decay': 0.008, 'warmup_steps': 24, 'lambda_param': 0.2, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.409,0.178598,0.9539,0.954815,0.954053,0.954103
2,0.1524,0.164578,0.9613,0.961614,0.961463,0.961414
3,0.129,0.154315,0.967,0.967099,0.967171,0.96706
4,0.1183,0.150236,0.9683,0.968421,0.968428,0.9684
5,0.1139,0.147812,0.9683,0.968845,0.968445,0.968466
6,0.1118,0.144484,0.9703,0.970612,0.970397,0.970427
7,0.1103,0.143547,0.9708,0.970922,0.970948,0.970883


[I 2025-03-30 19:18:47,739] Trial 76 finished with value: 0.970882645809992 and parameters: {'learning_rate': 6.499882416252976e-05, 'weight_decay': 0.008, 'warmup_steps': 24, 'lambda_param': 0.2, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 77 with params: {'learning_rate': 5.112429509287801e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4445,0.181496,0.9501,0.952788,0.950338,0.950538
2,0.1518,0.16298,0.961,0.961463,0.961128,0.961169
3,0.1289,0.154632,0.9642,0.964398,0.964328,0.964319
4,0.1183,0.151474,0.9666,0.966756,0.966773,0.966736


[I 2025-03-30 19:24:22,853] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.00027811595208962893, 'weight_decay': 0.004, 'warmup_steps': 3, 'lambda_param': 0.2, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3099,0.213037,0.9352,0.938137,0.935488,0.935351
2,0.1724,0.185315,0.9507,0.952196,0.951004,0.950766
3,0.1459,0.165353,0.9608,0.960995,0.960971,0.960928
4,0.1298,0.156759,0.9648,0.965641,0.964936,0.965078


[I 2025-03-30 19:29:58,093] Trial 78 pruned. 


Trial 79 with params: {'learning_rate': 0.0001886911249849553, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3422,0.213555,0.9299,0.938377,0.930249,0.931171
2,0.1603,0.169651,0.9597,0.960071,0.95987,0.959738
3,0.137,0.165168,0.9586,0.959365,0.958796,0.958879
4,0.1236,0.156781,0.9638,0.964327,0.963925,0.964046


[I 2025-03-30 19:35:32,247] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 0.0029063834285411286, 'weight_decay': 0.01, 'warmup_steps': 6, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5044,1.540771,0.1092,0.089861,0.112691,0.05803
2,1.5303,1.55446,0.1073,0.038104,0.105974,0.041242
3,1.5539,1.546471,0.1259,0.066471,0.124758,0.050592
4,1.5403,1.542118,0.1195,0.108646,0.118952,0.064561


[I 2025-03-30 19:41:07,192] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 0.00012529294005663154, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3553,0.180832,0.9498,0.953294,0.949933,0.950404
2,0.1502,0.162118,0.9623,0.962488,0.962497,0.962395
3,0.1304,0.153666,0.9654,0.966526,0.965531,0.965638
4,0.1199,0.146885,0.9729,0.973086,0.973043,0.973001
5,0.1141,0.141135,0.9741,0.974379,0.974241,0.974242
6,0.1115,0.139685,0.9738,0.974096,0.973943,0.973963
7,0.11,0.137842,0.9746,0.974804,0.974759,0.974716


[I 2025-03-30 19:51:21,176] Trial 81 finished with value: 0.9747164107173752 and parameters: {'learning_rate': 0.00012529294005663154, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 82 with params: {'learning_rate': 0.0001749207784979569, 'weight_decay': 0.01, 'warmup_steps': 16, 'lambda_param': 0.4, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3353,0.206263,0.9349,0.938002,0.935396,0.935076
2,0.159,0.166496,0.9609,0.961526,0.961027,0.961055
3,0.135,0.155126,0.9671,0.967236,0.967317,0.967233
4,0.1226,0.14932,0.9689,0.969049,0.969046,0.968993
5,0.1158,0.142377,0.9712,0.971365,0.971282,0.971305
6,0.1123,0.139334,0.9749,0.975028,0.97504,0.974992
7,0.1105,0.137911,0.9755,0.975587,0.975656,0.975601


[I 2025-03-30 20:01:10,356] Trial 82 finished with value: 0.9756009071838184 and parameters: {'learning_rate': 0.0001749207784979569, 'weight_decay': 0.01, 'warmup_steps': 16, 'lambda_param': 0.4, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 83 with params: {'learning_rate': 0.0002580032025713817, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.4, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3431,0.216035,0.9336,0.937974,0.933845,0.934399
2,0.1717,0.196299,0.944,0.944625,0.944206,0.943832


[I 2025-03-30 20:04:12,267] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.0002588018899047277, 'weight_decay': 0.009000000000000001, 'warmup_steps': 15, 'lambda_param': 0.2, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3345,0.208692,0.934,0.938839,0.934418,0.934423
2,0.1698,0.175797,0.9548,0.954927,0.955001,0.954798
3,0.1425,0.173455,0.9576,0.958385,0.957829,0.957728
4,0.1271,0.156149,0.9661,0.966251,0.966349,0.966187


[I 2025-03-30 20:09:50,178] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.00019889637834019354, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3374,0.189616,0.9458,0.946926,0.945947,0.946172
2,0.1614,0.175965,0.9567,0.95736,0.956886,0.95682
3,0.1366,0.163107,0.9604,0.961064,0.960566,0.960591
4,0.1247,0.152604,0.9681,0.968327,0.968192,0.968199


[I 2025-03-30 20:15:27,369] Trial 85 pruned. 


Trial 86 with params: {'learning_rate': 0.00015278486360010863, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3381,0.185679,0.9481,0.951137,0.948312,0.948558
2,0.1557,0.165597,0.9594,0.959352,0.959712,0.959398
3,0.1334,0.155223,0.9657,0.96595,0.965833,0.96586
4,0.1227,0.148632,0.9702,0.970466,0.97037,0.970279
5,0.1155,0.141972,0.9729,0.973468,0.973086,0.97304
6,0.1122,0.140205,0.973,0.973499,0.973132,0.973194
7,0.1104,0.138208,0.9737,0.973933,0.973829,0.973816


[I 2025-03-30 20:25:23,418] Trial 86 finished with value: 0.9738163054614686 and parameters: {'learning_rate': 0.00015278486360010863, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 87 with params: {'learning_rate': 0.00013537140485040273, 'weight_decay': 0.006, 'warmup_steps': 27, 'lambda_param': 0.4, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3668,0.19217,0.9416,0.945538,0.941974,0.941968
2,0.1538,0.161411,0.9617,0.961995,0.961847,0.961866
3,0.1305,0.1533,0.967,0.967381,0.967188,0.96709
4,0.1199,0.15088,0.9654,0.966007,0.96558,0.965573


[I 2025-03-30 20:31:11,901] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 0.0001033215290468983, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.30000000000000004, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3588,0.171795,0.9575,0.958062,0.957589,0.957714
2,0.1511,0.158637,0.9641,0.964435,0.964279,0.964275
3,0.1286,0.155178,0.9646,0.965101,0.964806,0.964766
4,0.1191,0.148693,0.9686,0.968806,0.968775,0.968741
5,0.1142,0.143048,0.9722,0.972413,0.97237,0.972304
6,0.1116,0.141215,0.9725,0.972744,0.972626,0.972649
7,0.1101,0.140317,0.9732,0.97333,0.973377,0.973322


[I 2025-03-30 20:41:11,891] Trial 88 finished with value: 0.9733221379420354 and parameters: {'learning_rate': 0.0001033215290468983, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.30000000000000004, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 89 with params: {'learning_rate': 0.00013193531441646044, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3525,0.199868,0.9416,0.945038,0.941795,0.942145
2,0.1528,0.164561,0.9587,0.959017,0.958966,0.958773
3,0.1306,0.151762,0.9669,0.967158,0.967131,0.967032
4,0.1197,0.144765,0.9734,0.973589,0.973496,0.973526
5,0.1151,0.144946,0.971,0.971806,0.971158,0.971215
6,0.1119,0.139577,0.9734,0.973674,0.973527,0.973565
7,0.1103,0.137958,0.9739,0.974037,0.974051,0.974031


[I 2025-03-30 20:51:06,501] Trial 89 finished with value: 0.9740306476430345 and parameters: {'learning_rate': 0.00013193531441646044, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 90 with params: {'learning_rate': 0.0011115662517499805, 'weight_decay': 0.004, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5212,0.410641,0.8337,0.844341,0.834078,0.833282
2,0.3272,0.328325,0.8695,0.877821,0.869563,0.869612
3,0.2529,0.300791,0.8857,0.892588,0.885482,0.885378
4,0.2013,0.218692,0.9304,0.932336,0.930563,0.930764


[I 2025-03-30 20:56:49,263] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.0002157871176617988, 'weight_decay': 0.009000000000000001, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3101,0.229145,0.9252,0.928173,0.9257,0.925362
2,0.1656,0.186202,0.9509,0.952606,0.951106,0.951179


[I 2025-03-30 20:59:43,674] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 0.0015837356481811218, 'weight_decay': 0.006, 'warmup_steps': 15, 'lambda_param': 0.1, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6416,0.553646,0.7551,0.775525,0.756181,0.752897
2,0.4099,0.415454,0.8268,0.837709,0.827922,0.821098
3,0.3174,0.35894,0.8581,0.867919,0.858045,0.858572
4,0.2524,0.273181,0.9044,0.905919,0.904349,0.903942


[I 2025-03-30 21:05:53,902] Trial 92 pruned. 


Trial 93 with params: {'learning_rate': 5.9059829250360414e-05, 'weight_decay': 0.008, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4466,0.191074,0.9464,0.947795,0.946538,0.946521
2,0.154,0.167823,0.96,0.960328,0.960146,0.960085
3,0.1289,0.155333,0.9658,0.966287,0.966015,0.965925
4,0.1193,0.150021,0.9682,0.968307,0.968371,0.96828
5,0.1143,0.147801,0.9693,0.969788,0.969456,0.96946
6,0.1119,0.145963,0.9706,0.970919,0.970763,0.970741
7,0.1103,0.14494,0.9709,0.971064,0.971056,0.971019


[I 2025-03-30 21:15:48,436] Trial 93 finished with value: 0.9710189663247982 and parameters: {'learning_rate': 5.9059829250360414e-05, 'weight_decay': 0.008, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 94 with params: {'learning_rate': 8.785284362480978e-05, 'weight_decay': 0.006, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3916,0.17503,0.9536,0.954733,0.953812,0.953885
2,0.149,0.157145,0.965,0.965488,0.965028,0.965164
3,0.1283,0.155567,0.9664,0.966837,0.966587,0.966514
4,0.1181,0.147408,0.9699,0.970041,0.970054,0.970015
5,0.1137,0.142978,0.9726,0.972812,0.972716,0.972729
6,0.1111,0.141445,0.9725,0.972809,0.972608,0.972677
7,0.1098,0.140168,0.9725,0.972588,0.972649,0.972599


Using the latest cached version of the module from /home/jovyan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/34c46321f42186df33a6260966e34a368f14868d9cc2ba47d142112e2800d233 (last modified on Tue Mar 25 13:21:31 2025) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
[I 2025-03-30 21:26:19,208] Trial 94 finished with value: 0.9725990447347085 and parameters: {'learning_rate': 8.785284362480978e-05, 'weight_decay': 0.006, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 95 with params: {'learning_rate': 0.00033622652480271855, 'weight_decay': 0.0, 'warmup_steps': 5, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3194,0.22867,0.9252,0.928747,0.925529,0.92559
2,0.1809,0.178087,0.9546,0.955043,0.954878,0.954772


[I 2025-03-30 21:29:11,013] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 5.399635979922363e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4406,0.183983,0.9507,0.951909,0.950874,0.950932
2,0.1544,0.16247,0.9613,0.961669,0.961419,0.961481
3,0.1293,0.153116,0.9669,0.967098,0.96711,0.966996
4,0.1188,0.149185,0.9679,0.968085,0.968078,0.968017
5,0.1141,0.146138,0.9703,0.970538,0.970494,0.970427
6,0.1117,0.145281,0.9705,0.970724,0.970618,0.970644
7,0.1104,0.143986,0.9697,0.969857,0.969851,0.969816


[I 2025-03-30 21:39:24,925] Trial 96 finished with value: 0.969816101362144 and parameters: {'learning_rate': 5.399635979922363e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 97 with params: {'learning_rate': 0.0002985710024151608, 'weight_decay': 0.008, 'warmup_steps': 26, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3544,0.20852,0.9328,0.935365,0.933206,0.933129
2,0.1762,0.204137,0.9408,0.9436,0.940765,0.94119


[I 2025-03-30 21:42:17,866] Trial 97 pruned. 


Trial 98 with params: {'learning_rate': 0.00010935130174798839, 'weight_decay': 0.007, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3653,0.183729,0.9481,0.950917,0.948132,0.948606
2,0.1508,0.164699,0.9623,0.962874,0.962483,0.962471
3,0.1291,0.154522,0.966,0.966215,0.966142,0.966111
4,0.119,0.146897,0.9691,0.969352,0.969292,0.969224
5,0.1142,0.144389,0.9712,0.971837,0.971304,0.971415
6,0.1116,0.140245,0.9737,0.973947,0.973828,0.973848
7,0.1101,0.139126,0.9733,0.973497,0.97344,0.97344


[I 2025-03-30 21:52:16,624] Trial 98 finished with value: 0.9734400880373913 and parameters: {'learning_rate': 0.00010935130174798839, 'weight_decay': 0.007, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 99 with params: {'learning_rate': 8.710007471084877e-05, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.371,0.177187,0.9549,0.956002,0.954946,0.955212
2,0.1502,0.158277,0.9642,0.964298,0.964379,0.964284
3,0.1287,0.155291,0.9637,0.965129,0.963854,0.963993
4,0.1191,0.148223,0.9695,0.969717,0.969641,0.969665
5,0.1139,0.145838,0.9694,0.970057,0.969551,0.969609
6,0.1116,0.14153,0.9721,0.972489,0.972203,0.972288
7,0.1102,0.140745,0.9725,0.972732,0.972635,0.972658


[I 2025-03-30 22:02:50,484] Trial 99 finished with value: 0.9726580815624283 and parameters: {'learning_rate': 8.710007471084877e-05, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 100 with params: {'learning_rate': 0.00012162617401836313, 'weight_decay': 0.008, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3681,0.186652,0.9478,0.950634,0.948114,0.948186
2,0.1519,0.17029,0.9574,0.958173,0.957605,0.95743
3,0.131,0.152698,0.9678,0.968138,0.967977,0.968005
4,0.1194,0.149427,0.9693,0.96974,0.969459,0.969499
5,0.1145,0.142968,0.9733,0.97348,0.973484,0.973418
6,0.1116,0.140488,0.9725,0.972728,0.972636,0.972651
7,0.11,0.139907,0.9721,0.972163,0.972268,0.972196


[I 2025-03-30 22:13:04,239] Trial 100 finished with value: 0.9721960489643802 and parameters: {'learning_rate': 0.00012162617401836313, 'weight_decay': 0.008, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 101 with params: {'learning_rate': 8.468238290855145e-05, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3854,0.179661,0.9529,0.955173,0.953059,0.953278
2,0.1492,0.159693,0.9648,0.965009,0.964953,0.964903
3,0.1283,0.151848,0.9689,0.969195,0.969082,0.968981
4,0.1187,0.146258,0.9724,0.972666,0.972434,0.97253
5,0.1139,0.14122,0.9741,0.97448,0.974181,0.974249
6,0.1112,0.139623,0.9732,0.973523,0.973298,0.973352
7,0.1099,0.138841,0.9736,0.97381,0.973706,0.973717


[I 2025-03-30 22:23:05,224] Trial 101 finished with value: 0.9737171745496273 and parameters: {'learning_rate': 8.468238290855145e-05, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 102 with params: {'learning_rate': 0.00021237173133186566, 'weight_decay': 0.01, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3525,0.203906,0.939,0.941147,0.939122,0.939338
2,0.1624,0.182137,0.9526,0.953741,0.95294,0.952592
3,0.1384,0.163992,0.9608,0.961522,0.960925,0.961047
4,0.1249,0.149932,0.97,0.97015,0.970133,0.970103
5,0.1162,0.146134,0.9699,0.970436,0.970045,0.9701
6,0.1126,0.140681,0.9741,0.974361,0.974207,0.974242
7,0.1106,0.139535,0.9744,0.974558,0.97455,0.974523


[I 2025-03-30 22:33:25,513] Trial 102 finished with value: 0.9745229570554542 and parameters: {'learning_rate': 0.00021237173133186566, 'weight_decay': 0.01, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 103 with params: {'learning_rate': 0.0003439804710936911, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3616,0.253643,0.9099,0.915584,0.910371,0.909996
2,0.1833,0.210072,0.9387,0.941549,0.938751,0.939


[I 2025-03-30 22:36:12,668] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00023793889138512282, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3493,0.24071,0.9167,0.925365,0.91724,0.91788
2,0.1665,0.172419,0.9578,0.958254,0.958069,0.957889
3,0.1405,0.163229,0.9629,0.963249,0.963158,0.963091
4,0.1276,0.154535,0.9664,0.966471,0.966631,0.966493


[I 2025-03-30 22:42:07,362] Trial 104 pruned. 


Trial 105 with params: {'learning_rate': 0.001394113520827695, 'weight_decay': 0.002, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6094,0.542065,0.7605,0.805146,0.760081,0.766905
2,0.3786,0.394766,0.8407,0.847074,0.841383,0.840135


[I 2025-03-30 22:44:59,148] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.00016644555832767357, 'weight_decay': 0.0, 'warmup_steps': 2, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3118,0.196814,0.9435,0.947005,0.943655,0.943985
2,0.1569,0.183912,0.9498,0.951104,0.950078,0.949986
3,0.1356,0.159749,0.9638,0.963886,0.964032,0.963875
4,0.1221,0.150775,0.9673,0.967338,0.967495,0.967372


[I 2025-03-30 22:50:37,926] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 0.00012018461491622113, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3417,0.195284,0.9411,0.94525,0.941308,0.94153
2,0.1505,0.166771,0.9611,0.962002,0.961257,0.961313
3,0.1293,0.157388,0.9653,0.965804,0.965457,0.965452
4,0.1199,0.146623,0.9715,0.971813,0.971678,0.971632
5,0.1144,0.14289,0.9727,0.97318,0.972834,0.972895
6,0.1118,0.139618,0.9745,0.974637,0.974659,0.974633
7,0.11,0.138602,0.9752,0.975362,0.975351,0.975335


[I 2025-03-30 23:00:49,703] Trial 107 finished with value: 0.9753346264809897 and parameters: {'learning_rate': 0.00012018461491622113, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 0.8, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 108 with params: {'learning_rate': 0.00014516609330979537, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3315,0.192378,0.9445,0.947608,0.944885,0.94477
2,0.1523,0.169534,0.958,0.958637,0.958133,0.958259
3,0.1302,0.151798,0.9682,0.968529,0.968275,0.968291
4,0.1207,0.148449,0.9691,0.96937,0.969215,0.969235
5,0.1152,0.141565,0.9735,0.973721,0.973609,0.973607
6,0.1117,0.13929,0.9742,0.974492,0.974333,0.974343
7,0.1104,0.138052,0.9754,0.975529,0.975543,0.975494


[I 2025-03-30 23:10:51,000] Trial 108 finished with value: 0.9754935135193004 and parameters: {'learning_rate': 0.00014516609330979537, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 109 with params: {'learning_rate': 8.642340091115601e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 12, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3618,0.191682,0.9462,0.949399,0.946402,0.946526
2,0.1495,0.159697,0.9646,0.965139,0.964698,0.964797
3,0.1286,0.152519,0.966,0.96661,0.966215,0.966162
4,0.1186,0.14729,0.9704,0.9705,0.970572,0.97051
5,0.1137,0.141227,0.9723,0.972674,0.972452,0.972471
6,0.1112,0.140252,0.973,0.973372,0.973136,0.973153
7,0.1098,0.139498,0.973,0.973204,0.973165,0.973137


[I 2025-03-30 23:20:57,368] Trial 109 finished with value: 0.9731369341767359 and parameters: {'learning_rate': 8.642340091115601e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 12, 'lambda_param': 0.9, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 110 with params: {'learning_rate': 0.00021474458984009075, 'weight_decay': 0.01, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3274,0.214375,0.9348,0.939259,0.934819,0.935019
2,0.1634,0.176028,0.9537,0.955206,0.953886,0.954019
3,0.1383,0.164537,0.9608,0.962426,0.961035,0.961095
4,0.1244,0.155969,0.9661,0.966883,0.966366,0.966236


[I 2025-03-30 23:26:47,552] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.00034532268462300755, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3351,0.224306,0.9282,0.931075,0.928273,0.928871
2,0.181,0.208908,0.9356,0.938831,0.935757,0.935887


[I 2025-03-30 23:29:39,843] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 6.786706512825958e-05, 'weight_decay': 0.007, 'warmup_steps': 14, 'lambda_param': 0.6000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3914,0.181625,0.9525,0.953315,0.952575,0.952641
2,0.1526,0.161306,0.9638,0.964233,0.963904,0.963943
3,0.1291,0.15461,0.9654,0.965641,0.965619,0.965452
4,0.1182,0.147838,0.9715,0.971606,0.971601,0.971577
5,0.1139,0.14535,0.9706,0.970847,0.97073,0.9707
6,0.1118,0.143499,0.9718,0.971993,0.971952,0.97189
7,0.1101,0.142822,0.9727,0.972784,0.972854,0.97279


[I 2025-03-30 23:39:55,502] Trial 112 finished with value: 0.9727903980069741 and parameters: {'learning_rate': 6.786706512825958e-05, 'weight_decay': 0.007, 'warmup_steps': 14, 'lambda_param': 0.6000000000000001, 'temperature': 5.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 113 with params: {'learning_rate': 0.00041802949909460917, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 0.4, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3561,0.241661,0.919,0.921333,0.919318,0.919345
2,0.194,0.219355,0.9307,0.933865,0.931042,0.930584


[I 2025-03-30 23:42:50,878] Trial 113 pruned. 


Trial 114 with params: {'learning_rate': 0.00011017032474017522, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3613,0.186369,0.949,0.951559,0.949265,0.949462
2,0.1492,0.167402,0.9588,0.959112,0.959077,0.958893
3,0.129,0.155418,0.965,0.965369,0.965215,0.965133
4,0.1191,0.150452,0.9681,0.968363,0.968333,0.968247
5,0.1141,0.14355,0.9707,0.970911,0.970867,0.970816
6,0.1117,0.141552,0.9724,0.972598,0.972557,0.972489
7,0.1101,0.140479,0.9721,0.972264,0.972294,0.972208


[I 2025-03-30 23:52:50,190] Trial 114 finished with value: 0.9722081968816582 and parameters: {'learning_rate': 0.00011017032474017522, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 115 with params: {'learning_rate': 0.00010663191173618783, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3598,0.194474,0.945,0.948549,0.945104,0.945625
2,0.1507,0.164556,0.9604,0.960774,0.960616,0.960509
3,0.1286,0.154521,0.9654,0.965741,0.965586,0.965524
4,0.1188,0.149716,0.9685,0.968863,0.968671,0.968664
5,0.1138,0.144369,0.97,0.970451,0.970155,0.97017
6,0.1113,0.140491,0.9731,0.973464,0.973184,0.973248
7,0.1099,0.139141,0.974,0.974152,0.974133,0.974095


[I 2025-03-31 00:02:37,059] Trial 115 finished with value: 0.974095377621454 and parameters: {'learning_rate': 0.00010663191173618783, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 116 with params: {'learning_rate': 0.0001505315972511228, 'weight_decay': 0.008, 'warmup_steps': 17, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3439,0.188894,0.9461,0.948645,0.946306,0.946464
2,0.1534,0.168651,0.9604,0.960995,0.960543,0.960523
3,0.1324,0.156135,0.964,0.964235,0.964143,0.964158
4,0.1217,0.14794,0.9696,0.969945,0.969775,0.96978
5,0.1154,0.142763,0.973,0.973224,0.973117,0.973098
6,0.1121,0.140127,0.9744,0.97473,0.974533,0.974525
7,0.1103,0.138518,0.9747,0.974896,0.974835,0.974787


[I 2025-03-31 00:12:23,050] Trial 116 finished with value: 0.9747865955413262 and parameters: {'learning_rate': 0.0001505315972511228, 'weight_decay': 0.008, 'warmup_steps': 17, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 117 with params: {'learning_rate': 0.00017806160911382557, 'weight_decay': 0.007, 'warmup_steps': 18, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3373,0.197981,0.9415,0.94396,0.941676,0.941892
2,0.1587,0.179313,0.9532,0.95429,0.953404,0.953322


[I 2025-03-31 00:15:09,251] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.0001228070896986533, 'weight_decay': 0.01, 'warmup_steps': 10, 'lambda_param': 0.30000000000000004, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3349,0.192172,0.9436,0.948215,0.943675,0.944322
2,0.1501,0.159745,0.9627,0.962969,0.962889,0.962877
3,0.1296,0.152061,0.967,0.967151,0.967161,0.967137
4,0.1196,0.147309,0.9719,0.972325,0.972031,0.972078
5,0.1144,0.143055,0.9717,0.972336,0.971849,0.971879
6,0.1116,0.140576,0.9734,0.973711,0.973531,0.97356
7,0.1103,0.139378,0.973,0.973201,0.973166,0.973149


[I 2025-03-31 00:25:09,692] Trial 118 finished with value: 0.9731491579885653 and parameters: {'learning_rate': 0.0001228070896986533, 'weight_decay': 0.01, 'warmup_steps': 10, 'lambda_param': 0.30000000000000004, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 119 with params: {'learning_rate': 7.77983144665788e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3789,0.178716,0.9546,0.95598,0.95477,0.95491
2,0.1487,0.16013,0.9657,0.965906,0.965799,0.965763
3,0.1269,0.153728,0.9658,0.966491,0.965969,0.966003
4,0.118,0.14843,0.9708,0.971158,0.970878,0.970945
5,0.1138,0.144991,0.971,0.971378,0.971086,0.971153
6,0.1115,0.143566,0.9707,0.971069,0.970812,0.970856
7,0.11,0.142682,0.9713,0.971584,0.971418,0.971441


[I 2025-03-31 00:34:54,808] Trial 119 finished with value: 0.9714414969843336 and parameters: {'learning_rate': 7.77983144665788e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 120 with params: {'learning_rate': 0.00028723135334077736, 'weight_decay': 0.01, 'warmup_steps': 28, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3575,0.237404,0.9196,0.927615,0.920034,0.920669
2,0.1771,0.182807,0.9497,0.950508,0.949972,0.949842


[I 2025-03-31 00:37:40,804] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.0015283187811351835, 'weight_decay': 0.005, 'warmup_steps': 8, 'lambda_param': 0.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6085,0.550041,0.7588,0.782187,0.758934,0.758439
2,0.4028,0.416302,0.834,0.845179,0.834075,0.8355
3,0.306,0.357577,0.8614,0.873844,0.862113,0.860855
4,0.2397,0.250822,0.9131,0.914578,0.913168,0.913407


[I 2025-03-31 00:43:43,739] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 0.0003223247629969315, 'weight_decay': 0.006, 'warmup_steps': 10, 'lambda_param': 0.6000000000000001, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3326,0.237844,0.9155,0.922903,0.916299,0.915887
2,0.1786,0.207151,0.9378,0.938499,0.937976,0.93783


[I 2025-03-31 00:46:30,872] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 0.00035246277079769596, 'weight_decay': 0.008, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3409,0.235944,0.9211,0.924539,0.921117,0.921615
2,0.1866,0.201121,0.9427,0.943987,0.943143,0.942623
3,0.1539,0.178268,0.955,0.955666,0.95529,0.955057
4,0.1352,0.162766,0.9618,0.962871,0.961869,0.962087


[I 2025-03-31 00:52:03,698] Trial 123 pruned. 


Trial 124 with params: {'learning_rate': 0.00014061007989411934, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3445,0.20611,0.9362,0.941355,0.9365,0.9368
2,0.1538,0.166188,0.9603,0.960613,0.960517,0.960443
3,0.1321,0.154812,0.9655,0.966382,0.965638,0.965707
4,0.12,0.148529,0.9696,0.969801,0.969696,0.969725
5,0.1144,0.144041,0.9721,0.972521,0.972241,0.97224
6,0.1116,0.140736,0.9731,0.973262,0.973207,0.973223
7,0.1101,0.139042,0.974,0.974088,0.974147,0.974094


[I 2025-03-31 01:01:47,617] Trial 124 finished with value: 0.9740937025476694 and parameters: {'learning_rate': 0.00014061007989411934, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 125 with params: {'learning_rate': 0.0001149349857523226, 'weight_decay': 0.007, 'warmup_steps': 20, 'lambda_param': 0.6000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3571,0.188367,0.9465,0.948494,0.946779,0.946713
2,0.1505,0.159795,0.9627,0.963103,0.962882,0.962872
3,0.129,0.150603,0.9696,0.969658,0.969738,0.969678
4,0.1194,0.146934,0.9706,0.970928,0.970741,0.970794
5,0.1145,0.143395,0.9719,0.972255,0.972029,0.972011
6,0.1116,0.140011,0.9732,0.97348,0.973303,0.973348
7,0.11,0.139344,0.9731,0.973305,0.973203,0.973215


[I 2025-03-31 01:11:34,849] Trial 125 finished with value: 0.9732145748534983 and parameters: {'learning_rate': 0.0001149349857523226, 'weight_decay': 0.007, 'warmup_steps': 20, 'lambda_param': 0.6000000000000001, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 126 with params: {'learning_rate': 0.0009049791490282845, 'weight_decay': 0.0, 'warmup_steps': 25, 'lambda_param': 0.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.468,0.420437,0.8252,0.842027,0.826462,0.823682
2,0.2824,0.310902,0.8837,0.890277,0.883935,0.883599


[I 2025-03-31 01:14:23,812] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.00013522515719363606, 'weight_decay': 0.01, 'warmup_steps': 18, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3461,0.183971,0.9507,0.952011,0.950894,0.950921
2,0.1534,0.171212,0.9555,0.956151,0.955809,0.955595
3,0.1309,0.153194,0.9668,0.967143,0.966972,0.966923
4,0.1204,0.144741,0.9713,0.971386,0.97143,0.971397
5,0.1148,0.144274,0.9704,0.971045,0.970591,0.970593
6,0.1119,0.140674,0.9725,0.972761,0.972654,0.972668
7,0.1102,0.138661,0.9735,0.973651,0.973672,0.973636


[I 2025-03-31 01:24:07,877] Trial 127 finished with value: 0.9736357397141155 and parameters: {'learning_rate': 0.00013522515719363606, 'weight_decay': 0.01, 'warmup_steps': 18, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 128 with params: {'learning_rate': 0.0002343689992306515, 'weight_decay': 0.008, 'warmup_steps': 24, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3448,0.206436,0.9384,0.940638,0.938664,0.938833
2,0.1678,0.178999,0.9543,0.955206,0.954482,0.954542
3,0.1417,0.163756,0.9605,0.961015,0.960672,0.960614
4,0.1262,0.152252,0.9684,0.968664,0.968483,0.968558
5,0.1173,0.146643,0.9698,0.970376,0.969954,0.969967
6,0.1129,0.139994,0.9734,0.973647,0.973572,0.973536
7,0.1108,0.138251,0.9744,0.974585,0.974537,0.97453


[I 2025-03-31 01:33:53,370] Trial 128 finished with value: 0.9745296879501943 and parameters: {'learning_rate': 0.0002343689992306515, 'weight_decay': 0.008, 'warmup_steps': 24, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 129 with params: {'learning_rate': 0.000128606050919097, 'weight_decay': 0.004, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3383,0.203082,0.941,0.944954,0.941179,0.941282
2,0.1535,0.163054,0.9619,0.962164,0.96208,0.961978
3,0.1325,0.156904,0.9647,0.965686,0.964877,0.964909
4,0.1214,0.144331,0.9714,0.971938,0.971513,0.971606
5,0.1146,0.141423,0.9742,0.974533,0.974313,0.974323
6,0.1116,0.138865,0.9747,0.974946,0.97481,0.974838
7,0.1101,0.137821,0.9758,0.975978,0.975945,0.975909


[I 2025-03-31 01:43:37,244] Trial 129 finished with value: 0.9759092839105643 and parameters: {'learning_rate': 0.000128606050919097, 'weight_decay': 0.004, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 130 with params: {'learning_rate': 0.0001925204958916948, 'weight_decay': 0.003, 'warmup_steps': 9, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3164,0.196054,0.9407,0.944424,0.941027,0.941114
2,0.1589,0.179052,0.9545,0.955023,0.954765,0.95452


[I 2025-03-31 01:46:23,820] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 0.00011867450015846645, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3711,0.19398,0.9454,0.948052,0.945507,0.945981
2,0.1516,0.160336,0.962,0.962366,0.962198,0.962185
3,0.1302,0.153659,0.9666,0.966943,0.966779,0.966736
4,0.1203,0.144576,0.9708,0.971089,0.97094,0.970963
5,0.1144,0.141713,0.974,0.974338,0.974142,0.974121
6,0.1118,0.139991,0.9735,0.973906,0.973638,0.973659
7,0.1103,0.139061,0.9741,0.974227,0.974221,0.974187


[I 2025-03-31 01:56:06,225] Trial 131 finished with value: 0.9741870328191636 and parameters: {'learning_rate': 0.00011867450015846645, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 132 with params: {'learning_rate': 0.00041171620928047214, 'weight_decay': 0.003, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3579,0.258111,0.9089,0.914702,0.908934,0.909968
2,0.1962,0.214297,0.935,0.937963,0.935298,0.934998


[I 2025-03-31 01:58:52,999] Trial 132 pruned. 


Trial 133 with params: {'learning_rate': 8.161381617817908e-05, 'weight_decay': 0.003, 'warmup_steps': 9, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3727,0.17955,0.9521,0.952782,0.952389,0.952283
2,0.1492,0.164459,0.9617,0.962042,0.961844,0.961819
3,0.128,0.153239,0.9664,0.96698,0.966646,0.966564
4,0.1184,0.149814,0.9683,0.968709,0.968498,0.968478
5,0.1138,0.145128,0.9697,0.970285,0.969848,0.969912
6,0.1114,0.14313,0.9712,0.971532,0.971341,0.971381
7,0.1101,0.141851,0.9714,0.971611,0.971569,0.971561


[I 2025-03-31 02:08:47,235] Trial 133 finished with value: 0.971561459736203 and parameters: {'learning_rate': 8.161381617817908e-05, 'weight_decay': 0.003, 'warmup_steps': 9, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 134 with params: {'learning_rate': 6.558978114640059e-05, 'weight_decay': 0.0, 'warmup_steps': 14, 'lambda_param': 0.1, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3967,0.189781,0.9465,0.948809,0.946804,0.946711
2,0.1533,0.164902,0.9612,0.96182,0.961374,0.961459
3,0.1279,0.155509,0.9649,0.965648,0.965043,0.965127
4,0.118,0.150671,0.9678,0.968058,0.967959,0.967951


[I 2025-03-31 02:14:34,936] Trial 134 pruned. 


Trial 135 with params: {'learning_rate': 0.00021472339680716937, 'weight_decay': 0.006, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3215,0.208764,0.9325,0.936123,0.932882,0.93281
2,0.1632,0.177066,0.9545,0.955533,0.954658,0.954489
3,0.139,0.160592,0.962,0.962469,0.962175,0.962257
4,0.1257,0.153561,0.9662,0.966658,0.966389,0.966399


[I 2025-03-31 02:20:27,979] Trial 135 pruned. 


Trial 136 with params: {'learning_rate': 0.00011274383862101934, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3259,0.180525,0.9525,0.954514,0.952817,0.952727
2,0.1486,0.161159,0.9632,0.96352,0.963409,0.96336
3,0.1286,0.153292,0.9658,0.966303,0.965896,0.966024
4,0.1197,0.147088,0.9705,0.970646,0.970642,0.970622
5,0.1138,0.142407,0.9726,0.972688,0.972775,0.972697
6,0.1113,0.140952,0.9729,0.973097,0.973007,0.973
7,0.11,0.138972,0.9735,0.973654,0.973634,0.973606


[I 2025-03-31 02:31:24,438] Trial 136 finished with value: 0.9736055250559993 and parameters: {'learning_rate': 0.00011274383862101934, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 137 with params: {'learning_rate': 0.0001151799035753457, 'weight_decay': 0.004, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3463,0.183441,0.9482,0.950142,0.948478,0.948508
2,0.1523,0.165504,0.9614,0.962186,0.96151,0.961675
3,0.1304,0.155744,0.9664,0.966953,0.966422,0.966547
4,0.1197,0.150917,0.9676,0.968115,0.967743,0.967767


[I 2025-03-31 02:37:11,837] Trial 137 pruned. 


Trial 138 with params: {'learning_rate': 0.00012307184596887328, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3032,0.184756,0.9477,0.951641,0.948019,0.948163
2,0.1494,0.169617,0.9597,0.96047,0.959859,0.959898
3,0.1299,0.155927,0.9629,0.964328,0.963003,0.96318
4,0.1204,0.147453,0.969,0.96998,0.969189,0.969226
5,0.1148,0.144365,0.9701,0.970431,0.97025,0.970252
6,0.1118,0.14068,0.9731,0.973403,0.973217,0.973255
7,0.1103,0.139208,0.9734,0.973596,0.97354,0.973533


[I 2025-03-31 02:46:57,985] Trial 138 finished with value: 0.9735326716107711 and parameters: {'learning_rate': 0.00012307184596887328, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 7.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 139 with params: {'learning_rate': 0.00018912622443921634, 'weight_decay': 0.01, 'warmup_steps': 8, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3188,0.197964,0.9411,0.945473,0.941341,0.941505
2,0.1572,0.171788,0.9556,0.956315,0.955846,0.955771
3,0.1349,0.160578,0.962,0.96311,0.962085,0.962297
4,0.1229,0.150953,0.9681,0.9687,0.968211,0.968257


[I 2025-03-31 02:52:32,940] Trial 139 pruned. 


Trial 140 with params: {'learning_rate': 0.00021081959205105003, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3341,0.208184,0.9399,0.943054,0.940311,0.940092
2,0.1606,0.187871,0.9493,0.95183,0.949489,0.949717
3,0.1392,0.169218,0.9579,0.958878,0.958092,0.958224
4,0.1245,0.157535,0.9644,0.965397,0.96466,0.964522


[I 2025-03-31 02:58:06,300] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 0.0002365521234939826, 'weight_decay': 0.009000000000000001, 'warmup_steps': 29, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3563,0.206908,0.939,0.942261,0.939301,0.939259
2,0.168,0.193493,0.9455,0.94862,0.945717,0.945614


[I 2025-03-31 03:00:54,809] Trial 141 pruned. 


Trial 142 with params: {'learning_rate': 0.00023010707276175665, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 0.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3531,0.196285,0.9409,0.943401,0.941058,0.94111
2,0.167,0.183156,0.9507,0.951147,0.950986,0.950813
3,0.141,0.157026,0.9649,0.965482,0.964984,0.965176
4,0.1266,0.153743,0.9666,0.967582,0.966689,0.966849


[I 2025-03-31 03:06:30,289] Trial 142 pruned. 


Trial 143 with params: {'learning_rate': 0.00020002849605365978, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3453,0.202723,0.9375,0.940583,0.937862,0.937688
2,0.1619,0.175277,0.9569,0.957105,0.957233,0.956869
3,0.138,0.154382,0.9671,0.967593,0.967194,0.967273
4,0.1243,0.150786,0.9694,0.969947,0.969576,0.969542
5,0.1167,0.14592,0.9717,0.97193,0.971739,0.971767
6,0.1125,0.139816,0.9745,0.974725,0.974632,0.974623
7,0.1107,0.138592,0.9756,0.975741,0.975742,0.975704


[I 2025-03-31 03:16:28,797] Trial 143 finished with value: 0.9757037670193125 and parameters: {'learning_rate': 0.00020002849605365978, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 144 with params: {'learning_rate': 0.004062992839455107, 'weight_decay': 0.007, 'warmup_steps': 17, 'lambda_param': 0.8, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4828,1.562363,0.1038,0.091354,0.104806,0.058878
2,1.5472,1.554266,0.129,0.054715,0.127302,0.059037
3,1.5352,1.509922,0.1559,0.163515,0.15749,0.124695
4,1.5077,1.540288,0.1332,0.111417,0.13275,0.100745


[I 2025-03-31 03:22:01,514] Trial 144 pruned. 


Trial 145 with params: {'learning_rate': 0.00013760204739950906, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19, 'lambda_param': 0.6000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3424,0.19314,0.9442,0.94668,0.944393,0.944648
2,0.1538,0.164463,0.9621,0.962442,0.962319,0.962279
3,0.1311,0.156012,0.9647,0.965445,0.964844,0.964985
4,0.1209,0.150981,0.9675,0.968084,0.967586,0.967697


[I 2025-03-31 03:27:34,173] Trial 145 pruned. 


Trial 146 with params: {'learning_rate': 0.0011607614784531854, 'weight_decay': 0.0, 'warmup_steps': 1, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5073,0.413604,0.8314,0.838242,0.832031,0.829353
2,0.3272,0.316566,0.8813,0.886238,0.881327,0.881065
3,0.253,0.264121,0.9066,0.908586,0.907018,0.906649
4,0.2017,0.221558,0.9283,0.929225,0.928593,0.928428


[I 2025-03-31 03:33:09,072] Trial 146 pruned. 


Trial 147 with params: {'learning_rate': 0.001519271985143758, 'weight_decay': 0.007, 'warmup_steps': 22, 'lambda_param': 0.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6426,0.612859,0.7196,0.796078,0.720335,0.720213
2,0.4078,0.440694,0.8046,0.834118,0.805174,0.799719


[I 2025-03-31 03:35:57,516] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 0.00021584650027088676, 'weight_decay': 0.01, 'warmup_steps': 28, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3569,0.211911,0.9321,0.938119,0.932583,0.932875
2,0.1642,0.177932,0.9538,0.954652,0.953948,0.953938


[I 2025-03-31 03:38:44,507] Trial 148 pruned. 


Trial 149 with params: {'learning_rate': 7.376314367140723e-05, 'weight_decay': 0.01, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.377,0.187315,0.9469,0.949938,0.947197,0.947298
2,0.1506,0.162884,0.9609,0.961098,0.961091,0.960959
3,0.1286,0.151853,0.9681,0.968263,0.968244,0.968219
4,0.1187,0.150022,0.9683,0.968796,0.968439,0.968466
5,0.1139,0.144931,0.9711,0.971403,0.971213,0.971239
6,0.1116,0.143044,0.9709,0.971185,0.971047,0.971051
7,0.1101,0.142287,0.9715,0.971642,0.971659,0.971619


[I 2025-03-31 03:48:35,832] Trial 149 finished with value: 0.9716190451722241 and parameters: {'learning_rate': 7.376314367140723e-05, 'weight_decay': 0.01, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


In [26]:
print(best_distill)

BestRun(run_id='32', objective=0.9764120594095738, hyperparameters={'learning_rate': 0.00013553561983282748, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}, run_summary=None)


In [27]:
base.reset_seed()

Přepočet kroků s ohledem na změnu velikosti datasetu.

In [None]:
data_length = len(train_combo)
min_r = math.ceil(data_length/batch_size)*2
max_r = math.ceil(data_length/batch_size)*num_epochs
warm_up = math.ceil(data_length/batch_size/10)

## Prohledávání s normálním tréninkem nad augmentovaným datasetem
Konfigurace jednotlivých tréninků.

In [28]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-aug_hp-search", logging_dir=f"~/logs/{DATASET}/-aug_hp-search", epochs=num_epochs, batch_size=batch_size)

Definice hledaných hyperparametrů a jejich rozmezí.

In [29]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up)
    }   
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [30]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



Konfigurace trenéra pro jednotlivé tréninky. 

In [31]:
trainer = Trainer(
    args=training_args,
    train_dataset=train_combo,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [32]:
best_base_aug = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Base-head",
    n_trials=150
)

[I 2025-03-31 03:48:36,446] A new study created in memory with name: Base-head


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3883,0.221293,0.926,0.929451,0.926255,0.926541
2,0.1312,0.166291,0.9474,0.947967,0.947885,0.947319


[I 2025-03-31 03:53:01,180] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.0007875660249889869, 'weight_decay': 0.001, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5386,0.413269,0.858,0.874213,0.857768,0.859255
2,0.2723,0.242039,0.9227,0.922562,0.923158,0.92202
3,0.1734,0.220825,0.9299,0.931425,0.930256,0.929911


[I 2025-03-31 03:59:36,657] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 6.533369619026643e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4483,0.140322,0.9532,0.953476,0.953506,0.953211
2,0.0915,0.125343,0.9608,0.961229,0.961065,0.960828


[I 2025-03-31 04:03:59,557] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.0013035123791853842, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8112,0.621493,0.7873,0.799103,0.787041,0.787377
2,0.4513,0.441679,0.861,0.870828,0.861079,0.861713
3,0.3002,0.328119,0.89,0.8968,0.890091,0.89042


[I 2025-03-31 04:10:33,921] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.002311294500510415, 'weight_decay': 0.002, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5232,1.534588,0.4391,0.505404,0.439281,0.406545
2,1.2849,1.277727,0.5537,0.577794,0.553947,0.541602


[I 2025-03-31 04:14:56,281] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3789,0.152657,0.9502,0.951018,0.950477,0.950321
2,0.0924,0.139739,0.9575,0.957688,0.957796,0.957496
3,0.0401,0.124862,0.9645,0.965055,0.964656,0.964743
4,0.0182,0.140248,0.9657,0.966193,0.965862,0.965781
5,0.0061,0.140632,0.9698,0.970039,0.96989,0.969938


[I 2025-03-31 04:25:56,194] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0003654769917956456, 'weight_decay': 0.003, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4128,0.196447,0.9346,0.935647,0.934541,0.934766
2,0.1551,0.164474,0.9465,0.947851,0.946636,0.946673


[I 2025-03-31 04:30:22,481] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 9.505122659935192e-05, 'weight_decay': 0.003, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.38,0.138533,0.9546,0.954898,0.954797,0.95475
2,0.0903,0.120398,0.962,0.962363,0.962132,0.962062
3,0.0378,0.128409,0.9654,0.965599,0.965545,0.965492
4,0.0154,0.143817,0.968,0.968262,0.968077,0.968097
5,0.0055,0.15025,0.9696,0.96991,0.969718,0.969739


[I 2025-03-31 04:41:25,196] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 0.00040842279473800845, 'weight_decay': 0.008, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4059,0.257876,0.9141,0.917884,0.91412,0.91424
2,0.1703,0.187183,0.9379,0.938464,0.938022,0.938001
3,0.0985,0.177912,0.9482,0.949572,0.948237,0.94848


[I 2025-03-31 04:48:03,908] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0005338741354740678, 'weight_decay': 0.006, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4371,0.29461,0.8991,0.911771,0.899193,0.901218
2,0.2003,0.193124,0.937,0.937461,0.937365,0.936924


[I 2025-03-31 04:52:28,094] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 5.765419213017514e-05, 'weight_decay': 0.0, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4765,0.143942,0.9533,0.953626,0.953621,0.953376
2,0.1021,0.115302,0.9635,0.963498,0.963733,0.963578
3,0.0413,0.118464,0.9656,0.9659,0.965659,0.965728
4,0.0176,0.133836,0.9661,0.966195,0.966259,0.966191
5,0.0053,0.151298,0.9679,0.968218,0.968047,0.968035


[I 2025-03-31 05:03:28,414] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 8.864358030226235e-05, 'weight_decay': 0.003, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3794,0.128561,0.957,0.957166,0.957251,0.957022
2,0.0881,0.118847,0.9631,0.963207,0.963343,0.963103
3,0.0352,0.128566,0.9663,0.9664,0.966447,0.966378


[I 2025-03-31 05:10:05,623] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 7.882328855146668e-05, 'weight_decay': 0.004, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.425,0.140237,0.953,0.953468,0.953271,0.953079
2,0.0897,0.112924,0.9647,0.96494,0.964812,0.96477
3,0.0376,0.127851,0.9636,0.963776,0.963723,0.963633


[I 2025-03-31 05:16:45,780] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.0001642985400515745, 'weight_decay': 0.0, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3616,0.153377,0.9487,0.949515,0.948945,0.948797
2,0.0982,0.139231,0.9553,0.955966,0.955505,0.955371
3,0.0476,0.125522,0.9644,0.964613,0.964508,0.964522


[I 2025-03-31 05:23:20,883] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 0.00023364707944876568, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3883,0.192952,0.9359,0.937374,0.935927,0.936189
2,0.121,0.147653,0.9506,0.951246,0.950975,0.95068
3,0.0637,0.14105,0.9568,0.957485,0.956776,0.957036


[I 2025-03-31 05:29:54,119] Trial 14 pruned. 


Trial 15 with params: {'learning_rate': 0.003590246670113587, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.1915,2.309329,0.1252,0.063624,0.123805,0.062392
2,2.2661,2.299206,0.1185,0.136411,0.11624,0.041247


[I 2025-03-31 05:34:17,148] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.00018649990770712045, 'weight_decay': 0.005, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3541,0.157586,0.9466,0.947119,0.946799,0.946688
2,0.1049,0.132207,0.9574,0.95764,0.957574,0.957459
3,0.0517,0.134301,0.964,0.964156,0.964148,0.964014
4,0.0281,0.137392,0.9643,0.964603,0.964392,0.964413
5,0.0104,0.139085,0.968,0.968187,0.968171,0.968153


[I 2025-03-31 05:45:15,318] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 6.832986140924479e-05, 'weight_decay': 0.006, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4166,0.13488,0.9556,0.955696,0.9559,0.955593
2,0.0918,0.116984,0.9623,0.962541,0.962584,0.962343
3,0.0368,0.123405,0.9657,0.966285,0.965743,0.965883


[I 2025-03-31 05:51:50,809] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0026868566033176914, 'weight_decay': 0.01, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.923,2.05156,0.233,0.247356,0.23149,0.194652
2,2.0933,2.106786,0.2065,0.17387,0.205746,0.150136


[I 2025-03-31 05:56:13,178] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 6.274902830770433e-05, 'weight_decay': 0.0, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4579,0.148232,0.951,0.951485,0.951257,0.951103
2,0.0986,0.116854,0.9623,0.96262,0.962512,0.962388
3,0.0405,0.118519,0.9658,0.966153,0.965915,0.965971
4,0.0157,0.142976,0.9671,0.967126,0.967273,0.967163
5,0.005,0.153646,0.9673,0.967527,0.967417,0.967403


[I 2025-03-31 06:07:11,827] Trial 19 pruned. 


Trial 20 with params: {'learning_rate': 7.536233099932573e-05, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4351,0.139801,0.9523,0.952504,0.952611,0.95235
2,0.0953,0.120875,0.9633,0.963874,0.963413,0.963359
3,0.0386,0.121165,0.9671,0.967331,0.967193,0.967208
4,0.0158,0.142838,0.9667,0.967139,0.966829,0.966856
5,0.0053,0.145039,0.9704,0.97071,0.970467,0.970554


[I 2025-03-31 06:18:11,143] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 7.808987905772294e-05, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.437,0.136018,0.955,0.95548,0.955248,0.955131
2,0.0878,0.126911,0.9615,0.961722,0.961817,0.961496
3,0.0363,0.117692,0.968,0.96859,0.968059,0.968178
4,0.0132,0.138904,0.9685,0.968715,0.968665,0.968627
5,0.0047,0.15447,0.9688,0.968879,0.968972,0.968909
6,0.0015,0.155973,0.9705,0.970676,0.970641,0.970644
7,0.0005,0.15542,0.9717,0.971828,0.971823,0.971823


[I 2025-03-31 06:33:37,492] Trial 21 finished with value: 0.9718228491410248 and parameters: {'learning_rate': 7.808987905772294e-05, 'weight_decay': 0.006, 'warmup_steps': 28}. Best is trial 21 with value: 0.9718228491410248.


Trial 22 with params: {'learning_rate': 5.622823959282174e-05, 'weight_decay': 0.005, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4939,0.140334,0.9538,0.954114,0.954163,0.953831
2,0.097,0.112515,0.9654,0.965498,0.96558,0.965505
3,0.0397,0.118252,0.9678,0.968167,0.96788,0.967907
4,0.0151,0.136181,0.9693,0.969585,0.969507,0.96945
5,0.0049,0.149947,0.9691,0.969322,0.969265,0.969241


[I 2025-03-31 06:44:36,062] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 8.733972674215324e-05, 'weight_decay': 0.007, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4347,0.143805,0.9535,0.954196,0.953841,0.953531
2,0.0927,0.12793,0.9599,0.95996,0.960176,0.959855
3,0.0389,0.126847,0.965,0.96531,0.965165,0.965109
4,0.0155,0.128109,0.9681,0.968401,0.968246,0.968219
5,0.005,0.138471,0.971,0.971135,0.971112,0.971097
6,0.0017,0.144585,0.9712,0.97137,0.971321,0.971329
7,0.0005,0.147829,0.9728,0.972937,0.9729,0.972909


[I 2025-03-31 06:59:56,861] Trial 23 finished with value: 0.9729094964315547 and parameters: {'learning_rate': 8.733972674215324e-05, 'weight_decay': 0.007, 'warmup_steps': 30}. Best is trial 23 with value: 0.9729094964315547.


Trial 24 with params: {'learning_rate': 6.768180919825604e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4623,0.141292,0.9529,0.953446,0.953087,0.952964
2,0.0902,0.116579,0.9649,0.965195,0.965004,0.965029
3,0.0365,0.127392,0.9653,0.965628,0.965393,0.965436


[I 2025-03-31 07:06:30,545] Trial 24 pruned. 


Trial 25 with params: {'learning_rate': 9.558860928486504e-05, 'weight_decay': 0.006, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4215,0.134564,0.9558,0.955965,0.956049,0.955825
2,0.0925,0.114988,0.9628,0.962835,0.963048,0.962808
3,0.0388,0.11588,0.9685,0.968648,0.968621,0.968605
4,0.016,0.132751,0.9674,0.967793,0.967477,0.967581
5,0.0065,0.148785,0.9704,0.970634,0.970484,0.970521
6,0.0016,0.147047,0.9713,0.971544,0.971404,0.971459
7,0.0006,0.145786,0.9721,0.97231,0.972184,0.972232


[I 2025-03-31 07:21:48,782] Trial 25 finished with value: 0.9722321933173793 and parameters: {'learning_rate': 9.558860928486504e-05, 'weight_decay': 0.006, 'warmup_steps': 30}. Best is trial 23 with value: 0.9729094964315547.


Trial 26 with params: {'learning_rate': 0.0002643861023868768, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3868,0.186732,0.938,0.939146,0.938335,0.938128
2,0.1272,0.15671,0.9482,0.948872,0.948512,0.948203


[I 2025-03-31 07:26:10,072] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.00011645934996272619, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3993,0.142402,0.9524,0.952742,0.952592,0.952484
2,0.0909,0.119371,0.9615,0.961753,0.96169,0.961546
3,0.0392,0.121957,0.9655,0.965938,0.96555,0.965687
4,0.0176,0.140856,0.9658,0.966065,0.965974,0.96597
5,0.0061,0.147445,0.9688,0.96901,0.968918,0.96894


[I 2025-03-31 07:37:03,692] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 0.00011108512918019929, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4042,0.146582,0.9493,0.950043,0.949549,0.949494
2,0.09,0.125932,0.9607,0.960765,0.960942,0.960624
3,0.04,0.126385,0.9653,0.965818,0.965332,0.965494
4,0.0186,0.149454,0.9651,0.96539,0.965247,0.965201
5,0.0067,0.146485,0.9688,0.968981,0.968967,0.968951
6,0.0019,0.147133,0.9706,0.970801,0.97073,0.970759
7,0.0005,0.144453,0.9714,0.97155,0.971513,0.971521


[I 2025-03-31 07:52:21,466] Trial 28 finished with value: 0.9715205318229193 and parameters: {'learning_rate': 0.00011108512918019929, 'weight_decay': 0.005, 'warmup_steps': 29}. Best is trial 23 with value: 0.9729094964315547.


Trial 29 with params: {'learning_rate': 0.0007164134462450241, 'weight_decay': 0.01, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.548,0.325916,0.8904,0.896086,0.890731,0.890492
2,0.2584,0.231203,0.9241,0.924394,0.924554,0.923852


[I 2025-03-31 07:56:44,851] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 7.475166556151518e-05, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.45,0.148498,0.9501,0.950863,0.950285,0.95021
2,0.0936,0.119726,0.9621,0.962064,0.962319,0.962092
3,0.0394,0.117168,0.9669,0.967287,0.967055,0.967025
4,0.0161,0.126871,0.969,0.969082,0.969185,0.969109
5,0.0058,0.143265,0.9702,0.970354,0.970329,0.970325


[I 2025-03-31 08:07:39,402] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 0.00010311076532392113, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4058,0.141135,0.9523,0.952821,0.952524,0.952353
2,0.0901,0.13846,0.9557,0.955817,0.955968,0.955586
3,0.0393,0.122462,0.9679,0.968165,0.968037,0.967993
4,0.0171,0.134188,0.9682,0.968653,0.968221,0.968373
5,0.0069,0.14372,0.9701,0.970212,0.970269,0.970202
6,0.0017,0.140668,0.973,0.97318,0.973123,0.973135
7,0.0006,0.136299,0.974,0.974202,0.974135,0.97416


[I 2025-03-31 08:23:00,517] Trial 31 finished with value: 0.9741598985804594 and parameters: {'learning_rate': 0.00010311076532392113, 'weight_decay': 0.005, 'warmup_steps': 29}. Best is trial 31 with value: 0.9741598985804594.


Trial 32 with params: {'learning_rate': 0.00023460239846333, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3942,0.17649,0.9412,0.941607,0.94148,0.940998
2,0.1191,0.158125,0.9506,0.950742,0.950939,0.950429


[I 2025-03-31 08:27:22,611] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 6.268895006525956e-05, 'weight_decay': 0.003, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4601,0.134029,0.9547,0.955086,0.955,0.954805
2,0.0918,0.120164,0.9636,0.963677,0.963749,0.963659
3,0.0367,0.126911,0.9644,0.964721,0.964494,0.964548


[I 2025-03-31 08:33:55,993] Trial 33 pruned. 


Trial 34 with params: {'learning_rate': 5.7310937030315135e-05, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4765,0.14195,0.9525,0.953129,0.952729,0.952621
2,0.0984,0.115442,0.963,0.963169,0.963183,0.963098
3,0.0389,0.115747,0.9665,0.96666,0.966688,0.966646
4,0.016,0.135482,0.9664,0.966617,0.966528,0.966541
5,0.0054,0.146186,0.97,0.970251,0.970128,0.970163


[I 2025-03-31 08:44:52,401] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 0.00020525994361904562, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.383,0.166246,0.9465,0.946981,0.94684,0.946568
2,0.1104,0.141245,0.9589,0.959075,0.959181,0.958806
3,0.0561,0.128454,0.9605,0.961001,0.960562,0.960687


[I 2025-03-31 08:51:26,334] Trial 35 pruned. 


Trial 36 with params: {'learning_rate': 0.004049761177508626, 'weight_decay': 0.006, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3313,2.312519,0.1022,0.01022,0.1,0.018545
2,2.3021,2.310024,0.1014,0.01014,0.1,0.018413


[I 2025-03-31 08:55:48,587] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 9.041355111457508e-05, 'weight_decay': 0.006, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4219,0.139204,0.9523,0.953042,0.952448,0.952394
2,0.0882,0.119241,0.9616,0.961673,0.96182,0.961651
3,0.0369,0.124808,0.9659,0.966036,0.966089,0.965981
4,0.0159,0.141868,0.9669,0.96741,0.966991,0.967094
5,0.0055,0.149213,0.9692,0.969419,0.969322,0.96932
6,0.0013,0.152882,0.9698,0.969946,0.969925,0.969922
7,0.0004,0.150791,0.971,0.971194,0.971109,0.971143


[I 2025-03-31 09:11:09,141] Trial 37 finished with value: 0.9711425272140772 and parameters: {'learning_rate': 9.041355111457508e-05, 'weight_decay': 0.006, 'warmup_steps': 29}. Best is trial 31 with value: 0.9741598985804594.


Trial 38 with params: {'learning_rate': 5.014889563511618e-05, 'weight_decay': 0.007, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5255,0.158376,0.9484,0.948843,0.948693,0.948384
2,0.1115,0.121143,0.9623,0.962433,0.962542,0.962387
3,0.0445,0.129912,0.9642,0.964604,0.964311,0.964357


[I 2025-03-31 09:17:43,110] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.0001597034557883444, 'weight_decay': 0.004, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3809,0.149656,0.9499,0.951115,0.950004,0.950195
2,0.0966,0.148021,0.9509,0.951794,0.951276,0.950835
3,0.0462,0.118048,0.9641,0.964223,0.96423,0.964206


[I 2025-03-31 09:24:18,941] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.004241076779716196, 'weight_decay': 0.003, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.233,2.268663,0.1337,0.106395,0.133643,0.081282
2,2.2736,2.289341,0.1116,0.076695,0.11285,0.063059
3,2.2543,2.329412,0.1192,0.051694,0.117767,0.048673


[I 2025-03-31 09:30:53,485] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 8.109659508740162e-05, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4342,0.131848,0.9563,0.956763,0.956479,0.956497
2,0.0929,0.111671,0.9662,0.966404,0.966327,0.966308
3,0.0385,0.125281,0.9659,0.966285,0.965996,0.966028


[I 2025-03-31 09:37:27,672] Trial 41 pruned. 


Trial 42 with params: {'learning_rate': 0.00015117006764000902, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3911,0.164348,0.9442,0.945258,0.944336,0.944298
2,0.0979,0.134968,0.9563,0.956757,0.95654,0.956367
3,0.0462,0.128038,0.9625,0.963218,0.962643,0.962742


[I 2025-03-31 09:44:01,452] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.00016361367125994418, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3811,0.161102,0.946,0.947496,0.94613,0.946206
2,0.0992,0.125834,0.9602,0.960692,0.96039,0.960285
3,0.0486,0.14209,0.9576,0.958325,0.957809,0.957863


[I 2025-03-31 09:50:35,459] Trial 43 pruned. 


Trial 44 with params: {'learning_rate': 9.074012529354292e-05, 'weight_decay': 0.005, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4131,0.140732,0.9531,0.953725,0.953338,0.953149
2,0.0897,0.130994,0.9609,0.961165,0.961211,0.960826
3,0.0365,0.120417,0.9676,0.967782,0.967763,0.967751
4,0.0156,0.140645,0.9689,0.969191,0.968997,0.969047
5,0.0055,0.156934,0.9688,0.968944,0.968955,0.968866


[I 2025-03-31 10:01:31,361] Trial 44 pruned. 


Trial 45 with params: {'learning_rate': 0.0008102177669590671, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.582,0.330255,0.8901,0.895729,0.890091,0.891065
2,0.2836,0.229589,0.9238,0.924301,0.923911,0.924033


[I 2025-03-31 10:05:54,555] Trial 45 pruned. 


Trial 46 with params: {'learning_rate': 0.00012202145027053816, 'weight_decay': 0.006, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4042,0.144775,0.9536,0.954607,0.953706,0.95383
2,0.093,0.129906,0.9591,0.959404,0.959432,0.959069
3,0.0414,0.115518,0.9672,0.967512,0.967347,0.967356
4,0.0188,0.125247,0.969,0.969089,0.969183,0.969068
5,0.0066,0.141379,0.9713,0.971485,0.971429,0.97143
6,0.0018,0.140452,0.972,0.97216,0.972121,0.972122
7,0.0006,0.140364,0.9728,0.972956,0.972918,0.972923


[I 2025-03-31 10:21:16,892] Trial 46 finished with value: 0.9729232846106017 and parameters: {'learning_rate': 0.00012202145027053816, 'weight_decay': 0.006, 'warmup_steps': 31}. Best is trial 31 with value: 0.9741598985804594.


Trial 47 with params: {'learning_rate': 0.00010481877791892262, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.397,0.152346,0.9487,0.949389,0.948921,0.948723
2,0.0905,0.132477,0.9571,0.95756,0.957289,0.957129
3,0.0381,0.140734,0.9615,0.962084,0.961635,0.961662


[I 2025-03-31 10:27:50,140] Trial 47 pruned. 


Trial 48 with params: {'learning_rate': 0.00015433736178353414, 'weight_decay': 0.01, 'warmup_steps': 9}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3509,0.159635,0.9444,0.945284,0.944538,0.944561
2,0.1006,0.131562,0.9585,0.958643,0.958727,0.958423
3,0.0478,0.11242,0.9661,0.966238,0.966246,0.966206
4,0.0222,0.127422,0.9687,0.969025,0.968832,0.968819
5,0.0076,0.143846,0.9697,0.97012,0.969833,0.969914


[I 2025-03-31 10:38:45,178] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.00013783840697717263, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3981,0.165412,0.9454,0.946361,0.945602,0.945466
2,0.0973,0.128966,0.9584,0.958355,0.95863,0.958311
3,0.0423,0.132326,0.9637,0.964034,0.963798,0.96386


[I 2025-03-31 10:45:17,644] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.0027800474932883233, 'weight_decay': 0.0, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.0818,2.217227,0.1644,0.16455,0.161322,0.108389
2,2.0973,2.139922,0.1993,0.241263,0.198764,0.149261
3,2.1182,2.102571,0.2055,0.157001,0.204483,0.14226


[I 2025-03-31 10:51:51,518] Trial 50 pruned. 


Trial 51 with params: {'learning_rate': 0.0001280188186280819, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3964,0.148613,0.9502,0.951131,0.95035,0.950402
2,0.0907,0.13751,0.9565,0.956584,0.956785,0.956364
3,0.0409,0.121661,0.9645,0.964737,0.964588,0.964611


[I 2025-03-31 10:58:25,596] Trial 51 pruned. 


Trial 52 with params: {'learning_rate': 6.726436861315805e-05, 'weight_decay': 0.005, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4597,0.146592,0.9506,0.95133,0.950867,0.95067
2,0.0938,0.118587,0.9621,0.961975,0.962384,0.962068
3,0.037,0.132396,0.9632,0.963485,0.963368,0.963309


[I 2025-03-31 11:04:59,240] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 0.00023354498448728003, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3936,0.175906,0.9381,0.939643,0.938398,0.938258
2,0.1202,0.14264,0.9542,0.955514,0.954315,0.954261


[I 2025-03-31 11:09:23,529] Trial 53 pruned. 


Trial 54 with params: {'learning_rate': 0.000403916017640712, 'weight_decay': 0.0, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4284,0.231204,0.9216,0.924354,0.921945,0.921928
2,0.1695,0.182078,0.9425,0.94349,0.942658,0.942507
3,0.0968,0.150375,0.9508,0.951059,0.950985,0.950856


[I 2025-03-31 11:15:57,882] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.00017719091300380597, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3974,0.160737,0.944,0.944657,0.944199,0.944181
2,0.1067,0.159551,0.9495,0.95046,0.949822,0.949621


[I 2025-03-31 11:20:20,716] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 0.004913837305728667, 'weight_decay': 0.002, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3854,2.314964,0.1016,0.01016,0.1,0.018446
2,2.3019,2.310244,0.1014,0.01014,0.1,0.018413
3,2.3028,2.306221,0.1014,0.01014,0.1,0.018413


[I 2025-03-31 11:26:54,198] Trial 56 pruned. 


Trial 57 with params: {'learning_rate': 0.00013701969678280232, 'weight_decay': 0.008, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4004,0.164353,0.9445,0.945298,0.944704,0.94447
2,0.0986,0.1373,0.9572,0.957764,0.957344,0.957271
3,0.0445,0.123798,0.9646,0.964854,0.964731,0.964728


[I 2025-03-31 11:33:27,661] Trial 57 pruned. 


Trial 58 with params: {'learning_rate': 9.567504987432412e-05, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4258,0.139733,0.9533,0.953828,0.953455,0.95335
2,0.0936,0.11659,0.9618,0.961849,0.962017,0.961823
3,0.0388,0.128567,0.9653,0.965395,0.965466,0.965402


[I 2025-03-31 11:40:02,216] Trial 58 pruned. 


Trial 59 with params: {'learning_rate': 0.00010517669827533925, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4083,0.137214,0.9542,0.95503,0.954419,0.954299
2,0.0882,0.120821,0.963,0.962976,0.963268,0.963016
3,0.0401,0.123621,0.9655,0.965932,0.965558,0.96566
4,0.0181,0.130176,0.9679,0.967994,0.968035,0.967951
5,0.0053,0.142674,0.9711,0.971224,0.971229,0.971208
6,0.0025,0.138706,0.972,0.972161,0.972115,0.972126
7,0.0005,0.139184,0.9739,0.974045,0.974015,0.974011


[I 2025-03-31 11:55:25,878] Trial 59 finished with value: 0.9740110457684894 and parameters: {'learning_rate': 0.00010517669827533925, 'weight_decay': 0.005, 'warmup_steps': 32}. Best is trial 31 with value: 0.9741598985804594.


Trial 60 with params: {'learning_rate': 0.0011700191952905836, 'weight_decay': 0.003, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.724,0.721868,0.763,0.807957,0.762982,0.75987
2,0.3941,0.346669,0.8833,0.890399,0.883566,0.883791


[I 2025-03-31 11:59:49,553] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.0001654263497446901, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3913,0.152645,0.9496,0.950096,0.949789,0.949701
2,0.101,0.135188,0.9561,0.956173,0.956376,0.956022
3,0.0493,0.122397,0.9653,0.965558,0.965421,0.96544
4,0.0249,0.128544,0.9667,0.967046,0.966802,0.966876
5,0.0091,0.142817,0.9695,0.969733,0.969619,0.969656
6,0.0025,0.148148,0.9712,0.971296,0.971351,0.971304
7,0.0005,0.143284,0.9729,0.973048,0.973032,0.973033


[I 2025-03-31 12:15:19,520] Trial 61 finished with value: 0.9730328967694384 and parameters: {'learning_rate': 0.0001654263497446901, 'weight_decay': 0.006, 'warmup_steps': 32}. Best is trial 31 with value: 0.9741598985804594.


Trial 62 with params: {'learning_rate': 0.00016135593778915874, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3859,0.145648,0.9514,0.951794,0.951476,0.951564
2,0.0987,0.140101,0.9558,0.956869,0.955949,0.955894
3,0.0472,0.138758,0.9604,0.960425,0.960619,0.960422


[I 2025-03-31 12:21:56,437] Trial 62 pruned. 


Trial 63 with params: {'learning_rate': 0.00011905113904484478, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4134,0.157004,0.9461,0.947612,0.946323,0.946064
2,0.0931,0.124734,0.9606,0.960755,0.960844,0.960639
3,0.0424,0.115685,0.9669,0.967213,0.966956,0.967033
4,0.0177,0.13842,0.9657,0.966104,0.965823,0.96592
5,0.0068,0.15593,0.9682,0.968538,0.968323,0.968391


[I 2025-03-31 12:33:25,351] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.0014740970021661379, 'weight_decay': 0.005, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8521,0.782205,0.7412,0.785827,0.740853,0.740864
2,0.4973,0.563719,0.8229,0.839579,0.82377,0.822401


[I 2025-03-31 12:37:48,860] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.0006901925931088882, 'weight_decay': 0.005, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5292,0.358015,0.882,0.8927,0.882202,0.88402
2,0.2535,0.234031,0.9239,0.924867,0.924271,0.923871
3,0.1574,0.200719,0.9355,0.938011,0.935747,0.935884


[I 2025-03-31 12:44:22,781] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 0.0005606968603036112, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4911,0.29545,0.8967,0.901575,0.896616,0.896974
2,0.2131,0.245672,0.9188,0.921018,0.919199,0.91866


[I 2025-03-31 12:48:45,619] Trial 66 pruned. 


Trial 67 with params: {'learning_rate': 0.00011036336399789536, 'weight_decay': 0.006, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3968,0.131074,0.957,0.957167,0.957251,0.957095
2,0.0916,0.131064,0.9588,0.959584,0.958906,0.958925
3,0.0378,0.12455,0.9663,0.966586,0.96641,0.966411
4,0.0168,0.131,0.9677,0.967735,0.967815,0.967754
5,0.0067,0.143039,0.9701,0.970417,0.970181,0.970274
6,0.002,0.139431,0.9721,0.972392,0.972201,0.972266
7,0.0004,0.137435,0.9731,0.973338,0.973202,0.973252


[I 2025-03-31 13:04:10,866] Trial 67 finished with value: 0.973251978300129 and parameters: {'learning_rate': 0.00011036336399789536, 'weight_decay': 0.006, 'warmup_steps': 22}. Best is trial 31 with value: 0.9741598985804594.


Trial 68 with params: {'learning_rate': 9.896656488739246e-05, 'weight_decay': 0.006, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3878,0.1412,0.9523,0.952939,0.952451,0.952482
2,0.0924,0.126248,0.96,0.960083,0.960195,0.959994
3,0.0377,0.124472,0.9662,0.966495,0.966311,0.966359
4,0.0171,0.141263,0.9673,0.96765,0.967365,0.967462
5,0.0058,0.155246,0.97,0.970407,0.970052,0.970183
6,0.0019,0.147492,0.9708,0.971054,0.970896,0.970963
7,0.0006,0.147384,0.972,0.972187,0.972074,0.97212


[I 2025-03-31 13:19:36,048] Trial 68 finished with value: 0.9721196393145413 and parameters: {'learning_rate': 9.896656488739246e-05, 'weight_decay': 0.006, 'warmup_steps': 16}. Best is trial 31 with value: 0.9741598985804594.


Trial 69 with params: {'learning_rate': 0.00029257413831553685, 'weight_decay': 0.006, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4029,0.2199,0.9282,0.930918,0.928413,0.928095
2,0.1378,0.154017,0.9503,0.950704,0.950639,0.950221


[I 2025-03-31 13:23:59,121] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 8.258996838453831e-05, 'weight_decay': 0.005, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4179,0.139503,0.9542,0.954758,0.954462,0.954312
2,0.0965,0.117885,0.9635,0.963806,0.963737,0.963483
3,0.0407,0.11724,0.9672,0.967472,0.96733,0.967342
4,0.0165,0.135607,0.9651,0.965366,0.965236,0.965162
5,0.0064,0.148658,0.9678,0.968075,0.967926,0.967933


[I 2025-03-31 13:34:56,514] Trial 70 pruned. 


Trial 71 with params: {'learning_rate': 0.00013900432881088528, 'weight_decay': 0.007, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3607,0.146687,0.9523,0.953018,0.952468,0.952409
2,0.0949,0.136188,0.9554,0.955601,0.955711,0.955328
3,0.0445,0.11297,0.9648,0.965106,0.964923,0.964956
4,0.0209,0.127628,0.9689,0.96915,0.969009,0.969031
5,0.0071,0.130247,0.9731,0.97331,0.973171,0.97322
6,0.0022,0.127702,0.9747,0.974812,0.974829,0.974808
7,0.0006,0.127135,0.9749,0.975021,0.974996,0.975002


[I 2025-03-31 13:50:23,921] Trial 71 finished with value: 0.9750019868944989 and parameters: {'learning_rate': 0.00013900432881088528, 'weight_decay': 0.007, 'warmup_steps': 13}. Best is trial 71 with value: 0.9750019868944989.


Trial 72 with params: {'learning_rate': 0.00014807185359432985, 'weight_decay': 0.008, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.36,0.14309,0.9511,0.951915,0.951163,0.951407
2,0.0975,0.130351,0.9572,0.957723,0.957391,0.957124
3,0.0453,0.131928,0.963,0.963236,0.963174,0.9631


[I 2025-03-31 13:56:59,688] Trial 72 pruned. 


Trial 73 with params: {'learning_rate': 5.489595730333042e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 14}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4795,0.144828,0.9538,0.954141,0.954028,0.953837
2,0.1011,0.107776,0.967,0.967203,0.967183,0.967107
3,0.0426,0.125803,0.9642,0.964608,0.964357,0.964366


[I 2025-03-31 14:03:35,335] Trial 73 pruned. 


Trial 74 with params: {'learning_rate': 0.0001420194299814645, 'weight_decay': 0.007, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3663,0.165377,0.9442,0.945452,0.944388,0.94412
2,0.0938,0.131507,0.9567,0.957279,0.956964,0.956753
3,0.0444,0.128995,0.9635,0.963592,0.963624,0.963523


[I 2025-03-31 14:10:10,244] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00011221834328945825, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4011,0.161015,0.9465,0.947505,0.946682,0.946713
2,0.0943,0.120991,0.9604,0.960756,0.960633,0.960483
3,0.0384,0.114727,0.9676,0.967725,0.967679,0.967649
4,0.0175,0.1402,0.9655,0.965696,0.965631,0.965572
5,0.0067,0.145365,0.9693,0.969509,0.969474,0.969447


[I 2025-03-31 14:21:08,369] Trial 75 pruned. 


Trial 76 with params: {'learning_rate': 0.00024402219890752626, 'weight_decay': 0.005, 'warmup_steps': 9}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3603,0.197819,0.9346,0.93922,0.93474,0.935061
2,0.124,0.179124,0.9456,0.946632,0.946058,0.945413


[I 2025-03-31 14:25:32,523] Trial 76 pruned. 


Trial 77 with params: {'learning_rate': 0.00020296540471280374, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.387,0.201739,0.9296,0.932932,0.929677,0.930074
2,0.1097,0.165479,0.9467,0.947579,0.947135,0.946616
3,0.0568,0.138447,0.9599,0.960532,0.960006,0.96012


[I 2025-03-31 14:32:09,745] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.00015188915245702766, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3881,0.162335,0.946,0.946942,0.946138,0.946189
2,0.0978,0.153584,0.9528,0.952922,0.953145,0.952628
3,0.0475,0.12916,0.9636,0.963855,0.96371,0.963636


[I 2025-03-31 14:38:46,061] Trial 78 pruned. 


Trial 79 with params: {'learning_rate': 7.092826008348294e-05, 'weight_decay': 0.006, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4682,0.144378,0.9524,0.953183,0.952619,0.952539
2,0.0932,0.122207,0.9617,0.961997,0.961904,0.961721
3,0.0376,0.122986,0.9671,0.967514,0.967203,0.967178
4,0.0152,0.132287,0.9679,0.968192,0.968013,0.968048
5,0.0053,0.144381,0.9691,0.969387,0.96917,0.969228


[I 2025-03-31 14:49:44,980] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 0.00018807914813120138, 'weight_decay': 0.003, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3749,0.166874,0.9459,0.946921,0.946092,0.946045
2,0.1076,0.147768,0.9523,0.953053,0.952733,0.95232
3,0.0546,0.152002,0.9559,0.956284,0.956057,0.955975


[I 2025-03-31 14:56:20,217] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 7.463271683902057e-05, 'weight_decay': 0.007, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.413,0.137655,0.9539,0.954398,0.954147,0.953987
2,0.0893,0.128039,0.9614,0.961348,0.961716,0.961346
3,0.0363,0.128998,0.9633,0.963772,0.963452,0.96349


[I 2025-03-31 15:02:56,842] Trial 81 pruned. 


Trial 82 with params: {'learning_rate': 0.0002580378349367608, 'weight_decay': 0.007, 'warmup_steps': 14}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3734,0.192195,0.9377,0.939299,0.937728,0.937955
2,0.1275,0.199652,0.9381,0.939637,0.938523,0.937794
3,0.069,0.142249,0.956,0.95701,0.956031,0.95627


[I 2025-03-31 15:09:33,492] Trial 82 pruned. 


Trial 83 with params: {'learning_rate': 5.2398098069044525e-05, 'weight_decay': 0.006, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4806,0.145899,0.9515,0.951954,0.951866,0.951492
2,0.1006,0.112968,0.9655,0.965773,0.965735,0.965603
3,0.0421,0.121312,0.9652,0.965471,0.965329,0.965296


[I 2025-03-31 15:16:08,192] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 9.189888369605919e-05, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4091,0.136561,0.9548,0.955196,0.955129,0.95484
2,0.0895,0.112422,0.9637,0.963905,0.963869,0.963789
3,0.0384,0.121325,0.9665,0.966752,0.966656,0.966584
4,0.0164,0.13486,0.9699,0.970116,0.97001,0.970033
5,0.0049,0.147159,0.9701,0.970279,0.970247,0.970235
6,0.0014,0.146495,0.9704,0.970561,0.970533,0.970539
7,0.0004,0.14795,0.9714,0.971616,0.971488,0.971542


[I 2025-03-31 15:31:34,495] Trial 84 finished with value: 0.9715415534732854 and parameters: {'learning_rate': 9.189888369605919e-05, 'weight_decay': 0.007, 'warmup_steps': 19}. Best is trial 71 with value: 0.9750019868944989.


Trial 85 with params: {'learning_rate': 0.0005758128746324003, 'weight_decay': 0.0, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.469,0.314297,0.8968,0.904265,0.896784,0.897607
2,0.2165,0.243651,0.9185,0.921164,0.918735,0.918498


[I 2025-03-31 15:35:57,499] Trial 85 pruned. 


Trial 86 with params: {'learning_rate': 0.00010465161260600638, 'weight_decay': 0.006, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4027,0.156446,0.947,0.947998,0.947202,0.947065
2,0.0965,0.130017,0.9581,0.958532,0.958333,0.95812
3,0.0421,0.12284,0.9643,0.964757,0.964368,0.964426


[I 2025-03-31 15:42:31,730] Trial 86 pruned. 


Trial 87 with params: {'learning_rate': 0.00020172388478995225, 'weight_decay': 0.006, 'warmup_steps': 16}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3716,0.194661,0.9372,0.937884,0.937466,0.937161
2,0.1109,0.132164,0.9575,0.95748,0.957752,0.957488
3,0.0554,0.139582,0.96,0.960077,0.960212,0.960036


[I 2025-03-31 15:49:06,242] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 0.00012500179425159816, 'weight_decay': 0.002, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3959,0.140462,0.9534,0.954191,0.953513,0.953571
2,0.093,0.14064,0.9551,0.955329,0.95544,0.955021


[I 2025-03-31 15:53:30,208] Trial 88 pruned. 


Trial 89 with params: {'learning_rate': 0.00013527594175644146, 'weight_decay': 0.007, 'warmup_steps': 4}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3547,0.147898,0.9495,0.949872,0.949771,0.949509
2,0.0935,0.137624,0.9577,0.958133,0.95791,0.95766
3,0.0447,0.129782,0.9628,0.963033,0.963022,0.962969


[I 2025-03-31 16:00:03,341] Trial 89 pruned. 


Trial 90 with params: {'learning_rate': 5.097688155314207e-05, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5086,0.143748,0.9526,0.952999,0.952813,0.952702
2,0.1046,0.11639,0.9629,0.962944,0.963143,0.962923
3,0.0409,0.117943,0.9653,0.965562,0.965394,0.965387
4,0.0163,0.127293,0.967,0.967148,0.967133,0.967122
5,0.0056,0.140935,0.9691,0.969138,0.969279,0.96918


[I 2025-03-31 16:11:01,355] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 5.278226667584257e-05, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5009,0.144643,0.9517,0.952169,0.95202,0.951756
2,0.1003,0.115152,0.9656,0.965791,0.965775,0.965731
3,0.0404,0.122871,0.9653,0.965565,0.965442,0.965454
4,0.0169,0.13669,0.9661,0.966385,0.966254,0.966201
5,0.0054,0.156116,0.9679,0.968087,0.968034,0.968027


[I 2025-03-31 16:22:01,073] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 8.967002959671585e-05, 'weight_decay': 0.008, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.425,0.134099,0.9558,0.956139,0.956035,0.955933
2,0.0899,0.125102,0.9607,0.960823,0.960953,0.960684
3,0.0374,0.13352,0.9632,0.963492,0.963278,0.963305


[I 2025-03-31 16:28:36,824] Trial 92 pruned. 


Trial 93 with params: {'learning_rate': 0.00014996012751069385, 'weight_decay': 0.004, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3689,0.162454,0.9471,0.947365,0.947287,0.947095
2,0.0962,0.122266,0.9597,0.959805,0.959854,0.959734
3,0.0466,0.127178,0.962,0.962312,0.962096,0.962175


[I 2025-03-31 16:35:12,190] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.00010656572287281419, 'weight_decay': 0.004, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4101,0.135753,0.955,0.955692,0.955162,0.955058
2,0.0916,0.124782,0.9616,0.96186,0.961802,0.961617
3,0.0377,0.12838,0.9664,0.966742,0.966504,0.966552
4,0.0175,0.131934,0.9683,0.968475,0.968457,0.968448
5,0.0068,0.145984,0.9704,0.970594,0.970535,0.970516
6,0.002,0.139759,0.974,0.974165,0.9741,0.974111
7,0.0006,0.14042,0.9742,0.974337,0.974279,0.974302


[I 2025-03-31 16:50:35,974] Trial 94 finished with value: 0.9743022969453872 and parameters: {'learning_rate': 0.00010656572287281419, 'weight_decay': 0.004, 'warmup_steps': 25}. Best is trial 71 with value: 0.9750019868944989.


Trial 95 with params: {'learning_rate': 8.007471738963006e-05, 'weight_decay': 0.003, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4279,0.142969,0.9536,0.953906,0.953801,0.953699
2,0.0934,0.118455,0.9621,0.9625,0.962215,0.962222
3,0.0365,0.124175,0.9652,0.965441,0.965336,0.965312


[I 2025-03-31 16:57:10,499] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 0.00021639034482612632, 'weight_decay': 0.005, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3861,0.188421,0.9397,0.94142,0.939929,0.939788
2,0.1144,0.149017,0.9525,0.952812,0.952663,0.95249
3,0.0591,0.141019,0.9595,0.959612,0.959729,0.959611


[I 2025-03-31 17:03:44,927] Trial 96 pruned. 


Trial 97 with params: {'learning_rate': 9.472663200258537e-05, 'weight_decay': 0.004, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4075,0.152727,0.9482,0.949161,0.948375,0.948381
2,0.088,0.122585,0.9618,0.961861,0.962006,0.961887
3,0.0373,0.120479,0.9664,0.966818,0.966505,0.966618
4,0.0158,0.143005,0.9656,0.965907,0.965739,0.965759
5,0.006,0.138221,0.9711,0.971238,0.971221,0.97121
6,0.0013,0.143555,0.9707,0.970825,0.970852,0.970826
7,0.0005,0.14348,0.9713,0.971465,0.971428,0.971433


[I 2025-03-31 17:19:07,908] Trial 97 finished with value: 0.9714328845772571 and parameters: {'learning_rate': 9.472663200258537e-05, 'weight_decay': 0.004, 'warmup_steps': 26}. Best is trial 71 with value: 0.9750019868944989.


Trial 98 with params: {'learning_rate': 0.0035054904723296637, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3287,2.309231,0.1294,0.037129,0.128184,0.053074
2,2.3131,2.506704,0.0952,0.00952,0.1,0.017385


[I 2025-03-31 17:23:31,241] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 0.0001739989607172103, 'weight_decay': 0.007, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3917,0.153653,0.9464,0.947232,0.946524,0.946713
2,0.1036,0.127979,0.9604,0.961096,0.960475,0.96056
3,0.0495,0.15118,0.9556,0.956276,0.95578,0.955851


[I 2025-03-31 17:30:06,432] Trial 99 pruned. 


Trial 100 with params: {'learning_rate': 0.00012839653725627988, 'weight_decay': 0.005, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3761,0.160195,0.9467,0.947942,0.946956,0.946787
2,0.0925,0.12979,0.9591,0.959306,0.959301,0.959197
3,0.0428,0.121591,0.9639,0.964039,0.964063,0.964015


[I 2025-03-31 17:36:40,419] Trial 100 pruned. 


Trial 101 with params: {'learning_rate': 7.418250194956502e-05, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4508,0.134968,0.9578,0.957742,0.958046,0.957818
2,0.0981,0.110051,0.9665,0.966584,0.966671,0.966605
3,0.0392,0.118203,0.9656,0.966026,0.96568,0.965814
4,0.0165,0.134701,0.9681,0.968654,0.968239,0.968291
5,0.0052,0.144849,0.9714,0.971633,0.971496,0.971533
6,0.0015,0.145579,0.9707,0.970864,0.970835,0.970821
7,0.0005,0.145032,0.9719,0.972086,0.972003,0.972032


[I 2025-03-31 17:52:09,170] Trial 101 finished with value: 0.9720324453117193 and parameters: {'learning_rate': 7.418250194956502e-05, 'weight_decay': 0.007, 'warmup_steps': 28}. Best is trial 71 with value: 0.9750019868944989.


Trial 102 with params: {'learning_rate': 8.218751434188152e-05, 'weight_decay': 0.007, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4379,0.141238,0.9522,0.952578,0.952448,0.952171
2,0.0939,0.118699,0.9629,0.963616,0.963045,0.963013
3,0.0378,0.125661,0.9652,0.965319,0.965332,0.965299


[I 2025-03-31 17:58:44,477] Trial 102 pruned. 


Trial 103 with params: {'learning_rate': 5.059652985464632e-05, 'weight_decay': 0.008, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4968,0.140388,0.9548,0.955002,0.955095,0.954874
2,0.1026,0.114689,0.9643,0.964335,0.964528,0.964393
3,0.0429,0.128175,0.9635,0.963801,0.963623,0.963589


[I 2025-03-31 18:05:21,474] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00010982893323116206, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3937,0.132963,0.9551,0.955356,0.955325,0.95516
2,0.0959,0.129678,0.959,0.959909,0.959209,0.959059
3,0.0398,0.1149,0.967,0.967114,0.967151,0.967087
4,0.0178,0.123919,0.9692,0.96941,0.969274,0.969304
5,0.0072,0.140279,0.9712,0.971429,0.97131,0.971323
6,0.0018,0.138578,0.9709,0.971127,0.971017,0.971062
7,0.0007,0.137605,0.972,0.972257,0.972096,0.972163


[I 2025-03-31 18:20:44,606] Trial 104 finished with value: 0.9721629154130721 and parameters: {'learning_rate': 0.00010982893323116206, 'weight_decay': 0.006, 'warmup_steps': 25}. Best is trial 71 with value: 0.9750019868944989.


Trial 105 with params: {'learning_rate': 0.0006078662726350267, 'weight_decay': 0.01, 'warmup_steps': 2}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4568,0.275903,0.9085,0.911945,0.908539,0.908651
2,0.2203,0.269577,0.9141,0.918892,0.914155,0.914254
3,0.1402,0.180249,0.9433,0.943801,0.943426,0.943398


[I 2025-03-31 18:27:18,458] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.00016497820663983865, 'weight_decay': 0.006, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.373,0.174056,0.9418,0.943446,0.941903,0.942073
2,0.1025,0.154856,0.9527,0.953531,0.95296,0.952636


[I 2025-03-31 18:31:43,230] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 0.00015356685144789174, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3897,0.150206,0.9501,0.950921,0.95036,0.950144
2,0.0987,0.12639,0.96,0.960234,0.960265,0.960076
3,0.0457,0.13681,0.9604,0.960788,0.96056,0.960553


[I 2025-03-31 18:38:19,847] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 6.944863740592477e-05, 'weight_decay': 0.004, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4342,0.141312,0.9536,0.954026,0.953892,0.953602
2,0.0943,0.118329,0.9626,0.963011,0.962867,0.962683
3,0.0364,0.116897,0.9669,0.967079,0.967004,0.966987
4,0.0149,0.129371,0.9679,0.968294,0.967946,0.968068
5,0.0045,0.15136,0.9682,0.96841,0.968333,0.968357


[I 2025-03-31 18:49:19,090] Trial 108 pruned. 


Trial 109 with params: {'learning_rate': 0.00013212941343850463, 'weight_decay': 0.005, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3928,0.151349,0.9492,0.950106,0.949346,0.949397
2,0.0912,0.138996,0.9566,0.956615,0.956881,0.956576


[I 2025-03-31 18:53:42,594] Trial 109 pruned. 


Trial 110 with params: {'learning_rate': 0.00013257381844268332, 'weight_decay': 0.005, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.387,0.152372,0.9482,0.948997,0.948411,0.948259
2,0.0962,0.128123,0.9593,0.959847,0.959475,0.959447
3,0.0437,0.121272,0.9654,0.965644,0.965429,0.965496
4,0.0207,0.134388,0.9676,0.967897,0.967672,0.967745
5,0.008,0.145728,0.97,0.970357,0.970125,0.970084


[I 2025-03-31 19:04:40,976] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.00014054355003985592, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3888,0.140337,0.953,0.953265,0.953155,0.952961
2,0.0955,0.134681,0.9559,0.956453,0.956119,0.955941
3,0.0434,0.134736,0.9611,0.961212,0.961292,0.961194


[I 2025-03-31 19:11:16,587] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 9.206052553146995e-05, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4157,0.138651,0.9541,0.954145,0.95434,0.954161
2,0.0915,0.122896,0.9619,0.961978,0.962096,0.9619
3,0.0382,0.123776,0.9657,0.965971,0.965786,0.965852
4,0.0162,0.138799,0.9682,0.968387,0.968337,0.968287
5,0.0055,0.14981,0.9689,0.969162,0.969035,0.96907


[I 2025-03-31 19:22:14,995] Trial 112 pruned. 


Trial 113 with params: {'learning_rate': 0.00010077603890986858, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4139,0.137767,0.954,0.954331,0.954252,0.954097
2,0.0903,0.122514,0.9616,0.962065,0.961874,0.961656
3,0.0389,0.120503,0.9666,0.966845,0.966735,0.966741
4,0.0173,0.132975,0.9688,0.969183,0.968847,0.968935
5,0.0055,0.141619,0.9703,0.970637,0.970394,0.970382
6,0.0017,0.137644,0.9722,0.97235,0.972312,0.972314
7,0.0004,0.1366,0.9741,0.974334,0.974163,0.974233


[I 2025-03-31 19:37:39,945] Trial 113 finished with value: 0.9742334367768942 and parameters: {'learning_rate': 0.00010077603890986858, 'weight_decay': 0.006, 'warmup_steps': 28}. Best is trial 71 with value: 0.9750019868944989.


Trial 114 with params: {'learning_rate': 0.00024076189381611062, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4012,0.184966,0.9377,0.938164,0.937927,0.937541
2,0.1207,0.148287,0.9498,0.950313,0.950033,0.94978


[I 2025-03-31 19:42:03,414] Trial 114 pruned. 


Trial 115 with params: {'learning_rate': 0.00011343824777524984, 'weight_decay': 0.005, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3983,0.145244,0.9531,0.953829,0.953269,0.953167
2,0.0904,0.119136,0.9632,0.96342,0.963476,0.963252
3,0.0407,0.122642,0.9644,0.964476,0.964543,0.964471
4,0.0179,0.135236,0.9675,0.967717,0.967647,0.967625
5,0.0062,0.151287,0.9694,0.969648,0.969534,0.969573


[I 2025-03-31 19:53:02,139] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 7.81606605201484e-05, 'weight_decay': 0.004, 'warmup_steps': 10}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4113,0.134561,0.955,0.955385,0.955257,0.95509
2,0.0912,0.109558,0.967,0.967174,0.967161,0.967101
3,0.0368,0.112377,0.9662,0.966617,0.966307,0.96642
4,0.0162,0.127811,0.9693,0.96976,0.969476,0.969497
5,0.0053,0.140038,0.9714,0.97179,0.971462,0.971583
6,0.0015,0.137651,0.9715,0.971863,0.971607,0.971712
7,0.0005,0.137697,0.9733,0.973563,0.973395,0.973464


[I 2025-03-31 20:08:26,503] Trial 116 finished with value: 0.9734640260915528 and parameters: {'learning_rate': 7.81606605201484e-05, 'weight_decay': 0.004, 'warmup_steps': 10}. Best is trial 71 with value: 0.9750019868944989.


Trial 117 with params: {'learning_rate': 8.96183325971753e-05, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4226,0.1342,0.9542,0.954616,0.954486,0.954355
2,0.091,0.117043,0.9637,0.963944,0.963877,0.963791
3,0.0381,0.121536,0.9669,0.967022,0.96707,0.966972
4,0.0162,0.131888,0.9684,0.968774,0.968459,0.968571
5,0.0059,0.135504,0.972,0.972261,0.972145,0.972172
6,0.0019,0.140516,0.9717,0.972047,0.971841,0.971923
7,0.0007,0.142346,0.973,0.973283,0.973126,0.973192


[I 2025-03-31 20:23:50,292] Trial 117 finished with value: 0.9731920743968816 and parameters: {'learning_rate': 8.96183325971753e-05, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 71 with value: 0.9750019868944989.


Trial 118 with params: {'learning_rate': 6.249646173346577e-05, 'weight_decay': 0.005, 'warmup_steps': 8}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4341,0.139779,0.9537,0.954275,0.953983,0.953757
2,0.0924,0.118204,0.9628,0.96331,0.963053,0.962952
3,0.0356,0.12724,0.9643,0.964679,0.964348,0.964401


[I 2025-03-31 20:30:26,062] Trial 118 pruned. 


Trial 119 with params: {'learning_rate': 6.888232835022142e-05, 'weight_decay': 0.002, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4255,0.140131,0.9527,0.952755,0.953086,0.952757
2,0.0949,0.119521,0.9626,0.962964,0.962884,0.962637
3,0.0362,0.12045,0.9683,0.968688,0.968347,0.968451
4,0.0137,0.139101,0.9676,0.967925,0.96775,0.967762
5,0.0056,0.136605,0.9708,0.971054,0.970953,0.970982
6,0.0013,0.145489,0.972,0.972194,0.972123,0.972145
7,0.0006,0.1454,0.9731,0.973288,0.973191,0.973232


[I 2025-03-31 20:45:52,240] Trial 119 finished with value: 0.9732315491187495 and parameters: {'learning_rate': 6.888232835022142e-05, 'weight_decay': 0.002, 'warmup_steps': 11}. Best is trial 71 with value: 0.9750019868944989.


Trial 120 with params: {'learning_rate': 6.389952747944363e-05, 'weight_decay': 0.003, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4348,0.139865,0.9528,0.953502,0.953102,0.952854
2,0.0946,0.116071,0.9654,0.965542,0.965557,0.965506
3,0.0373,0.128316,0.9656,0.966294,0.965747,0.965768
4,0.0144,0.145877,0.9648,0.965098,0.964923,0.964962
5,0.0052,0.159401,0.9673,0.967391,0.967505,0.967399


[I 2025-03-31 20:56:50,644] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.00016810195015580037, 'weight_decay': 0.001, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3434,0.155801,0.9454,0.946429,0.945579,0.945609
2,0.1027,0.138639,0.9562,0.956336,0.956553,0.956151


[I 2025-03-31 21:01:15,135] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 9.544870481880747e-05, 'weight_decay': 0.004, 'warmup_steps': 4}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3688,0.145676,0.9511,0.952364,0.951101,0.95132
2,0.0885,0.121804,0.962,0.962196,0.962232,0.962064
3,0.037,0.131781,0.9672,0.967463,0.96735,0.967358
4,0.0178,0.13822,0.9677,0.968112,0.967847,0.967899
5,0.0058,0.143374,0.97,0.970401,0.970066,0.970162
6,0.0015,0.146506,0.9715,0.971734,0.971589,0.971643
7,0.0005,0.140775,0.9735,0.973678,0.973595,0.973629


[I 2025-03-31 21:16:39,503] Trial 122 finished with value: 0.9736288554187194 and parameters: {'learning_rate': 9.544870481880747e-05, 'weight_decay': 0.004, 'warmup_steps': 4}. Best is trial 71 with value: 0.9750019868944989.


Trial 123 with params: {'learning_rate': 9.356030302385796e-05, 'weight_decay': 0.001, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.382,0.13562,0.9555,0.955844,0.955803,0.955503
2,0.0902,0.119748,0.9613,0.961618,0.961535,0.961327
3,0.038,0.119852,0.9671,0.967497,0.967165,0.967241
4,0.0166,0.146038,0.9656,0.966011,0.965747,0.965769
5,0.0052,0.144043,0.969,0.969039,0.969156,0.969077


[I 2025-03-31 21:27:36,068] Trial 123 pruned. 


Trial 124 with params: {'learning_rate': 0.00011925156364483786, 'weight_decay': 0.004, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.351,0.16064,0.9454,0.946852,0.945498,0.945557
2,0.0911,0.140784,0.9568,0.957189,0.957039,0.956824
3,0.0413,0.120398,0.966,0.966118,0.966171,0.966098
4,0.0193,0.138979,0.9668,0.967231,0.966923,0.966967
5,0.0071,0.149608,0.9694,0.969691,0.969508,0.969518


[I 2025-03-31 21:38:35,780] Trial 124 pruned. 


Trial 125 with params: {'learning_rate': 8.489034863744123e-05, 'weight_decay': 0.001, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3879,0.139169,0.9518,0.952008,0.952206,0.951745
2,0.0902,0.114598,0.9637,0.964075,0.963926,0.963788
3,0.037,0.117383,0.9671,0.967584,0.967201,0.9673
4,0.0157,0.141909,0.9669,0.967348,0.966984,0.967091
5,0.0048,0.143681,0.9698,0.970146,0.969909,0.969945


[I 2025-03-31 21:49:35,320] Trial 125 pruned. 


Trial 126 with params: {'learning_rate': 0.00010622498212202575, 'weight_decay': 0.004, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3731,0.130872,0.959,0.959299,0.959198,0.959071
2,0.0911,0.130655,0.9589,0.959353,0.959085,0.958972
3,0.0395,0.127587,0.9645,0.964453,0.964719,0.964508
4,0.0182,0.143003,0.9642,0.964547,0.964383,0.964312
5,0.0061,0.151225,0.9699,0.970135,0.970049,0.970045
6,0.0018,0.146851,0.9717,0.971878,0.971862,0.971848
7,0.0007,0.147397,0.9723,0.972439,0.972453,0.972431


[I 2025-03-31 22:05:00,620] Trial 126 finished with value: 0.9724307474752221 and parameters: {'learning_rate': 0.00010622498212202575, 'weight_decay': 0.004, 'warmup_steps': 7}. Best is trial 71 with value: 0.9750019868944989.


Trial 127 with params: {'learning_rate': 5.730962712611121e-05, 'weight_decay': 0.002, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4388,0.136128,0.9549,0.955213,0.955182,0.95498
2,0.0954,0.116417,0.9644,0.964526,0.964618,0.964418
3,0.0385,0.124887,0.9668,0.967103,0.966904,0.966946
4,0.0152,0.137955,0.9676,0.967809,0.967744,0.967722
5,0.0056,0.153028,0.9674,0.967652,0.967536,0.967558


[I 2025-03-31 22:15:59,909] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.00037696430184407483, 'weight_decay': 0.003, 'warmup_steps': 2}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3794,0.249777,0.916,0.918452,0.916113,0.916064
2,0.158,0.161239,0.9497,0.950064,0.949922,0.949706


[I 2025-03-31 22:20:23,996] Trial 128 pruned. 


Trial 129 with params: {'learning_rate': 7.447386369523557e-05, 'weight_decay': 0.003, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4077,0.135431,0.9544,0.954733,0.954647,0.954424
2,0.0878,0.124751,0.9616,0.961759,0.961792,0.961544
3,0.0349,0.129892,0.9634,0.963885,0.963512,0.963598


[I 2025-03-31 22:26:58,109] Trial 129 pruned. 


Trial 130 with params: {'learning_rate': 0.00012039065039566897, 'weight_decay': 0.003, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3569,0.142298,0.9533,0.953636,0.953486,0.953405
2,0.0927,0.129347,0.9603,0.960829,0.960466,0.960389
3,0.0411,0.121481,0.9648,0.964976,0.964985,0.964952
4,0.0188,0.145236,0.9662,0.966701,0.966384,0.966474
5,0.0059,0.149498,0.9693,0.969553,0.96946,0.969471


[I 2025-03-31 22:37:56,853] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 7.520774205667585e-05, 'weight_decay': 0.004, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4033,0.128357,0.9561,0.956553,0.956238,0.956242
2,0.0886,0.122819,0.9614,0.961994,0.961686,0.961444
3,0.0374,0.130873,0.9657,0.965925,0.965885,0.965782
4,0.0157,0.135575,0.9667,0.966951,0.966881,0.966869
5,0.0046,0.149369,0.9697,0.969845,0.969887,0.969826


[I 2025-03-31 22:48:57,547] Trial 131 pruned. 


Trial 132 with params: {'learning_rate': 0.00015559793413379416, 'weight_decay': 0.002, 'warmup_steps': 9}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3565,0.143943,0.9502,0.950753,0.950388,0.950451
2,0.0977,0.147107,0.9543,0.954677,0.954562,0.954251
3,0.0463,0.131947,0.9619,0.962525,0.961968,0.962077


[I 2025-03-31 22:55:33,357] Trial 132 pruned. 


Trial 133 with params: {'learning_rate': 6.291255792485352e-05, 'weight_decay': 0.004, 'warmup_steps': 9}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4383,0.138429,0.9549,0.955166,0.955159,0.955006
2,0.1001,0.113339,0.9629,0.963255,0.963075,0.962963
3,0.0384,0.122045,0.9673,0.967782,0.967389,0.967487
4,0.016,0.142893,0.9654,0.965764,0.965525,0.965554
5,0.0055,0.149045,0.9688,0.96911,0.968866,0.968929


[I 2025-03-31 23:06:34,103] Trial 133 pruned. 


Trial 134 with params: {'learning_rate': 9.95878837716165e-05, 'weight_decay': 0.004, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3714,0.148323,0.9512,0.951783,0.951527,0.951191
2,0.0917,0.132513,0.96,0.96005,0.960203,0.959941
3,0.0401,0.130254,0.9638,0.964484,0.963766,0.963962


[I 2025-03-31 23:13:08,283] Trial 134 pruned. 


Trial 135 with params: {'learning_rate': 9.285787592587367e-05, 'weight_decay': 0.005, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3745,0.133892,0.954,0.954384,0.954247,0.954088
2,0.0903,0.123666,0.9627,0.962621,0.962993,0.962632
3,0.0378,0.124146,0.9615,0.961616,0.961616,0.961564


[I 2025-03-31 23:19:43,066] Trial 135 pruned. 


Trial 136 with params: {'learning_rate': 0.0003194366766741562, 'weight_decay': 0.003, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3762,0.222111,0.9286,0.930929,0.928767,0.928721
2,0.1401,0.172091,0.9426,0.943824,0.942858,0.942944
3,0.0813,0.139566,0.9566,0.956973,0.956732,0.956702


[I 2025-03-31 23:26:17,870] Trial 136 pruned. 


Trial 137 with params: {'learning_rate': 0.00013779430890626954, 'weight_decay': 0.005, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3569,0.148084,0.9504,0.951232,0.9506,0.950607
2,0.0924,0.122058,0.9605,0.960699,0.960799,0.960611
3,0.0441,0.122213,0.9648,0.964883,0.965053,0.96486


[I 2025-03-31 23:32:52,725] Trial 137 pruned. 


Trial 138 with params: {'learning_rate': 0.00013086161901401913, 'weight_decay': 0.004, 'warmup_steps': 2}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3516,0.139352,0.9515,0.952545,0.95168,0.951662
2,0.0946,0.125803,0.9599,0.96016,0.960202,0.959972
3,0.0423,0.127059,0.9627,0.963312,0.962756,0.962924


[I 2025-03-31 23:39:27,368] Trial 138 pruned. 


Trial 139 with params: {'learning_rate': 0.00014117267894586278, 'weight_decay': 0.005, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3994,0.14629,0.9529,0.954282,0.953019,0.953109
2,0.0937,0.134751,0.9573,0.957428,0.957572,0.957227
3,0.0426,0.128022,0.9643,0.964463,0.964499,0.964436
4,0.0206,0.141076,0.9644,0.964977,0.964491,0.964644
5,0.0073,0.148212,0.969,0.96928,0.969102,0.969136


[I 2025-03-31 23:50:26,786] Trial 139 pruned. 


Trial 140 with params: {'learning_rate': 0.00010273562540320119, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4155,0.136418,0.9548,0.955221,0.95497,0.954948
2,0.0903,0.124149,0.9591,0.959247,0.959314,0.959112
3,0.0383,0.127241,0.9654,0.96569,0.965576,0.965547
4,0.0164,0.135694,0.968,0.968157,0.968171,0.968108
5,0.0063,0.151352,0.969,0.969371,0.969095,0.969178


[I 2025-04-01 00:01:26,511] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 9.267025562646518e-05, 'weight_decay': 0.005, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4274,0.138101,0.9556,0.955981,0.955767,0.955688
2,0.0928,0.115374,0.9628,0.963026,0.962926,0.962884
3,0.0371,0.114697,0.9682,0.968413,0.968237,0.968281
4,0.017,0.139788,0.9653,0.965689,0.965451,0.965411
5,0.0054,0.147315,0.9686,0.968682,0.968789,0.968703


[I 2025-04-01 00:12:22,853] Trial 141 pruned. 


Trial 142 with params: {'learning_rate': 6.358284950952197e-05, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4587,0.143425,0.9523,0.952477,0.952567,0.952267
2,0.0934,0.116747,0.9639,0.964046,0.964083,0.963949
3,0.0368,0.12517,0.9649,0.965282,0.965001,0.965089


[I 2025-04-01 00:18:56,652] Trial 142 pruned. 


Trial 143 with params: {'learning_rate': 9.19710849294985e-05, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4259,0.1293,0.9574,0.957658,0.957567,0.957505
2,0.0952,0.114776,0.9632,0.963413,0.963418,0.963292
3,0.0401,0.122802,0.9647,0.964747,0.964896,0.964781
4,0.0175,0.136582,0.9662,0.966799,0.966378,0.96645
5,0.007,0.145212,0.969,0.969455,0.969115,0.969184


[I 2025-04-01 00:29:54,792] Trial 143 pruned. 


Trial 144 with params: {'learning_rate': 0.0003457428530838075, 'weight_decay': 0.009000000000000001, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.421,0.252914,0.9126,0.917791,0.913083,0.912931
2,0.1502,0.15436,0.9498,0.950012,0.950047,0.949655


[I 2025-04-01 00:34:18,538] Trial 144 pruned. 


Trial 145 with params: {'learning_rate': 0.00010550389942031643, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3957,0.138273,0.9546,0.954866,0.954824,0.95466
2,0.0877,0.128164,0.9593,0.959804,0.959426,0.959364
3,0.0377,0.117966,0.9655,0.965753,0.965646,0.965664
4,0.0167,0.140443,0.9687,0.968935,0.968798,0.968786
5,0.0069,0.140423,0.9708,0.970957,0.970906,0.970902
6,0.0016,0.141638,0.9726,0.972921,0.972653,0.972753
7,0.0005,0.142052,0.9737,0.973879,0.973753,0.973803


[I 2025-04-01 00:49:42,662] Trial 145 finished with value: 0.9738031017027329 and parameters: {'learning_rate': 0.00010550389942031643, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 71 with value: 0.9750019868944989.


Trial 146 with params: {'learning_rate': 0.004283355770338839, 'weight_decay': 0.01, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.2475,2.237669,0.1495,0.060745,0.147128,0.07863
2,2.2739,2.325901,0.1213,0.03751,0.122767,0.045172


[I 2025-04-01 00:54:05,650] Trial 146 pruned. 


Trial 147 with params: {'learning_rate': 6.20420623258273e-05, 'weight_decay': 0.001, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4527,0.139397,0.9529,0.953703,0.953081,0.953029
2,0.0979,0.120297,0.9629,0.963299,0.963093,0.962992
3,0.039,0.126411,0.9662,0.96643,0.966313,0.966281
4,0.0159,0.142843,0.9648,0.965269,0.964807,0.964957
5,0.0062,0.149181,0.9676,0.967944,0.967727,0.967781


[I 2025-04-01 01:05:02,677] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 0.00010566180374249967, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4105,0.141938,0.9512,0.951749,0.951556,0.951254
2,0.0899,0.112334,0.9644,0.964637,0.964522,0.964544
3,0.0374,0.132665,0.9629,0.963554,0.962904,0.963161


[I 2025-04-01 01:11:37,935] Trial 148 pruned. 


Trial 149 with params: {'learning_rate': 0.0001397082047711854, 'weight_decay': 0.009000000000000001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.371,0.153391,0.9482,0.949074,0.948247,0.94845
2,0.0935,0.132794,0.9566,0.956868,0.956851,0.95654


[I 2025-04-01 01:16:02,133] Trial 149 pruned. 


In [33]:
print(best_base_aug)

BestRun(run_id='71', objective=0.9750019868944989, hyperparameters={'learning_rate': 0.00013900432881088528, 'weight_decay': 0.007, 'warmup_steps': 13}, run_summary=None)


In [34]:
base.reset_seed()

## Prohledávání s destilací nad augmentovaným datasetem
Konfigurace jednotlivých tréninků.

In [35]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-aug-KD_hp-search", logging_dir=f"~/logs/{DATASET}/-aug-KD_hp-search", remove_unused_columns=False, epochs=num_epochs, batch_size=batch_size)

Definice hledaných hyperparametrů a jejich rozmezí, rozšířeno o hyperparametry destilace.

In [36]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up),
        "lambda_param": trial.suggest_float("lambda_param",0,1,step=.1),
        "temperature": trial.suggest_float("temperature", 2,7, step=.5)
    }
    print(f"Trial {trial.number} with params: {params}")
    return params

Konfigurace Optuny.

In [37]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



Konfigurace destilačního trenéra pro jednotlivé tréninky. 

In [38]:
trainer = base.DistilTrainer(
    args=training_args,
    train_dataset=train_combo,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Nastavení prohledávání.

In [39]:
best_distill_aug = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Distill",
    n_trials=150
)

[I 2025-04-01 01:16:02,812] A new study created in memory with name: Distill


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3304,0.22072,0.93,0.93234,0.930202,0.929863
2,0.1931,0.202633,0.9386,0.940075,0.938903,0.938784


[I 2025-04-01 01:20:25,975] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.00010255552094216992, 'weight_decay': 0.0, 'warmup_steps': 28, 'lambda_param': 0.6000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3421,0.172162,0.9551,0.955796,0.955348,0.955329
2,0.1707,0.158439,0.966,0.966276,0.966153,0.966099
3,0.1478,0.155724,0.9663,0.966394,0.966457,0.966376


[I 2025-04-01 01:27:00,612] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 5.497167787383099e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3855,0.172489,0.9591,0.959443,0.959365,0.959181
2,0.1742,0.15696,0.9643,0.964419,0.964535,0.964383


[I 2025-04-01 01:31:24,599] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3217,0.174886,0.9569,0.957499,0.957209,0.95686
2,0.1703,0.16258,0.9632,0.963336,0.963436,0.963235
3,0.1491,0.154401,0.9674,0.967767,0.967454,0.96757


[I 2025-04-01 01:37:59,910] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.0008369042894376068, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4361,0.365217,0.8544,0.867651,0.854549,0.854211
2,0.2792,0.249172,0.915,0.917656,0.914894,0.915168


[I 2025-04-01 01:42:24,011] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.0018591820902866042, 'weight_decay': 0.002, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7222,0.5734,0.7414,0.761584,0.741653,0.742
2,0.4813,0.409342,0.8321,0.845059,0.83199,0.834181
3,0.3619,0.3427,0.8641,0.874686,0.863955,0.865564


[I 2025-04-01 01:48:58,580] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0008204643365323959, 'weight_decay': 0.001, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4288,0.324954,0.8777,0.882897,0.877409,0.877816
2,0.2782,0.246801,0.916,0.91737,0.916095,0.916136


[I 2025-04-01 01:53:23,204] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 0.0020690200562805084, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8088,0.675814,0.6946,0.72432,0.693017,0.693185
2,0.5478,0.486319,0.7937,0.813249,0.793442,0.794391
3,0.4212,0.402456,0.8338,0.844942,0.833666,0.833144


[I 2025-04-01 01:59:59,238] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3191,0.176946,0.9538,0.95448,0.95405,0.953896
2,0.1706,0.157824,0.9632,0.963625,0.963373,0.963355
3,0.1475,0.151727,0.9678,0.968174,0.967895,0.968005
4,0.1379,0.14858,0.9697,0.969785,0.969848,0.969788
5,0.1327,0.144876,0.9717,0.971862,0.971816,0.971825


[I 2025-04-01 02:11:01,837] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0010568529720322872, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.489,0.359962,0.8553,0.862868,0.855351,0.85465
2,0.3166,0.290084,0.8942,0.897164,0.894659,0.894092


[I 2025-04-01 02:15:25,337] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 5.622306732978549e-05, 'weight_decay': 0.004, 'warmup_steps': 6, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3592,0.175722,0.9562,0.956551,0.956439,0.956286
2,0.1743,0.155539,0.9663,0.966457,0.966539,0.966419
3,0.1494,0.155678,0.9656,0.965682,0.965813,0.965718


[I 2025-04-01 02:22:01,361] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 0.00010214640646150033, 'weight_decay': 0.006, 'warmup_steps': 4, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3156,0.172055,0.9557,0.95607,0.955923,0.955777
2,0.1686,0.158884,0.9657,0.966323,0.965901,0.965803
3,0.147,0.151116,0.9698,0.969817,0.969964,0.969847


[I 2025-04-01 02:28:36,754] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 0.00016480631021835324, 'weight_decay': 0.007, 'warmup_steps': 0, 'lambda_param': 0.5, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.2977,0.18343,0.9494,0.949882,0.94971,0.949524
2,0.1778,0.16788,0.9578,0.958686,0.958082,0.957915
3,0.1532,0.155845,0.9667,0.967001,0.966844,0.966817
4,0.1412,0.155683,0.9664,0.966666,0.966551,0.966493
5,0.1344,0.148725,0.9706,0.970923,0.9707,0.970766


[I 2025-04-01 02:39:37,459] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.0002789558923318467, 'weight_decay': 0.001, 'warmup_steps': 1, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3106,0.220361,0.9293,0.933118,0.929275,0.929978
2,0.1945,0.185312,0.95,0.950765,0.950303,0.949938
3,0.1635,0.172961,0.9569,0.957672,0.957018,0.957061


[I 2025-04-01 02:46:15,055] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 0.00014946504427538972, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.301,0.174032,0.9558,0.956697,0.955972,0.955903
2,0.1733,0.166635,0.9604,0.960325,0.960682,0.960352
3,0.1511,0.157201,0.9663,0.966666,0.966381,0.96648
4,0.1402,0.149618,0.97,0.970139,0.97016,0.970114
5,0.1334,0.142761,0.973,0.973249,0.973107,0.973166
6,0.1303,0.139762,0.9754,0.975582,0.975486,0.975527
7,0.1287,0.138868,0.9759,0.976092,0.975993,0.976027


[I 2025-04-01 03:01:39,063] Trial 14 finished with value: 0.9760268868449631 and parameters: {'learning_rate': 0.00014946504427538972, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 3.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 15 with params: {'learning_rate': 0.00023747437511073197, 'weight_decay': 0.008, 'warmup_steps': 9, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3136,0.198792,0.941,0.941906,0.941574,0.940658
2,0.1858,0.174611,0.9544,0.955172,0.954642,0.954531
3,0.1603,0.160498,0.9637,0.96394,0.963843,0.963817


[I 2025-04-01 03:08:15,202] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.00026247635814697094, 'weight_decay': 0.009000000000000001, 'warmup_steps': 1, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3061,0.201139,0.9379,0.938682,0.937931,0.937877
2,0.1896,0.187108,0.9471,0.948449,0.947423,0.947057
3,0.161,0.162222,0.9622,0.96227,0.96236,0.96228


[I 2025-04-01 03:14:49,838] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 0.0010845830712810907, 'weight_decay': 0.01, 'warmup_steps': 5, 'lambda_param': 0.7000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5007,0.358021,0.8557,0.860445,0.855873,0.855911
2,0.3241,0.324138,0.8728,0.879151,0.873832,0.872292


[I 2025-04-01 03:19:13,254] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0001600430576664745, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3035,0.191826,0.9437,0.945082,0.944016,0.943575
2,0.1776,0.170258,0.9603,0.960643,0.960558,0.960295
3,0.1539,0.157098,0.9653,0.965366,0.965505,0.965409
4,0.1409,0.149512,0.9699,0.970213,0.969945,0.970028
5,0.1337,0.145083,0.9726,0.972866,0.97271,0.972763


[I 2025-04-01 03:30:11,911] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 0.00013750796310717456, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3004,0.183768,0.9496,0.950191,0.949962,0.949586
2,0.1714,0.163142,0.9616,0.961865,0.961818,0.961644
3,0.15,0.156914,0.9661,0.966194,0.96626,0.966174


[I 2025-04-01 03:36:50,120] Trial 19 pruned. 


Trial 20 with params: {'learning_rate': 0.00014766637242423952, 'weight_decay': 0.008, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3262,0.181583,0.9521,0.953158,0.952259,0.95226
2,0.1738,0.159399,0.9639,0.964338,0.964106,0.96399
3,0.1497,0.153938,0.9676,0.967993,0.967733,0.967807
4,0.1397,0.14835,0.9701,0.970343,0.970225,0.970227
5,0.1333,0.143832,0.9718,0.972182,0.971909,0.97197


[I 2025-04-01 03:47:58,095] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 8.025588615307563e-05, 'weight_decay': 0.008, 'warmup_steps': 28, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3577,0.174056,0.9556,0.955937,0.95585,0.955717
2,0.1712,0.161141,0.9639,0.964283,0.964152,0.96396
3,0.1469,0.156249,0.9668,0.967003,0.967008,0.966921
4,0.1377,0.148761,0.9705,0.970743,0.970619,0.970625
5,0.1325,0.145445,0.9724,0.972626,0.972511,0.972518
6,0.1302,0.143977,0.9722,0.972389,0.972318,0.972341
7,0.129,0.142931,0.9722,0.972426,0.972316,0.972338


[I 2025-04-01 04:03:20,765] Trial 21 finished with value: 0.9723384394570976 and parameters: {'learning_rate': 8.025588615307563e-05, 'weight_decay': 0.008, 'warmup_steps': 28, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 22 with params: {'learning_rate': 6.45219786851023e-05, 'weight_decay': 0.01, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3471,0.171743,0.9562,0.956521,0.956395,0.956294
2,0.1722,0.158432,0.9652,0.965343,0.965356,0.965295
3,0.1471,0.150838,0.9694,0.969653,0.969494,0.969536
4,0.1373,0.149849,0.9691,0.969355,0.969198,0.969218
5,0.1328,0.145646,0.9721,0.972232,0.972249,0.972209


[I 2025-04-01 04:14:18,863] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 0.00011126749225155437, 'weight_decay': 0.006, 'warmup_steps': 32, 'lambda_param': 0.30000000000000004, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3401,0.174862,0.9535,0.954276,0.953673,0.953642
2,0.171,0.162946,0.9621,0.962338,0.962424,0.962085
3,0.1477,0.153637,0.9678,0.967915,0.967899,0.967878
4,0.1378,0.14688,0.9698,0.970028,0.969946,0.96996
5,0.1327,0.144345,0.9715,0.971735,0.971618,0.971633


[I 2025-04-01 04:25:19,226] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 7.730572019158064e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 20, 'lambda_param': 0.4, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3494,0.172255,0.9573,0.95805,0.957506,0.957559
2,0.1701,0.156308,0.967,0.967037,0.967174,0.967071
3,0.147,0.153103,0.9681,0.968407,0.968139,0.968248
4,0.1374,0.150621,0.9699,0.970044,0.970029,0.969985
5,0.1328,0.14488,0.9724,0.972468,0.972576,0.972498
6,0.1303,0.143593,0.9718,0.971996,0.97197,0.971959
7,0.129,0.142426,0.9723,0.972528,0.972445,0.972456


[I 2025-04-01 04:40:40,146] Trial 24 finished with value: 0.9724562731008737 and parameters: {'learning_rate': 7.730572019158064e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 20, 'lambda_param': 0.4, 'temperature': 7.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 25 with params: {'learning_rate': 8.046448076276604e-05, 'weight_decay': 0.007, 'warmup_steps': 20, 'lambda_param': 0.5, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3478,0.167709,0.959,0.95907,0.959206,0.959044
2,0.1696,0.156727,0.9649,0.965253,0.9651,0.965024
3,0.1468,0.150226,0.9693,0.969513,0.969373,0.969424
4,0.1372,0.149328,0.9689,0.969452,0.968959,0.969048
5,0.1323,0.143617,0.9721,0.972198,0.972259,0.972199
6,0.1299,0.141681,0.973,0.973173,0.973123,0.973129
7,0.1288,0.141003,0.9728,0.972974,0.972903,0.972912


[I 2025-04-01 04:56:01,631] Trial 25 finished with value: 0.9729122059730481 and parameters: {'learning_rate': 8.046448076276604e-05, 'weight_decay': 0.007, 'warmup_steps': 20, 'lambda_param': 0.5, 'temperature': 7.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 26 with params: {'learning_rate': 6.173349024029844e-05, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.2, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3584,0.172814,0.9571,0.957863,0.957278,0.957213
2,0.1718,0.157894,0.9636,0.963932,0.963796,0.963661
3,0.1477,0.153494,0.9674,0.967872,0.967429,0.967596


[I 2025-04-01 05:02:37,247] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.00023716965914421723, 'weight_decay': 0.006, 'warmup_steps': 18, 'lambda_param': 0.30000000000000004, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3216,0.2082,0.9374,0.939527,0.937459,0.937634
2,0.1862,0.172391,0.9555,0.956091,0.955627,0.955748
3,0.16,0.167672,0.9607,0.961795,0.960824,0.960927


[I 2025-04-01 05:09:11,990] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 0.0004402641242980844, 'weight_decay': 0.01, 'warmup_steps': 30, 'lambda_param': 0.4, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3632,0.243572,0.9183,0.923395,0.918258,0.919107
2,0.2182,0.206269,0.9381,0.940268,0.938373,0.93826
3,0.1817,0.188591,0.9489,0.948953,0.949342,0.948941


[I 2025-04-01 05:15:46,392] Trial 28 pruned. 


Trial 29 with params: {'learning_rate': 7.16182340018055e-05, 'weight_decay': 0.003, 'warmup_steps': 18, 'lambda_param': 0.8, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3528,0.177438,0.9538,0.954237,0.954109,0.953891
2,0.1719,0.155596,0.9657,0.966017,0.965871,0.965863
3,0.1471,0.156545,0.9649,0.965086,0.965027,0.965026


[I 2025-04-01 05:22:20,600] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 5.397327843889936e-05, 'weight_decay': 0.006, 'warmup_steps': 20, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3836,0.183296,0.9504,0.95168,0.950627,0.9506
2,0.177,0.158136,0.9671,0.967553,0.967293,0.96723
3,0.1491,0.153052,0.9674,0.96768,0.967503,0.967561


[I 2025-04-01 05:28:55,611] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 6.863679244317977e-05, 'weight_decay': 0.008, 'warmup_steps': 30, 'lambda_param': 0.4, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3722,0.171654,0.9571,0.957605,0.957301,0.957215
2,0.1708,0.157649,0.9655,0.965798,0.965758,0.965596
3,0.1469,0.151146,0.968,0.968264,0.968158,0.968143
4,0.1369,0.147406,0.9695,0.969742,0.969652,0.969655
5,0.1325,0.143918,0.9722,0.972399,0.972325,0.972339
6,0.1302,0.141986,0.9726,0.972814,0.972716,0.972749
7,0.129,0.141288,0.9716,0.971907,0.971731,0.971768


[I 2025-04-01 05:44:19,440] Trial 31 finished with value: 0.9717683192890331 and parameters: {'learning_rate': 6.863679244317977e-05, 'weight_decay': 0.008, 'warmup_steps': 30, 'lambda_param': 0.4, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 32 with params: {'learning_rate': 0.00012580001759204294, 'weight_decay': 0.007, 'warmup_steps': 30, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3347,0.176862,0.9544,0.954917,0.954471,0.954554
2,0.1718,0.158732,0.965,0.965125,0.96524,0.964982
3,0.1494,0.151819,0.968,0.967978,0.968186,0.968036
4,0.1382,0.147705,0.9719,0.971973,0.971991,0.971967
5,0.133,0.143422,0.9728,0.97295,0.972928,0.972899
6,0.1302,0.139834,0.9739,0.97407,0.973998,0.974008
7,0.1288,0.139124,0.974,0.974265,0.97407,0.974131


[I 2025-04-01 05:59:40,784] Trial 32 finished with value: 0.9741306146134778 and parameters: {'learning_rate': 0.00012580001759204294, 'weight_decay': 0.007, 'warmup_steps': 30, 'lambda_param': 0.8, 'temperature': 4.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 33 with params: {'learning_rate': 8.056476192812695e-05, 'weight_decay': 0.004, 'warmup_steps': 29, 'lambda_param': 0.8, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3518,0.172302,0.9573,0.957943,0.957488,0.957441
2,0.1698,0.158923,0.9644,0.96464,0.964646,0.964445
3,0.1474,0.151185,0.9692,0.969353,0.969334,0.969304
4,0.1373,0.148702,0.9703,0.97044,0.970425,0.970397
5,0.1326,0.143806,0.9719,0.972109,0.972034,0.972051


[I 2025-04-01 06:10:37,643] Trial 33 pruned. 


Trial 34 with params: {'learning_rate': 0.0004458668849587371, 'weight_decay': 0.008, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3664,0.254831,0.91,0.913365,0.910187,0.910451
2,0.2189,0.207371,0.9363,0.937404,0.936827,0.936123


[I 2025-04-01 06:15:00,502] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 0.0004484974207832509, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3423,0.240137,0.9193,0.92318,0.91949,0.919521
2,0.2163,0.203659,0.9397,0.940461,0.939869,0.939496
3,0.1818,0.186739,0.9482,0.949688,0.948214,0.948565


[I 2025-04-01 06:21:35,171] Trial 35 pruned. 


Trial 36 with params: {'learning_rate': 0.00014656152995659458, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.5, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3206,0.183197,0.9487,0.949451,0.948908,0.948676
2,0.1729,0.166661,0.9592,0.95957,0.959396,0.959231


[I 2025-04-01 06:25:58,380] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 5.626231405953138e-05, 'weight_decay': 0.007, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3674,0.173417,0.9578,0.958395,0.958009,0.95793
2,0.1762,0.158661,0.9646,0.964795,0.964804,0.96469
3,0.1504,0.153092,0.9673,0.967411,0.967493,0.967425
4,0.1388,0.148987,0.9688,0.969032,0.968926,0.96894
5,0.1334,0.147037,0.9702,0.970375,0.970351,0.970343


[I 2025-04-01 06:36:54,735] Trial 37 pruned. 


Trial 38 with params: {'learning_rate': 5.372440025280784e-05, 'weight_decay': 0.006, 'warmup_steps': 17, 'lambda_param': 0.2, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3776,0.176025,0.9555,0.956127,0.955738,0.955628
2,0.1724,0.154762,0.9658,0.966003,0.965939,0.965937
3,0.1484,0.152793,0.9686,0.968773,0.968689,0.968712
4,0.1377,0.149677,0.9686,0.968807,0.968748,0.9687
5,0.133,0.147253,0.9698,0.969953,0.969948,0.969912


[I 2025-04-01 06:47:52,748] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.00020817094511356652, 'weight_decay': 0.006, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3279,0.18931,0.9471,0.948055,0.947339,0.947103
2,0.1816,0.172827,0.9562,0.956268,0.956527,0.956172


[I 2025-04-01 06:52:16,094] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 6.481736891825345e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19, 'lambda_param': 0.5, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.368,0.171485,0.9563,0.956662,0.956472,0.956388
2,0.1705,0.157835,0.9647,0.964928,0.964903,0.964796
3,0.1472,0.153979,0.9678,0.967985,0.967929,0.9679
4,0.1377,0.149335,0.9694,0.969716,0.969476,0.969556
5,0.1329,0.145804,0.9694,0.969607,0.96951,0.969538


[I 2025-04-01 07:03:14,423] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 6.512621333191503e-05, 'weight_decay': 0.008, 'warmup_steps': 30, 'lambda_param': 0.9, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3733,0.174267,0.9553,0.956728,0.955593,0.955525
2,0.1723,0.155914,0.9653,0.965476,0.965492,0.965381
3,0.1478,0.153452,0.9679,0.96853,0.968033,0.96815
4,0.1373,0.148566,0.9714,0.971543,0.971542,0.971501
5,0.1328,0.146134,0.9711,0.971351,0.971223,0.97125


[I 2025-04-01 07:14:09,842] Trial 41 pruned. 


Trial 42 with params: {'learning_rate': 0.002298170148918028, 'weight_decay': 0.006, 'warmup_steps': 10, 'lambda_param': 0.1, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.1852,1.303163,0.2846,0.33946,0.284846,0.239429
2,1.1386,1.275572,0.3505,0.451159,0.352165,0.320308
3,1.0275,1.131607,0.4437,0.46905,0.443517,0.414346


[I 2025-04-01 07:20:43,183] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.00015205959741666948, 'weight_decay': 0.008, 'warmup_steps': 26, 'lambda_param': 0.6000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3335,0.173897,0.9559,0.956336,0.956015,0.956016
2,0.1751,0.164121,0.9601,0.960174,0.960393,0.960055


[I 2025-04-01 07:25:05,876] Trial 43 pruned. 


Trial 44 with params: {'learning_rate': 0.00011008601609400506, 'weight_decay': 0.008, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3393,0.17482,0.9557,0.956362,0.955833,0.955899
2,0.1709,0.162341,0.9615,0.961575,0.961762,0.961532
3,0.1484,0.152621,0.9689,0.969267,0.969018,0.969094
4,0.138,0.150398,0.9691,0.969402,0.969275,0.969268
5,0.1323,0.143635,0.9718,0.971979,0.971929,0.971924
6,0.13,0.140669,0.9731,0.973344,0.973229,0.973254
7,0.1286,0.139894,0.9739,0.974196,0.974033,0.974064


[I 2025-04-01 07:40:26,370] Trial 44 finished with value: 0.9740636462843719 and parameters: {'learning_rate': 0.00011008601609400506, 'weight_decay': 0.008, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 6.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 45 with params: {'learning_rate': 0.00014552426362589955, 'weight_decay': 0.008, 'warmup_steps': 29, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3318,0.17715,0.9539,0.955061,0.954012,0.954222
2,0.1738,0.162501,0.9615,0.961739,0.961668,0.961545
3,0.1507,0.157806,0.9643,0.964706,0.964506,0.964475


[I 2025-04-01 07:46:59,855] Trial 45 pruned. 


Trial 46 with params: {'learning_rate': 0.00027628084200139716, 'weight_decay': 0.005, 'warmup_steps': 30, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.335,0.203695,0.9401,0.941773,0.940399,0.94012
2,0.1941,0.183493,0.9481,0.948712,0.948364,0.948211
3,0.1646,0.165102,0.9609,0.961067,0.961106,0.961035


[I 2025-04-01 07:53:35,452] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 0.0025789104733638904, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.0445,1.43857,0.3282,0.422042,0.326817,0.28112
2,0.88,0.81711,0.5908,0.608719,0.589763,0.585143


[I 2025-04-01 07:57:57,796] Trial 47 pruned. 


Trial 48 with params: {'learning_rate': 0.0027511979602444763, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5049,1.544276,0.1172,0.035162,0.115428,0.045618
2,1.4769,1.634135,0.1258,0.068377,0.124703,0.055477


[I 2025-04-01 08:02:20,621] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.0015898708923464957, 'weight_decay': 0.004, 'warmup_steps': 17, 'lambda_param': 0.1, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6421,0.537734,0.7627,0.790579,0.762025,0.764287
2,0.4227,0.365432,0.8553,0.86221,0.855163,0.854396
3,0.3222,0.303282,0.883,0.890436,0.88285,0.884442


[I 2025-04-01 08:08:55,169] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.00011047169753416782, 'weight_decay': 0.007, 'warmup_steps': 22, 'lambda_param': 0.30000000000000004, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3307,0.17364,0.9575,0.958088,0.957649,0.957602
2,0.1701,0.159403,0.9622,0.962815,0.962378,0.962364
3,0.1484,0.155881,0.9663,0.966251,0.96645,0.966308


[I 2025-04-01 08:15:34,179] Trial 50 pruned. 


Trial 51 with params: {'learning_rate': 8.990889568762435e-05, 'weight_decay': 0.006, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3493,0.177116,0.9552,0.955774,0.955534,0.955254
2,0.1693,0.159368,0.9639,0.964288,0.964142,0.963965
3,0.147,0.152261,0.9679,0.968301,0.968012,0.968119
4,0.1375,0.147812,0.9708,0.970963,0.970975,0.970916
5,0.1325,0.142914,0.9736,0.973736,0.97373,0.973715
6,0.13,0.140821,0.9733,0.973484,0.973448,0.973446
7,0.1287,0.139892,0.973,0.973283,0.973153,0.97317


[I 2025-04-01 08:30:56,486] Trial 51 finished with value: 0.9731703621331815 and parameters: {'learning_rate': 8.990889568762435e-05, 'weight_decay': 0.006, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 52 with params: {'learning_rate': 0.00010890809542845314, 'weight_decay': 0.004, 'warmup_steps': 29, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.34,0.172315,0.9571,0.95756,0.957372,0.957208
2,0.1702,0.157302,0.9652,0.965388,0.965384,0.965261
3,0.1484,0.153011,0.9673,0.967594,0.967453,0.96748
4,0.1379,0.146206,0.9721,0.972241,0.972193,0.972163
5,0.1326,0.143366,0.9725,0.972825,0.972594,0.972652
6,0.1302,0.141552,0.9737,0.973914,0.973814,0.973849
7,0.1288,0.139416,0.9754,0.975628,0.97551,0.975549


[I 2025-04-01 08:46:20,016] Trial 52 finished with value: 0.9755491859490355 and parameters: {'learning_rate': 0.00010890809542845314, 'weight_decay': 0.004, 'warmup_steps': 29, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 53 with params: {'learning_rate': 0.0003353858567977691, 'weight_decay': 0.005, 'warmup_steps': 32, 'lambda_param': 0.5, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3409,0.226621,0.9261,0.930143,0.926007,0.926917
2,0.1994,0.18731,0.9505,0.950962,0.950609,0.950502


[I 2025-04-01 08:50:44,188] Trial 53 pruned. 


Trial 54 with params: {'learning_rate': 0.00010938372878601712, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 0.5, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.341,0.171446,0.9567,0.956986,0.956886,0.956821
2,0.1706,0.159202,0.963,0.963282,0.963209,0.963022
3,0.1481,0.152086,0.9689,0.969023,0.969061,0.969034
4,0.1378,0.149162,0.9698,0.970158,0.969866,0.969955
5,0.1322,0.143531,0.9726,0.972862,0.972695,0.972751
6,0.1301,0.141483,0.9721,0.972297,0.972218,0.97224
7,0.1287,0.140461,0.9728,0.97306,0.972931,0.972959


[I 2025-04-01 09:06:05,421] Trial 54 finished with value: 0.9729594416185702 and parameters: {'learning_rate': 0.00010938372878601712, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 0.5, 'temperature': 5.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 55 with params: {'learning_rate': 0.0002941526437372032, 'weight_decay': 0.002, 'warmup_steps': 26, 'lambda_param': 0.7000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3318,0.201582,0.9416,0.944456,0.941536,0.942212
2,0.1954,0.186419,0.9464,0.947063,0.946796,0.946421


[I 2025-04-01 09:10:28,386] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 8.333105058396458e-05, 'weight_decay': 0.003, 'warmup_steps': 30, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3532,0.172577,0.9567,0.957381,0.956812,0.956907
2,0.17,0.156805,0.9658,0.965873,0.965997,0.9659
3,0.1472,0.153996,0.9665,0.966811,0.966623,0.966684


[I 2025-04-01 09:17:02,716] Trial 56 pruned. 


Trial 57 with params: {'learning_rate': 0.00012855439879510747, 'weight_decay': 0.002, 'warmup_steps': 30, 'lambda_param': 0.4, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3338,0.170421,0.9578,0.957896,0.958039,0.957843
2,0.1718,0.162004,0.9633,0.963385,0.963508,0.963328
3,0.1496,0.153068,0.9682,0.96834,0.968332,0.968317
4,0.1386,0.150021,0.9708,0.970954,0.970903,0.97086
5,0.1329,0.143219,0.9743,0.974514,0.974398,0.974435
6,0.1301,0.141278,0.9743,0.974451,0.97439,0.974403
7,0.1287,0.140406,0.9747,0.974949,0.974807,0.974829


[I 2025-04-01 09:33:15,474] Trial 57 finished with value: 0.9748288267069476 and parameters: {'learning_rate': 0.00012855439879510747, 'weight_decay': 0.002, 'warmup_steps': 30, 'lambda_param': 0.4, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 58 with params: {'learning_rate': 5.8118083037899305e-05, 'weight_decay': 0.007, 'warmup_steps': 28, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3776,0.170258,0.9577,0.957932,0.957941,0.957765
2,0.171,0.158216,0.9652,0.965426,0.9654,0.965351
3,0.1476,0.153848,0.9676,0.96773,0.967749,0.967729


[I 2025-04-01 09:39:48,665] Trial 58 pruned. 


Trial 59 with params: {'learning_rate': 0.00018678513071366875, 'weight_decay': 0.005, 'warmup_steps': 25, 'lambda_param': 0.8, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3241,0.204424,0.9401,0.941668,0.940193,0.940342
2,0.1794,0.167201,0.9598,0.959929,0.959931,0.9598
3,0.1546,0.155323,0.967,0.967089,0.967153,0.967107
4,0.1416,0.152164,0.9685,0.96876,0.968605,0.968625
5,0.1345,0.145124,0.9702,0.970398,0.970309,0.970343


[I 2025-04-01 09:50:47,474] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 0.00013822919527861567, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32, 'lambda_param': 0.9, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.333,0.180386,0.9527,0.952809,0.95304,0.952625
2,0.1723,0.165733,0.9606,0.960854,0.960885,0.960594
3,0.1501,0.155111,0.967,0.967279,0.967148,0.967128


[I 2025-04-01 09:57:25,458] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.00012044085386519688, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3341,0.179865,0.9511,0.951676,0.95136,0.951092
2,0.1707,0.164269,0.9608,0.961127,0.961095,0.960852
3,0.1487,0.153201,0.9686,0.969096,0.968622,0.968813
4,0.1384,0.150282,0.9688,0.968774,0.968965,0.968796
5,0.1325,0.142486,0.9755,0.97562,0.975558,0.975571
6,0.1302,0.138994,0.9754,0.975661,0.975517,0.975563
7,0.1287,0.138373,0.9752,0.97541,0.975303,0.975334


[I 2025-04-01 10:12:50,223] Trial 61 finished with value: 0.975333613683923 and parameters: {'learning_rate': 0.00012044085386519688, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 5.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 62 with params: {'learning_rate': 0.00010518522420507488, 'weight_decay': 0.001, 'warmup_steps': 28, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.343,0.179367,0.952,0.952872,0.952206,0.952123
2,0.1721,0.160213,0.9648,0.964818,0.964982,0.964788
3,0.1486,0.149699,0.9703,0.970472,0.970367,0.970398
4,0.1383,0.147644,0.9713,0.971581,0.971356,0.971441
5,0.1327,0.142128,0.9731,0.973254,0.973191,0.973212
6,0.1299,0.140012,0.9743,0.974497,0.974418,0.974448
7,0.1286,0.139186,0.9744,0.974637,0.974481,0.97454


[I 2025-04-01 10:28:12,895] Trial 62 finished with value: 0.9745397163397967 and parameters: {'learning_rate': 0.00010518522420507488, 'weight_decay': 0.001, 'warmup_steps': 28, 'lambda_param': 0.0, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 63 with params: {'learning_rate': 9.712390035183193e-05, 'weight_decay': 0.0, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.346,0.176072,0.9557,0.95629,0.955951,0.955826
2,0.17,0.15744,0.9647,0.965353,0.964884,0.964833
3,0.1479,0.151844,0.9684,0.96884,0.968578,0.968548
4,0.1379,0.147885,0.9698,0.970145,0.969903,0.96995
5,0.1325,0.143071,0.9732,0.973475,0.973287,0.973331
6,0.1301,0.140895,0.9738,0.974013,0.973936,0.973954
7,0.1288,0.14004,0.9744,0.974689,0.974534,0.974573


[I 2025-04-01 10:43:37,693] Trial 63 finished with value: 0.9745731170887337 and parameters: {'learning_rate': 9.712390035183193e-05, 'weight_decay': 0.0, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 64 with params: {'learning_rate': 0.0002351244550083826, 'weight_decay': 0.001, 'warmup_steps': 31, 'lambda_param': 0.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.334,0.199864,0.9424,0.943802,0.942557,0.942585
2,0.1863,0.176153,0.9547,0.95494,0.954947,0.954728


[I 2025-04-01 10:48:01,253] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.00016525141364738446, 'weight_decay': 0.0, 'warmup_steps': 24, 'lambda_param': 0.1, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3243,0.190675,0.9461,0.948209,0.946288,0.946135
2,0.1752,0.171624,0.958,0.958257,0.958329,0.957854
3,0.1527,0.155604,0.9667,0.966854,0.966847,0.966806


[I 2025-04-01 10:54:41,250] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 6.323351059973518e-05, 'weight_decay': 0.001, 'warmup_steps': 30, 'lambda_param': 0.1, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3755,0.172392,0.9548,0.95577,0.955071,0.954941
2,0.1712,0.155482,0.9651,0.965286,0.965271,0.965178
3,0.1478,0.151177,0.969,0.969097,0.969111,0.969072
4,0.1377,0.147927,0.9709,0.971056,0.971014,0.971015
5,0.1329,0.145317,0.9713,0.971431,0.971449,0.971405


[I 2025-04-01 11:05:41,710] Trial 66 pruned. 


Trial 67 with params: {'learning_rate': 0.00019820228176629646, 'weight_decay': 0.001, 'warmup_steps': 25, 'lambda_param': 0.1, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3259,0.182158,0.9504,0.950922,0.950546,0.950582
2,0.1809,0.165581,0.9572,0.957829,0.957433,0.957318
3,0.1552,0.160948,0.9635,0.963745,0.963714,0.963591


[I 2025-04-01 11:12:17,200] Trial 67 pruned. 


Trial 68 with params: {'learning_rate': 7.276193589192486e-05, 'weight_decay': 0.0, 'warmup_steps': 25, 'lambda_param': 0.4, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3611,0.175514,0.9562,0.956546,0.956445,0.956259
2,0.1708,0.157619,0.9638,0.96415,0.963982,0.963984
3,0.1474,0.156293,0.9663,0.966472,0.966446,0.966421


[I 2025-04-01 11:18:50,981] Trial 68 pruned. 


Trial 69 with params: {'learning_rate': 0.0002892714342204719, 'weight_decay': 0.001, 'warmup_steps': 31, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.333,0.214453,0.9305,0.933706,0.93063,0.930825
2,0.1931,0.181246,0.9524,0.952961,0.952527,0.952449


[I 2025-04-01 11:23:14,454] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 9.88558411540213e-05, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 0.2, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3445,0.17851,0.9537,0.954529,0.953947,0.953815
2,0.171,0.164783,0.961,0.961063,0.961115,0.960981
3,0.1475,0.152218,0.9682,0.968543,0.968306,0.968398
4,0.1381,0.150135,0.9697,0.969749,0.969848,0.969752
5,0.1326,0.144717,0.973,0.973226,0.973059,0.973101
6,0.1301,0.142502,0.9726,0.97294,0.972707,0.972775
7,0.1288,0.14177,0.9734,0.973627,0.973502,0.973538


[I 2025-04-01 11:38:32,638] Trial 70 finished with value: 0.9735378726273207 and parameters: {'learning_rate': 9.88558411540213e-05, 'weight_decay': 0.003, 'warmup_steps': 28, 'lambda_param': 0.2, 'temperature': 7.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 71 with params: {'learning_rate': 6.166645001939548e-05, 'weight_decay': 0.01, 'warmup_steps': 7, 'lambda_param': 0.9, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3468,0.171596,0.9574,0.957766,0.957555,0.957537
2,0.172,0.162065,0.9614,0.962081,0.961666,0.961543
3,0.1489,0.154785,0.9683,0.968405,0.968479,0.968392
4,0.1385,0.150608,0.9694,0.969584,0.969564,0.969524
5,0.1333,0.147702,0.9704,0.970597,0.970543,0.970542


[I 2025-04-01 11:49:27,955] Trial 71 pruned. 


Trial 72 with params: {'learning_rate': 0.0001108531036505466, 'weight_decay': 0.001, 'warmup_steps': 25, 'lambda_param': 0.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3362,0.176187,0.9533,0.95364,0.953475,0.95336
2,0.1711,0.160594,0.9637,0.964019,0.963767,0.963732
3,0.1482,0.149914,0.9697,0.969852,0.969767,0.969779
4,0.1379,0.146548,0.9711,0.971311,0.971204,0.971195
5,0.1326,0.142965,0.973,0.973337,0.973123,0.973116
6,0.13,0.140462,0.9752,0.975327,0.975311,0.975303
7,0.1288,0.139512,0.9748,0.975037,0.974905,0.974932


[I 2025-04-01 12:04:48,419] Trial 72 finished with value: 0.9749324670580585 and parameters: {'learning_rate': 0.0001108531036505466, 'weight_decay': 0.001, 'warmup_steps': 25, 'lambda_param': 0.0, 'temperature': 6.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 73 with params: {'learning_rate': 0.00010622749796426139, 'weight_decay': 0.002, 'warmup_steps': 15, 'lambda_param': 0.1, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3269,0.178575,0.9529,0.953863,0.953222,0.952955
2,0.1691,0.159927,0.9635,0.963594,0.963763,0.963474
3,0.1477,0.154043,0.9678,0.967984,0.967953,0.967914
4,0.1373,0.14704,0.9718,0.971938,0.9719,0.971901
5,0.1326,0.144904,0.9738,0.973868,0.973951,0.973889
6,0.13,0.142058,0.9718,0.972017,0.971948,0.971953
7,0.1288,0.140615,0.9745,0.974685,0.974629,0.974625


[I 2025-04-01 12:21:06,723] Trial 73 finished with value: 0.9746253904150363 and parameters: {'learning_rate': 0.00010622749796426139, 'weight_decay': 0.002, 'warmup_steps': 15, 'lambda_param': 0.1, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 74 with params: {'learning_rate': 8.875293218598164e-05, 'weight_decay': 0.001, 'warmup_steps': 16, 'lambda_param': 0.2, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3442,0.173469,0.9557,0.956155,0.955929,0.955845
2,0.1707,0.155648,0.9653,0.965403,0.965502,0.965349
3,0.1466,0.150896,0.9687,0.968834,0.968875,0.968799
4,0.1373,0.149155,0.9698,0.970205,0.969913,0.96996
5,0.1321,0.144812,0.9718,0.972021,0.971856,0.971912


[I 2025-04-01 12:32:05,748] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00013212411314174807, 'weight_decay': 0.001, 'warmup_steps': 13, 'lambda_param': 0.1, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3153,0.176868,0.953,0.953777,0.953189,0.953197
2,0.173,0.164582,0.9592,0.959448,0.959442,0.959256
3,0.1502,0.160054,0.9637,0.963744,0.963943,0.96378


[I 2025-04-01 12:38:42,066] Trial 75 pruned. 


Trial 76 with params: {'learning_rate': 0.0001478830912282871, 'weight_decay': 0.001, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3376,0.174691,0.9559,0.956128,0.956147,0.955964
2,0.1747,0.16119,0.9632,0.963319,0.96343,0.963228
3,0.1516,0.162542,0.9626,0.962931,0.96274,0.962663


[I 2025-04-01 12:45:19,390] Trial 76 pruned. 


Trial 77 with params: {'learning_rate': 5.029781934024544e-05, 'weight_decay': 0.001, 'warmup_steps': 24, 'lambda_param': 0.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3914,0.17522,0.9553,0.956064,0.955438,0.955476
2,0.1742,0.158792,0.9641,0.964349,0.964257,0.964231
3,0.149,0.153669,0.967,0.967347,0.967111,0.967198
4,0.1383,0.150114,0.9694,0.969611,0.969535,0.96954
5,0.1333,0.148406,0.9687,0.968902,0.968848,0.968847


[I 2025-04-01 12:56:19,305] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 6.979731308215335e-05, 'weight_decay': 0.0, 'warmup_steps': 0, 'lambda_param': 0.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3295,0.171568,0.9574,0.957689,0.957617,0.957544
2,0.1708,0.158637,0.9637,0.964174,0.963902,0.963906
3,0.1471,0.153879,0.966,0.966408,0.966155,0.966199


[I 2025-04-01 13:02:56,924] Trial 78 pruned. 


Trial 79 with params: {'learning_rate': 9.285093602836657e-05, 'weight_decay': 0.002, 'warmup_steps': 17, 'lambda_param': 0.1, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3378,0.174601,0.9549,0.955723,0.955148,0.955048
2,0.1685,0.159778,0.9633,0.96381,0.963502,0.963423
3,0.1472,0.155737,0.9662,0.966172,0.966377,0.96621


[I 2025-04-01 13:09:32,948] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 0.0003232397187280173, 'weight_decay': 0.004, 'warmup_steps': 17, 'lambda_param': 0.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3308,0.219156,0.9295,0.932241,0.929707,0.929816
2,0.1984,0.181443,0.9519,0.95338,0.952093,0.952223
3,0.1698,0.173137,0.9583,0.958913,0.958444,0.958542


[I 2025-04-01 13:16:07,874] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 0.00012504568233531515, 'weight_decay': 0.003, 'warmup_steps': 25, 'lambda_param': 0.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3318,0.174197,0.9546,0.954989,0.954832,0.954699
2,0.1722,0.164681,0.9617,0.962134,0.961931,0.961716
3,0.1494,0.153112,0.9678,0.967936,0.967994,0.967924


[I 2025-04-01 13:22:42,654] Trial 81 pruned. 


Trial 82 with params: {'learning_rate': 5.919333183182066e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.1, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3807,0.178634,0.954,0.954457,0.954272,0.95405
2,0.175,0.157507,0.9644,0.964578,0.964557,0.96451
3,0.1489,0.153616,0.9672,0.967524,0.967281,0.967331
4,0.1378,0.146951,0.9723,0.97264,0.972437,0.972472
5,0.1329,0.147055,0.9721,0.972208,0.972237,0.972202
6,0.1307,0.144423,0.9719,0.972158,0.972039,0.97208
7,0.1295,0.142911,0.9732,0.97345,0.973299,0.973351


[I 2025-04-01 13:38:03,184] Trial 82 finished with value: 0.9733507040678624 and parameters: {'learning_rate': 5.919333183182066e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.1, 'temperature': 6.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 83 with params: {'learning_rate': 0.0003354827807830549, 'weight_decay': 0.009000000000000001, 'warmup_steps': 7, 'lambda_param': 0.1, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3243,0.229144,0.9238,0.92718,0.923596,0.924007
2,0.1995,0.193766,0.9466,0.947137,0.947006,0.946615


[I 2025-04-01 13:42:26,497] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.00016865082573354006, 'weight_decay': 0.002, 'warmup_steps': 14, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3127,0.181851,0.9504,0.951788,0.950731,0.95054
2,0.1751,0.163008,0.9607,0.961097,0.960945,0.960852
3,0.152,0.161209,0.964,0.964249,0.964139,0.964116


[I 2025-04-01 13:49:01,012] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.00017953794968901505, 'weight_decay': 0.002, 'warmup_steps': 26, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3266,0.186198,0.9488,0.949496,0.948912,0.9488
2,0.1767,0.165079,0.9614,0.961793,0.961722,0.961487


[I 2025-04-01 13:53:25,087] Trial 85 pruned. 


Trial 86 with params: {'learning_rate': 0.00011094653078201996, 'weight_decay': 0.0, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3354,0.173084,0.956,0.956561,0.956146,0.956118
2,0.1696,0.161507,0.962,0.96336,0.962265,0.962128
3,0.1472,0.15554,0.9662,0.96626,0.966378,0.966243


[I 2025-04-01 13:59:59,500] Trial 86 pruned. 


Trial 87 with params: {'learning_rate': 0.00026223121889014676, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3067,0.196881,0.9418,0.943261,0.942108,0.941885
2,0.1881,0.182445,0.9506,0.952846,0.951098,0.950555


[I 2025-04-01 14:04:21,647] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 0.00040369793783588123, 'weight_decay': 0.003, 'warmup_steps': 21, 'lambda_param': 0.30000000000000004, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3499,0.229702,0.921,0.924917,0.921297,0.921587
2,0.2103,0.203553,0.9419,0.943823,0.94194,0.942112


[I 2025-04-01 14:08:46,575] Trial 88 pruned. 


Trial 89 with params: {'learning_rate': 0.00012581818819560388, 'weight_decay': 0.004, 'warmup_steps': 25, 'lambda_param': 0.2, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3309,0.1814,0.9522,0.953251,0.952503,0.952273
2,0.1708,0.159254,0.9633,0.963446,0.963566,0.963285
3,0.1488,0.157153,0.9649,0.965148,0.965043,0.965041


[I 2025-04-01 14:15:20,845] Trial 89 pruned. 


Trial 90 with params: {'learning_rate': 0.000928277511187833, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.4, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4747,0.374383,0.8493,0.864394,0.848812,0.849893
2,0.301,0.266165,0.908,0.910177,0.907736,0.907715
3,0.2383,0.241525,0.9213,0.921271,0.92177,0.920553


[I 2025-04-01 14:21:54,452] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 7.121734963273843e-05, 'weight_decay': 0.005, 'warmup_steps': 23, 'lambda_param': 0.6000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3696,0.173671,0.956,0.956097,0.956236,0.956068
2,0.1741,0.157246,0.9655,0.96581,0.965721,0.96565
3,0.1486,0.153239,0.9667,0.966853,0.966873,0.966772


[I 2025-04-01 14:28:29,843] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 0.00010887973765167546, 'weight_decay': 0.006, 'warmup_steps': 30, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3357,0.172903,0.9568,0.957178,0.957091,0.956848
2,0.1692,0.158292,0.9658,0.965936,0.965989,0.965906
3,0.1474,0.153323,0.9682,0.968541,0.968267,0.968332
4,0.1374,0.148757,0.9701,0.970268,0.970223,0.970187
5,0.1325,0.144626,0.9716,0.971953,0.97171,0.971729


[I 2025-04-01 14:39:31,590] Trial 92 pruned. 


Trial 93 with params: {'learning_rate': 0.0002793182325266481, 'weight_decay': 0.001, 'warmup_steps': 31, 'lambda_param': 0.7000000000000001, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3408,0.208263,0.9343,0.937547,0.934615,0.934259
2,0.1926,0.180392,0.9515,0.951736,0.951846,0.951415


[I 2025-04-01 14:43:55,750] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.00010338642064276396, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.30000000000000004, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3456,0.175988,0.9538,0.954356,0.954153,0.953899
2,0.17,0.164034,0.9624,0.962551,0.96263,0.962407
3,0.1482,0.154945,0.9669,0.967223,0.967,0.967042
4,0.1376,0.147417,0.9708,0.971073,0.970886,0.970947
5,0.1326,0.145793,0.9715,0.971619,0.971615,0.971584


[I 2025-04-01 14:54:56,460] Trial 94 pruned. 


Trial 95 with params: {'learning_rate': 0.00033622652480271855, 'weight_decay': 0.0, 'warmup_steps': 5, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3195,0.216805,0.932,0.932787,0.932228,0.931976
2,0.2009,0.188242,0.9478,0.947923,0.948263,0.9476


[I 2025-04-01 14:59:19,719] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 5.399635979922363e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3885,0.174397,0.9564,0.95692,0.956602,0.956561
2,0.1762,0.157521,0.9649,0.965069,0.965113,0.964989
3,0.1497,0.152278,0.9663,0.966441,0.966429,0.966429


[I 2025-04-01 15:05:55,051] Trial 96 pruned. 


Trial 97 with params: {'learning_rate': 0.00012180589154004539, 'weight_decay': 0.006, 'warmup_steps': 30, 'lambda_param': 0.6000000000000001, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3418,0.169726,0.9555,0.955658,0.955769,0.955634
2,0.1715,0.167395,0.9569,0.957084,0.957172,0.956907


[I 2025-04-01 15:10:19,116] Trial 97 pruned. 


Trial 98 with params: {'learning_rate': 7.944280159622164e-05, 'weight_decay': 0.0, 'warmup_steps': 23, 'lambda_param': 0.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.347,0.178515,0.9532,0.954047,0.953363,0.953342
2,0.1692,0.163366,0.9606,0.960982,0.960878,0.96067
3,0.147,0.15854,0.9641,0.964302,0.964251,0.964226


[I 2025-04-01 15:16:53,669] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 0.00014171065540309326, 'weight_decay': 0.008, 'warmup_steps': 12, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3122,0.177673,0.9514,0.952597,0.951703,0.951425
2,0.1737,0.166312,0.96,0.96072,0.960173,0.960047
3,0.1508,0.158876,0.9652,0.965449,0.965372,0.965282


[I 2025-04-01 15:23:28,464] Trial 99 pruned. 


Trial 100 with params: {'learning_rate': 0.004463096479266976, 'weight_decay': 0.003, 'warmup_steps': 18, 'lambda_param': 1.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4852,1.566701,0.108,0.093208,0.106355,0.066027
2,1.4953,1.529764,0.126,0.080673,0.124863,0.072312
3,1.5081,1.553503,0.1014,0.01014,0.1,0.018413


[I 2025-04-01 15:30:04,948] Trial 100 pruned. 


Trial 101 with params: {'learning_rate': 0.00031502971397332646, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.4, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3178,0.221084,0.9289,0.93125,0.928806,0.929163
2,0.1958,0.183444,0.9483,0.948704,0.948594,0.948285


[I 2025-04-01 15:34:28,677] Trial 101 pruned. 


Trial 102 with params: {'learning_rate': 9.181003046476271e-05, 'weight_decay': 0.005, 'warmup_steps': 27, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3453,0.175041,0.9543,0.955186,0.95444,0.954527
2,0.1717,0.157032,0.9657,0.96568,0.965953,0.965747
3,0.1486,0.153638,0.9673,0.967883,0.967465,0.967512
4,0.1375,0.146915,0.9714,0.971649,0.971511,0.971541
5,0.1324,0.144514,0.9713,0.971658,0.971419,0.9715


[I 2025-04-01 15:45:26,704] Trial 102 pruned. 


Trial 103 with params: {'learning_rate': 0.0002253184406057868, 'weight_decay': 0.001, 'warmup_steps': 7, 'lambda_param': 0.4, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3065,0.196164,0.9425,0.944137,0.942793,0.942632
2,0.1828,0.174038,0.953,0.95406,0.95318,0.953122


[I 2025-04-01 15:49:48,849] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.0003133130297405107, 'weight_decay': 0.008, 'warmup_steps': 29, 'lambda_param': 1.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3401,0.218305,0.9307,0.933843,0.930835,0.931082
2,0.1973,0.185714,0.9506,0.95143,0.950851,0.950628


[I 2025-04-01 16:07:24,656] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 9.485007186583606e-05, 'weight_decay': 0.003, 'warmup_steps': 21, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3359,0.175532,0.9539,0.954389,0.954071,0.953992
2,0.1717,0.1572,0.9656,0.966181,0.96575,0.965765
3,0.149,0.152899,0.969,0.969384,0.969095,0.969184
4,0.1378,0.148356,0.9699,0.970132,0.970015,0.970045
5,0.1325,0.144197,0.9718,0.972091,0.971937,0.971946
6,0.13,0.141211,0.9737,0.973794,0.97387,0.973824


[I 2025-04-01 16:33:40,230] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 9.85414936577383e-05, 'weight_decay': 0.004, 'warmup_steps': 20, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3377,0.175885,0.9566,0.957088,0.956655,0.956695
2,0.1711,0.156237,0.9664,0.96637,0.966617,0.966437
3,0.1479,0.153563,0.9679,0.968136,0.967973,0.968018
4,0.1374,0.150909,0.9681,0.968347,0.968251,0.968241
5,0.1327,0.145863,0.972,0.972266,0.972099,0.972142


IOPub message rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_msg_rate_limit`.

Current values:
ServerApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
ServerApp.rate_limit_window=3.0 (secs)

[I 2025-04-01 16:44:35,550] Trial 108 pruned. 


Trial 109 with params: {'learning_rate': 6.999007988097729e-05, 'weight_decay': 0.006, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3719,0.17544,0.9547,0.95547,0.954989,0.954788
2,0.1707,0.157702,0.9652,0.965411,0.965316,0.965261
3,0.1466,0.153192,0.967,0.96731,0.967059,0.967134
4,0.1371,0.148945,0.9696,0.969788,0.969686,0.969715
5,0.1324,0.146183,0.9698,0.969974,0.969932,0.96991


[I 2025-04-01 16:55:34,003] Trial 109 pruned. 


Trial 110 with params: {'learning_rate': 7.485988197891542e-05, 'weight_decay': 0.004, 'warmup_steps': 6, 'lambda_param': 0.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3375,0.175741,0.9555,0.956029,0.955745,0.955616
2,0.1693,0.157295,0.9644,0.964564,0.964564,0.964494
3,0.1464,0.15185,0.9682,0.968343,0.968317,0.968308
4,0.1375,0.147823,0.9706,0.970698,0.970774,0.970687
5,0.1325,0.145062,0.9717,0.971857,0.971877,0.971831


[I 2025-04-01 17:06:33,712] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.00018510712397089887, 'weight_decay': 0.003, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3296,0.186584,0.9488,0.949826,0.949041,0.948912
2,0.1792,0.169047,0.9579,0.958265,0.958303,0.957829


[I 2025-04-01 17:10:55,829] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 0.000802279212394968, 'weight_decay': 0.005, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4259,0.27728,0.8941,0.90305,0.894024,0.89491
2,0.2781,0.271252,0.9031,0.905982,0.903626,0.902929
3,0.2239,0.210537,0.9339,0.93445,0.934029,0.933944


[I 2025-04-01 17:17:30,264] Trial 112 pruned. 


Trial 113 with params: {'learning_rate': 6.362806984860492e-05, 'weight_decay': 0.002, 'warmup_steps': 23, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3703,0.1761,0.9545,0.955141,0.954749,0.954675
2,0.1746,0.158541,0.9646,0.965028,0.964774,0.964724
3,0.148,0.15257,0.9676,0.967915,0.967793,0.967773


[I 2025-04-01 17:24:03,567] Trial 113 pruned. 


Trial 114 with params: {'learning_rate': 0.0001807071751854924, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.304,0.191998,0.945,0.947956,0.945402,0.945068
2,0.178,0.168138,0.9587,0.959021,0.959018,0.958655
3,0.1537,0.15505,0.9675,0.967707,0.967656,0.967642
4,0.141,0.155765,0.9673,0.96781,0.967399,0.967382
5,0.1341,0.146668,0.9722,0.97234,0.972321,0.972306
6,0.1304,0.142501,0.9733,0.973494,0.973385,0.973415
7,0.1289,0.140524,0.9755,0.975767,0.975557,0.975633


[I 2025-04-01 17:39:28,665] Trial 114 finished with value: 0.9756332785384745 and parameters: {'learning_rate': 0.0001807071751854924, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 4.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 115 with params: {'learning_rate': 0.00013777911241493818, 'weight_decay': 0.01, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.2997,0.1864,0.9487,0.950037,0.948953,0.948806
2,0.1739,0.163497,0.9609,0.961273,0.961024,0.961017
3,0.1505,0.153977,0.9666,0.967,0.966725,0.966781


[I 2025-04-01 17:46:02,383] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 0.0001823095417200476, 'weight_decay': 0.008, 'warmup_steps': 8, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3111,0.188233,0.9476,0.948781,0.947721,0.947688
2,0.1776,0.169438,0.9577,0.958018,0.957874,0.957634
3,0.1546,0.155177,0.9663,0.966769,0.966414,0.966488


[I 2025-04-01 17:52:42,630] Trial 116 pruned. 


Trial 117 with params: {'learning_rate': 0.0002123616647310967, 'weight_decay': 0.002, 'warmup_steps': 21, 'lambda_param': 0.4, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.326,0.196296,0.9429,0.944174,0.94312,0.942769
2,0.1834,0.187296,0.9474,0.947939,0.947879,0.947282


[I 2025-04-01 17:57:06,128] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.00031828851234068145, 'weight_decay': 0.01, 'warmup_steps': 8, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3201,0.216916,0.9323,0.933722,0.932683,0.932233
2,0.198,0.186955,0.9484,0.949499,0.948564,0.948477


[I 2025-04-01 18:01:30,792] Trial 118 pruned. 


Trial 119 with params: {'learning_rate': 6.283249007042146e-05, 'weight_decay': 0.002, 'warmup_steps': 26, 'lambda_param': 0.9, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3695,0.175739,0.9552,0.955333,0.955429,0.955203
2,0.1718,0.158991,0.9632,0.963519,0.963369,0.963362
3,0.1479,0.152413,0.9676,0.967807,0.967743,0.967737
4,0.1376,0.148594,0.9699,0.970001,0.970051,0.970005
5,0.1327,0.14592,0.9709,0.97104,0.971032,0.971019


[I 2025-04-01 18:12:31,733] Trial 119 pruned. 


Trial 120 with params: {'learning_rate': 0.00010593086649693671, 'weight_decay': 0.009000000000000001, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3136,0.169065,0.9591,0.959295,0.959296,0.959213
2,0.1693,0.160609,0.9637,0.96418,0.96381,0.963804
3,0.1472,0.152038,0.9677,0.967919,0.967853,0.967843


[I 2025-04-01 18:19:10,428] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.000253886239956771, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3047,0.211146,0.9365,0.938982,0.9367,0.936615
2,0.1866,0.179585,0.9532,0.9534,0.953422,0.952968
3,0.161,0.162623,0.9619,0.961998,0.962111,0.961966


[I 2025-04-01 18:25:46,373] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 6.770731207203462e-05, 'weight_decay': 0.002, 'warmup_steps': 29, 'lambda_param': 0.1, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3656,0.171669,0.9578,0.958006,0.958084,0.957896
2,0.1716,0.156444,0.9664,0.966533,0.96655,0.966467
3,0.1468,0.150599,0.9684,0.968545,0.968569,0.968525
4,0.1373,0.146347,0.9704,0.970586,0.970524,0.970524
5,0.1324,0.144926,0.9723,0.972522,0.972418,0.972416


[I 2025-04-01 18:36:45,467] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 0.00012249553159820464, 'weight_decay': 0.01, 'warmup_steps': 2, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3074,0.178836,0.9517,0.952912,0.951957,0.951919
2,0.1717,0.163937,0.961,0.961633,0.961175,0.961031
3,0.1486,0.153742,0.9678,0.968066,0.967945,0.967933
4,0.1377,0.147755,0.9714,0.971582,0.971513,0.971507
5,0.1325,0.144707,0.9722,0.972217,0.972411,0.97229
6,0.13,0.141325,0.9737,0.973852,0.973844,0.973843
7,0.1287,0.140049,0.9742,0.974434,0.974323,0.974361


Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

[I 2025-04-01 18:52:15,689] Trial 123 finished with value: 0.9743607711711821 and parameters: {'learning_rate': 0.00012249553159820464, 'weight_decay': 0.01, 'warmup_steps': 2, 'lambda_param': 0.8, 'temperature': 2.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 124 with params: {'learning_rate': 0.00014956050667072713, 'weight_decay': 0.008, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3023,0.181445,0.9518,0.952811,0.951917,0.951906
2,0.1737,0.161697,0.9623,0.962352,0.962566,0.96235
3,0.1521,0.155363,0.9663,0.966479,0.966426,0.966383


[I 2025-04-01 18:58:52,308] Trial 124 pruned. 


Trial 125 with params: {'learning_rate': 5.811435692493989e-05, 'weight_decay': 0.008, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.355,0.173143,0.9557,0.955752,0.956022,0.955728
2,0.1729,0.158205,0.9638,0.963974,0.964004,0.963882
3,0.148,0.156536,0.9649,0.965183,0.965004,0.96502


[I 2025-04-01 19:05:34,799] Trial 125 pruned. 


Trial 126 with params: {'learning_rate': 0.00013951957193561058, 'weight_decay': 0.008, 'warmup_steps': 6, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3091,0.182267,0.9486,0.949264,0.948868,0.9486
2,0.1731,0.165172,0.9603,0.960339,0.96054,0.960253


[I 2025-04-01 19:09:59,765] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.00014711366253096326, 'weight_decay': 0.007, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.311,0.181619,0.9511,0.951717,0.951355,0.95108
2,0.1735,0.168833,0.9569,0.957618,0.957221,0.957001
3,0.1507,0.157213,0.9649,0.965338,0.964965,0.965083


[I 2025-04-01 19:16:35,320] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.00019425069586510193, 'weight_decay': 0.01, 'warmup_steps': 6, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3055,0.191269,0.9456,0.946677,0.945934,0.945655
2,0.1786,0.172106,0.954,0.955117,0.954182,0.954065
3,0.1547,0.159732,0.9631,0.96326,0.963269,0.963201


[I 2025-04-01 19:23:11,768] Trial 128 pruned. 


Trial 129 with params: {'learning_rate': 0.00014440972658390848, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3011,0.175202,0.9552,0.955766,0.955359,0.955414
2,0.1735,0.162169,0.9632,0.963248,0.963375,0.963188
3,0.1506,0.155447,0.967,0.967255,0.967078,0.967104


[I 2025-04-01 19:29:47,045] Trial 129 pruned. 


Trial 130 with params: {'learning_rate': 0.00019906660940768, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 0.7000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3333,0.18972,0.946,0.946171,0.946335,0.945912
2,0.1826,0.165058,0.9602,0.96057,0.960486,0.960351


[I 2025-04-01 19:34:10,634] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 0.0005612567161548509, 'weight_decay': 0.01, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3839,0.272305,0.899,0.909662,0.899057,0.900513
2,0.2367,0.213475,0.9321,0.933513,0.932193,0.932221
3,0.1953,0.194116,0.9397,0.941326,0.940165,0.939766


[I 2025-04-01 19:40:45,541] Trial 131 pruned. 


Trial 132 with params: {'learning_rate': 7.364852168472748e-05, 'weight_decay': 0.0, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3653,0.171879,0.9573,0.95767,0.957485,0.957404
2,0.1701,0.159002,0.9646,0.96464,0.964845,0.964642
3,0.1469,0.152573,0.9694,0.969584,0.969514,0.969504
4,0.1372,0.146032,0.9702,0.970431,0.970354,0.970359
5,0.1324,0.143557,0.9725,0.972754,0.972681,0.972681
6,0.1302,0.141694,0.9738,0.974023,0.973948,0.973956
7,0.129,0.140288,0.9737,0.97395,0.973827,0.973851


[I 2025-04-01 19:56:08,726] Trial 132 finished with value: 0.9738513991111961 and parameters: {'learning_rate': 7.364852168472748e-05, 'weight_decay': 0.0, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 6.0}. Best is trial 14 with value: 0.9760268868449631.


Trial 133 with params: {'learning_rate': 8.351329846491984e-05, 'weight_decay': 0.002, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3542,0.173734,0.9554,0.95637,0.955607,0.95569
2,0.1698,0.159423,0.9634,0.963561,0.963531,0.963481
3,0.1471,0.154239,0.9666,0.966768,0.966754,0.966727
4,0.1376,0.151405,0.9685,0.96889,0.968622,0.968666
5,0.1324,0.146507,0.9711,0.971274,0.971254,0.971214


[I 2025-04-01 20:07:06,266] Trial 133 pruned. 


Trial 134 with params: {'learning_rate': 0.00020562651776151767, 'weight_decay': 0.0, 'warmup_steps': 30, 'lambda_param': 0.1, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3293,0.189912,0.9464,0.947268,0.946596,0.946395
2,0.1822,0.171593,0.9568,0.957091,0.956969,0.956709


[I 2025-04-01 20:11:29,176] Trial 134 pruned. 


Trial 135 with params: {'learning_rate': 0.003691578768913055, 'weight_decay': 0.003, 'warmup_steps': 26, 'lambda_param': 0.1, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4158,1.513016,0.1471,0.063309,0.145204,0.082484
2,1.4918,1.551083,0.111,0.035164,0.110694,0.047749
3,1.4979,1.541335,0.1247,0.025185,0.122874,0.041581


[I 2025-04-01 20:18:03,829] Trial 135 pruned. 


Trial 136 with params: {'learning_rate': 0.00013280105304577745, 'weight_decay': 0.002, 'warmup_steps': 25, 'lambda_param': 0.30000000000000004, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3312,0.175621,0.9558,0.956926,0.956002,0.955982
2,0.1719,0.163813,0.9611,0.961606,0.961254,0.961188
3,0.1501,0.152704,0.967,0.967297,0.967076,0.967159


[I 2025-04-01 20:24:38,678] Trial 136 pruned. 


Trial 137 with params: {'learning_rate': 0.002472023290700323, 'weight_decay': 0.009000000000000001, 'warmup_steps': 15, 'lambda_param': 0.7000000000000001, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.0729,1.148605,0.403,0.51612,0.401753,0.376925
2,0.9501,0.998832,0.4898,0.568669,0.488409,0.481923
3,0.8356,0.872047,0.5683,0.598736,0.567484,0.554765


[I 2025-04-01 20:31:15,480] Trial 137 pruned. 


Trial 138 with params: {'learning_rate': 0.002819055822915683, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.2977,1.421715,0.214,0.2645,0.213153,0.187006
2,1.4206,1.512731,0.1362,0.137891,0.135183,0.070473
3,1.447,1.490832,0.1757,0.143672,0.174008,0.113714


[I 2025-04-01 20:37:50,797] Trial 138 pruned. 


Trial 139 with params: {'learning_rate': 9.765375237194382e-05, 'weight_decay': 0.01, 'warmup_steps': 7, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3229,0.169375,0.957,0.957511,0.957198,0.957199
2,0.1692,0.16341,0.9611,0.961308,0.96136,0.96109
3,0.1467,0.155085,0.9678,0.967855,0.96801,0.967898
4,0.137,0.148347,0.9705,0.970653,0.970634,0.97063
5,0.1324,0.145591,0.9723,0.972495,0.972384,0.972424


[I 2025-04-01 20:48:49,835] Trial 139 pruned. 


Trial 140 with params: {'learning_rate': 6.0716123101733206e-05, 'weight_decay': 0.0, 'warmup_steps': 29, 'lambda_param': 0.1, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3841,0.183056,0.9526,0.953287,0.952759,0.952737
2,0.1787,0.159246,0.9645,0.964924,0.964702,0.964678
3,0.1486,0.151613,0.9681,0.968358,0.968215,0.968244
4,0.1378,0.149167,0.9694,0.969603,0.969566,0.969559
5,0.1329,0.145176,0.9699,0.970124,0.97006,0.970068


[I 2025-04-01 20:59:48,900] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 0.0001552622806408454, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.2978,0.185684,0.9461,0.946999,0.946333,0.946217
2,0.1758,0.165869,0.9587,0.958698,0.95895,0.958702


[I 2025-04-01 21:04:16,548] Trial 141 pruned. 


Trial 142 with params: {'learning_rate': 0.00015473107796255142, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3326,0.185839,0.9482,0.948557,0.948382,0.948188
2,0.1748,0.163709,0.9602,0.96065,0.960451,0.960315
3,0.1516,0.155091,0.9677,0.968439,0.967685,0.967913
4,0.1397,0.151866,0.9699,0.970174,0.969943,0.969999
5,0.1334,0.145739,0.9723,0.972588,0.972354,0.972432
6,0.1304,0.141246,0.9736,0.973823,0.973711,0.973723
7,0.1287,0.139732,0.9738,0.974051,0.97389,0.973934


[I 2025-04-01 21:19:39,270] Trial 142 finished with value: 0.9739339113388435 and parameters: {'learning_rate': 0.00015473107796255142, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 6.5}. Best is trial 14 with value: 0.9760268868449631.


Trial 143 with params: {'learning_rate': 0.0005238669986758576, 'weight_decay': 0.002, 'warmup_steps': 30, 'lambda_param': 0.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3833,0.256272,0.9052,0.914075,0.905509,0.906058
2,0.2308,0.215133,0.9334,0.935156,0.933602,0.933497


[I 2025-04-01 21:24:02,536] Trial 143 pruned. 


Trial 144 with params: {'learning_rate': 0.00010572603226995668, 'weight_decay': 0.001, 'warmup_steps': 30, 'lambda_param': 0.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3427,0.176223,0.9555,0.955968,0.955751,0.955619
2,0.1704,0.158019,0.9651,0.965342,0.965324,0.965172
3,0.1482,0.154461,0.9655,0.965713,0.965694,0.965644


[I 2025-04-01 21:30:39,160] Trial 144 pruned. 


Trial 145 with params: {'learning_rate': 0.00013781398998660572, 'weight_decay': 0.002, 'warmup_steps': 24, 'lambda_param': 0.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3276,0.181116,0.9525,0.953211,0.9527,0.95259
2,0.1732,0.164319,0.9607,0.960862,0.960985,0.960617


[I 2025-04-01 21:35:04,552] Trial 145 pruned. 


Trial 146 with params: {'learning_rate': 0.0001868104124089473, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3316,0.185488,0.9487,0.94901,0.94879,0.948708
2,0.1805,0.173608,0.9551,0.955377,0.9553,0.955049
3,0.154,0.162717,0.9623,0.962316,0.962537,0.962347


[I 2025-04-01 21:41:39,013] Trial 146 pruned. 


Trial 147 with params: {'learning_rate': 6.036043339829239e-05, 'weight_decay': 0.0, 'warmup_steps': 32, 'lambda_param': 0.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3922,0.169463,0.9571,0.957896,0.957316,0.957237
2,0.1725,0.155051,0.9648,0.965167,0.965023,0.964898
3,0.1475,0.150114,0.9687,0.968856,0.968854,0.968801
4,0.1371,0.148649,0.9693,0.969421,0.969449,0.969377
5,0.1324,0.143692,0.972,0.972241,0.972125,0.972137


[I 2025-04-01 21:52:35,842] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 0.003199645143713299, 'weight_decay': 0.007, 'warmup_steps': 0, 'lambda_param': 0.1, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5248,1.564019,0.1014,0.01014,0.1,0.018413
2,1.5069,1.553261,0.1014,0.01014,0.1,0.018413


[I 2025-04-01 21:56:59,597] Trial 148 pruned. 


Trial 149 with params: {'learning_rate': 0.00011872523215251572, 'weight_decay': 0.009000000000000001, 'warmup_steps': 24, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3346,0.179448,0.9511,0.952146,0.951246,0.951254
2,0.1712,0.163259,0.9611,0.961534,0.961335,0.961152
3,0.1482,0.154251,0.9684,0.968782,0.968551,0.968601
4,0.1379,0.147217,0.971,0.971261,0.971172,0.971161
5,0.1327,0.144941,0.9711,0.971304,0.971274,0.971274


[I 2025-04-01 22:07:59,936] Trial 149 pruned. 


In [40]:
print(best_distill_aug)

BestRun(run_id='14', objective=0.9760268868449631, hyperparameters={'learning_rate': 0.00014946504427538972, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 3.0}, run_summary=None)


In [41]:
print("Best random init training score: ", best_base)
print("Best random init distilation trianing score: ", best_distill)
print("Best pretrained (head only) training score: ", best_base_aug)
print("Best pretrained distilation (head only) training score: ",best_distill_aug)

Best random init training score:  BestRun(run_id='67', objective=0.9746290627725797, hyperparameters={'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}, run_summary=None)
Best random init distilation trianing score:  BestRun(run_id='32', objective=0.9764120594095738, hyperparameters={'learning_rate': 0.00013553561983282748, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}, run_summary=None)
Best pretrained (head only) training score:  BestRun(run_id='71', objective=0.9750019868944989, hyperparameters={'learning_rate': 0.00013900432881088528, 'weight_decay': 0.007, 'warmup_steps': 13}, run_summary=None)
Best pretrained distilation (head only) training score:  BestRun(run_id='14', objective=0.9760268868449631, hyperparameters={'learning_rate': 0.00014946504427538972, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 3.0}, run_summary=None)
