# Notebook pro trénink s destilací nad datasetem CIFAR10
V tomto notebooku je trénován MobileNetV2 nad datasetem CIFAR10, jako učitelsý model je využíván finetunued ViT nad stejným datasetem. 

MobileNetV2 je používán s náhodnou inicializací, tréninkem pouze klasifikační hlavy inicializovaného (předtrénovaného nad ImageNetem) MobileNetuV2 a trénink celého modelu, taktéž inicializovaného. Tyto tři úlohy jsou trénovány bězným způsobem a také s pomocí destilace výše zmíněného modelu.  

Při destilaci je využíváno předpočítaných logitů ze sešitu precompute_logits.

## Import knihoven a definice metod

In [1]:
from transformers import Trainer, EarlyStoppingCallback, AutoModelForImageClassification
from torch.utils.data import DataLoader, ConcatDataset
import pandas as pd
import optuna
import torch
import math
import base
import os

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package punkt to /home/jovyan/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /home/jovyan/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!


In [None]:
dataset_part = base.get_dataset_part()

Resetování náhodného seedu pro replikovatelnost výsledků.

In [None]:
base.reset_seed()

In [4]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("GPU is available and will be used:", torch.cuda.get_device_name(0))
else:
    device = torch.device("cpu")
    print("GPU is not available, using CPU.")

GPU is available and will be used: NVIDIA A100 80GB PCIe MIG 2g.20gb


Provedení transformací nad datasetem.

In [5]:
DATASET = "cifar10"

In [6]:
transform = base.base_transforms()

#Poslední train batch použijeme jako eval část...
test = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.TEST, transform=transform)
train = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.TRAIN, transform=transform)
eval = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.EVAL, transform=transform)

In [7]:
augment_transform = base.aug_transforms()
train_aug = base.CustomCIFAR10L(root=f"{os.path.expanduser('~')}/data/10-logits", dataset_part=dataset_part.TRAIN, transform=augment_transform)

In [8]:
train_aug = base.remove_diff_pred_class(train, train_aug, pytorch_dataset=True)
train_combo = ConcatDataset([train, train_aug])

Removing entries from augmented dataset that are different from the base one - based on saved logits:   0%|   …

In [None]:
# Test rozložení --> Good Enough
df = pd.DataFrame(eval.labels)
print(df.value_counts())

0
5    1025
9    1022
3    1016
0    1014
1    1014
8    1003
4     997
6     980
7     977
2     952
Name: count, dtype: int64


### Standardní trénink náhodně inicializovaného modelu. 

In [None]:
num_epochs = 7
batch_size = 128

In [None]:
#Nápočet epoch na steps
data_length = len(train)
min_r = math.ceil(data_length/batch_size)*2
max_r = math.ceil(data_length/batch_size)*num_epochs
warm_up = math.ceil(data_length/batch_size/10)

In [12]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up)
    }   
    print(f"Trial {trial.number} with params: {params}")
    return params

In [None]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



In [None]:
base.reset_seed()

In [15]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/_hp-search", logging_dir=f"~/logs/{DATASET}/_hp-search", epochs=num_epochs, batch_size=batch_size)

In [None]:
def get_model():
    return AutoModelForImageClassification.from_pretrained("timm/tiny_vit_5m_224.in1k", num_labels=10, ignore_mismatched_sizes=True)

In [17]:
trainer = Trainer(
    args=training_args,
    train_dataset=train,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)
  

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [18]:
best_base = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Base",
    n_trials=150
)

[I 2025-03-29 16:18:44,733] A new study created in memory with name: Base


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4427,0.264957,0.9117,0.919823,0.911798,0.911894
2,0.1384,0.190786,0.9374,0.93995,0.93788,0.937399
3,0.0747,0.146533,0.9559,0.956957,0.955881,0.956239
4,0.0425,0.150831,0.9584,0.958616,0.958541,0.958336


[I 2025-03-29 16:29:31,752] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.0007875660249889869, 'weight_decay': 0.001, 'warmup_steps': 5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5514,0.41773,0.8557,0.882328,0.85637,0.855427
2,0.2784,0.339531,0.8819,0.892951,0.882365,0.880033


[I 2025-03-29 16:34:52,224] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 6.533369619026643e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5266,0.174688,0.9451,0.947232,0.945241,0.945465
2,0.0922,0.130627,0.9562,0.956605,0.95643,0.95628
3,0.0396,0.12304,0.9618,0.961901,0.962067,0.961894
4,0.0161,0.121924,0.9696,0.96972,0.969756,0.969725


[I 2025-03-29 16:45:37,164] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.0013035123791853842, 'weight_decay': 0.0, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8162,0.680534,0.7682,0.795655,0.768655,0.766107
2,0.4594,0.387826,0.8659,0.870784,0.866465,0.865212


[I 2025-03-29 16:50:59,362] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.002311294500510415, 'weight_decay': 0.002, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3528,1.407308,0.5228,0.576865,0.52367,0.493646
2,0.9994,1.008663,0.6558,0.68169,0.657148,0.643225
3,0.7787,0.749806,0.7491,0.763195,0.749192,0.747998
4,0.6159,0.661946,0.7716,0.795378,0.771904,0.77347


[I 2025-03-29 17:01:46,736] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4403,0.168277,0.9468,0.94893,0.946834,0.947092
2,0.088,0.123281,0.9612,0.961262,0.961493,0.961186
3,0.0411,0.120905,0.9647,0.965272,0.9647,0.964859
4,0.0175,0.132797,0.9679,0.96815,0.968049,0.968094


[I 2025-03-29 17:12:29,708] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0003654769917956456, 'weight_decay': 0.003, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.451,0.227819,0.9229,0.926144,0.923448,0.923253
2,0.1578,0.209714,0.9301,0.937129,0.930442,0.930244
3,0.0971,0.160412,0.9502,0.950742,0.950349,0.950313
4,0.0553,0.151367,0.9573,0.958074,0.957273,0.957539


[I 2025-03-29 17:23:13,998] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 9.505122659935192e-05, 'weight_decay': 0.003, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4495,0.168017,0.9442,0.947011,0.94415,0.94474
2,0.0877,0.130934,0.9565,0.957313,0.956724,0.956609
3,0.0382,0.115613,0.9653,0.965487,0.965489,0.965436
4,0.0146,0.132549,0.9662,0.966456,0.966369,0.966374


[I 2025-03-29 17:33:59,659] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 0.00040842279473800845, 'weight_decay': 0.008, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.425,0.386194,0.8755,0.890119,0.87655,0.87544
2,0.1675,0.201339,0.9328,0.934603,0.933255,0.931989


[I 2025-03-29 17:37:38,428] Trial 8 pruned. 


Trial 9 with params: {'learning_rate': 0.0005338741354740678, 'weight_decay': 0.006, 'warmup_steps': 1}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4618,0.286336,0.9062,0.910548,0.906179,0.906465
2,0.2085,0.270973,0.909,0.918543,0.90956,0.908993
3,0.1291,0.193552,0.9348,0.936477,0.934689,0.935144
4,0.0737,0.173231,0.9466,0.947142,0.946908,0.946483


[I 2025-03-29 17:43:12,242] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 6.533528818763353e-05, 'weight_decay': 0.01, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5037,0.158559,0.9494,0.950632,0.949494,0.94965
2,0.0869,0.132052,0.9552,0.956138,0.955405,0.955445
3,0.0366,0.127167,0.9628,0.963155,0.962956,0.96298
4,0.0141,0.141551,0.9658,0.966127,0.965927,0.965906


[I 2025-03-29 17:48:46,552] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 7.708968913466938e-05, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5066,0.160335,0.9476,0.949904,0.947825,0.947975
2,0.0914,0.11888,0.961,0.961458,0.961057,0.961126
3,0.038,0.118747,0.9655,0.965752,0.965588,0.965642
4,0.0163,0.133683,0.9668,0.96698,0.96694,0.966915
5,0.0052,0.130086,0.9684,0.968645,0.968529,0.968522
6,0.0012,0.132783,0.9708,0.97098,0.970872,0.970903
7,0.0005,0.133448,0.9711,0.971246,0.971205,0.971205


[I 2025-03-29 17:58:34,457] Trial 11 finished with value: 0.9712048758910423 and parameters: {'learning_rate': 7.708968913466938e-05, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 11 with value: 0.9712048758910423.


Trial 12 with params: {'learning_rate': 5.217026363807214e-05, 'weight_decay': 0.004, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6313,0.180836,0.9423,0.94368,0.942567,0.942575
2,0.1056,0.120743,0.9616,0.962072,0.961704,0.961793
3,0.0442,0.116826,0.966,0.966138,0.966118,0.966074
4,0.0143,0.138144,0.9651,0.96526,0.965298,0.9652


[I 2025-03-29 18:04:07,758] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 5.226430585490316e-05, 'weight_decay': 0.007, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6229,0.162117,0.948,0.949867,0.948113,0.948362
2,0.0937,0.119216,0.9623,0.962655,0.962396,0.962444
3,0.04,0.11778,0.9635,0.963617,0.963599,0.963547
4,0.0159,0.135057,0.9646,0.964619,0.964764,0.964605
5,0.0045,0.143058,0.9663,0.966513,0.96637,0.966393
6,0.0023,0.143602,0.969,0.969161,0.9691,0.969122
7,0.001,0.144122,0.9685,0.96861,0.968627,0.968602


[I 2025-03-29 18:13:52,622] Trial 13 finished with value: 0.9686021568730794 and parameters: {'learning_rate': 5.226430585490316e-05, 'weight_decay': 0.007, 'warmup_steps': 26}. Best is trial 11 with value: 0.9712048758910423.


Trial 14 with params: {'learning_rate': 9.95605435141112e-05, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.488,0.186999,0.9393,0.944452,0.939422,0.939996
2,0.0905,0.122951,0.9593,0.959491,0.959471,0.959415
3,0.0412,0.117226,0.9653,0.965525,0.965482,0.965473
4,0.0169,0.122421,0.9686,0.968717,0.968783,0.968669
5,0.006,0.123625,0.9709,0.971367,0.971008,0.971096
6,0.0016,0.126107,0.972,0.972076,0.972148,0.972095
7,0.0007,0.125646,0.9729,0.973052,0.973021,0.973024


[I 2025-03-29 18:23:34,906] Trial 14 finished with value: 0.9730240184701999 and parameters: {'learning_rate': 9.95605435141112e-05, 'weight_decay': 0.007, 'warmup_steps': 28}. Best is trial 14 with value: 0.9730240184701999.


Trial 15 with params: {'learning_rate': 0.0003662169232204062, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4851,0.252533,0.9189,0.922387,0.919356,0.919473
2,0.1636,0.19222,0.9349,0.937273,0.935188,0.934952
3,0.0928,0.16212,0.9478,0.948848,0.948042,0.948096
4,0.054,0.162263,0.9534,0.953965,0.953588,0.953452


[I 2025-03-29 18:29:04,981] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 0.00038309918336020546, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4508,0.286934,0.9057,0.912826,0.906158,0.906662
2,0.1627,0.206656,0.9301,0.931591,0.930497,0.929844


[I 2025-03-29 18:31:51,745] Trial 16 pruned. 


Trial 17 with params: {'learning_rate': 0.0020085822314002493, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.1278,1.028901,0.6516,0.689256,0.652173,0.651455
2,0.7247,0.781852,0.7386,0.775961,0.738718,0.73893
3,0.5041,0.59359,0.7976,0.811637,0.798066,0.792535
4,0.3691,0.406194,0.8686,0.883081,0.868396,0.871252


[I 2025-03-29 18:37:22,966] Trial 17 pruned. 


Trial 18 with params: {'learning_rate': 0.0026868566033176914, 'weight_decay': 0.01, 'warmup_steps': 11}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8859,2.314781,0.1417,0.171584,0.140837,0.092388
2,2.1076,2.089234,0.2119,0.192986,0.21148,0.188983
3,2.0578,2.083014,0.2073,0.213258,0.205387,0.162042
4,2.0793,2.016677,0.2433,0.223367,0.241827,0.212747


[I 2025-03-29 18:42:54,846] Trial 18 pruned. 


Trial 19 with params: {'learning_rate': 0.00015627747538495373, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4432,0.166431,0.9477,0.949691,0.94783,0.948009
2,0.0987,0.140477,0.9544,0.955254,0.954645,0.95455


[I 2025-03-29 18:45:41,106] Trial 19 pruned. 


Trial 20 with params: {'learning_rate': 7.639542885278315e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5332,0.152277,0.9525,0.953275,0.952709,0.952691
2,0.09,0.119022,0.9601,0.960693,0.960234,0.960347
3,0.0391,0.131622,0.9617,0.961712,0.961984,0.961659
4,0.0152,0.138279,0.966,0.966798,0.966095,0.966205
5,0.0047,0.137755,0.9692,0.969378,0.969317,0.969342
6,0.0014,0.142714,0.9705,0.97067,0.970633,0.970647
7,0.0005,0.142881,0.971,0.971137,0.971157,0.97114


[I 2025-03-29 18:55:22,884] Trial 20 finished with value: 0.9711397392439892 and parameters: {'learning_rate': 7.639542885278315e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}. Best is trial 14 with value: 0.9730240184701999.


Trial 21 with params: {'learning_rate': 6.804198974992601e-05, 'weight_decay': 0.008, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5522,0.184256,0.9417,0.94464,0.941812,0.94229
2,0.0954,0.120527,0.9606,0.960914,0.960722,0.960676
3,0.0387,0.115763,0.9671,0.967535,0.967227,0.967206
4,0.0143,0.133903,0.9678,0.967836,0.968005,0.967823
5,0.0051,0.141145,0.9706,0.971114,0.970654,0.970745
6,0.0017,0.137425,0.9697,0.969858,0.9698,0.969814
7,0.0007,0.13898,0.9709,0.970967,0.97103,0.970987


[I 2025-03-29 19:05:04,917] Trial 21 finished with value: 0.970986506707678 and parameters: {'learning_rate': 6.804198974992601e-05, 'weight_decay': 0.008, 'warmup_steps': 32}. Best is trial 14 with value: 0.9730240184701999.


Trial 22 with params: {'learning_rate': 0.00019913817180425286, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4516,0.215778,0.9265,0.931213,0.926719,0.926925
2,0.1159,0.159373,0.9473,0.949858,0.947442,0.94775
3,0.061,0.120467,0.962,0.962484,0.962111,0.962203
4,0.0286,0.131587,0.9636,0.963788,0.963698,0.963716
5,0.0112,0.132403,0.968,0.968147,0.968156,0.968126
6,0.0036,0.129192,0.9714,0.971634,0.97151,0.971546
7,0.0008,0.12885,0.9733,0.973376,0.973434,0.973383


[I 2025-03-29 19:14:46,002] Trial 22 finished with value: 0.9733827228355143 and parameters: {'learning_rate': 0.00019913817180425286, 'weight_decay': 0.008, 'warmup_steps': 30}. Best is trial 22 with value: 0.9733827228355143.


Trial 23 with params: {'learning_rate': 9.496688021669307e-05, 'weight_decay': 0.005, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4814,0.195403,0.935,0.944111,0.93522,0.936601
2,0.0914,0.127305,0.9584,0.958982,0.958643,0.958548
3,0.0406,0.117421,0.964,0.964429,0.96413,0.964192
4,0.0173,0.136321,0.9668,0.967283,0.96696,0.966996


[I 2025-03-29 19:20:18,460] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 0.00011865097794262479, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.476,0.160698,0.9466,0.948305,0.946541,0.946851
2,0.096,0.127472,0.9565,0.957121,0.956624,0.956618
3,0.0435,0.127944,0.9638,0.964849,0.96389,0.964029
4,0.0197,0.128134,0.9674,0.968073,0.96751,0.967614
5,0.006,0.135274,0.9698,0.970111,0.969916,0.969965
6,0.0018,0.128943,0.9732,0.973455,0.973298,0.973361
7,0.0007,0.130094,0.9731,0.973254,0.973226,0.973229


[I 2025-03-29 19:29:59,931] Trial 24 finished with value: 0.9732291600489491 and parameters: {'learning_rate': 0.00011865097794262479, 'weight_decay': 0.006, 'warmup_steps': 27}. Best is trial 22 with value: 0.9733827228355143.


Trial 25 with params: {'learning_rate': 0.00020601407276034348, 'weight_decay': 0.003, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4529,0.22015,0.9265,0.929263,0.926812,0.926729
2,0.1105,0.160777,0.9484,0.949656,0.948564,0.948669


[I 2025-03-29 19:32:46,753] Trial 25 pruned. 


Trial 26 with params: {'learning_rate': 0.00024009854177757173, 'weight_decay': 0.009000000000000001, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.443,0.239708,0.9211,0.924993,0.921206,0.921138
2,0.1258,0.146923,0.9495,0.950336,0.949771,0.949617
3,0.0655,0.127077,0.9607,0.961614,0.960873,0.960989
4,0.0338,0.134583,0.9646,0.965048,0.964578,0.964718
5,0.0153,0.142064,0.9652,0.96569,0.965229,0.965309
6,0.0035,0.12639,0.9717,0.971953,0.971812,0.97186
7,0.0012,0.125383,0.9721,0.972385,0.972205,0.972276


[I 2025-03-29 19:42:28,614] Trial 26 finished with value: 0.9722758907507199 and parameters: {'learning_rate': 0.00024009854177757173, 'weight_decay': 0.009000000000000001, 'warmup_steps': 28}. Best is trial 22 with value: 0.9733827228355143.


Trial 27 with params: {'learning_rate': 0.0002467077135460003, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.462,0.229358,0.9229,0.930496,0.922725,0.924326
2,0.1227,0.148176,0.9497,0.95096,0.949809,0.949961


[I 2025-03-29 19:45:14,970] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 0.002953666986018182, 'weight_decay': 0.002, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.9899,2.270544,0.1752,0.120661,0.173074,0.125013
2,2.1675,2.218947,0.1808,0.170645,0.18035,0.14429


[I 2025-03-29 19:48:01,025] Trial 28 pruned. 


Trial 29 with params: {'learning_rate': 0.00011735172641973649, 'weight_decay': 0.003, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3922,0.171859,0.9448,0.948764,0.945024,0.945414
2,0.0866,0.110174,0.9631,0.963302,0.963305,0.963241
3,0.0408,0.123265,0.9633,0.96444,0.96343,0.963679
4,0.0177,0.122345,0.9682,0.968252,0.968392,0.968251
5,0.0056,0.141341,0.969,0.96939,0.969073,0.969177
6,0.0016,0.138268,0.9703,0.970583,0.970407,0.970469
7,0.0004,0.13675,0.9706,0.970681,0.970753,0.970708


[I 2025-03-29 19:57:41,137] Trial 29 finished with value: 0.970708143786813 and parameters: {'learning_rate': 0.00011735172641973649, 'weight_decay': 0.003, 'warmup_steps': 0}. Best is trial 22 with value: 0.9733827228355143.


Trial 30 with params: {'learning_rate': 0.00028100291767653175, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4396,0.263432,0.911,0.917949,0.911041,0.911214
2,0.1379,0.180815,0.9405,0.942103,0.940826,0.940662
3,0.0752,0.15109,0.9527,0.954112,0.952643,0.953095
4,0.0418,0.146234,0.9598,0.959802,0.96,0.959856


[I 2025-03-29 20:03:12,372] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 0.0003282029820771861, 'weight_decay': 0.01, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4606,0.284063,0.9081,0.913485,0.908336,0.908256
2,0.1468,0.176599,0.9406,0.942442,0.940695,0.940876


[I 2025-03-29 20:05:58,996] Trial 31 pruned. 


Trial 32 with params: {'learning_rate': 0.0002644965932082481, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4661,0.252513,0.9178,0.922497,0.917815,0.918037
2,0.132,0.185334,0.9389,0.942004,0.939248,0.938791
3,0.0786,0.144427,0.9556,0.956549,0.955725,0.95587
4,0.0394,0.126335,0.9619,0.962133,0.962028,0.962043


[I 2025-03-29 20:11:29,842] Trial 32 pruned. 


Trial 33 with params: {'learning_rate': 0.0004211137487642013, 'weight_decay': 0.008, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4909,0.258089,0.9143,0.916936,0.914541,0.914683
2,0.1829,0.199585,0.9308,0.932218,0.930988,0.931096


[I 2025-03-29 20:14:15,801] Trial 33 pruned. 


Trial 34 with params: {'learning_rate': 0.00019066411536696978, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4455,0.268575,0.9118,0.922968,0.912035,0.913265
2,0.1131,0.1263,0.9586,0.958792,0.958775,0.958678
3,0.0582,0.131439,0.9586,0.959712,0.958708,0.958863
4,0.0255,0.138647,0.9623,0.962984,0.962367,0.962557
5,0.0132,0.131235,0.9678,0.968373,0.967896,0.968002
6,0.0028,0.131631,0.9719,0.972222,0.97198,0.972065
7,0.0007,0.130106,0.9712,0.971347,0.971334,0.971322


[I 2025-03-29 20:23:57,846] Trial 34 finished with value: 0.9713224748941587 and parameters: {'learning_rate': 0.00019066411536696978, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}. Best is trial 22 with value: 0.9733827228355143.


Trial 35 with params: {'learning_rate': 0.00017057009867124738, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4481,0.228622,0.9261,0.933495,0.926568,0.926627
2,0.1031,0.146528,0.9518,0.952248,0.952089,0.95171
3,0.0542,0.136457,0.9589,0.959592,0.95911,0.959087
4,0.0268,0.137977,0.9611,0.961338,0.961399,0.961165


[I 2025-03-29 20:29:31,614] Trial 35 pruned. 


Trial 36 with params: {'learning_rate': 0.004049761177508626, 'weight_decay': 0.006, 'warmup_steps': 3}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.3119,2.435007,0.1599,0.053823,0.158469,0.061093
2,2.3176,2.315834,0.0905,0.033091,0.090894,0.042763
3,2.299,2.30991,0.1012,0.083057,0.10163,0.06812
4,2.3324,2.322517,0.0979,0.019637,0.098441,0.032539


[I 2025-03-29 20:35:03,961] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 9.286230673587775e-05, 'weight_decay': 0.01, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4787,0.195805,0.9343,0.941442,0.934547,0.934903
2,0.095,0.127605,0.959,0.95975,0.95915,0.959204
3,0.0416,0.13322,0.9615,0.963116,0.961612,0.961845
4,0.0167,0.13403,0.9668,0.967241,0.966971,0.966983


[I 2025-03-29 20:40:35,724] Trial 37 pruned. 


Trial 38 with params: {'learning_rate': 0.00018692749398230822, 'weight_decay': 0.007, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4533,0.222554,0.9253,0.93549,0.925648,0.926935
2,0.1098,0.140472,0.9535,0.954715,0.953717,0.95356
3,0.0503,0.125461,0.9636,0.963826,0.963718,0.963694
4,0.0258,0.136012,0.9661,0.96655,0.966363,0.966162


[I 2025-03-29 20:46:06,977] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 0.0010475348879951107, 'weight_decay': 0.009000000000000001, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7077,0.494804,0.8371,0.848305,0.837999,0.836959
2,0.36,0.373679,0.8716,0.880331,0.871831,0.870768
3,0.2491,0.346505,0.8842,0.89485,0.883587,0.884876
4,0.156,0.219374,0.9299,0.930334,0.930147,0.930064


[I 2025-03-29 20:51:39,469] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.0003364737045777045, 'weight_decay': 0.01, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.437,0.288007,0.9028,0.909155,0.903288,0.902647
2,0.1536,0.189819,0.9357,0.937956,0.9359,0.935996


[I 2025-03-29 20:54:26,308] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 0.0001456647286080767, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4408,0.186801,0.9389,0.94408,0.939042,0.939524
2,0.0993,0.142152,0.9517,0.95274,0.952018,0.951638
3,0.0467,0.118045,0.9641,0.964396,0.964277,0.964232
4,0.0238,0.136592,0.9648,0.965133,0.964942,0.964923
5,0.0093,0.150167,0.9679,0.968492,0.968142,0.968083
6,0.002,0.129746,0.972,0.972164,0.972223,0.972152
7,0.0005,0.130666,0.9726,0.97269,0.972788,0.972727


[I 2025-03-29 21:04:11,230] Trial 41 finished with value: 0.9727269225527291 and parameters: {'learning_rate': 0.0001456647286080767, 'weight_decay': 0.009000000000000001, 'warmup_steps': 26}. Best is trial 22 with value: 0.9733827228355143.


Trial 42 with params: {'learning_rate': 0.0001818125580572801, 'weight_decay': 0.01, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4621,0.215622,0.9295,0.933378,0.929449,0.929614
2,0.1094,0.159993,0.9466,0.949075,0.946963,0.946631


[I 2025-03-29 21:06:58,337] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.00017882142807170676, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4527,0.183903,0.9398,0.942148,0.940063,0.940083
2,0.1077,0.140467,0.9535,0.954359,0.953651,0.953654
3,0.0505,0.130678,0.9613,0.962081,0.961326,0.961575
4,0.0264,0.1246,0.9649,0.965076,0.965132,0.965038
5,0.0111,0.117063,0.9707,0.971177,0.970829,0.970923
6,0.0025,0.122644,0.9732,0.973555,0.973341,0.973361
7,0.0006,0.119076,0.9734,0.973597,0.973533,0.973544


[I 2025-03-29 21:16:40,809] Trial 43 finished with value: 0.9735436774258494 and parameters: {'learning_rate': 0.00017882142807170676, 'weight_decay': 0.008, 'warmup_steps': 23}. Best is trial 43 with value: 0.9735436774258494.


Trial 44 with params: {'learning_rate': 7.012112975444019e-05, 'weight_decay': 0.0, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5314,0.190051,0.9388,0.943309,0.938922,0.939424
2,0.0921,0.115931,0.9618,0.962005,0.961981,0.96185
3,0.0392,0.110456,0.9659,0.966239,0.966034,0.966107
4,0.0174,0.1205,0.9678,0.967948,0.967883,0.967902
5,0.0062,0.139446,0.968,0.968288,0.968169,0.968151
6,0.0017,0.135429,0.969,0.969226,0.969125,0.969148
7,0.0007,0.135128,0.9693,0.96941,0.969458,0.969428


[I 2025-03-29 21:26:28,760] Trial 44 finished with value: 0.9694279207563451 and parameters: {'learning_rate': 7.012112975444019e-05, 'weight_decay': 0.0, 'warmup_steps': 24}. Best is trial 43 with value: 0.9735436774258494.


Trial 45 with params: {'learning_rate': 0.00013563560676260026, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4725,0.155023,0.951,0.951477,0.951083,0.95115
2,0.0956,0.1339,0.9568,0.95715,0.957034,0.956867
3,0.0471,0.136746,0.9599,0.961421,0.96002,0.960337
4,0.0237,0.126859,0.9663,0.966381,0.966458,0.966373
5,0.008,0.136413,0.9685,0.96865,0.968603,0.968615
6,0.0027,0.134947,0.97,0.970251,0.970129,0.97016
7,0.0007,0.133567,0.971,0.97112,0.971163,0.971127


[I 2025-03-29 21:36:17,079] Trial 45 finished with value: 0.9711269670216645 and parameters: {'learning_rate': 0.00013563560676260026, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 43 with value: 0.9735436774258494.


Trial 46 with params: {'learning_rate': 0.00010750244532497942, 'weight_decay': 0.007, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4602,0.168979,0.9438,0.945788,0.943889,0.943957
2,0.0895,0.123324,0.961,0.96112,0.961155,0.961055
3,0.0408,0.116944,0.9653,0.966106,0.965374,0.96556
4,0.0182,0.136171,0.9654,0.965523,0.965564,0.965494


[I 2025-03-29 21:41:49,589] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 0.00010887451629772067, 'weight_decay': 0.005, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.473,0.209213,0.9285,0.935446,0.928765,0.928972
2,0.0928,0.12634,0.959,0.959661,0.95913,0.959288
3,0.0444,0.11384,0.9662,0.966276,0.96638,0.966264
4,0.0175,0.121459,0.968,0.968048,0.968188,0.968096
5,0.0068,0.12541,0.9713,0.97141,0.971463,0.971386
6,0.0016,0.128498,0.9726,0.972798,0.972677,0.972704
7,0.0005,0.126973,0.9722,0.972323,0.972319,0.972316


[I 2025-03-29 21:51:59,083] Trial 47 finished with value: 0.9723158718719297 and parameters: {'learning_rate': 0.00010887451629772067, 'weight_decay': 0.005, 'warmup_steps': 27}. Best is trial 43 with value: 0.9735436774258494.


Trial 48 with params: {'learning_rate': 0.00012147190692302132, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4696,0.194579,0.9335,0.939963,0.933958,0.934283
2,0.0991,0.116843,0.9625,0.963129,0.962608,0.962688
3,0.0462,0.125989,0.9619,0.963254,0.962109,0.962094
4,0.0186,0.123807,0.9694,0.969666,0.969547,0.969552
5,0.007,0.137423,0.968,0.968142,0.968143,0.968116
6,0.0018,0.142694,0.9709,0.971183,0.970996,0.971051
7,0.0005,0.137763,0.9703,0.970454,0.970454,0.970445


[I 2025-03-29 22:01:41,232] Trial 48 finished with value: 0.9704449399761612 and parameters: {'learning_rate': 0.00012147190692302132, 'weight_decay': 0.007, 'warmup_steps': 24}. Best is trial 43 with value: 0.9735436774258494.


Trial 49 with params: {'learning_rate': 0.00020338031147463888, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4252,0.193183,0.9378,0.941039,0.937924,0.938353
2,0.11,0.145897,0.9502,0.951443,0.950415,0.95037


[I 2025-03-29 22:04:26,329] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 0.0027800474932883233, 'weight_decay': 0.0, 'warmup_steps': 12}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.8725,2.200063,0.1886,0.220976,0.187333,0.151398
2,2.0761,2.243609,0.1518,0.180142,0.151297,0.111972


[I 2025-03-29 22:07:12,889] Trial 50 pruned. 


Trial 51 with params: {'learning_rate': 0.0002326906365354164, 'weight_decay': 0.005, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4434,0.208238,0.9329,0.937489,0.932971,0.933748
2,0.1229,0.156702,0.9493,0.95135,0.94967,0.949365
3,0.0619,0.132097,0.9606,0.960954,0.960639,0.96072
4,0.0336,0.129805,0.9631,0.963476,0.963283,0.963316


[I 2025-03-29 22:12:44,343] Trial 51 pruned. 


Trial 52 with params: {'learning_rate': 6.1005881023266626e-05, 'weight_decay': 0.007, 'warmup_steps': 7}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5168,0.164689,0.9483,0.949997,0.948293,0.948674
2,0.0957,0.125215,0.9592,0.95967,0.959413,0.959368
3,0.037,0.128637,0.9629,0.962982,0.963079,0.962942
4,0.0157,0.139914,0.9651,0.965234,0.965256,0.96523


[I 2025-03-29 22:18:17,269] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 9.335977849844236e-05, 'weight_decay': 0.006, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5156,0.164909,0.9456,0.947876,0.945861,0.945872
2,0.0951,0.12318,0.9592,0.959896,0.95925,0.959379
3,0.0411,0.124694,0.9635,0.963677,0.963621,0.963585
4,0.0182,0.128578,0.9681,0.968284,0.968204,0.96816
5,0.0052,0.140075,0.9692,0.969279,0.969306,0.969253
6,0.0016,0.140276,0.9705,0.970659,0.970568,0.970605
7,0.0005,0.139486,0.9711,0.971168,0.971163,0.971161


[I 2025-03-29 22:28:00,649] Trial 53 finished with value: 0.971161361651063 and parameters: {'learning_rate': 9.335977849844236e-05, 'weight_decay': 0.006, 'warmup_steps': 30}. Best is trial 43 with value: 0.9735436774258494.


Trial 54 with params: {'learning_rate': 0.000403916017640712, 'weight_decay': 0.0, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4707,0.291617,0.9043,0.910795,0.904971,0.904341
2,0.1649,0.218964,0.9268,0.928785,0.927255,0.926721


[I 2025-03-29 22:30:46,989] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.0002606336830980987, 'weight_decay': 0.0, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3941,0.257073,0.9131,0.921715,0.913562,0.913187
2,0.1264,0.147241,0.9522,0.952736,0.952555,0.952348
3,0.0674,0.154349,0.9544,0.955422,0.954847,0.954531
4,0.0373,0.14077,0.9605,0.960801,0.960808,0.960574


[I 2025-03-29 22:36:18,400] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 6.358026237171493e-05, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5657,0.175585,0.9429,0.945054,0.943064,0.943401
2,0.0977,0.122658,0.9618,0.962183,0.961959,0.961962
3,0.0427,0.119655,0.9662,0.966514,0.966362,0.966287
4,0.0148,0.130585,0.967,0.967152,0.967165,0.967099
5,0.0048,0.135921,0.9694,0.96959,0.969537,0.969544
6,0.0013,0.140654,0.969,0.969291,0.969089,0.969161
7,0.0007,0.14093,0.9688,0.968911,0.968939,0.96891


[I 2025-03-29 22:46:01,518] Trial 56 finished with value: 0.968909508253556 and parameters: {'learning_rate': 6.358026237171493e-05, 'weight_decay': 0.005, 'warmup_steps': 26}. Best is trial 43 with value: 0.9735436774258494.


Trial 57 with params: {'learning_rate': 0.00011887515276957258, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4756,0.175363,0.9418,0.945091,0.942009,0.94229
2,0.0911,0.114684,0.9615,0.961794,0.96175,0.961617
3,0.0414,0.118665,0.9638,0.963768,0.964035,0.963815
4,0.0181,0.119333,0.9678,0.968153,0.967963,0.967923
5,0.0066,0.124018,0.971,0.971074,0.971121,0.971055
6,0.0014,0.128777,0.9718,0.971978,0.971863,0.971896
7,0.0004,0.12406,0.9739,0.973988,0.973991,0.973977


[I 2025-03-29 22:55:42,088] Trial 57 finished with value: 0.9739767587933696 and parameters: {'learning_rate': 0.00011887515276957258, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 57 with value: 0.9739767587933696.


Trial 58 with params: {'learning_rate': 7.081459585768469e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5166,0.157713,0.9484,0.949999,0.948519,0.948801
2,0.0896,0.135183,0.9543,0.955158,0.954508,0.954464
3,0.0387,0.119601,0.9634,0.963434,0.96362,0.96346
4,0.0142,0.14231,0.9674,0.967523,0.967569,0.967505
5,0.0045,0.136179,0.971,0.971363,0.971103,0.971162
6,0.0017,0.142174,0.9704,0.970628,0.970506,0.970554
7,0.0006,0.140919,0.9703,0.970452,0.970437,0.970436


[I 2025-03-29 23:05:22,242] Trial 58 finished with value: 0.970435554108661 and parameters: {'learning_rate': 7.081459585768469e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}. Best is trial 57 with value: 0.9739767587933696.


Trial 59 with params: {'learning_rate': 0.00012028135740743376, 'weight_decay': 0.008, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4666,0.172004,0.9422,0.944974,0.942428,0.942612
2,0.0918,0.123715,0.9594,0.95977,0.95956,0.959529
3,0.0458,0.114339,0.967,0.967412,0.967211,0.967137
4,0.0198,0.130145,0.966,0.966401,0.966112,0.966088


[I 2025-03-29 23:10:54,268] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 0.0011700191952905836, 'weight_decay': 0.003, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7844,0.606262,0.7978,0.814055,0.798145,0.798925
2,0.4278,0.394319,0.8686,0.875057,0.868988,0.868295
3,0.2832,0.37096,0.8783,0.883766,0.878477,0.877122
4,0.1838,0.2482,0.9198,0.922083,0.919851,0.920471


[I 2025-03-29 23:16:26,335] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.00015444204635882978, 'weight_decay': 0.005, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4578,0.185739,0.9382,0.940062,0.938438,0.938308
2,0.1038,0.137587,0.9553,0.955526,0.955515,0.955384
3,0.0525,0.131201,0.9616,0.961605,0.961799,0.961626
4,0.0248,0.119548,0.9677,0.967785,0.967862,0.96781
5,0.0104,0.122705,0.9692,0.969304,0.969377,0.969314
6,0.0025,0.121753,0.9732,0.973276,0.973361,0.973302
7,0.0007,0.12095,0.9719,0.971907,0.972101,0.971982


[I 2025-03-29 23:26:23,149] Trial 61 finished with value: 0.9719819824160216 and parameters: {'learning_rate': 0.00015444204635882978, 'weight_decay': 0.005, 'warmup_steps': 30}. Best is trial 57 with value: 0.9739767587933696.


Trial 62 with params: {'learning_rate': 0.00012570938701673154, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4502,0.174311,0.9423,0.94611,0.942293,0.942867
2,0.0914,0.152564,0.9512,0.953606,0.95141,0.951516
3,0.0424,0.127247,0.9629,0.963182,0.963154,0.963019
4,0.0182,0.119976,0.9679,0.96811,0.96805,0.968056
5,0.0072,0.125787,0.9707,0.970815,0.970829,0.970797
6,0.0019,0.123235,0.9733,0.973433,0.97343,0.973416
7,0.0005,0.122741,0.9732,0.973319,0.973316,0.973314


[I 2025-03-29 23:36:36,046] Trial 62 finished with value: 0.9733137782981413 and parameters: {'learning_rate': 0.00012570938701673154, 'weight_decay': 0.007, 'warmup_steps': 23}. Best is trial 57 with value: 0.9739767587933696.


Trial 63 with params: {'learning_rate': 8.738951618852924e-05, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4901,0.1585,0.9469,0.949671,0.947015,0.947436
2,0.0919,0.134571,0.9554,0.956417,0.95564,0.955632
3,0.0406,0.126198,0.9628,0.963202,0.962866,0.962961
4,0.0161,0.131945,0.9655,0.96577,0.965672,0.965625


[I 2025-03-29 23:42:21,340] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.0014740970021661379, 'weight_decay': 0.005, 'warmup_steps': 13}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8538,0.722874,0.7606,0.775777,0.761405,0.757537
2,0.5142,0.415468,0.8566,0.86794,0.856551,0.858756
3,0.3461,0.413751,0.8581,0.870598,0.858578,0.857625
4,0.2339,0.27061,0.9089,0.915556,0.908956,0.910306


[I 2025-03-29 23:47:55,783] Trial 64 pruned. 


Trial 65 with params: {'learning_rate': 0.0003061126129336506, 'weight_decay': 0.004, 'warmup_steps': 10}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4077,0.237446,0.923,0.927004,0.923444,0.923557
2,0.139,0.192619,0.9423,0.946188,0.942542,0.942608


[I 2025-03-29 23:50:43,302] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 0.00019251840253040213, 'weight_decay': 0.007, 'warmup_steps': 15}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4145,0.211831,0.9306,0.935297,0.930927,0.93115
2,0.1066,0.149509,0.9504,0.951337,0.95065,0.950608
3,0.0534,0.133214,0.9605,0.961084,0.960595,0.960703
4,0.0268,0.143382,0.962,0.962287,0.962153,0.962158


[I 2025-03-29 23:56:48,051] Trial 66 pruned. 


Trial 67 with params: {'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.484,0.18063,0.9393,0.942906,0.939729,0.9396
2,0.0887,0.113693,0.9616,0.961775,0.961758,0.961648
3,0.0388,0.115217,0.9657,0.965841,0.965846,0.965788
4,0.0169,0.122903,0.9698,0.970278,0.969851,0.969986
5,0.0052,0.123374,0.9719,0.972085,0.972034,0.972037
6,0.0018,0.125322,0.9739,0.974106,0.974027,0.974044
7,0.0006,0.125152,0.9745,0.974653,0.974612,0.974629


[I 2025-03-30 00:06:37,997] Trial 67 finished with value: 0.9746290627725797 and parameters: {'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}. Best is trial 67 with value: 0.9746290627725797.


Trial 68 with params: {'learning_rate': 0.00011106805870942286, 'weight_decay': 0.006, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4624,0.169977,0.9447,0.946935,0.944822,0.945205
2,0.0959,0.143071,0.9526,0.95406,0.952775,0.95289


[I 2025-03-30 00:09:24,500] Trial 68 pruned. 


Trial 69 with params: {'learning_rate': 0.00026379078208589916, 'weight_decay': 0.007, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4441,0.222593,0.9265,0.928663,0.927032,0.926341
2,0.1334,0.159377,0.9481,0.948861,0.948333,0.948255
3,0.0736,0.130405,0.9594,0.959929,0.959591,0.959609
4,0.0385,0.128816,0.9635,0.963646,0.963622,0.963592


[I 2025-03-30 00:14:59,569] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 5.1939313310282055e-05, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5933,0.170069,0.9475,0.948246,0.947505,0.947741
2,0.1025,0.130518,0.9586,0.958863,0.958806,0.958685
3,0.044,0.131537,0.961,0.961754,0.961118,0.961203
4,0.0172,0.138142,0.9638,0.96396,0.963907,0.963899
5,0.0059,0.142225,0.9666,0.96677,0.966693,0.966708
6,0.0019,0.154488,0.9657,0.965878,0.965794,0.965821
7,0.0009,0.157517,0.9671,0.967139,0.967213,0.967152


[I 2025-03-30 00:24:43,523] Trial 70 finished with value: 0.9671521366228003 and parameters: {'learning_rate': 5.1939313310282055e-05, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 71 with params: {'learning_rate': 0.00014524706562936044, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4472,0.193107,0.9371,0.943006,0.937234,0.937835
2,0.0985,0.129569,0.9569,0.957309,0.957132,0.957059
3,0.0469,0.120871,0.9645,0.96482,0.964678,0.964671
4,0.0214,0.127811,0.9673,0.967792,0.967473,0.967487
5,0.0078,0.141804,0.9683,0.968597,0.968436,0.968397
6,0.0017,0.126927,0.9728,0.972941,0.9729,0.972906
7,0.0006,0.128563,0.9722,0.972338,0.972329,0.972319


[I 2025-03-30 00:34:37,383] Trial 71 finished with value: 0.9723193002334269 and parameters: {'learning_rate': 0.00014524706562936044, 'weight_decay': 0.008, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 72 with params: {'learning_rate': 5.507864621388507e-05, 'weight_decay': 0.006, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5677,0.165092,0.9479,0.948916,0.948123,0.947962
2,0.0989,0.121365,0.9602,0.960538,0.96033,0.960381
3,0.0392,0.126578,0.9638,0.964132,0.963942,0.963888
4,0.0141,0.133393,0.9665,0.966559,0.966659,0.966587
5,0.0055,0.139647,0.9685,0.968777,0.968597,0.968638
6,0.0018,0.142426,0.9692,0.969382,0.969321,0.969346
7,0.0009,0.143794,0.9702,0.970323,0.970353,0.970325


[I 2025-03-30 00:44:32,600] Trial 72 finished with value: 0.9703253748951186 and parameters: {'learning_rate': 5.507864621388507e-05, 'weight_decay': 0.006, 'warmup_steps': 17}. Best is trial 67 with value: 0.9746290627725797.


Trial 73 with params: {'learning_rate': 0.00016046280027725454, 'weight_decay': 0.008, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4351,0.21221,0.9289,0.934863,0.929134,0.929615
2,0.1011,0.142241,0.9507,0.952456,0.95095,0.950924


[I 2025-03-30 00:47:20,179] Trial 73 pruned. 


Trial 74 with params: {'learning_rate': 9.37652748553604e-05, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4919,0.160033,0.9499,0.951871,0.950038,0.950202
2,0.0912,0.114528,0.9623,0.963014,0.962444,0.962569
3,0.0396,0.124348,0.9639,0.964135,0.964013,0.96405
4,0.0172,0.138689,0.9656,0.965844,0.965663,0.965732
5,0.0062,0.148131,0.9677,0.96789,0.967817,0.967806
6,0.0017,0.143545,0.9686,0.968725,0.968748,0.968729
7,0.0005,0.143692,0.9705,0.970645,0.970633,0.970633


[I 2025-03-30 00:57:09,320] Trial 74 finished with value: 0.9706326901919524 and parameters: {'learning_rate': 9.37652748553604e-05, 'weight_decay': 0.008, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 75 with params: {'learning_rate': 0.0001248004164266306, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4823,0.173416,0.9453,0.947302,0.945634,0.945508
2,0.1069,0.133538,0.9555,0.956817,0.955664,0.955781
3,0.0484,0.126762,0.9606,0.961098,0.960756,0.960817
4,0.0224,0.126711,0.9671,0.967185,0.967207,0.967142
5,0.0073,0.12235,0.9713,0.971565,0.971462,0.971426
6,0.0014,0.124575,0.9738,0.973994,0.973922,0.973939
7,0.0006,0.125869,0.9739,0.973992,0.974063,0.974013


[I 2025-03-30 01:06:56,130] Trial 75 finished with value: 0.9740127201248043 and parameters: {'learning_rate': 0.0001248004164266306, 'weight_decay': 0.007, 'warmup_steps': 28}. Best is trial 67 with value: 0.9746290627725797.


Trial 76 with params: {'learning_rate': 0.0001696458210351118, 'weight_decay': 0.007, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4505,0.165953,0.9476,0.949679,0.947445,0.947733
2,0.1046,0.149689,0.9511,0.952606,0.951242,0.951238
3,0.0535,0.137503,0.9571,0.957555,0.957189,0.957243
4,0.024,0.135038,0.9646,0.965266,0.964684,0.964856
5,0.0113,0.132872,0.97,0.970516,0.970166,0.970199
6,0.0027,0.125815,0.9705,0.970777,0.970603,0.970653
7,0.0006,0.121762,0.9708,0.97106,0.970909,0.970953


[I 2025-03-30 01:16:56,980] Trial 76 finished with value: 0.9709527966468403 and parameters: {'learning_rate': 0.0001696458210351118, 'weight_decay': 0.007, 'warmup_steps': 29}. Best is trial 67 with value: 0.9746290627725797.


Trial 77 with params: {'learning_rate': 0.00014335654906866193, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4612,0.170037,0.9432,0.94672,0.94346,0.943869
2,0.0969,0.12812,0.9575,0.958406,0.957669,0.957737
3,0.0446,0.132042,0.9623,0.963247,0.962293,0.962631
4,0.0232,0.127355,0.9658,0.966171,0.965945,0.965984


[I 2025-03-30 01:22:30,368] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.00010454672389277825, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4693,0.162285,0.9465,0.948813,0.946548,0.946716
2,0.088,0.126648,0.9581,0.958615,0.958342,0.958189
3,0.0412,0.12447,0.9624,0.962739,0.962549,0.962576
4,0.0178,0.119486,0.9692,0.969359,0.969421,0.969336
5,0.0066,0.126724,0.9709,0.971115,0.971057,0.971008
6,0.0016,0.123238,0.9731,0.9732,0.973232,0.973203
7,0.0006,0.122065,0.9735,0.973557,0.973628,0.973585


[I 2025-03-30 01:32:16,511] Trial 78 finished with value: 0.9735846964699236 and parameters: {'learning_rate': 0.00010454672389277825, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 79 with params: {'learning_rate': 0.0002079601235835947, 'weight_decay': 0.006, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.431,0.192682,0.9349,0.937382,0.934886,0.935424
2,0.1116,0.156186,0.9485,0.949829,0.948703,0.948384
3,0.0601,0.125112,0.9614,0.961804,0.961419,0.961531
4,0.03,0.130207,0.9635,0.9639,0.96367,0.963682


[I 2025-03-30 01:37:51,349] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 8.27169910109526e-05, 'weight_decay': 0.007, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4957,0.165876,0.9455,0.947117,0.945629,0.945811
2,0.0895,0.1286,0.9596,0.960093,0.959769,0.959731
3,0.0365,0.119121,0.9645,0.964592,0.964664,0.964571
4,0.0154,0.132444,0.9656,0.965782,0.965809,0.965715


[I 2025-03-30 01:43:23,766] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 8.194846030220038e-05, 'weight_decay': 0.007, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5238,0.172659,0.9432,0.945275,0.943411,0.943322
2,0.0901,0.124284,0.9596,0.960043,0.95988,0.959702
3,0.0387,0.121594,0.9641,0.964388,0.964264,0.964217
4,0.0157,0.130117,0.9674,0.967553,0.967521,0.967515
5,0.0051,0.139408,0.9682,0.968445,0.968262,0.968287
6,0.0017,0.139534,0.9707,0.970852,0.970789,0.970795
7,0.0007,0.138082,0.9705,0.970617,0.970608,0.970588


[I 2025-03-30 01:53:29,830] Trial 81 finished with value: 0.970587624728708 and parameters: {'learning_rate': 8.194846030220038e-05, 'weight_decay': 0.007, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 82 with params: {'learning_rate': 0.0001986188637046071, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4427,0.210421,0.9298,0.934271,0.929868,0.930368
2,0.1071,0.151278,0.9488,0.951315,0.949032,0.949219


[I 2025-03-30 01:56:17,738] Trial 82 pruned. 


Trial 83 with params: {'learning_rate': 5.380559807793641e-05, 'weight_decay': 0.007, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5949,0.183758,0.9409,0.943337,0.941075,0.941179
2,0.1051,0.122399,0.9587,0.959288,0.958904,0.958895
3,0.0426,0.123192,0.9631,0.963355,0.963218,0.963267
4,0.015,0.133967,0.9652,0.965336,0.965435,0.965267
5,0.0056,0.135403,0.9686,0.968714,0.96877,0.968685
6,0.0018,0.13866,0.9706,0.970894,0.970696,0.970768
7,0.0006,0.138683,0.9699,0.970049,0.97004,0.970034


[I 2025-03-30 03:06:05,544] Trial 83 finished with value: 0.9700344276167805 and parameters: {'learning_rate': 5.380559807793641e-05, 'weight_decay': 0.007, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 84 with params: {'learning_rate': 0.00017677589724360998, 'weight_decay': 0.007, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4453,0.203578,0.9333,0.939881,0.93328,0.934452
2,0.1053,0.155676,0.9494,0.949931,0.949671,0.949516


[I 2025-03-30 03:08:53,037] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.00011839895364819732, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4669,0.182481,0.9414,0.944823,0.941621,0.94197
2,0.0894,0.135924,0.9571,0.957336,0.957382,0.957001
3,0.0414,0.114988,0.9656,0.965838,0.965772,0.96578
4,0.0187,0.122173,0.9687,0.968832,0.96886,0.968821
5,0.0065,0.12917,0.9701,0.970231,0.970276,0.970194
6,0.0018,0.126243,0.9709,0.971052,0.971007,0.971012
7,0.0007,0.124432,0.972,0.972103,0.972158,0.972116


[I 2025-03-30 03:18:43,788] Trial 85 finished with value: 0.9721158721004268 and parameters: {'learning_rate': 0.00011839895364819732, 'weight_decay': 0.008, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 86 with params: {'learning_rate': 0.0002597113179487162, 'weight_decay': 0.01, 'warmup_steps': 6}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3861,0.255988,0.9183,0.924961,0.918852,0.9187
2,0.1281,0.153887,0.9459,0.948216,0.946154,0.946101


[I 2025-03-30 03:21:30,384] Trial 86 pruned. 


Trial 87 with params: {'learning_rate': 8.810867924200206e-05, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5046,0.158865,0.9471,0.948808,0.947287,0.947466
2,0.0894,0.115137,0.9633,0.963811,0.963436,0.963496
3,0.0399,0.124236,0.9641,0.965033,0.964235,0.96428
4,0.0165,0.119988,0.9697,0.969858,0.969838,0.969794
5,0.0052,0.13318,0.9706,0.97087,0.970734,0.970732
6,0.0014,0.130003,0.9732,0.973439,0.973266,0.973339
7,0.0005,0.132448,0.973,0.973143,0.973104,0.973117


[I 2025-03-30 03:31:12,459] Trial 87 finished with value: 0.9731167875703608 and parameters: {'learning_rate': 8.810867924200206e-05, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 88 with params: {'learning_rate': 0.00010532808384570563, 'weight_decay': 0.003, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4537,0.161396,0.9478,0.950757,0.947878,0.948341
2,0.0904,0.116728,0.9613,0.961792,0.961423,0.961526
3,0.0394,0.117158,0.9653,0.965646,0.965448,0.965382
4,0.0171,0.12592,0.9682,0.968345,0.968421,0.968288
5,0.0066,0.128168,0.9685,0.968847,0.96859,0.968684
6,0.0018,0.126169,0.9716,0.971886,0.971687,0.971749
7,0.0007,0.124042,0.9724,0.972574,0.97252,0.972522


[I 2025-03-30 03:41:24,416] Trial 88 finished with value: 0.9725216660618647 and parameters: {'learning_rate': 0.00010532808384570563, 'weight_decay': 0.003, 'warmup_steps': 18}. Best is trial 67 with value: 0.9746290627725797.


Trial 89 with params: {'learning_rate': 9.02154822352379e-05, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4959,0.15244,0.9497,0.951645,0.949748,0.950114
2,0.094,0.120436,0.9608,0.961247,0.960891,0.960993
3,0.0432,0.123506,0.9659,0.966203,0.966028,0.966023
4,0.0182,0.129197,0.9671,0.967468,0.96727,0.967214
5,0.0062,0.136298,0.9679,0.968166,0.968022,0.968022
6,0.0014,0.130983,0.9715,0.97169,0.971674,0.971654
7,0.0005,0.128676,0.9709,0.971079,0.971025,0.971044


[I 2025-03-30 03:51:06,458] Trial 89 finished with value: 0.9710441023231715 and parameters: {'learning_rate': 9.02154822352379e-05, 'weight_decay': 0.006, 'warmup_steps': 24}. Best is trial 67 with value: 0.9746290627725797.


Trial 90 with params: {'learning_rate': 0.0005558154008655438, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5202,0.31641,0.8956,0.908271,0.895688,0.896819
2,0.2148,0.236014,0.9192,0.921985,0.919682,0.918001


[I 2025-03-30 03:53:52,329] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.00010109337133292047, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4693,0.181072,0.9379,0.94362,0.937973,0.938952
2,0.0882,0.136194,0.9568,0.957225,0.95706,0.956862
3,0.04,0.116333,0.964,0.96433,0.964152,0.964192
4,0.0155,0.126719,0.9678,0.967956,0.967943,0.967907
5,0.0058,0.133262,0.9694,0.969698,0.969508,0.969534
6,0.0016,0.129357,0.9714,0.971797,0.971458,0.971572
7,0.0005,0.128628,0.9728,0.972969,0.972927,0.972935


[I 2025-03-30 04:03:36,367] Trial 91 finished with value: 0.9729347451315938 and parameters: {'learning_rate': 0.00010109337133292047, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 92 with params: {'learning_rate': 6.965676100182774e-05, 'weight_decay': 0.006, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5394,0.155437,0.95,0.951823,0.950035,0.950433
2,0.0931,0.119745,0.9615,0.96188,0.961575,0.961651
3,0.0422,0.113075,0.9665,0.966632,0.966612,0.966589
4,0.0163,0.131795,0.9672,0.967514,0.967297,0.967366
5,0.0063,0.141808,0.9698,0.969959,0.969887,0.969885
6,0.0024,0.140139,0.9705,0.970658,0.970597,0.970617
7,0.0008,0.142113,0.9697,0.969754,0.969819,0.969762


[I 2025-03-30 04:13:21,778] Trial 92 finished with value: 0.9697620985698204 and parameters: {'learning_rate': 6.965676100182774e-05, 'weight_decay': 0.006, 'warmup_steps': 29}. Best is trial 67 with value: 0.9746290627725797.


Trial 93 with params: {'learning_rate': 0.00014386945094024955, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4384,0.196651,0.9347,0.939447,0.934881,0.935347
2,0.0996,0.143961,0.9529,0.954151,0.952991,0.953088
3,0.0456,0.120739,0.9648,0.964979,0.96502,0.964954
4,0.0212,0.137572,0.9647,0.965468,0.964909,0.964779


[I 2025-03-30 04:18:54,750] Trial 93 pruned. 


Trial 94 with params: {'learning_rate': 0.0002009904356943865, 'weight_decay': 0.008, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4635,0.192116,0.9378,0.940142,0.937821,0.938123
2,0.1112,0.168058,0.9442,0.945296,0.944647,0.943987


[I 2025-03-30 04:21:51,830] Trial 94 pruned. 


Trial 95 with params: {'learning_rate': 8.952659244058166e-05, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4988,0.15468,0.9486,0.951116,0.948765,0.948983
2,0.0895,0.131008,0.9569,0.957209,0.957039,0.95691
3,0.0393,0.126802,0.9639,0.964696,0.96415,0.964102
4,0.0163,0.139509,0.9656,0.966161,0.965605,0.965766
5,0.0062,0.136443,0.9698,0.969988,0.96987,0.969883
6,0.0018,0.131995,0.9729,0.973196,0.97297,0.973059
7,0.0005,0.135058,0.972,0.972105,0.972107,0.972097


[I 2025-03-30 04:31:45,530] Trial 95 finished with value: 0.9720970570488252 and parameters: {'learning_rate': 8.952659244058166e-05, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 96 with params: {'learning_rate': 0.0001758223185828369, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.436,0.195483,0.9345,0.936979,0.934772,0.934788
2,0.1045,0.1239,0.9565,0.957509,0.956571,0.956833
3,0.0535,0.116707,0.9637,0.963962,0.963817,0.963854
4,0.0284,0.13338,0.9628,0.963219,0.962982,0.962993


[I 2025-03-30 04:37:18,639] Trial 96 pruned. 


Trial 97 with params: {'learning_rate': 0.00014435651061544283, 'weight_decay': 0.004, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4547,0.151858,0.9505,0.9511,0.950742,0.950577
2,0.096,0.143365,0.9533,0.954292,0.953537,0.953421
3,0.0473,0.137357,0.9564,0.956996,0.956528,0.95659
4,0.0208,0.134886,0.9654,0.965703,0.965533,0.965535
5,0.0085,0.124345,0.9698,0.969993,0.969932,0.969928
6,0.0016,0.129113,0.972,0.972136,0.972115,0.972087
7,0.0004,0.125463,0.9722,0.972279,0.972338,0.972292


[I 2025-03-30 04:47:13,028] Trial 97 finished with value: 0.9722917827979171 and parameters: {'learning_rate': 0.00014435651061544283, 'weight_decay': 0.004, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 98 with params: {'learning_rate': 0.0035054904723296637, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,2.4411,2.315875,0.1016,0.01016,0.1,0.018446
2,2.3358,2.307172,0.0977,0.00977,0.1,0.017801
3,2.3065,2.307979,0.0997,0.00997,0.1,0.018132
4,2.306,2.303693,0.1022,0.01022,0.1,0.018545


[I 2025-03-30 04:52:45,391] Trial 98 pruned. 


Trial 99 with params: {'learning_rate': 8.090843589470582e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5192,0.191693,0.9357,0.942236,0.935953,0.936642
2,0.0939,0.125641,0.9586,0.959039,0.958773,0.958664
3,0.0412,0.112248,0.9659,0.966181,0.965993,0.966046
4,0.016,0.12339,0.9672,0.967253,0.967396,0.967268
5,0.005,0.136233,0.9691,0.969222,0.969218,0.96919
6,0.002,0.133604,0.9716,0.971717,0.971696,0.971701
7,0.0007,0.134227,0.9716,0.971691,0.971725,0.971703


[I 2025-03-30 05:02:28,186] Trial 99 finished with value: 0.9717032287019307 and parameters: {'learning_rate': 8.090843589470582e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 100 with params: {'learning_rate': 0.00013333557567672514, 'weight_decay': 0.006, 'warmup_steps': 32}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4703,0.184463,0.938,0.942025,0.938115,0.938742
2,0.0915,0.127593,0.9566,0.956952,0.956817,0.956721
3,0.0449,0.117571,0.9639,0.964046,0.964087,0.96404
4,0.0208,0.133898,0.9651,0.965296,0.965317,0.965217
5,0.0083,0.135006,0.9689,0.969183,0.969024,0.969036
6,0.0023,0.127831,0.9712,0.971395,0.971355,0.971371
7,0.0006,0.130229,0.972,0.972134,0.972161,0.972144


[I 2025-03-30 05:12:10,361] Trial 100 finished with value: 0.9721438763030031 and parameters: {'learning_rate': 0.00013333557567672514, 'weight_decay': 0.006, 'warmup_steps': 32}. Best is trial 67 with value: 0.9746290627725797.


Trial 101 with params: {'learning_rate': 8.457343840273425e-05, 'weight_decay': 0.005, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5227,0.175287,0.9418,0.944126,0.942023,0.942248
2,0.093,0.130851,0.9575,0.958486,0.957633,0.957711
3,0.0361,0.122858,0.9643,0.964424,0.964432,0.964398
4,0.0157,0.130294,0.9671,0.967277,0.967275,0.967196
5,0.0049,0.138564,0.9689,0.968933,0.969025,0.968957
6,0.0018,0.137228,0.9721,0.972346,0.972232,0.972235
7,0.0006,0.134674,0.9714,0.971514,0.971532,0.971508


[I 2025-03-30 05:21:52,790] Trial 101 finished with value: 0.971507647241191 and parameters: {'learning_rate': 8.457343840273425e-05, 'weight_decay': 0.005, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 102 with params: {'learning_rate': 0.00013450126417323204, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4523,0.166758,0.9471,0.949957,0.947201,0.947506
2,0.0951,0.142674,0.954,0.955368,0.954203,0.954179
3,0.0471,0.134535,0.9598,0.960467,0.959977,0.960046
4,0.0213,0.134079,0.966,0.966386,0.966117,0.966203
5,0.0074,0.132766,0.9696,0.96985,0.969751,0.969736
6,0.002,0.128227,0.9712,0.971467,0.971268,0.971343
7,0.0007,0.125296,0.9733,0.973404,0.973452,0.973418


[I 2025-03-30 05:31:38,637] Trial 102 finished with value: 0.9734180083133032 and parameters: {'learning_rate': 0.00013450126417323204, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 103 with params: {'learning_rate': 0.0001191596467352941, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4528,0.188721,0.9365,0.941793,0.936635,0.937069
2,0.096,0.129955,0.9563,0.956817,0.956492,0.956451
3,0.0438,0.121136,0.9641,0.964253,0.964271,0.964136
4,0.0187,0.131935,0.9662,0.9664,0.966365,0.966346


[I 2025-03-30 05:37:13,158] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00020340932723015692, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4434,0.182599,0.9404,0.943583,0.940435,0.940956
2,0.1114,0.165758,0.9458,0.946534,0.946161,0.945917
3,0.0583,0.128684,0.96,0.960493,0.960175,0.960226
4,0.0301,0.132907,0.9627,0.962767,0.962881,0.962718


[I 2025-03-30 05:42:45,754] Trial 104 pruned. 


Trial 105 with params: {'learning_rate': 0.00013280021760138654, 'weight_decay': 0.007, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4455,0.166273,0.9454,0.947371,0.945513,0.945864
2,0.0969,0.113096,0.9611,0.961483,0.961291,0.961246
3,0.0477,0.120004,0.9631,0.96352,0.963244,0.963337
4,0.019,0.127062,0.9678,0.96796,0.967998,0.967915
5,0.0083,0.129998,0.969,0.969066,0.969209,0.969096
6,0.0025,0.125647,0.9708,0.970977,0.970969,0.970964
7,0.0005,0.123952,0.9726,0.972624,0.972792,0.972698


[I 2025-03-30 05:52:32,530] Trial 105 finished with value: 0.9726977797037911 and parameters: {'learning_rate': 0.00013280021760138654, 'weight_decay': 0.007, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 106 with params: {'learning_rate': 0.0001829983417738769, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4454,0.204333,0.9316,0.936346,0.931646,0.932153
2,0.1066,0.147652,0.9505,0.951349,0.950818,0.950712
3,0.0564,0.12977,0.9617,0.962244,0.961869,0.961952
4,0.0287,0.135011,0.9639,0.964518,0.964003,0.964168


[I 2025-03-30 05:58:06,958] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 0.00013656273349393398, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4643,0.184143,0.9378,0.940551,0.937881,0.938058
2,0.1018,0.122665,0.9598,0.960125,0.960015,0.959929
3,0.046,0.118513,0.964,0.964471,0.964185,0.964242
4,0.0202,0.126552,0.9667,0.966969,0.966898,0.96685


[I 2025-03-30 06:03:39,936] Trial 107 pruned. 


Trial 108 with params: {'learning_rate': 0.00035245706866971816, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4611,0.260278,0.9093,0.919522,0.909054,0.910429
2,0.161,0.213905,0.9273,0.929813,0.927883,0.927233


[I 2025-03-30 06:06:26,330] Trial 108 pruned. 


Trial 109 with params: {'learning_rate': 8.967758113070001e-05, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4971,0.164796,0.9488,0.950274,0.948963,0.949021
2,0.0919,0.139902,0.952,0.952738,0.952196,0.952106
3,0.0401,0.12399,0.9634,0.963607,0.963576,0.963521
4,0.0163,0.123507,0.9689,0.969029,0.969037,0.969018
5,0.0044,0.135821,0.9691,0.969304,0.969253,0.969241
6,0.0015,0.132404,0.971,0.971226,0.971109,0.97116
7,0.0005,0.132394,0.973,0.973215,0.973103,0.973147


[I 2025-03-30 06:16:16,338] Trial 109 finished with value: 0.9731471393902427 and parameters: {'learning_rate': 8.967758113070001e-05, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 110 with params: {'learning_rate': 8.151680199857094e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4999,0.170178,0.9435,0.945741,0.943742,0.943889
2,0.0946,0.122508,0.9607,0.960957,0.9609,0.96081
3,0.0383,0.11892,0.9661,0.966055,0.966297,0.966145
4,0.0154,0.138279,0.9642,0.964755,0.964334,0.96432


[I 2025-03-30 06:21:53,230] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 5.320320856550781e-05, 'weight_decay': 0.007, 'warmup_steps': 22}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5919,0.167232,0.9477,0.949095,0.947801,0.94804
2,0.0976,0.132922,0.9569,0.957303,0.957018,0.957024
3,0.0422,0.12692,0.9625,0.96262,0.962637,0.962568
4,0.016,0.136806,0.9648,0.964908,0.964941,0.96491


[I 2025-03-30 06:27:27,605] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 9.712315393149582e-05, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4901,0.167578,0.9442,0.9468,0.944355,0.944523
2,0.0901,0.116376,0.9617,0.961936,0.961906,0.961842
3,0.0403,0.126801,0.9636,0.964275,0.963725,0.963785
4,0.0184,0.131314,0.9679,0.968178,0.968073,0.968014
5,0.006,0.130232,0.9712,0.971257,0.971352,0.97128
6,0.0016,0.133706,0.9722,0.972338,0.972348,0.972333
7,0.0007,0.133325,0.9724,0.972443,0.972554,0.972489


[I 2025-03-30 06:37:14,385] Trial 112 finished with value: 0.9724886290579775 and parameters: {'learning_rate': 9.712315393149582e-05, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 113 with params: {'learning_rate': 9.531486307332714e-05, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4951,0.175469,0.9426,0.946678,0.942685,0.943326
2,0.0917,0.116126,0.9613,0.961878,0.961368,0.961519
3,0.0382,0.123098,0.9642,0.964553,0.964336,0.964319
4,0.0169,0.132723,0.9683,0.968395,0.968462,0.968397
5,0.0057,0.132848,0.9706,0.971017,0.97068,0.970751
6,0.0014,0.129039,0.9726,0.972825,0.972685,0.972728
7,0.0005,0.128906,0.9733,0.973419,0.973396,0.973392


[I 2025-03-30 06:46:59,315] Trial 113 finished with value: 0.9733916772087708 and parameters: {'learning_rate': 9.531486307332714e-05, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 114 with params: {'learning_rate': 0.00011092880575613935, 'weight_decay': 0.007, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4812,0.176779,0.9418,0.946343,0.941884,0.942691
2,0.0918,0.133048,0.9555,0.955985,0.95573,0.955618
3,0.0401,0.120944,0.9654,0.965693,0.965508,0.965544
4,0.0193,0.129736,0.9659,0.966233,0.966026,0.966089
5,0.0067,0.137211,0.9701,0.970667,0.970268,0.97027
6,0.0018,0.134849,0.9719,0.97203,0.97199,0.972003
7,0.0005,0.135013,0.9716,0.971683,0.971742,0.971706


[I 2025-03-30 06:56:44,490] Trial 114 finished with value: 0.9717064882341813 and parameters: {'learning_rate': 0.00011092880575613935, 'weight_decay': 0.007, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 115 with params: {'learning_rate': 0.0001745698577191295, 'weight_decay': 0.005, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4495,0.19754,0.9357,0.938077,0.935965,0.935656
2,0.1064,0.166857,0.9457,0.94716,0.945957,0.945719


[I 2025-03-30 06:59:30,539] Trial 115 pruned. 


Trial 116 with params: {'learning_rate': 5.2717703637833475e-05, 'weight_decay': 0.006, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6035,0.16351,0.9484,0.950402,0.948422,0.948996
2,0.1055,0.123079,0.9616,0.96219,0.961754,0.961792
3,0.0446,0.125655,0.9638,0.964023,0.963881,0.9639
4,0.0182,0.133344,0.9668,0.966967,0.966911,0.96691
5,0.0065,0.145029,0.9676,0.967876,0.967676,0.967725
6,0.0025,0.152239,0.9671,0.967444,0.967159,0.967265
7,0.0012,0.154766,0.9677,0.967872,0.967802,0.967823


[I 2025-03-30 07:10:04,584] Trial 116 finished with value: 0.9678226720898128 and parameters: {'learning_rate': 5.2717703637833475e-05, 'weight_decay': 0.006, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 117 with params: {'learning_rate': 0.0027121193476131807, 'weight_decay': 0.009000000000000001, 'warmup_steps': 18}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.499,1.830881,0.3853,0.411548,0.386306,0.358818
2,1.32,1.377711,0.5035,0.547051,0.504523,0.478944


[I 2025-03-30 07:12:51,402] Trial 117 pruned. 


Trial 118 with params: {'learning_rate': 0.00010188457962913947, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4955,0.175822,0.9405,0.945125,0.94065,0.941479
2,0.0933,0.129425,0.9588,0.95942,0.958979,0.958954
3,0.0428,0.126026,0.9642,0.964344,0.964223,0.964208
4,0.0184,0.130308,0.9655,0.96589,0.965592,0.965662


[I 2025-03-30 07:18:28,106] Trial 118 pruned. 


Trial 119 with params: {'learning_rate': 0.00019675195405497828, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4404,0.202859,0.9337,0.938311,0.933846,0.934515
2,0.1063,0.148895,0.9527,0.954249,0.95279,0.953048
3,0.0603,0.131246,0.9603,0.96085,0.960529,0.96052
4,0.0298,0.136256,0.9631,0.963482,0.963263,0.963306


[I 2025-03-30 07:24:01,369] Trial 119 pruned. 


Trial 120 with params: {'learning_rate': 0.00013501072872136455, 'weight_decay': 0.006, 'warmup_steps': 17}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4414,0.175811,0.9437,0.946499,0.943714,0.944056
2,0.096,0.135413,0.9547,0.955932,0.954841,0.95505


[I 2025-03-30 07:26:47,918] Trial 120 pruned. 


Trial 121 with params: {'learning_rate': 0.00013070972010450813, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4714,0.196855,0.9345,0.940368,0.934741,0.935331
2,0.0981,0.136377,0.9555,0.956317,0.955712,0.955815
3,0.0485,0.121169,0.9635,0.963964,0.963576,0.963664
4,0.0203,0.140132,0.9641,0.964639,0.964268,0.964275


[I 2025-03-30 07:32:30,659] Trial 121 pruned. 


Trial 122 with params: {'learning_rate': 6.540779845278149e-05, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5374,0.161358,0.949,0.951307,0.948992,0.949528
2,0.0965,0.135395,0.9556,0.956149,0.955745,0.955729
3,0.0463,0.127693,0.9626,0.963289,0.962672,0.962801
4,0.0182,0.135122,0.966,0.966208,0.966113,0.966114


[I 2025-03-30 07:38:04,456] Trial 122 pruned. 


Trial 123 with params: {'learning_rate': 0.00011008229952490226, 'weight_decay': 0.006, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4686,0.1871,0.9382,0.942538,0.938205,0.938872
2,0.0913,0.131699,0.9579,0.958776,0.958024,0.95804
3,0.0412,0.118333,0.9653,0.965705,0.965333,0.965453
4,0.0165,0.135455,0.9669,0.967065,0.967029,0.967
5,0.0062,0.135735,0.9691,0.969256,0.969276,0.969221
6,0.0014,0.135739,0.9721,0.972386,0.972157,0.972247
7,0.0005,0.134396,0.9719,0.972056,0.972039,0.972026


[I 2025-03-30 07:47:47,783] Trial 123 finished with value: 0.9720258086672494 and parameters: {'learning_rate': 0.00011008229952490226, 'weight_decay': 0.006, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 124 with params: {'learning_rate': 8.497260814999432e-05, 'weight_decay': 0.008, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5085,0.147665,0.9531,0.954428,0.953174,0.9534
2,0.0908,0.129249,0.9596,0.960254,0.959781,0.959748
3,0.0414,0.122767,0.9653,0.965769,0.965378,0.965474
4,0.0164,0.130805,0.9676,0.967976,0.967638,0.967762
5,0.0049,0.141563,0.9694,0.969615,0.969525,0.969552
6,0.0016,0.142032,0.9703,0.970556,0.970358,0.970429
7,0.0006,0.141238,0.972,0.972178,0.972097,0.972123


[I 2025-03-30 07:57:35,876] Trial 124 finished with value: 0.9721228189106135 and parameters: {'learning_rate': 8.497260814999432e-05, 'weight_decay': 0.008, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 125 with params: {'learning_rate': 0.00010705347971676416, 'weight_decay': 0.007, 'warmup_steps': 29}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4931,0.195239,0.9331,0.937072,0.933398,0.93332
2,0.0915,0.120881,0.9603,0.960761,0.960493,0.960503
3,0.0431,0.122319,0.9647,0.965341,0.964764,0.964925
4,0.0184,0.135559,0.9656,0.966025,0.965672,0.965702
5,0.0079,0.129502,0.9711,0.971342,0.971268,0.971249
6,0.0017,0.130699,0.9707,0.970995,0.970808,0.970879
7,0.0006,0.129199,0.9719,0.972165,0.972024,0.972061


[I 2025-03-30 08:07:38,835] Trial 125 finished with value: 0.9720614753577481 and parameters: {'learning_rate': 0.00010705347971676416, 'weight_decay': 0.007, 'warmup_steps': 29}. Best is trial 67 with value: 0.9746290627725797.


Trial 126 with params: {'learning_rate': 0.00015536347405307435, 'weight_decay': 0.007, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.445,0.204069,0.9339,0.937964,0.933977,0.934634
2,0.1008,0.137909,0.9538,0.954446,0.95401,0.953844


[I 2025-03-30 08:10:25,062] Trial 126 pruned. 


Trial 127 with params: {'learning_rate': 0.0016071794381718252, 'weight_decay': 0.001, 'warmup_steps': 4}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.9394,0.828077,0.7233,0.766336,0.72332,0.725879
2,0.5621,0.523616,0.8266,0.84259,0.827186,0.821433
3,0.3934,0.436955,0.8542,0.860253,0.854677,0.852696
4,0.2771,0.303247,0.9029,0.907215,0.902895,0.904158


[I 2025-03-30 08:15:57,771] Trial 127 pruned. 


Trial 128 with params: {'learning_rate': 0.0003399928889305973, 'weight_decay': 0.007, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4656,0.332113,0.8892,0.902108,0.889466,0.889477
2,0.1523,0.184879,0.9398,0.940712,0.940291,0.939544
3,0.089,0.143696,0.956,0.95619,0.956114,0.9561
4,0.0491,0.137035,0.9608,0.961054,0.960918,0.960896


[I 2025-03-30 08:21:52,688] Trial 128 pruned. 


Trial 129 with params: {'learning_rate': 8.439056309864063e-05, 'weight_decay': 0.008, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5198,0.168952,0.9443,0.947392,0.944472,0.944931
2,0.0949,0.130793,0.957,0.957639,0.957154,0.957066
3,0.0406,0.119988,0.964,0.964499,0.964052,0.964174
4,0.0176,0.129575,0.968,0.968493,0.968113,0.96808
5,0.0059,0.128367,0.9688,0.968991,0.968921,0.968911
6,0.0019,0.133195,0.9708,0.971114,0.970908,0.970974
7,0.0007,0.130736,0.9726,0.972769,0.972712,0.972736


[I 2025-03-30 08:31:43,052] Trial 129 finished with value: 0.9727355858752323 and parameters: {'learning_rate': 8.439056309864063e-05, 'weight_decay': 0.008, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 130 with params: {'learning_rate': 5.039116231539376e-05, 'weight_decay': 0.004, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6051,0.170574,0.9463,0.948633,0.946257,0.946861
2,0.1047,0.131284,0.9586,0.959057,0.958788,0.958741
3,0.0445,0.122122,0.9644,0.964512,0.964515,0.964473
4,0.0164,0.134896,0.9659,0.966116,0.966005,0.966044


[I 2025-03-30 08:37:17,959] Trial 130 pruned. 


Trial 131 with params: {'learning_rate': 9.865676035790842e-05, 'weight_decay': 0.006, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4997,0.176426,0.942,0.946689,0.94204,0.942873
2,0.0936,0.129945,0.958,0.95901,0.957989,0.958273
3,0.0433,0.111188,0.9669,0.967132,0.967039,0.96706
4,0.0173,0.125201,0.967,0.967184,0.967165,0.967119
5,0.0069,0.125166,0.9714,0.971555,0.971514,0.9715
6,0.0018,0.128501,0.971,0.971184,0.97113,0.971136
7,0.0006,0.128971,0.9724,0.972513,0.972526,0.972509


[I 2025-03-30 08:47:02,789] Trial 131 finished with value: 0.9725088269779644 and parameters: {'learning_rate': 9.865676035790842e-05, 'weight_decay': 0.006, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 132 with params: {'learning_rate': 0.0001031135430792371, 'weight_decay': 0.007, 'warmup_steps': 27}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.477,0.174832,0.941,0.945864,0.941355,0.941561
2,0.0886,0.12127,0.9575,0.958093,0.957661,0.957642
3,0.0391,0.125873,0.964,0.964698,0.963992,0.964121
4,0.0174,0.130478,0.9675,0.967788,0.967628,0.967621
5,0.0062,0.133686,0.9696,0.96963,0.969751,0.969655
6,0.0018,0.131681,0.971,0.971135,0.971104,0.971113
7,0.0005,0.132101,0.9724,0.972531,0.972488,0.972494


[I 2025-03-30 08:56:45,708] Trial 132 finished with value: 0.9724943263816586 and parameters: {'learning_rate': 0.0001031135430792371, 'weight_decay': 0.007, 'warmup_steps': 27}. Best is trial 67 with value: 0.9746290627725797.


Trial 133 with params: {'learning_rate': 0.00024272350993485774, 'weight_decay': 0.006, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4416,0.240931,0.9179,0.926741,0.91819,0.918118
2,0.1249,0.196565,0.9355,0.939339,0.935664,0.935763
3,0.0654,0.135111,0.9603,0.960635,0.960378,0.960467
4,0.0389,0.118608,0.9658,0.966313,0.965902,0.966037
5,0.0145,0.133749,0.9679,0.968311,0.968078,0.968119
6,0.0041,0.123594,0.9718,0.972237,0.971876,0.972002
7,0.0007,0.119107,0.9723,0.97237,0.972453,0.972402


[I 2025-03-30 09:07:12,082] Trial 133 finished with value: 0.9724015868347069 and parameters: {'learning_rate': 0.00024272350993485774, 'weight_decay': 0.006, 'warmup_steps': 24}. Best is trial 67 with value: 0.9746290627725797.


Trial 134 with params: {'learning_rate': 0.00011822094169472689, 'weight_decay': 0.006, 'warmup_steps': 28}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4756,0.214333,0.9309,0.936297,0.931096,0.931417
2,0.0956,0.123982,0.9598,0.959998,0.960041,0.9599
3,0.0436,0.130867,0.9623,0.962942,0.962474,0.962571
4,0.0187,0.137892,0.9641,0.964398,0.964288,0.964261


[I 2025-03-30 09:12:45,121] Trial 134 pruned. 


Trial 135 with params: {'learning_rate': 0.0001554632484654868, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4367,0.190347,0.9363,0.942769,0.936471,0.937506
2,0.098,0.135401,0.9565,0.956915,0.956738,0.956654
3,0.0483,0.134247,0.9616,0.962221,0.961697,0.961786
4,0.0244,0.125289,0.9664,0.966427,0.966617,0.966414
5,0.0082,0.123202,0.9712,0.971458,0.971344,0.971347
6,0.0021,0.125235,0.9711,0.971504,0.97122,0.97129
7,0.0006,0.120516,0.9736,0.973783,0.973711,0.973724


[I 2025-03-30 09:22:29,854] Trial 135 finished with value: 0.9737240545587218 and parameters: {'learning_rate': 0.0001554632484654868, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23}. Best is trial 67 with value: 0.9746290627725797.


Trial 136 with params: {'learning_rate': 0.0001491088894733688, 'weight_decay': 0.008, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4437,0.193755,0.9342,0.938147,0.934399,0.934664
2,0.0959,0.135124,0.9566,0.956872,0.956778,0.95661
3,0.05,0.115838,0.9654,0.965467,0.96557,0.965484
4,0.0217,0.142573,0.9631,0.963248,0.963335,0.96313


[I 2025-03-30 09:28:30,756] Trial 136 pruned. 


Trial 137 with params: {'learning_rate': 0.00020373553713241103, 'weight_decay': 0.01, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4307,0.204079,0.9305,0.936795,0.930902,0.93104
2,0.1125,0.156992,0.951,0.951342,0.951205,0.950917
3,0.0613,0.132714,0.9586,0.958946,0.958776,0.958741
4,0.0296,0.126875,0.9656,0.966118,0.965588,0.965739
5,0.0124,0.136907,0.968,0.968456,0.968095,0.968057
6,0.003,0.120398,0.9711,0.971344,0.971209,0.971254
7,0.0007,0.11675,0.9726,0.972669,0.972761,0.972696


[I 2025-03-30 09:38:21,843] Trial 137 finished with value: 0.9726962609113586 and parameters: {'learning_rate': 0.00020373553713241103, 'weight_decay': 0.01, 'warmup_steps': 21}. Best is trial 67 with value: 0.9746290627725797.


Trial 138 with params: {'learning_rate': 0.00021360881101219152, 'weight_decay': 0.009000000000000001, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4505,0.211936,0.9319,0.936072,0.932058,0.932434
2,0.1184,0.145459,0.9511,0.951376,0.951423,0.951061
3,0.0594,0.121639,0.9628,0.963214,0.962852,0.962987
4,0.0284,0.131662,0.9647,0.965043,0.964897,0.964905


[I 2025-03-30 09:43:55,121] Trial 138 pruned. 


Trial 139 with params: {'learning_rate': 7.779035601268777e-05, 'weight_decay': 0.01, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5024,0.165126,0.9452,0.948349,0.945357,0.945766
2,0.0889,0.120134,0.9618,0.962133,0.96194,0.961959
3,0.0385,0.124889,0.9625,0.962643,0.962648,0.962596
4,0.0147,0.134513,0.966,0.966071,0.966156,0.966097


[I 2025-03-30 09:49:28,610] Trial 139 pruned. 


Trial 140 with params: {'learning_rate': 0.0001273571460488291, 'weight_decay': 0.01, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.463,0.197419,0.9338,0.938253,0.933963,0.93433
2,0.0968,0.150764,0.9499,0.951056,0.950199,0.949997
3,0.0439,0.12697,0.9614,0.961796,0.961594,0.961592
4,0.0209,0.143053,0.9645,0.964784,0.964609,0.964623


[I 2025-03-30 09:55:18,624] Trial 140 pruned. 


Trial 141 with params: {'learning_rate': 0.00011871615512500498, 'weight_decay': 0.006, 'warmup_steps': 25}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4633,0.174251,0.9423,0.946695,0.942337,0.943179
2,0.0935,0.122295,0.961,0.961437,0.961077,0.961088
3,0.044,0.11873,0.9642,0.964963,0.964433,0.964385
4,0.0182,0.124497,0.9689,0.96898,0.96906,0.969
5,0.0056,0.130972,0.9684,0.968876,0.968548,0.968604
6,0.0017,0.12967,0.9708,0.970972,0.970933,0.97094
7,0.0004,0.128852,0.9719,0.972016,0.972049,0.97202


[I 2025-03-30 10:05:35,594] Trial 141 finished with value: 0.9720196300173409 and parameters: {'learning_rate': 0.00011871615512500498, 'weight_decay': 0.006, 'warmup_steps': 25}. Best is trial 67 with value: 0.9746290627725797.


Trial 142 with params: {'learning_rate': 6.322290328638982e-05, 'weight_decay': 0.008, 'warmup_steps': 30}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5932,0.168159,0.9462,0.947418,0.946289,0.946424
2,0.0945,0.116851,0.9616,0.961874,0.96167,0.961716
3,0.0382,0.115456,0.966,0.966086,0.966153,0.966082
4,0.0144,0.132907,0.9667,0.966842,0.966826,0.966821
5,0.005,0.145341,0.9665,0.966601,0.966665,0.966585
6,0.0018,0.14751,0.9679,0.968132,0.968022,0.968057
7,0.0008,0.14704,0.9683,0.96838,0.968435,0.968397


[I 2025-03-30 10:15:20,642] Trial 142 finished with value: 0.9683973996710821 and parameters: {'learning_rate': 6.322290328638982e-05, 'weight_decay': 0.008, 'warmup_steps': 30}. Best is trial 67 with value: 0.9746290627725797.


Trial 143 with params: {'learning_rate': 9.817682263120722e-05, 'weight_decay': 0.007, 'warmup_steps': 21}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.47,0.166883,0.9458,0.948582,0.945946,0.946128
2,0.0883,0.139596,0.9551,0.955908,0.955327,0.955096


[I 2025-03-30 10:18:06,309] Trial 143 pruned. 


Trial 144 with params: {'learning_rate': 0.00012998111661535324, 'weight_decay': 0.005, 'warmup_steps': 26}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4524,0.189832,0.9366,0.939831,0.936816,0.936797
2,0.0929,0.131191,0.956,0.95642,0.956239,0.956027
3,0.0456,0.120929,0.9628,0.963373,0.962981,0.962971
4,0.022,0.132565,0.9662,0.966453,0.966349,0.966343
5,0.0072,0.133132,0.9702,0.970478,0.970286,0.970335
6,0.0023,0.132746,0.9715,0.971826,0.971625,0.971686
7,0.0005,0.133915,0.9723,0.972465,0.972457,0.972449


[I 2025-03-30 10:27:53,000] Trial 144 finished with value: 0.9724491874340389 and parameters: {'learning_rate': 0.00012998111661535324, 'weight_decay': 0.005, 'warmup_steps': 26}. Best is trial 67 with value: 0.9746290627725797.


Trial 145 with params: {'learning_rate': 0.00033617132254965394, 'weight_decay': 0.008, 'warmup_steps': 19}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4433,0.2491,0.9194,0.924907,0.919641,0.919943
2,0.1543,0.184701,0.9366,0.938142,0.936901,0.936862


[I 2025-03-30 10:30:40,478] Trial 145 pruned. 


Trial 146 with params: {'learning_rate': 0.0001420927094320487, 'weight_decay': 0.008, 'warmup_steps': 20}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4332,0.177322,0.9417,0.944145,0.941765,0.941982
2,0.0987,0.147004,0.9513,0.952629,0.95151,0.951653
3,0.0491,0.13949,0.9595,0.959934,0.959731,0.959562
4,0.0223,0.126913,0.9685,0.968732,0.968658,0.968659
5,0.0084,0.134333,0.9701,0.97039,0.970262,0.970246
6,0.0026,0.134138,0.9713,0.971466,0.971443,0.971443
7,0.0005,0.133518,0.9717,0.971862,0.971848,0.971846


[I 2025-03-30 10:40:29,461] Trial 146 finished with value: 0.9718459221200899 and parameters: {'learning_rate': 0.0001420927094320487, 'weight_decay': 0.008, 'warmup_steps': 20}. Best is trial 67 with value: 0.9746290627725797.


Trial 147 with params: {'learning_rate': 0.0003079538495067879, 'weight_decay': 0.008, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4617,0.253198,0.9165,0.918857,0.916963,0.916414
2,0.1417,0.230481,0.9276,0.932726,0.92799,0.92701


[I 2025-03-30 10:43:17,116] Trial 147 pruned. 


Trial 148 with params: {'learning_rate': 9.806964711146234e-05, 'weight_decay': 0.005, 'warmup_steps': 31}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5107,0.20066,0.933,0.941627,0.933199,0.934555
2,0.0923,0.119668,0.9598,0.96045,0.959939,0.959969
3,0.0409,0.121229,0.965,0.965443,0.965102,0.965193
4,0.0158,0.132176,0.967,0.967488,0.967103,0.967153
5,0.0067,0.124224,0.9712,0.971412,0.971332,0.971347
6,0.0015,0.125096,0.9724,0.972615,0.972512,0.972556
7,0.0006,0.124561,0.974,0.974159,0.974137,0.974144


[I 2025-03-30 10:53:28,623] Trial 148 finished with value: 0.9741444675782933 and parameters: {'learning_rate': 9.806964711146234e-05, 'weight_decay': 0.005, 'warmup_steps': 31}. Best is trial 67 with value: 0.9746290627725797.


Trial 149 with params: {'learning_rate': 0.00015367446969151156, 'weight_decay': 0.008, 'warmup_steps': 24}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4403,0.20293,0.9333,0.938917,0.933727,0.933899
2,0.0976,0.138907,0.9527,0.954045,0.95301,0.952865
3,0.0469,0.12047,0.964,0.964423,0.96416,0.964183
4,0.0227,0.143753,0.9622,0.962246,0.962467,0.962233


[I 2025-03-30 10:59:03,079] Trial 149 pruned. 


In [19]:
print(best_base)

BestRun(run_id='67', objective=0.9746290627725797, hyperparameters={'learning_rate': 9.777098843358782e-05, 'weight_decay': 0.007, 'warmup_steps': 24}, run_summary=None)


In [20]:
base.reset_seed()

In [21]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-KD_hp-search", logging_dir=f"~/logs/{DATASET}/-KD_hp-search",  remove_unused_columns=False, epochs=num_epochs, batch_size=batch_size)

In [22]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up),
        "lambda_param": trial.suggest_float("lambda_param",0,1,step=.1),
        "temperature": trial.suggest_float("temperature", 2,7, step=.5)
    }
    print(f"Trial {trial.number} with params: {params}")
    return params

In [23]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



In [24]:
trainer = base.DistilTrainer(
    args=training_args,
    train_dataset=train,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
best_distill = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Distill",
    n_trials=150
)

[I 2025-03-30 10:59:03,800] A new study created in memory with name: Distill


Trial 0 with params: {'learning_rate': 0.0002805758207667253, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3557,0.212412,0.9306,0.935032,0.930799,0.931365
2,0.1754,0.18813,0.9481,0.949182,0.948338,0.948282


[I 2025-03-30 11:01:53,132] Trial 0 pruned. 


Trial 1 with params: {'learning_rate': 0.00010255552094216992, 'weight_decay': 0.0, 'warmup_steps': 28, 'lambda_param': 0.6000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3818,0.182053,0.9508,0.952567,0.950948,0.951066
2,0.1515,0.160317,0.9634,0.963918,0.963531,0.963576
3,0.1289,0.152544,0.9663,0.966602,0.966435,0.966452
4,0.1185,0.146904,0.9712,0.971227,0.971397,0.97127


[I 2025-03-30 11:07:28,309] Trial 1 pruned. 


Trial 2 with params: {'learning_rate': 5.497167787383099e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.2, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4329,0.185381,0.9494,0.950536,0.949473,0.949682
2,0.1553,0.160801,0.9609,0.961298,0.961113,0.961059


[I 2025-03-30 11:10:14,755] Trial 2 pruned. 


Trial 3 with params: {'learning_rate': 0.00011635338541918901, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3487,0.185177,0.948,0.950778,0.948282,0.948494
2,0.1488,0.165327,0.9602,0.960533,0.960442,0.960264
3,0.1288,0.158627,0.9628,0.963369,0.963121,0.962927
4,0.1196,0.148593,0.9675,0.968181,0.967725,0.967667


[I 2025-03-30 11:16:05,982] Trial 3 pruned. 


Trial 4 with params: {'learning_rate': 0.0008369042894376068, 'weight_decay': 0.001, 'warmup_steps': 9, 'lambda_param': 0.4, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4396,0.36551,0.8545,0.861139,0.85528,0.854383
2,0.267,0.301993,0.8838,0.892392,0.884524,0.88383


[I 2025-03-30 11:18:53,396] Trial 4 pruned. 


Trial 5 with params: {'learning_rate': 0.0018591820902866042, 'weight_decay': 0.002, 'warmup_steps': 16, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.7343,0.696554,0.6657,0.712928,0.667039,0.652811
2,0.4866,0.434172,0.8197,0.829463,0.819996,0.816106
3,0.3655,0.367814,0.8528,0.855263,0.852756,0.851811
4,0.2825,0.30841,0.881,0.892329,0.881014,0.88229


[I 2025-03-30 11:24:30,517] Trial 5 pruned. 


Trial 6 with params: {'learning_rate': 0.0008204643365323959, 'weight_decay': 0.001, 'warmup_steps': 2, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4234,0.40445,0.8371,0.84239,0.837095,0.835689
2,0.2652,0.269981,0.9023,0.904466,0.902606,0.902513


[I 2025-03-30 11:27:18,198] Trial 6 pruned. 


Trial 7 with params: {'learning_rate': 0.0020690200562805084, 'weight_decay': 0.003, 'warmup_steps': 3, 'lambda_param': 0.7000000000000001, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8218,0.699356,0.6694,0.688589,0.669481,0.664684
2,0.5543,0.539932,0.7604,0.790052,0.760039,0.765404
3,0.4209,0.4211,0.8226,0.829499,0.822219,0.821837
4,0.3312,0.33505,0.8714,0.880605,0.871141,0.873199


[I 2025-03-30 11:32:52,978] Trial 7 pruned. 


Trial 8 with params: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3364,0.186132,0.9483,0.951483,0.948471,0.948956
2,0.1477,0.167558,0.9605,0.961283,0.960719,0.960673
3,0.1275,0.153626,0.9661,0.966731,0.966252,0.966318
4,0.1191,0.149911,0.9679,0.967981,0.968054,0.968007
5,0.1139,0.144899,0.9709,0.971303,0.971029,0.971056
6,0.1114,0.143224,0.9712,0.971625,0.971328,0.971353
7,0.1101,0.141783,0.972,0.972181,0.972151,0.972117


[I 2025-03-30 11:43:05,426] Trial 8 finished with value: 0.9721166180268345 and parameters: {'learning_rate': 8.770946743725407e-05, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 8 with value: 0.9721166180268345.


Trial 9 with params: {'learning_rate': 0.0010568529720322872, 'weight_decay': 0.003, 'warmup_steps': 17, 'lambda_param': 0.6000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5079,0.444718,0.8085,0.826369,0.809057,0.807433
2,0.3038,0.333859,0.8734,0.888738,0.873605,0.874092


[I 2025-03-30 11:45:52,643] Trial 9 pruned. 


Trial 10 with params: {'learning_rate': 5.622306732978549e-05, 'weight_decay': 0.004, 'warmup_steps': 6, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3921,0.178731,0.953,0.95432,0.953251,0.953311
2,0.1535,0.165265,0.9618,0.962035,0.961956,0.961907
3,0.1287,0.155834,0.9648,0.965164,0.964951,0.964924
4,0.1186,0.151742,0.9667,0.966901,0.96687,0.96685


[I 2025-03-30 11:51:28,426] Trial 10 pruned. 


Trial 11 with params: {'learning_rate': 0.00020808715310578245, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.6000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3604,0.198915,0.9416,0.943461,0.941996,0.941815
2,0.1619,0.175332,0.9541,0.954815,0.954274,0.954336
3,0.1384,0.162317,0.961,0.961478,0.961092,0.961174
4,0.1244,0.153341,0.9669,0.967243,0.967045,0.966979


[I 2025-03-30 11:57:06,022] Trial 11 pruned. 


Trial 12 with params: {'learning_rate': 0.00014318207047557446, 'weight_decay': 0.001, 'warmup_steps': 21, 'lambda_param': 0.8, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3487,0.189529,0.9457,0.948857,0.946004,0.946151
2,0.1538,0.166781,0.9602,0.961066,0.960386,0.960273
3,0.1312,0.160574,0.9629,0.962981,0.963057,0.962901
4,0.121,0.151962,0.9674,0.967371,0.967575,0.967394


[I 2025-03-30 12:02:40,396] Trial 12 pruned. 


Trial 13 with params: {'learning_rate': 0.0001679567168095784, 'weight_decay': 0.008, 'warmup_steps': 7, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3207,0.187072,0.9494,0.951799,0.949455,0.94978
2,0.155,0.172692,0.9557,0.956504,0.955954,0.955779
3,0.1339,0.150654,0.9716,0.971897,0.971743,0.971745
4,0.1213,0.148089,0.9703,0.970448,0.970469,0.970403


[I 2025-03-30 12:08:16,793] Trial 13 pruned. 


Trial 14 with params: {'learning_rate': 9.781484202771949e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 5, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3459,0.176221,0.9564,0.95723,0.956616,0.956565
2,0.1488,0.158917,0.9646,0.964786,0.96481,0.964717
3,0.1286,0.152697,0.9661,0.966667,0.966283,0.966268
4,0.1192,0.149028,0.9693,0.969748,0.969513,0.969453
5,0.1136,0.143631,0.9712,0.971804,0.971349,0.971395
6,0.1117,0.140586,0.9743,0.974509,0.974409,0.974422
7,0.11,0.139735,0.9733,0.973505,0.97344,0.973423


[I 2025-03-30 12:18:04,458] Trial 14 finished with value: 0.9734230267943806 and parameters: {'learning_rate': 9.781484202771949e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 5, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 14 with value: 0.9734230267943806.


Trial 15 with params: {'learning_rate': 0.00018002615153235487, 'weight_decay': 0.008, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3266,0.209343,0.9368,0.940688,0.937078,0.936974
2,0.1589,0.174025,0.9548,0.956361,0.954929,0.955084
3,0.1345,0.159286,0.9647,0.965132,0.964815,0.964853
4,0.1232,0.151518,0.9671,0.967592,0.967137,0.967288


[I 2025-03-30 12:23:40,138] Trial 15 pruned. 


Trial 16 with params: {'learning_rate': 7.384419630274902e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3541,0.184959,0.9507,0.952677,0.950962,0.950929
2,0.1493,0.164179,0.961,0.961764,0.961117,0.961262
3,0.1285,0.153277,0.9673,0.967406,0.967431,0.967384
4,0.1188,0.147851,0.9696,0.969767,0.969697,0.969713
5,0.1141,0.144161,0.9709,0.971293,0.971042,0.971039
6,0.1114,0.142653,0.9713,0.971468,0.971437,0.971433
7,0.1101,0.14191,0.9724,0.972476,0.972548,0.972501


[I 2025-03-30 12:33:28,702] Trial 16 finished with value: 0.9725011462973223 and parameters: {'learning_rate': 7.384419630274902e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 2.0}. Best is trial 14 with value: 0.9734230267943806.


Trial 17 with params: {'learning_rate': 0.000124594001444187, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3145,0.201343,0.9375,0.942745,0.937894,0.93818
2,0.1527,0.176041,0.9551,0.956645,0.95534,0.955292
3,0.1313,0.157594,0.9641,0.964995,0.96425,0.964289
4,0.12,0.149439,0.9675,0.968093,0.967692,0.967659
5,0.1144,0.144078,0.9706,0.971143,0.970823,0.970777
6,0.1115,0.140014,0.9721,0.97235,0.972217,0.972249
7,0.11,0.138475,0.9733,0.973527,0.973429,0.973448


[I 2025-03-30 12:43:23,966] Trial 17 finished with value: 0.9734480866106539 and parameters: {'learning_rate': 0.000124594001444187, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 17 with value: 0.9734480866106539.


Trial 18 with params: {'learning_rate': 0.00014341173135625626, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3018,0.197755,0.942,0.945034,0.942297,0.942168
2,0.1525,0.162885,0.9617,0.961891,0.961867,0.961785
3,0.1318,0.158134,0.9638,0.964261,0.964041,0.963849
4,0.1197,0.150287,0.9688,0.968895,0.969032,0.968919
5,0.1151,0.142874,0.9711,0.971512,0.971302,0.971266
6,0.1115,0.138603,0.9764,0.976484,0.976566,0.976498
7,0.1102,0.138232,0.9752,0.975305,0.975382,0.975326


[I 2025-03-30 12:53:11,011] Trial 18 finished with value: 0.9753260189154085 and parameters: {'learning_rate': 0.00014341173135625626, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 19 with params: {'learning_rate': 0.00012899425163390336, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3174,0.192056,0.9472,0.949387,0.947434,0.947494
2,0.1497,0.165637,0.9616,0.961908,0.961822,0.961627
3,0.1289,0.159055,0.9618,0.963253,0.961916,0.962183
4,0.1203,0.146536,0.9718,0.971902,0.971908,0.971878
5,0.1143,0.144155,0.9722,0.972596,0.972356,0.972366
6,0.1115,0.140706,0.9725,0.972841,0.972628,0.972672
7,0.11,0.140125,0.9728,0.973019,0.972951,0.972959


[I 2025-03-30 13:02:56,970] Trial 19 finished with value: 0.9729594402061267 and parameters: {'learning_rate': 0.00012899425163390336, 'weight_decay': 0.008, 'warmup_steps': 3, 'lambda_param': 1.0, 'temperature': 7.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 20 with params: {'learning_rate': 0.00041588197261701134, 'weight_decay': 0.009000000000000001, 'warmup_steps': 13, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3477,0.244635,0.9165,0.920994,0.916883,0.916772
2,0.1937,0.212577,0.9341,0.937127,0.934396,0.934604


[I 2025-03-30 13:05:44,632] Trial 20 pruned. 


Trial 21 with params: {'learning_rate': 7.917372034759902e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 10, 'lambda_param': 0.7000000000000001, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3645,0.176224,0.954,0.955064,0.954162,0.954233
2,0.1489,0.16252,0.9631,0.963297,0.963262,0.963171
3,0.1277,0.154292,0.9668,0.967186,0.966979,0.966889
4,0.1185,0.149912,0.9673,0.967474,0.967443,0.967439


[I 2025-03-30 13:11:22,817] Trial 21 pruned. 


Trial 22 with params: {'learning_rate': 0.000447661846734586, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.8, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3318,0.263688,0.9048,0.911067,0.905185,0.905347
2,0.1982,0.206848,0.9376,0.938907,0.937881,0.937373


[I 2025-03-30 13:14:10,330] Trial 22 pruned. 


Trial 23 with params: {'learning_rate': 0.00021374902549225927, 'weight_decay': 0.009000000000000001, 'warmup_steps': 0, 'lambda_param': 0.8, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.2947,0.201968,0.9398,0.940956,0.940067,0.93983
2,0.1633,0.199826,0.9423,0.944919,0.942361,0.942396
3,0.1389,0.164633,0.9598,0.960032,0.959995,0.959907
4,0.1248,0.154837,0.9655,0.96579,0.965615,0.965645


[I 2025-03-30 13:19:43,668] Trial 23 pruned. 


Trial 24 with params: {'learning_rate': 0.00010112961434437739, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.329,0.191202,0.9435,0.947631,0.943739,0.94401
2,0.1488,0.161645,0.9621,0.9627,0.962152,0.962303
3,0.128,0.151421,0.9679,0.968116,0.968094,0.968043
4,0.119,0.146577,0.9694,0.969643,0.969561,0.969585
5,0.1137,0.143095,0.9714,0.971875,0.971561,0.971589
6,0.1113,0.140731,0.9722,0.972464,0.97238,0.972352
7,0.11,0.139209,0.9729,0.97316,0.973053,0.973055


[I 2025-03-30 13:29:31,227] Trial 24 finished with value: 0.9730549110777786 and parameters: {'learning_rate': 0.00010112961434437739, 'weight_decay': 0.01, 'warmup_steps': 1, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 25 with params: {'learning_rate': 5.761199644855385e-05, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3758,0.187012,0.9503,0.952849,0.950466,0.950698
2,0.1508,0.165607,0.9612,0.961632,0.961375,0.961348
3,0.1287,0.15379,0.9665,0.966663,0.966723,0.966619
4,0.1182,0.14964,0.9689,0.969185,0.969021,0.969076
5,0.1141,0.146575,0.9703,0.97054,0.970471,0.970446
6,0.1117,0.145736,0.9699,0.970295,0.970051,0.970093
7,0.1101,0.145006,0.9696,0.969843,0.969783,0.969744


[I 2025-03-30 13:40:30,487] Trial 25 finished with value: 0.9697437502776565 and parameters: {'learning_rate': 5.761199644855385e-05, 'weight_decay': 0.008, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 6.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 26 with params: {'learning_rate': 0.00036673897334545683, 'weight_decay': 0.003, 'warmup_steps': 0, 'lambda_param': 0.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3179,0.240005,0.9178,0.9216,0.91823,0.918004
2,0.1857,0.214993,0.9347,0.938431,0.934802,0.935628


[I 2025-03-30 13:43:25,861] Trial 26 pruned. 


Trial 27 with params: {'learning_rate': 0.00018775431018063502, 'weight_decay': 0.01, 'warmup_steps': 3, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3058,0.195814,0.9409,0.942722,0.941248,0.941142
2,0.159,0.174802,0.9551,0.956941,0.955129,0.95545
3,0.1346,0.155827,0.9674,0.96805,0.967475,0.967624
4,0.1233,0.154901,0.9666,0.96731,0.966704,0.966801


[I 2025-03-30 13:49:00,336] Trial 27 pruned. 


Trial 28 with params: {'learning_rate': 8.56035984463901e-05, 'weight_decay': 0.01, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3584,0.180193,0.9524,0.95426,0.952565,0.952763
2,0.149,0.165996,0.9616,0.96219,0.961725,0.961786
3,0.1281,0.153238,0.9659,0.966555,0.966046,0.966109
4,0.1181,0.148077,0.9692,0.969227,0.96941,0.969295
5,0.1138,0.144236,0.9722,0.972728,0.972341,0.972393
6,0.1115,0.142334,0.9722,0.97241,0.972326,0.972336
7,0.11,0.14133,0.9731,0.973294,0.973241,0.973229


[I 2025-03-30 13:58:58,507] Trial 28 finished with value: 0.973229236328114 and parameters: {'learning_rate': 8.56035984463901e-05, 'weight_decay': 0.01, 'warmup_steps': 10, 'lambda_param': 1.0, 'temperature': 4.5}. Best is trial 18 with value: 0.9753260189154085.


Trial 29 with params: {'learning_rate': 0.0011267334199977662, 'weight_decay': 0.007, 'warmup_steps': 0, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5079,0.514883,0.7843,0.811899,0.785394,0.777068
2,0.3273,0.32309,0.8762,0.882995,0.876168,0.875305


[I 2025-03-30 14:01:47,885] Trial 29 pruned. 


Trial 30 with params: {'learning_rate': 7.710729969126271e-05, 'weight_decay': 0.005, 'warmup_steps': 10, 'lambda_param': 0.2, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3658,0.180483,0.9517,0.953385,0.951963,0.952003
2,0.1491,0.165527,0.9594,0.95992,0.959553,0.9596
3,0.1285,0.154159,0.9667,0.967205,0.966882,0.96688
4,0.1185,0.149366,0.9678,0.967978,0.968021,0.967925


[I 2025-03-30 14:07:23,336] Trial 30 pruned. 


Trial 31 with params: {'learning_rate': 0.00016724791613560432, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.346,0.222287,0.9273,0.932231,0.927743,0.927487
2,0.1577,0.169437,0.9568,0.957385,0.957051,0.956957
3,0.1348,0.157775,0.966,0.966836,0.966162,0.966215
4,0.1222,0.149021,0.9684,0.968565,0.968569,0.968536
5,0.1157,0.142947,0.973,0.973158,0.97308,0.973097
6,0.1123,0.139199,0.9745,0.974715,0.974596,0.97462
7,0.1106,0.137768,0.9749,0.975069,0.975014,0.97502


[I 2025-03-30 14:17:11,728] Trial 31 finished with value: 0.9750200670357133 and parameters: {'learning_rate': 0.00016724791613560432, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.9, 'temperature': 4.0}. Best is trial 18 with value: 0.9753260189154085.


Trial 32 with params: {'learning_rate': 0.00013553561983282748, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3572,0.190811,0.9428,0.947579,0.942932,0.943662
2,0.1538,0.175379,0.9552,0.956275,0.955348,0.955361
3,0.1329,0.154949,0.9644,0.966042,0.964504,0.964658
4,0.121,0.14986,0.9681,0.968513,0.968241,0.968267
5,0.1148,0.14153,0.9728,0.973431,0.972914,0.97298
6,0.1118,0.137841,0.9763,0.976481,0.976422,0.976408
7,0.1102,0.136758,0.9763,0.976497,0.976441,0.976412


[I 2025-03-30 14:27:05,829] Trial 32 finished with value: 0.9764120594095738 and parameters: {'learning_rate': 0.00013553561983282748, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 33 with params: {'learning_rate': 9.798842916219257e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3864,0.172762,0.9577,0.9584,0.957864,0.957895
2,0.1479,0.163882,0.9605,0.960941,0.960618,0.960679
3,0.1287,0.152813,0.9677,0.968224,0.967896,0.967814
4,0.1186,0.148163,0.9697,0.969905,0.969905,0.969835
5,0.1138,0.143475,0.972,0.972314,0.972107,0.97213
6,0.1112,0.141525,0.9729,0.97315,0.973055,0.97303
7,0.1099,0.140777,0.9729,0.972983,0.973043,0.972984


[I 2025-03-30 14:37:03,502] Trial 33 finished with value: 0.9729843818379591 and parameters: {'learning_rate': 9.798842916219257e-05, 'weight_decay': 0.01, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 34 with params: {'learning_rate': 0.0004765833477578671, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3712,0.283862,0.897,0.906492,0.897288,0.898169
2,0.2028,0.227252,0.9268,0.929384,0.926981,0.926536


[I 2025-03-30 14:39:52,818] Trial 34 pruned. 


Trial 35 with params: {'learning_rate': 5.479487851074696e-05, 'weight_decay': 0.01, 'warmup_steps': 21, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4237,0.181257,0.9542,0.955529,0.954367,0.954473
2,0.1533,0.163609,0.9637,0.964157,0.963849,0.963876
3,0.1296,0.154649,0.9641,0.964291,0.964266,0.964192
4,0.1191,0.148328,0.9691,0.96927,0.969281,0.96923
5,0.1146,0.147145,0.9702,0.970597,0.97035,0.970344
6,0.112,0.145104,0.9711,0.971382,0.971246,0.971247
7,0.1105,0.144626,0.9707,0.970947,0.970845,0.970829


[I 2025-03-30 14:49:40,812] Trial 35 finished with value: 0.9708288001325898 and parameters: {'learning_rate': 5.479487851074696e-05, 'weight_decay': 0.01, 'warmup_steps': 21, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 36 with params: {'learning_rate': 0.00025975114163242537, 'weight_decay': 0.01, 'warmup_steps': 18, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3407,0.218588,0.9265,0.933388,0.92692,0.927375
2,0.1705,0.194564,0.9425,0.944456,0.942634,0.942806


[I 2025-03-30 14:52:28,428] Trial 36 pruned. 


Trial 37 with params: {'learning_rate': 0.0001055526602227995, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.363,0.18189,0.9502,0.95292,0.950421,0.950672
2,0.151,0.163114,0.9616,0.961745,0.961824,0.961689
3,0.1285,0.155386,0.9643,0.965343,0.964499,0.964577
4,0.1185,0.146137,0.9697,0.969894,0.969865,0.969815
5,0.1138,0.143183,0.9719,0.972315,0.972025,0.972068
6,0.1112,0.14021,0.9735,0.973814,0.973614,0.973661
7,0.1099,0.138863,0.9745,0.974617,0.974659,0.974612


[I 2025-03-30 15:02:17,589] Trial 37 finished with value: 0.9746118380275559 and parameters: {'learning_rate': 0.0001055526602227995, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 38 with params: {'learning_rate': 9.646392086313548e-05, 'weight_decay': 0.008, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3709,0.188322,0.9462,0.94833,0.946603,0.946387
2,0.1485,0.158581,0.9643,0.964981,0.964414,0.964484
3,0.1275,0.15204,0.9681,0.968618,0.968199,0.968315
4,0.1186,0.149426,0.9679,0.968065,0.968089,0.968035


[I 2025-03-30 15:07:59,302] Trial 38 pruned. 


Trial 39 with params: {'learning_rate': 5.7230182765429275e-05, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4265,0.180569,0.9514,0.952157,0.95159,0.951609
2,0.1523,0.161044,0.9611,0.961569,0.961233,0.961258
3,0.1293,0.154021,0.9663,0.966823,0.966436,0.966464
4,0.1194,0.151002,0.9672,0.967283,0.967393,0.967305


[I 2025-03-30 15:13:35,139] Trial 39 pruned. 


Trial 40 with params: {'learning_rate': 0.002301313995834585, 'weight_decay': 0.007, 'warmup_steps': 13, 'lambda_param': 1.0, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.8981,0.980763,0.5324,0.639532,0.532476,0.519171
2,0.6495,0.666543,0.7057,0.725882,0.70624,0.699218
3,0.4994,0.570375,0.7465,0.80077,0.746346,0.751928
4,0.3953,0.442498,0.8084,0.838704,0.808332,0.812727


[I 2025-03-30 15:19:10,461] Trial 40 pruned. 


Trial 41 with params: {'learning_rate': 9.869734112270565e-05, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3873,0.181805,0.9507,0.952847,0.950869,0.951165
2,0.1486,0.16496,0.9599,0.960436,0.960083,0.959999
3,0.1277,0.15205,0.9682,0.96845,0.968428,0.968377
4,0.1186,0.145379,0.971,0.971293,0.971127,0.971184
5,0.1134,0.14303,0.9721,0.972535,0.972186,0.972279
6,0.1112,0.141171,0.9726,0.972856,0.972749,0.972748
7,0.11,0.139688,0.9727,0.972858,0.972882,0.972829


[I 2025-03-30 15:28:57,987] Trial 41 finished with value: 0.9728293194330183 and parameters: {'learning_rate': 9.869734112270565e-05, 'weight_decay': 0.01, 'warmup_steps': 32, 'lambda_param': 1.0, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 42 with params: {'learning_rate': 0.0003597284442432274, 'weight_decay': 0.006, 'warmup_steps': 22, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3551,0.250065,0.9157,0.919228,0.915965,0.916016
2,0.1881,0.210241,0.9345,0.93655,0.934823,0.934764
3,0.1548,0.174082,0.9544,0.954591,0.954607,0.954524
4,0.1353,0.160222,0.964,0.964181,0.964143,0.964126


[I 2025-03-30 15:34:33,127] Trial 42 pruned. 


Trial 43 with params: {'learning_rate': 0.0032088988731785663, 'weight_decay': 0.003, 'warmup_steps': 32, 'lambda_param': 0.2, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.3704,1.657869,0.1034,0.075995,0.102519,0.047257
2,1.5161,1.517255,0.1373,0.094602,0.136857,0.075044


[I 2025-03-30 15:37:20,476] Trial 43 pruned. 


Trial 44 with params: {'learning_rate': 0.0014691315499909523, 'weight_decay': 0.009000000000000001, 'warmup_steps': 29, 'lambda_param': 0.9, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6272,0.562682,0.7521,0.77285,0.751865,0.749841
2,0.4046,0.389463,0.8419,0.847449,0.842599,0.83954
3,0.305,0.340875,0.8676,0.876097,0.867535,0.867179
4,0.2399,0.252262,0.9116,0.912983,0.911548,0.911825


[I 2025-03-30 15:43:29,246] Trial 44 pruned. 


Trial 45 with params: {'learning_rate': 0.004229168606699789, 'weight_decay': 0.009000000000000001, 'warmup_steps': 24, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.4508,1.52468,0.1449,0.093415,0.145276,0.088245
2,1.5082,1.505709,0.1487,0.057201,0.148372,0.070046


[I 2025-03-30 15:46:16,892] Trial 45 pruned. 


Trial 46 with params: {'learning_rate': 0.00032851466793796933, 'weight_decay': 0.007, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3559,0.223177,0.9267,0.932606,0.926913,0.927708
2,0.183,0.203725,0.9385,0.940901,0.938672,0.938529
3,0.153,0.184675,0.95,0.952304,0.950025,0.950602
4,0.1338,0.160041,0.9628,0.963283,0.962985,0.962984


[I 2025-03-30 15:51:51,186] Trial 46 pruned. 


Trial 47 with params: {'learning_rate': 0.0025789104733638904, 'weight_decay': 0.002, 'warmup_steps': 27, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.0701,1.567796,0.3463,0.483715,0.345982,0.320358
2,0.9351,0.974829,0.5181,0.552043,0.518281,0.508579


[I 2025-03-30 15:54:38,694] Trial 47 pruned. 


Trial 48 with params: {'learning_rate': 0.0027511979602444763, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.7000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5496,1.572123,0.0853,0.063885,0.087715,0.044881
2,1.5205,1.528469,0.1413,0.131581,0.14062,0.107607


[I 2025-03-30 15:57:25,858] Trial 48 pruned. 


Trial 49 with params: {'learning_rate': 0.0015898708923464957, 'weight_decay': 0.004, 'warmup_steps': 17, 'lambda_param': 0.1, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6412,0.574255,0.7476,0.765795,0.749107,0.747548
2,0.4232,0.375255,0.8507,0.862475,0.850888,0.850751
3,0.3185,0.32559,0.8744,0.880044,0.874527,0.875043
4,0.2486,0.26285,0.9047,0.907956,0.904687,0.905506


[I 2025-03-30 16:02:59,764] Trial 49 pruned. 


Trial 50 with params: {'learning_rate': 5.9361329005039714e-05, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3871,0.187124,0.9478,0.950539,0.947966,0.948114
2,0.1534,0.160328,0.9631,0.963759,0.963231,0.963328
3,0.1284,0.153439,0.9655,0.965712,0.965635,0.96566
4,0.1186,0.148924,0.9683,0.968385,0.968543,0.968418
5,0.1137,0.147572,0.969,0.969411,0.969164,0.969153
6,0.1116,0.144659,0.9698,0.970114,0.969974,0.969979
7,0.1101,0.144138,0.9709,0.971048,0.971058,0.971029


[I 2025-03-30 16:12:46,652] Trial 50 finished with value: 0.9710285170360388 and parameters: {'learning_rate': 5.9361329005039714e-05, 'weight_decay': 0.01, 'warmup_steps': 4, 'lambda_param': 0.9, 'temperature': 6.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 51 with params: {'learning_rate': 0.00018204615991542676, 'weight_decay': 0.008, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3402,0.206386,0.9369,0.941645,0.936957,0.937803
2,0.1594,0.173622,0.9562,0.957019,0.956255,0.956474
3,0.1345,0.154647,0.9661,0.966978,0.966219,0.9664
4,0.1227,0.151076,0.9693,0.969805,0.969461,0.969508
5,0.1157,0.144026,0.9733,0.973492,0.973399,0.973409
6,0.1122,0.140447,0.9747,0.974885,0.974808,0.974822
7,0.1104,0.139311,0.9751,0.975294,0.975235,0.975237


[I 2025-03-30 16:22:34,243] Trial 51 finished with value: 0.9752365662843173 and parameters: {'learning_rate': 0.00018204615991542676, 'weight_decay': 0.008, 'warmup_steps': 19, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 52 with params: {'learning_rate': 0.00043569463522663814, 'weight_decay': 0.005, 'warmup_steps': 1, 'lambda_param': 0.9, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.323,0.265347,0.9089,0.910352,0.909367,0.908272
2,0.1978,0.211717,0.9353,0.937959,0.935669,0.935422
3,0.1616,0.193247,0.945,0.945979,0.944986,0.945115
4,0.1409,0.171935,0.9569,0.958221,0.956983,0.957242


[I 2025-03-30 16:28:09,842] Trial 52 pruned. 


Trial 53 with params: {'learning_rate': 0.00019274299123550742, 'weight_decay': 0.009000000000000001, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3535,0.199232,0.9437,0.945001,0.943839,0.943987
2,0.1617,0.172106,0.9559,0.956328,0.956242,0.956035
3,0.1379,0.156376,0.9673,0.967777,0.967369,0.967428
4,0.125,0.153188,0.9669,0.967447,0.967088,0.967088


[I 2025-03-30 16:33:43,991] Trial 53 pruned. 


Trial 54 with params: {'learning_rate': 0.0002340759161127536, 'weight_decay': 0.005, 'warmup_steps': 29, 'lambda_param': 0.0, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3586,0.23,0.9213,0.928206,0.921453,0.921959
2,0.1691,0.192369,0.9447,0.94682,0.944852,0.945243
3,0.1421,0.162305,0.9616,0.962197,0.961697,0.961821
4,0.1272,0.154978,0.9663,0.966595,0.966334,0.966385


[I 2025-03-30 16:39:18,998] Trial 54 pruned. 


Trial 55 with params: {'learning_rate': 0.00046529059578259626, 'weight_decay': 0.007, 'warmup_steps': 18, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3621,0.254579,0.9102,0.918926,0.910549,0.911307
2,0.2034,0.210571,0.9345,0.936562,0.93467,0.934734


[I 2025-03-30 16:42:06,166] Trial 55 pruned. 


Trial 56 with params: {'learning_rate': 8.274316768557815e-05, 'weight_decay': 0.008, 'warmup_steps': 16, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3778,0.180278,0.9513,0.953356,0.951491,0.95154
2,0.1493,0.159726,0.9607,0.961644,0.9609,0.960984
3,0.1284,0.151752,0.9673,0.967371,0.967556,0.967373
4,0.118,0.144469,0.9717,0.971805,0.971867,0.971808
5,0.1137,0.141827,0.9733,0.973664,0.973477,0.973435
6,0.1113,0.140873,0.9736,0.973815,0.973722,0.973722
7,0.1099,0.140074,0.974,0.974101,0.974157,0.974104


[I 2025-03-30 16:51:54,171] Trial 56 finished with value: 0.974103624688017 and parameters: {'learning_rate': 8.274316768557815e-05, 'weight_decay': 0.008, 'warmup_steps': 16, 'lambda_param': 0.8, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 57 with params: {'learning_rate': 6.432079156127297e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 0.5, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3961,0.188329,0.9458,0.948334,0.946053,0.946218
2,0.1512,0.160956,0.9646,0.965254,0.964755,0.96481
3,0.1287,0.155237,0.9653,0.965662,0.965473,0.965437
4,0.1191,0.148097,0.9701,0.970286,0.970252,0.970214
5,0.1138,0.14443,0.9726,0.972971,0.972693,0.972718
6,0.1115,0.142386,0.9732,0.973311,0.973354,0.973319
7,0.11,0.141847,0.9721,0.97223,0.972264,0.972209


[I 2025-03-30 17:02:11,841] Trial 57 finished with value: 0.9722091702278648 and parameters: {'learning_rate': 6.432079156127297e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 16, 'lambda_param': 0.5, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 58 with params: {'learning_rate': 0.00016893047242669506, 'weight_decay': 0.004, 'warmup_steps': 23, 'lambda_param': 0.9, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3475,0.22251,0.924,0.933395,0.924394,0.925195
2,0.1571,0.168189,0.9589,0.959265,0.959031,0.959018
3,0.136,0.159036,0.9632,0.963336,0.963408,0.963295
4,0.1229,0.153184,0.9665,0.967639,0.966732,0.966723


[I 2025-03-30 17:07:46,876] Trial 58 pruned. 


Trial 59 with params: {'learning_rate': 0.0028085976163393445, 'weight_decay': 0.002, 'warmup_steps': 17, 'lambda_param': 0.7000000000000001, 'temperature': 5.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.1243,1.457994,0.2861,0.380311,0.286232,0.235398
2,1.0453,1.216419,0.3836,0.487607,0.383443,0.339553
3,0.9143,0.97837,0.4984,0.591465,0.498591,0.49634
4,0.8279,0.869814,0.568,0.60785,0.568162,0.568652


[I 2025-03-30 17:13:22,068] Trial 59 pruned. 


Trial 60 with params: {'learning_rate': 5.976931804392223e-05, 'weight_decay': 0.008, 'warmup_steps': 14, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4049,0.185022,0.9505,0.952126,0.950679,0.950767
2,0.1501,0.157181,0.9645,0.964984,0.964656,0.964744
3,0.1278,0.152666,0.9662,0.966551,0.966332,0.96635
4,0.1178,0.149727,0.9677,0.967776,0.967899,0.967811


[I 2025-03-30 17:18:56,326] Trial 60 pruned. 


Trial 61 with params: {'learning_rate': 0.00016980566072716556, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3395,0.200779,0.9392,0.944286,0.939394,0.939963
2,0.1576,0.165803,0.9612,0.961797,0.96139,0.961421
3,0.1346,0.156845,0.9651,0.96544,0.965178,0.965239
4,0.1218,0.148291,0.9683,0.968578,0.968468,0.968478
5,0.1155,0.144012,0.9723,0.972591,0.972441,0.972393
6,0.1118,0.138811,0.9741,0.974252,0.97421,0.974215
7,0.1104,0.137963,0.9737,0.973895,0.973835,0.973832


[I 2025-03-30 17:28:41,477] Trial 61 finished with value: 0.973832399586948 and parameters: {'learning_rate': 0.00016980566072716556, 'weight_decay': 0.009000000000000001, 'warmup_steps': 17, 'lambda_param': 0.9, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 62 with params: {'learning_rate': 0.00018710210526752272, 'weight_decay': 0.009000000000000001, 'warmup_steps': 12, 'lambda_param': 1.0, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.327,0.216795,0.9296,0.934208,0.930024,0.929951
2,0.1576,0.179418,0.9515,0.952489,0.951712,0.951585


[I 2025-03-30 17:31:27,973] Trial 62 pruned. 


Trial 63 with params: {'learning_rate': 0.00023171151664329458, 'weight_decay': 0.008, 'warmup_steps': 12, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3201,0.212095,0.9318,0.938166,0.931946,0.932893
2,0.1655,0.181174,0.9516,0.952671,0.951742,0.951788
3,0.1393,0.166486,0.9594,0.959751,0.95961,0.959512
4,0.1264,0.154456,0.9662,0.966338,0.966381,0.966289


[I 2025-03-30 17:37:03,248] Trial 63 pruned. 


Trial 64 with params: {'learning_rate': 0.00014037189452890584, 'weight_decay': 0.01, 'warmup_steps': 20, 'lambda_param': 0.8, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3497,0.185452,0.9493,0.951008,0.949532,0.949601
2,0.1547,0.165478,0.9596,0.960151,0.959713,0.959684
3,0.1319,0.156361,0.9644,0.96463,0.964633,0.96454
4,0.1205,0.146713,0.9702,0.970408,0.970432,0.970338
5,0.1145,0.142211,0.9732,0.973361,0.973351,0.973315
6,0.1116,0.139662,0.9745,0.974738,0.974642,0.974627
7,0.1101,0.138816,0.9744,0.974502,0.974581,0.974492


[I 2025-03-30 17:46:50,775] Trial 64 finished with value: 0.974492136838864 and parameters: {'learning_rate': 0.00014037189452890584, 'weight_decay': 0.01, 'warmup_steps': 20, 'lambda_param': 0.8, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 65 with params: {'learning_rate': 0.0002071169980195919, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.8, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3424,0.212899,0.9316,0.935337,0.931838,0.932093
2,0.1621,0.179913,0.9546,0.955551,0.954857,0.954646
3,0.1391,0.160608,0.9624,0.962863,0.962631,0.962535
4,0.1251,0.153168,0.966,0.966404,0.966229,0.966162


[I 2025-03-30 17:52:26,588] Trial 65 pruned. 


Trial 66 with params: {'learning_rate': 5.0098619486030555e-05, 'weight_decay': 0.007, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4287,0.183122,0.95,0.950904,0.950162,0.950178
2,0.1524,0.165513,0.9615,0.961837,0.961681,0.961637
3,0.1282,0.153858,0.9657,0.966263,0.965814,0.965901
4,0.1188,0.150746,0.9683,0.968597,0.968471,0.968417
5,0.1141,0.148432,0.9692,0.96963,0.969352,0.969368
6,0.1118,0.146714,0.9698,0.97001,0.969958,0.969948
7,0.1105,0.145982,0.9701,0.970308,0.970252,0.970233


[I 2025-03-30 18:02:13,594] Trial 66 finished with value: 0.9702332808336276 and parameters: {'learning_rate': 5.0098619486030555e-05, 'weight_decay': 0.007, 'warmup_steps': 16, 'lambda_param': 1.0, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 67 with params: {'learning_rate': 0.00018437386835220431, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3525,0.223034,0.9247,0.933881,0.925196,0.925496
2,0.1577,0.166898,0.9602,0.960715,0.960357,0.960363
3,0.1344,0.15695,0.9637,0.964087,0.963829,0.963913
4,0.123,0.148577,0.9714,0.971616,0.971529,0.971515
5,0.1159,0.141565,0.974,0.974346,0.974128,0.974127
6,0.1122,0.138625,0.9744,0.974691,0.974475,0.974517
7,0.1105,0.137614,0.9755,0.975628,0.975627,0.975603


[I 2025-03-30 18:12:00,868] Trial 67 finished with value: 0.9756034131850368 and parameters: {'learning_rate': 0.00018437386835220431, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 0.7000000000000001, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 68 with params: {'learning_rate': 0.00017557083916535206, 'weight_decay': 0.01, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3555,0.192884,0.9461,0.948304,0.946089,0.946632
2,0.1573,0.170393,0.9566,0.957141,0.956845,0.956665
3,0.1371,0.156116,0.9657,0.965863,0.96593,0.965767
4,0.1231,0.148301,0.9687,0.968893,0.968827,0.968791
5,0.1159,0.143751,0.972,0.972237,0.972167,0.972163
6,0.1123,0.139354,0.9725,0.972763,0.972604,0.972634
7,0.1104,0.137398,0.9743,0.974444,0.974436,0.974412


[I 2025-03-30 18:21:49,383] Trial 68 finished with value: 0.9744115924935196 and parameters: {'learning_rate': 0.00017557083916535206, 'weight_decay': 0.01, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 69 with params: {'learning_rate': 0.0004850647223008225, 'weight_decay': 0.009000000000000001, 'warmup_steps': 31, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3941,0.254605,0.9089,0.915069,0.90888,0.909468
2,0.2106,0.23048,0.9225,0.929752,0.922957,0.922853


[I 2025-03-30 18:24:36,521] Trial 69 pruned. 


Trial 70 with params: {'learning_rate': 0.0001812744264729855, 'weight_decay': 0.01, 'warmup_steps': 20, 'lambda_param': 0.9, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3461,0.182199,0.9499,0.951387,0.950066,0.950171
2,0.1589,0.164744,0.9599,0.960428,0.960092,0.960064
3,0.1353,0.155177,0.9656,0.966197,0.965706,0.965847
4,0.1235,0.15502,0.967,0.967152,0.967157,0.967101


[I 2025-03-30 18:30:10,756] Trial 70 pruned. 


Trial 71 with params: {'learning_rate': 0.0004384700251936054, 'weight_decay': 0.009000000000000001, 'warmup_steps': 30, 'lambda_param': 0.30000000000000004, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3804,0.254584,0.9123,0.915484,0.912913,0.911979
2,0.2024,0.221374,0.9294,0.932067,0.929641,0.929173
3,0.1647,0.186272,0.9487,0.949899,0.948775,0.949026
4,0.1417,0.166289,0.9611,0.961444,0.961198,0.961233


[I 2025-03-30 18:35:45,000] Trial 71 pruned. 


Trial 72 with params: {'learning_rate': 0.0001157193379607402, 'weight_decay': 0.01, 'warmup_steps': 30, 'lambda_param': 0.5, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3732,0.187135,0.9473,0.949753,0.947616,0.947516
2,0.1503,0.172185,0.9557,0.956735,0.955959,0.955932
3,0.1295,0.151521,0.9678,0.968054,0.967958,0.967961
4,0.1197,0.146844,0.9712,0.971361,0.971376,0.971334
5,0.1143,0.143321,0.9708,0.971576,0.970961,0.971023
6,0.1117,0.139744,0.9732,0.973423,0.973326,0.973335
7,0.1103,0.138604,0.9749,0.975045,0.975072,0.975019


[I 2025-03-30 18:46:02,690] Trial 72 finished with value: 0.9750186842896913 and parameters: {'learning_rate': 0.0001157193379607402, 'weight_decay': 0.01, 'warmup_steps': 30, 'lambda_param': 0.5, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 73 with params: {'learning_rate': 0.00016433668825404572, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3447,0.19176,0.9462,0.948953,0.946248,0.946636
2,0.1569,0.162277,0.9623,0.962926,0.96249,0.962574
3,0.1341,0.156675,0.9656,0.96597,0.965764,0.965817
4,0.1214,0.15066,0.9692,0.96956,0.969441,0.969348
5,0.1154,0.145824,0.9709,0.971678,0.971095,0.971093
6,0.112,0.14008,0.9738,0.973916,0.973952,0.97391
7,0.1103,0.138404,0.9757,0.975743,0.975859,0.975779


[I 2025-03-30 18:56:23,631] Trial 73 finished with value: 0.9757787767116017 and parameters: {'learning_rate': 0.00016433668825404572, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 74 with params: {'learning_rate': 0.00024147004208896432, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3454,0.211408,0.933,0.938416,0.933404,0.933459
2,0.1686,0.178357,0.9526,0.953069,0.952839,0.952669


[I 2025-03-30 18:59:11,772] Trial 74 pruned. 


Trial 75 with params: {'learning_rate': 0.00011732516921287371, 'weight_decay': 0.01, 'warmup_steps': 29, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3658,0.18651,0.9467,0.949909,0.946781,0.947177
2,0.1522,0.163599,0.9597,0.960712,0.959727,0.959995
3,0.1302,0.157304,0.9638,0.964518,0.964046,0.964034
4,0.1189,0.150052,0.9688,0.968828,0.969016,0.968893
5,0.1139,0.143664,0.971,0.971478,0.971129,0.971197
6,0.1115,0.141027,0.9732,0.9735,0.973308,0.97337
7,0.1101,0.139608,0.9736,0.973831,0.973739,0.973757


[I 2025-03-30 19:08:59,478] Trial 75 finished with value: 0.9737566722834966 and parameters: {'learning_rate': 0.00011732516921287371, 'weight_decay': 0.01, 'warmup_steps': 29, 'lambda_param': 0.6000000000000001, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 76 with params: {'learning_rate': 6.499882416252976e-05, 'weight_decay': 0.008, 'warmup_steps': 24, 'lambda_param': 0.2, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.409,0.178598,0.9539,0.954815,0.954053,0.954103
2,0.1524,0.164578,0.9613,0.961614,0.961463,0.961414
3,0.129,0.154315,0.967,0.967099,0.967171,0.96706
4,0.1183,0.150236,0.9683,0.968421,0.968428,0.9684
5,0.1139,0.147812,0.9683,0.968845,0.968445,0.968466
6,0.1118,0.144484,0.9703,0.970612,0.970397,0.970427
7,0.1103,0.143547,0.9708,0.970922,0.970948,0.970883


[I 2025-03-30 19:18:47,739] Trial 76 finished with value: 0.970882645809992 and parameters: {'learning_rate': 6.499882416252976e-05, 'weight_decay': 0.008, 'warmup_steps': 24, 'lambda_param': 0.2, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 77 with params: {'learning_rate': 5.112429509287801e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 22, 'lambda_param': 0.6000000000000001, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4445,0.181496,0.9501,0.952788,0.950338,0.950538
2,0.1518,0.16298,0.961,0.961463,0.961128,0.961169
3,0.1289,0.154632,0.9642,0.964398,0.964328,0.964319
4,0.1183,0.151474,0.9666,0.966756,0.966773,0.966736


[I 2025-03-30 19:24:22,853] Trial 77 pruned. 


Trial 78 with params: {'learning_rate': 0.00027811595208962893, 'weight_decay': 0.004, 'warmup_steps': 3, 'lambda_param': 0.2, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3099,0.213037,0.9352,0.938137,0.935488,0.935351
2,0.1724,0.185315,0.9507,0.952196,0.951004,0.950766
3,0.1459,0.165353,0.9608,0.960995,0.960971,0.960928
4,0.1298,0.156759,0.9648,0.965641,0.964936,0.965078


[I 2025-03-30 19:29:58,093] Trial 78 pruned. 


Trial 79 with params: {'learning_rate': 0.0001886911249849553, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3422,0.213555,0.9299,0.938377,0.930249,0.931171
2,0.1603,0.169651,0.9597,0.960071,0.95987,0.959738
3,0.137,0.165168,0.9586,0.959365,0.958796,0.958879
4,0.1236,0.156781,0.9638,0.964327,0.963925,0.964046


[I 2025-03-30 19:35:32,247] Trial 79 pruned. 


Trial 80 with params: {'learning_rate': 0.0029063834285411286, 'weight_decay': 0.01, 'warmup_steps': 6, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,1.5044,1.540771,0.1092,0.089861,0.112691,0.05803
2,1.5303,1.55446,0.1073,0.038104,0.105974,0.041242
3,1.5539,1.546471,0.1259,0.066471,0.124758,0.050592
4,1.5403,1.542118,0.1195,0.108646,0.118952,0.064561


[I 2025-03-30 19:41:07,192] Trial 80 pruned. 


Trial 81 with params: {'learning_rate': 0.00012529294005663154, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3553,0.180832,0.9498,0.953294,0.949933,0.950404
2,0.1502,0.162118,0.9623,0.962488,0.962497,0.962395
3,0.1304,0.153666,0.9654,0.966526,0.965531,0.965638
4,0.1199,0.146885,0.9729,0.973086,0.973043,0.973001
5,0.1141,0.141135,0.9741,0.974379,0.974241,0.974242
6,0.1115,0.139685,0.9738,0.974096,0.973943,0.973963
7,0.11,0.137842,0.9746,0.974804,0.974759,0.974716


[I 2025-03-30 19:51:21,176] Trial 81 finished with value: 0.9747164107173752 and parameters: {'learning_rate': 0.00012529294005663154, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 82 with params: {'learning_rate': 0.0001749207784979569, 'weight_decay': 0.01, 'warmup_steps': 16, 'lambda_param': 0.4, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3353,0.206263,0.9349,0.938002,0.935396,0.935076
2,0.159,0.166496,0.9609,0.961526,0.961027,0.961055
3,0.135,0.155126,0.9671,0.967236,0.967317,0.967233
4,0.1226,0.14932,0.9689,0.969049,0.969046,0.968993
5,0.1158,0.142377,0.9712,0.971365,0.971282,0.971305
6,0.1123,0.139334,0.9749,0.975028,0.97504,0.974992
7,0.1105,0.137911,0.9755,0.975587,0.975656,0.975601


[I 2025-03-30 20:01:10,356] Trial 82 finished with value: 0.9756009071838184 and parameters: {'learning_rate': 0.0001749207784979569, 'weight_decay': 0.01, 'warmup_steps': 16, 'lambda_param': 0.4, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 83 with params: {'learning_rate': 0.0002580032025713817, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.4, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3431,0.216035,0.9336,0.937974,0.933845,0.934399
2,0.1717,0.196299,0.944,0.944625,0.944206,0.943832


[I 2025-03-30 20:04:12,267] Trial 83 pruned. 


Trial 84 with params: {'learning_rate': 0.0002588018899047277, 'weight_decay': 0.009000000000000001, 'warmup_steps': 15, 'lambda_param': 0.2, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3345,0.208692,0.934,0.938839,0.934418,0.934423
2,0.1698,0.175797,0.9548,0.954927,0.955001,0.954798
3,0.1425,0.173455,0.9576,0.958385,0.957829,0.957728
4,0.1271,0.156149,0.9661,0.966251,0.966349,0.966187


[I 2025-03-30 20:09:50,178] Trial 84 pruned. 


Trial 85 with params: {'learning_rate': 0.00019889637834019354, 'weight_decay': 0.009000000000000001, 'warmup_steps': 19, 'lambda_param': 0.5, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3374,0.189616,0.9458,0.946926,0.945947,0.946172
2,0.1614,0.175965,0.9567,0.95736,0.956886,0.95682
3,0.1366,0.163107,0.9604,0.961064,0.960566,0.960591
4,0.1247,0.152604,0.9681,0.968327,0.968192,0.968199


[I 2025-03-30 20:15:27,369] Trial 85 pruned. 


Trial 86 with params: {'learning_rate': 0.00015278486360010863, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3381,0.185679,0.9481,0.951137,0.948312,0.948558
2,0.1557,0.165597,0.9594,0.959352,0.959712,0.959398
3,0.1334,0.155223,0.9657,0.96595,0.965833,0.96586
4,0.1227,0.148632,0.9702,0.970466,0.97037,0.970279
5,0.1155,0.141972,0.9729,0.973468,0.973086,0.97304
6,0.1122,0.140205,0.973,0.973499,0.973132,0.973194
7,0.1104,0.138208,0.9737,0.973933,0.973829,0.973816


[I 2025-03-30 20:25:23,418] Trial 86 finished with value: 0.9738163054614686 and parameters: {'learning_rate': 0.00015278486360010863, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 87 with params: {'learning_rate': 0.00013537140485040273, 'weight_decay': 0.006, 'warmup_steps': 27, 'lambda_param': 0.4, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3668,0.19217,0.9416,0.945538,0.941974,0.941968
2,0.1538,0.161411,0.9617,0.961995,0.961847,0.961866
3,0.1305,0.1533,0.967,0.967381,0.967188,0.96709
4,0.1199,0.15088,0.9654,0.966007,0.96558,0.965573


[I 2025-03-30 20:31:11,901] Trial 87 pruned. 


Trial 88 with params: {'learning_rate': 0.0001033215290468983, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.30000000000000004, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3588,0.171795,0.9575,0.958062,0.957589,0.957714
2,0.1511,0.158637,0.9641,0.964435,0.964279,0.964275
3,0.1286,0.155178,0.9646,0.965101,0.964806,0.964766
4,0.1191,0.148693,0.9686,0.968806,0.968775,0.968741
5,0.1142,0.143048,0.9722,0.972413,0.97237,0.972304
6,0.1116,0.141215,0.9725,0.972744,0.972626,0.972649
7,0.1101,0.140317,0.9732,0.97333,0.973377,0.973322


[I 2025-03-30 20:41:11,891] Trial 88 finished with value: 0.9733221379420354 and parameters: {'learning_rate': 0.0001033215290468983, 'weight_decay': 0.01, 'warmup_steps': 19, 'lambda_param': 0.30000000000000004, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 89 with params: {'learning_rate': 0.00013193531441646044, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3525,0.199868,0.9416,0.945038,0.941795,0.942145
2,0.1528,0.164561,0.9587,0.959017,0.958966,0.958773
3,0.1306,0.151762,0.9669,0.967158,0.967131,0.967032
4,0.1197,0.144765,0.9734,0.973589,0.973496,0.973526
5,0.1151,0.144946,0.971,0.971806,0.971158,0.971215
6,0.1119,0.139577,0.9734,0.973674,0.973527,0.973565
7,0.1103,0.137958,0.9739,0.974037,0.974051,0.974031


[I 2025-03-30 20:51:06,501] Trial 89 finished with value: 0.9740306476430345 and parameters: {'learning_rate': 0.00013193531441646044, 'weight_decay': 0.01, 'warmup_steps': 22, 'lambda_param': 0.4, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 90 with params: {'learning_rate': 0.0011115662517499805, 'weight_decay': 0.004, 'warmup_steps': 24, 'lambda_param': 0.6000000000000001, 'temperature': 7.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.5212,0.410641,0.8337,0.844341,0.834078,0.833282
2,0.3272,0.328325,0.8695,0.877821,0.869563,0.869612
3,0.2529,0.300791,0.8857,0.892588,0.885482,0.885378
4,0.2013,0.218692,0.9304,0.932336,0.930563,0.930764


[I 2025-03-30 20:56:49,263] Trial 90 pruned. 


Trial 91 with params: {'learning_rate': 0.0002157871176617988, 'weight_decay': 0.009000000000000001, 'warmup_steps': 4, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3101,0.229145,0.9252,0.928173,0.9257,0.925362
2,0.1656,0.186202,0.9509,0.952606,0.951106,0.951179


[I 2025-03-30 20:59:43,674] Trial 91 pruned. 


Trial 92 with params: {'learning_rate': 0.0015837356481811218, 'weight_decay': 0.006, 'warmup_steps': 15, 'lambda_param': 0.1, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6416,0.553646,0.7551,0.775525,0.756181,0.752897
2,0.4099,0.415454,0.8268,0.837709,0.827922,0.821098
3,0.3174,0.35894,0.8581,0.867919,0.858045,0.858572
4,0.2524,0.273181,0.9044,0.905919,0.904349,0.903942


[I 2025-03-30 21:05:53,902] Trial 92 pruned. 


Trial 93 with params: {'learning_rate': 5.9059829250360414e-05, 'weight_decay': 0.008, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4466,0.191074,0.9464,0.947795,0.946538,0.946521
2,0.154,0.167823,0.96,0.960328,0.960146,0.960085
3,0.1289,0.155333,0.9658,0.966287,0.966015,0.965925
4,0.1193,0.150021,0.9682,0.968307,0.968371,0.96828
5,0.1143,0.147801,0.9693,0.969788,0.969456,0.96946
6,0.1119,0.145963,0.9706,0.970919,0.970763,0.970741
7,0.1103,0.14494,0.9709,0.971064,0.971056,0.971019


[I 2025-03-30 21:15:48,436] Trial 93 finished with value: 0.9710189663247982 and parameters: {'learning_rate': 5.9059829250360414e-05, 'weight_decay': 0.008, 'warmup_steps': 31, 'lambda_param': 0.5, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 94 with params: {'learning_rate': 8.785284362480978e-05, 'weight_decay': 0.006, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 2.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3916,0.17503,0.9536,0.954733,0.953812,0.953885
2,0.149,0.157145,0.965,0.965488,0.965028,0.965164
3,0.1283,0.155567,0.9664,0.966837,0.966587,0.966514
4,0.1181,0.147408,0.9699,0.970041,0.970054,0.970015
5,0.1137,0.142978,0.9726,0.972812,0.972716,0.972729
6,0.1111,0.141445,0.9725,0.972809,0.972608,0.972677
7,0.1098,0.140168,0.9725,0.972588,0.972649,0.972599


Using the latest cached version of the module from /home/jovyan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/34c46321f42186df33a6260966e34a368f14868d9cc2ba47d142112e2800d233 (last modified on Tue Mar 25 13:21:31 2025) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
[I 2025-03-30 21:26:19,208] Trial 94 finished with value: 0.9725990447347085 and parameters: {'learning_rate': 8.785284362480978e-05, 'weight_decay': 0.006, 'warmup_steps': 26, 'lambda_param': 0.8, 'temperature': 2.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 95 with params: {'learning_rate': 0.00033622652480271855, 'weight_decay': 0.0, 'warmup_steps': 5, 'lambda_param': 0.4, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3194,0.22867,0.9252,0.928747,0.925529,0.92559
2,0.1809,0.178087,0.9546,0.955043,0.954878,0.954772


[I 2025-03-30 21:29:11,013] Trial 95 pruned. 


Trial 96 with params: {'learning_rate': 5.399635979922363e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.4406,0.183983,0.9507,0.951909,0.950874,0.950932
2,0.1544,0.16247,0.9613,0.961669,0.961419,0.961481
3,0.1293,0.153116,0.9669,0.967098,0.96711,0.966996
4,0.1188,0.149185,0.9679,0.968085,0.968078,0.968017
5,0.1141,0.146138,0.9703,0.970538,0.970494,0.970427
6,0.1117,0.145281,0.9705,0.970724,0.970618,0.970644
7,0.1104,0.143986,0.9697,0.969857,0.969851,0.969816


[I 2025-03-30 21:39:24,925] Trial 96 finished with value: 0.969816101362144 and parameters: {'learning_rate': 5.399635979922363e-05, 'weight_decay': 0.0, 'warmup_steps': 26, 'lambda_param': 0.30000000000000004, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 97 with params: {'learning_rate': 0.0002985710024151608, 'weight_decay': 0.008, 'warmup_steps': 26, 'lambda_param': 1.0, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3544,0.20852,0.9328,0.935365,0.933206,0.933129
2,0.1762,0.204137,0.9408,0.9436,0.940765,0.94119


[I 2025-03-30 21:42:17,866] Trial 97 pruned. 


Trial 98 with params: {'learning_rate': 0.00010935130174798839, 'weight_decay': 0.007, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3653,0.183729,0.9481,0.950917,0.948132,0.948606
2,0.1508,0.164699,0.9623,0.962874,0.962483,0.962471
3,0.1291,0.154522,0.966,0.966215,0.966142,0.966111
4,0.119,0.146897,0.9691,0.969352,0.969292,0.969224
5,0.1142,0.144389,0.9712,0.971837,0.971304,0.971415
6,0.1116,0.140245,0.9737,0.973947,0.973828,0.973848
7,0.1101,0.139126,0.9733,0.973497,0.97344,0.97344


[I 2025-03-30 21:52:16,624] Trial 98 finished with value: 0.9734400880373913 and parameters: {'learning_rate': 0.00010935130174798839, 'weight_decay': 0.007, 'warmup_steps': 23, 'lambda_param': 1.0, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 99 with params: {'learning_rate': 8.710007471084877e-05, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.371,0.177187,0.9549,0.956002,0.954946,0.955212
2,0.1502,0.158277,0.9642,0.964298,0.964379,0.964284
3,0.1287,0.155291,0.9637,0.965129,0.963854,0.963993
4,0.1191,0.148223,0.9695,0.969717,0.969641,0.969665
5,0.1139,0.145838,0.9694,0.970057,0.969551,0.969609
6,0.1116,0.14153,0.9721,0.972489,0.972203,0.972288
7,0.1102,0.140745,0.9725,0.972732,0.972635,0.972658


[I 2025-03-30 22:02:50,484] Trial 99 finished with value: 0.9726580815624283 and parameters: {'learning_rate': 8.710007471084877e-05, 'weight_decay': 0.01, 'warmup_steps': 17, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 100 with params: {'learning_rate': 0.00012162617401836313, 'weight_decay': 0.008, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 2.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3681,0.186652,0.9478,0.950634,0.948114,0.948186
2,0.1519,0.17029,0.9574,0.958173,0.957605,0.95743
3,0.131,0.152698,0.9678,0.968138,0.967977,0.968005
4,0.1194,0.149427,0.9693,0.96974,0.969459,0.969499
5,0.1145,0.142968,0.9733,0.97348,0.973484,0.973418
6,0.1116,0.140488,0.9725,0.972728,0.972636,0.972651
7,0.11,0.139907,0.9721,0.972163,0.972268,0.972196


[I 2025-03-30 22:13:04,239] Trial 100 finished with value: 0.9721960489643802 and parameters: {'learning_rate': 0.00012162617401836313, 'weight_decay': 0.008, 'warmup_steps': 27, 'lambda_param': 0.8, 'temperature': 2.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 101 with params: {'learning_rate': 8.468238290855145e-05, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3854,0.179661,0.9529,0.955173,0.953059,0.953278
2,0.1492,0.159693,0.9648,0.965009,0.964953,0.964903
3,0.1283,0.151848,0.9689,0.969195,0.969082,0.968981
4,0.1187,0.146258,0.9724,0.972666,0.972434,0.97253
5,0.1139,0.14122,0.9741,0.97448,0.974181,0.974249
6,0.1112,0.139623,0.9732,0.973523,0.973298,0.973352
7,0.1099,0.138841,0.9736,0.97381,0.973706,0.973717


[I 2025-03-30 22:23:05,224] Trial 101 finished with value: 0.9737171745496273 and parameters: {'learning_rate': 8.468238290855145e-05, 'weight_decay': 0.01, 'warmup_steps': 23, 'lambda_param': 0.7000000000000001, 'temperature': 3.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 102 with params: {'learning_rate': 0.00021237173133186566, 'weight_decay': 0.01, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3525,0.203906,0.939,0.941147,0.939122,0.939338
2,0.1624,0.182137,0.9526,0.953741,0.95294,0.952592
3,0.1384,0.163992,0.9608,0.961522,0.960925,0.961047
4,0.1249,0.149932,0.97,0.97015,0.970133,0.970103
5,0.1162,0.146134,0.9699,0.970436,0.970045,0.9701
6,0.1126,0.140681,0.9741,0.974361,0.974207,0.974242
7,0.1106,0.139535,0.9744,0.974558,0.97455,0.974523


[I 2025-03-30 22:33:25,513] Trial 102 finished with value: 0.9745229570554542 and parameters: {'learning_rate': 0.00021237173133186566, 'weight_decay': 0.01, 'warmup_steps': 25, 'lambda_param': 1.0, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 103 with params: {'learning_rate': 0.0003439804710936911, 'weight_decay': 0.01, 'warmup_steps': 24, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3616,0.253643,0.9099,0.915584,0.910371,0.909996
2,0.1833,0.210072,0.9387,0.941549,0.938751,0.939


[I 2025-03-30 22:36:12,668] Trial 103 pruned. 


Trial 104 with params: {'learning_rate': 0.00023793889138512282, 'weight_decay': 0.01, 'warmup_steps': 26, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3493,0.24071,0.9167,0.925365,0.91724,0.91788
2,0.1665,0.172419,0.9578,0.958254,0.958069,0.957889
3,0.1405,0.163229,0.9629,0.963249,0.963158,0.963091
4,0.1276,0.154535,0.9664,0.966471,0.966631,0.966493


[I 2025-03-30 22:42:07,362] Trial 104 pruned. 


Trial 105 with params: {'learning_rate': 0.001394113520827695, 'weight_decay': 0.002, 'warmup_steps': 31, 'lambda_param': 1.0, 'temperature': 6.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.6094,0.542065,0.7605,0.805146,0.760081,0.766905
2,0.3786,0.394766,0.8407,0.847074,0.841383,0.840135


[I 2025-03-30 22:44:59,148] Trial 105 pruned. 


Trial 106 with params: {'learning_rate': 0.00016644555832767357, 'weight_decay': 0.0, 'warmup_steps': 2, 'lambda_param': 0.30000000000000004, 'temperature': 6.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3118,0.196814,0.9435,0.947005,0.943655,0.943985
2,0.1569,0.183912,0.9498,0.951104,0.950078,0.949986
3,0.1356,0.159749,0.9638,0.963886,0.964032,0.963875
4,0.1221,0.150775,0.9673,0.967338,0.967495,0.967372


[I 2025-03-30 22:50:37,926] Trial 106 pruned. 


Trial 107 with params: {'learning_rate': 0.00012018461491622113, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 0.8, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3417,0.195284,0.9411,0.94525,0.941308,0.94153
2,0.1505,0.166771,0.9611,0.962002,0.961257,0.961313
3,0.1293,0.157388,0.9653,0.965804,0.965457,0.965452
4,0.1199,0.146623,0.9715,0.971813,0.971678,0.971632
5,0.1144,0.14289,0.9727,0.97318,0.972834,0.972895
6,0.1118,0.139618,0.9745,0.974637,0.974659,0.974633
7,0.11,0.138602,0.9752,0.975362,0.975351,0.975335


[I 2025-03-30 23:00:49,703] Trial 107 finished with value: 0.9753346264809897 and parameters: {'learning_rate': 0.00012018461491622113, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 0.8, 'temperature': 4.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 108 with params: {'learning_rate': 0.00014516609330979537, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 3.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3315,0.192378,0.9445,0.947608,0.944885,0.94477
2,0.1523,0.169534,0.958,0.958637,0.958133,0.958259
3,0.1302,0.151798,0.9682,0.968529,0.968275,0.968291
4,0.1207,0.148449,0.9691,0.96937,0.969215,0.969235
5,0.1152,0.141565,0.9735,0.973721,0.973609,0.973607
6,0.1117,0.13929,0.9742,0.974492,0.974333,0.974343
7,0.1104,0.138052,0.9754,0.975529,0.975543,0.975494


[I 2025-03-30 23:10:51,000] Trial 108 finished with value: 0.9754935135193004 and parameters: {'learning_rate': 0.00014516609330979537, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 3.5}. Best is trial 32 with value: 0.9764120594095738.


Trial 109 with params: {'learning_rate': 8.642340091115601e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 12, 'lambda_param': 0.9, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3618,0.191682,0.9462,0.949399,0.946402,0.946526
2,0.1495,0.159697,0.9646,0.965139,0.964698,0.964797
3,0.1286,0.152519,0.966,0.96661,0.966215,0.966162
4,0.1186,0.14729,0.9704,0.9705,0.970572,0.97051
5,0.1137,0.141227,0.9723,0.972674,0.972452,0.972471
6,0.1112,0.140252,0.973,0.973372,0.973136,0.973153
7,0.1098,0.139498,0.973,0.973204,0.973165,0.973137


[I 2025-03-30 23:20:57,368] Trial 109 finished with value: 0.9731369341767359 and parameters: {'learning_rate': 8.642340091115601e-05, 'weight_decay': 0.009000000000000001, 'warmup_steps': 12, 'lambda_param': 0.9, 'temperature': 4.0}. Best is trial 32 with value: 0.9764120594095738.


Trial 110 with params: {'learning_rate': 0.00021474458984009075, 'weight_decay': 0.01, 'warmup_steps': 12, 'lambda_param': 0.7000000000000001, 'temperature': 4.5}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3274,0.214375,0.9348,0.939259,0.934819,0.935019
2,0.1634,0.176028,0.9537,0.955206,0.953886,0.954019
3,0.1383,0.164537,0.9608,0.962426,0.961035,0.961095
4,0.1244,0.155969,0.9661,0.966883,0.966366,0.966236


[I 2025-03-30 23:26:47,552] Trial 110 pruned. 


Trial 111 with params: {'learning_rate': 0.00034532268462300755, 'weight_decay': 0.01, 'warmup_steps': 14, 'lambda_param': 1.0, 'temperature': 4.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3351,0.224306,0.9282,0.931075,0.928273,0.928871
2,0.181,0.208908,0.9356,0.938831,0.935757,0.935887


[I 2025-03-30 23:29:39,843] Trial 111 pruned. 


Trial 112 with params: {'learning_rate': 6.786706512825958e-05, 'weight_decay': 0.007, 'warmup_steps': 14, 'lambda_param': 0.6000000000000001, 'temperature': 5.0}


Some weights of TimmWrapperForImageClassification were not initialized from the model checkpoint at timm/tiny_vit_5m_224.in1k and are newly initialized because the shapes did not match:
- head.fc.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated
- head.fc.weight: found shape torch.Size([1000, 320]) in the checkpoint and torch.Size([10, 320]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.3914,0.181625,0.9525,0.953315,0.952575,0.952641
2,0.1526,0.161306,0.9638,0.964233,0.963904,0.963943


In [None]:
print(best_distill)

BestRun(run_id='35', objective=0.8607044045531035, hyperparameters={'learning_rate': 0.0006139968240256416, 'weight_decay': 0.007, 'warmup_steps': 4, 'lambda_param': 0.30000000000000004, 'temperature': 6.0}, run_summary=None)


In [None]:
base.reset_seed()

In [None]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-aug_hp-search", logging_dir=f"~/logs/{DATASET}/-aug_hp-search", epochs=num_epochs, batch_size=batch_size)

In [None]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up)
    }   
    print(f"Trial {trial.number} with params: {params}")
    return params

In [None]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



In [None]:
trainer = Trainer(
    args=training_args,
    train_dataset=train_combo,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

config.json:   0%|          | 0.00/69.8k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/14.2M [00:00<?, ?B/s]

In [None]:
best_base_aug = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Base-head",
    n_trials=150
)

In [None]:
print(best_base_aug)

BestRun(run_id='35', objective=0.7718702742260117, hyperparameters={'learning_rate': 0.0024870786738035154, 'weight_decay': 0.009000000000000001, 'warmup_steps': 20}, run_summary=None)


In [None]:
base.reset_seed()

In [None]:
training_args = base.get_training_args(output_dir=f"~/results/{DATASET}/-aug-KD_hp-search", logging_dir=f"~/logs/{DATASET}/-aug-KD_hp-search", remove_unused_columns=False, epochs=num_epochs, batch_size=batch_size)

In [None]:
def hp_space(trial):
    params =  {
        "learning_rate": trial.suggest_float("learning_rate", 5e-5, 5e-3, log=True),
        "weight_decay": trial.suggest_float("weight_decay", 0, 1e-2, step=1e-3),
        "warmup_steps" : trial.suggest_int("warmup_steps", 0, warm_up),
        "lambda_param": trial.suggest_float("lambda_param",0,1,step=.1),
        "temperature": trial.suggest_float("temperature", 2,7, step=.5)
    }
    print(f"Trial {trial.number} with params: {params}")
    return params

In [None]:
pruner = optuna.pruners.HyperbandPruner(min_resource=min_r, max_resource=max_r, reduction_factor=2, bootstrap_count=2)
sampler = optuna.samplers.TPESampler(seed=42, multivariate=True)



In [None]:
trainer = base.DistilTrainer(
    args=training_args,
    train_dataset=train_combo,
    eval_dataset=eval,
    compute_metrics=base.compute_metrics,
    model_init = lambda: get_model()
)

In [None]:
best_distill_aug = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    compute_objective=lambda metrics: metrics["eval_f1"],
    pruner=pruner,
    sampler=sampler,
    study_name="Distill",
    n_trials=150
)

In [None]:
print(best_distill_aug)

BestRun(run_id='44', objective=0.7611287079618219, hyperparameters={'learning_rate': 0.00063155918393816, 'weight_decay': 0.009000000000000001, 'warmup_steps': 5, 'lambda_param': 0.6000000000000001, 'temperature': 4.0}, run_summary=None)


In [None]:
print("Best random init training score: ", best_base)
print("Best random init distilation trianing score: ", best_distill)
print("Best pretrained (head only) training score: ", best_base_aug)
print("Best pretrained distilation (head only) training score: ",best_distill_aug)

NameError: name 'best_base_random' is not defined