## Evaluate the finetuned model
The intention was to check if our new codes (systematically modified into command line base) are able to reproduce the performance from the old codes. However, since we decided to deal with one kind of measurement for each model in the new codes, there is no straightforward way to compare the two versions of codes. And the old model didn't provide excellent performance either. I then changed to the target, "if the new codes generate ok performance". Later on, we will conduct a series of hyperparameter tuning to improve the performance.

In [3]:
import torch
from src.models.mae_vit_regressor import mae_vit_base_patch16
from src.datas import transforms
from src.datas.dataloader import get_dataloader

device = torch.device('cuda')
model = mae_vit_base_patch16(pretrained=True, weights="results/model.ckpt").to(device)
checkpoint = torch.load('results/model.ckpt')
model.load_state_dict(checkpoint)

dataloader = get_dataloader(ispretrain=False, annotations_file='info.csv', input_dir='data/finetune/CaCO3%/train', 
                            batch_size=256, transform=transforms.standardize_numpy, num_workers=8)

In [4]:
from archives.src.util.evaluate import finetune_evaluator 

eva = finetune_evaluator()
model_mse = eva.evaluate(model=model, dataloader=dataloader['val'])
base_mse = eva.evaluate_base(dataloader['val'])

r_square = 1 - model_mse / base_mse

print('CaCO3')
print(f'MSE: {model_mse}')
print(f'RMSR: {model_mse**0.5}')
print(f'MSE of base model: {base_mse}')
print(f'R2: {r_square}')

CaCO3
MSE: [67.022675]
RMSR: [8.186738]
MSE of base model: [335.09964214]
R2: [0.79999181]


The performance of CaCO3 is okay.

However, the finetuning on the TOC datasets gives both the train and valid losses NaN starting from the first epoch. We need to fix it.