# Exploration and Comparison of Transformers for Image Classification

Comparison of results for linear probing across 6 datasets for each model
- ViT
- DeiT
- Swin
- CLIP

Showing the results in a different file is to avoid information and visual clutter.

NOTE: Since variables are not saved accros multiple jupyter notebook files, the data needs to be taken from the individual notebooks and hardcoded here.

### Prerequisites

Load necessary packages.

In [9]:
import os
os.chdir('..')
from utils.data_utils import *
import pandas as pd

### Results

Get the results from each model.

NOTE: Taken from "Results" section from each notebook.

In [2]:
resisc45_vit = 0.846349
food101_vit = 0.859644
fer2013_vit = 0.601839
pcam_vit = 0.843079
sun397_vit = 0.774529
dtd_vit = 0.730319

resisc45_deit = 0.886667
food101_deit = 0.786297
fer2013_deit = 0.527445
pcam_deit = 0.832336
sun397_deit = 0.726851
dtd_deit = 0.714894

resisc45_swin = 0.885397
food101_swin = 0.894733
fer2013_swin = 0.612148
pcam_swin = 0.845551
sun397_swin = 0.804184
dtd_swin = 0.782447

resisc45_clip = 0.898730
food101_clip = 0.889149
fer2013_clip = 0.670939
pcam_clip = 0.836365
sun397_clip = 0.790575
dtd_clip = 0.726064

Aggregate results over datasets.

In [3]:
results_resisc45 = [resisc45_vit, resisc45_deit, resisc45_swin, resisc45_clip]
results_food101 = [food101_vit, food101_deit, food101_swin, food101_clip]
results_fer2013 = [fer2013_vit, fer2013_deit, fer2013_swin, fer2013_clip]
results_pcam = [pcam_vit, pcam_deit, pcam_swin, pcam_clip]
results_sun397 = [sun397_vit, sun397_deit, sun397_swin, sun397_clip]
results_dtd = [dtd_vit, dtd_deit, dtd_swin, dtd_clip]

Concatenate all results into a single variable.

In [4]:
results = [
    results_resisc45,
    results_food101,
    results_fer2013,
    results_pcam,
    results_sun397,
    results_dtd,
]

In [13]:
labels = ['RESISC45', 'Food-101', 'FER2013', 'PatchCamelyon', 'SUN397', 'DTD']
models = ['ViT', 'DeiT', 'Swin', 'CLIP']

In [10]:
acc_dict = create_accuracy_dict(
    results,
    labels
)

In [11]:
acc_dict

{'RESISC45': [0.846349, 0.886667, 0.885397, 0.89873],
 'Food-101': [0.859644, 0.786297, 0.894733, 0.889149],
 'FER2013': [0.601839, 0.527445, 0.612148, 0.670939],
 'PatchCamelyon': [0.843079, 0.832336, 0.845551, 0.836365],
 'SUN397': [0.774529, 0.726851, 0.804184, 0.790575],
 'DTD': [0.730319, 0.714894, 0.782447, 0.726064]}

Display a dataframe containing the results for model and dataset.

In [14]:
df = pd.DataFrame(results, columns=[n for n in models], index=labels)
df

Unnamed: 0,ViT,DeiT,Swin,CLIP
RESISC45,0.846349,0.886667,0.885397,0.89873
Food-101,0.859644,0.786297,0.894733,0.889149
FER2013,0.601839,0.527445,0.612148,0.670939
PatchCamelyon,0.843079,0.832336,0.845551,0.836365
SUN397,0.774529,0.726851,0.804184,0.790575
DTD,0.730319,0.714894,0.782447,0.726064


Print the model that was best for each dataset.

In [27]:
for dataset, values in acc_dict.items():
    max_value = max(values)
    best_model = models[values.index(max_value)]
    print(f"For dataset \033[1m{dataset}\033[0m, the best model is \033[1m{best_model}\033[0m with an accuracy of \033[1m{max_value:.6}\033[0m")

For dataset [1mRESISC45[0m, the best model is [1mCLIP[0m with an accuracy of [1m0.89873[0m
For dataset [1mFood-101[0m, the best model is [1mSwin[0m with an accuracy of [1m0.894733[0m
For dataset [1mFER2013[0m, the best model is [1mCLIP[0m with an accuracy of [1m0.670939[0m
For dataset [1mPatchCamelyon[0m, the best model is [1mSwin[0m with an accuracy of [1m0.845551[0m
For dataset [1mSUN397[0m, the best model is [1mSwin[0m with an accuracy of [1m0.804184[0m
For dataset [1mDTD[0m, the best model is [1mSwin[0m with an accuracy of [1m0.782447[0m
