# Fine tuning vision models on lambda cloud: a cost-performance analysis

*This notebook presents a cost-performance analysis of fine-tuning the vision models presented in [fast.ai article](The best vision models for fine-tuning) on Lambda Cloud.*

## 1. Setup

Install dependencies

In [1]:
!pip install fastcore >/dev/null 2>&1
!pip install wandb >/dev/null 2>&1
!pip install ghapi >/dev/null 2>&1

Wandb login

In [1]:
import wandb
wandb.login()

wandb: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
wandb: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 

 ········································


wandb: Appending key for api.wandb.ai to your netrc file: /home/eole/.netrc


True

Fetch sweep output data

In [None]:
import wandb,os
from fastcore.all import *
import pandas as pd

api = wandb.Api()

sweep_ids = [
    'eolecvka/fastai_timm/sweeps/ygh90vj4'
]
sweeps = concat(api.sweep(o).runs for o in sweep_ids)
summs = [{**r.summary, 'model_name':r.config['model_name']} for r in sweeps]

df = pd.DataFrame(summs)
df['dataset'] = 'planet'
df.loc[df['accuracy_multi'].isna(), 'dataset'] = 'pets'
df.loc[df['dataset']=='planet', 'accuracy'] = df.loc[df['dataset']=='planet', 'accuracy_multi']
df['error_rate'] = 1-df.accuracy

pd.set_option('display.max_columns', None)
# df

Build summary tables

In [None]:
# cols = ['dataset', 'model_name', 'GPU_mem', 'error_rate', 'valid_loss', 'train_loss', 'fit_time']
cols = ['model_name', 'dataset', 'fit_time', '_runtime', 'accuracy']

df_pets = df.loc[df['dataset']=='pets']
df_planets = df.loc[df['dataset']=='planet']

df_pets_top_models_acc = df_pets[columns].groupby(['model_name']).agg('max').sort_values('accuracy', ascending=False).dropna()
df_planets_top_models_acc =df_planets[columns].groupby(['model_name']).agg('max').sort_values('accuracy', ascending=False).dropna()

In [None]:
df_pets_top_models_acc.head()

In [None]:
df_planets_top_models_acc.head()

## Visual analysis: fit time vs accuracy

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = df_planets_top_models_acc.fit_time.tolist()
y = df_planets_top_models_acc.accuracy.tolist()
plt.scatter(x, y, alpha=0.5)
plt.title("Cost-performance analysis of models fine-tuning on Planet dataset")
plt.xlabel("Runtime (seconds)")
plt.ylabel("Accuracy")
plt.show()

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = df_pets_top_models_acc._runtime.tolist()
y = df_pets_top_models_acc.accuracy.tolist()
plt.scatter(x, y, alpha=0.5)
plt.title("Cost-performance analysis of models fine-tuning on IIT Pets dataset")
plt.xlabel("Runtime (seconds)")
plt.ylabel("Accuracy")
plt.show()

???

In [None]:
model_name = 'swin_large_patch4_window7_224_in22k'
dataset_name = 'planet'

model_finetuning_cost_perf = df.loc[
    (df['model_name']==model_name) &
    (df['dataset']==dataset_name)
]

import numpy as np
import matplotlib.pyplot as plt

x = model_finetuning_cost_perf._runtime.tolist()
y = model_finetuning_cost_perf.accuracy.tolist()
plt.scatter(x, y, alpha=0.5)
plt.title(f"Cost-performance analysis of models fine-tuning on {dataset_name} dataset")
plt.xlabel("Runtime (seconds)")
plt.ylabel("Accuracy")
plt.show()

Why does it look like this??

In [None]:
cols = ['accuracy', 'accuracy_multi']
model_finetuning_cost_perf

In [None]:
df.epoch.unique()