# Tracking Experiments with AIM

In this guide, we'll talk about how you can use the open source tracker [AIM](https://github.com/aimhubio/aim) to keep track of your ML experiments.

In [None]:
%%capture
! pip install aim transformers datasets

## Basic Usage

AIM, like other loggers you may have used in the past, has a fairly simple user-facing API for creating and logging experiments.

  - ✅ First, you initialize a "Run". A run represents a single training experiment.
  - 📝 From there, you can add some metadata to the run object, treating it sort of like a dictionary. Below, we log some fake hyperparameters for the run. 
  - 📈 To log metrics in your run, you call the `run.track` function, supplying the metric you want to track, a name for it, and the timestep associated with it.
    - Additionally, you can supply "context" which will let you filter/sort/group this metric with other related metrics. Below we specify "train" as a subset, but you could also do "eval" and "test" subsets (or whatever you want, really)

In [None]:
from aim import Run

# Initialize a new run
run = Run()

# Log run parameters
run["hparams"] = {
    "learning_rate": 0.001,
    "batch_size": 32,
}

# Log metrics
for i in range(10):
    run.track(i, name='loss', step=i, context={"subset": "train"})
    run.track(i, name='acc', step=i, context={"subset": "train"})

We'll view these logs in the browser in a minute! First, let's see a slightly more realistic example of tracking with AIM...

## Usage with Hugging Face `transformers`

As a more involved example, let's take a look at how you can use AIM to track experiments using the Hugging Face [`transformers`](https://github.com/huggingface/transformers) library.

First, let's write a quick example to train a text classification model. Specifically, we'll be fine-tuning bert-base-cased on the [IMDB reviews](https://huggingface.co/datasets/imdb) dataset.

In [None]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer, default_data_collator
from datasets import load_dataset
import numpy as np

# Load the dataset + take a small sample of it
ds = load_dataset("imdb")
del ds['unsupervised']  # discarding this split as we won't need it
ds['train'] = ds['train'].shuffle(seed=42).select(range(5000))
ds['test'] = ds['test'].shuffle(seed=42).select(range(5000))

# Rename label -> labels so inputs can be **unpacked into model call fn in Trainer
ds = ds.rename_column("label", "labels")

# Get the labels from the dataset + sort them so we can add this info to model config
labels = ds["train"].unique("labels")
labels.sort()

# Initialize the model, supplying num_labels/label mappings to reinit classification head
model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-cased",
    num_labels=len(labels),
    return_dict=True,
    label2id={x: i for i, x in enumerate(labels)},
    id2label=dict(enumerate(labels)),
)

# Initialize tokenizer and define processing fn to apply to dataset
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

# Apply the processing fn, removing unnecessary text column
ds = ds.map(tokenize_function, batched=True, remove_columns=['text'])

# Define a fn to compute evaluation accuracy
def compute_metrics(pred):
    preds = pred.predictions.argmax(-1)
    acc = np.sum(preds == pred.label_ids) / preds.shape[0]
    return {'acc': acc}

Here, we set up the `AimCallback`, which is a handy [`transformers.TrainingCallback`](https://huggingface.co/docs/transformers/v4.24.0/en/main_classes/callback#transformers.TrainerCallback) that comes with `aim`. Then, we define the training configuration and the trainer, making sure to disable other loggers and provide our callback instance.

Then, we train! 🚀

In [None]:
from transformers import Trainer, TrainingArguments
from aim.hugging_face import AimCallback

# Set up AIM logger
aim_callback = AimCallback(experiment='huggingface_experiment')

# Define training configuration, setting report_to='none' because we will use aim instead
training_args = TrainingArguments(
    'bert-base-cased-imdb-sample',
    evaluation_strategy='epoch',
    num_train_epochs=4,
    report_to="none",
    logging_steps=10,
    per_device_train_batch_size=8,
)

# Initialize trainer, supplying the aim callback
trainer = Trainer(
    model,
    training_args,
    data_collator=default_data_collator,
    callbacks=[aim_callback],
    train_dataset=ds['train'],
    eval_dataset=ds['test'],
    compute_metrics=compute_metrics
)

# Train!
train_result = trainer.train()

## View the Runs

Finally, we can view both our dummy run and our `transformers` run. 👀

Note - If you aren't running this in a notebook, you can simply run `aim up` in your terminal to launch the viewer locally.

In [None]:
%load_ext aim

In [None]:
aim up