# Demonstrating a standard Active Learning pipeline
Train accurate classifier models with minimal data labeling (and minimal code) via active learning and AutoML.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cleanlab/examples/blob/master/active_learning_single_annotator/active_learning_single_annotator.ipynb)

This notebook demonstrates a practical approach to efficiently label data for training an accurate image classifier via active learning and AutoML. We consider standard active learning settings with a pool of unlabeled examples, where we label a batch of examples at a time and collect **at most one label** per example. If your data labeling may be noisy (imperfect), the consider our **active_learning_multiannotator** notebook instead, which helps you decide what data to re-label during active learning.

In **Active Learning**, we aim to construct a labeled dataset by collecting the fewest labels that still allow us to train an accurate classifier model. Here we assume data labeling is done in **batches**, and between these data labeling rounds, we retrain our classifier to decide what previously unlabeled examples (i.e. datapoints) to label next round.


This notebook demonstrates how to compute these scores easily for use in sequential active learning, showing how a classification model iteratively improves after labeling more examples for multiple rounds with the following steps:

1. Establish an initially labeled dataset, `df_labeled` to train the model on. This is a small subset of our training data, `df_train`. The rest of the training data is marked as `df_unlabeled`.
2. Train the model on the available labeled data and get predictions for the unlabeled data, `pred_probs_unlabeled`.
3. Compute active learning scores for all unlabeled examples and select which samples to collect labels for.
4. Label the selected samples and add them to current training set.
5. Repeat steps 2-4 to collect as many labels as your budget permits.

The accuracy of the model trained on the resulting dataset will generally match that of the same model trained on a much larger set of randomly selected examples -- i.e. this is the cheapest way to grow a dataset for training an accurate classifier!

## Import dependencies and data

Here we use images from the [Caltech-256](https://data.caltech.edu/records/nyy15-4j048) [1] classification dataset. Any image dataset in the same format can be substituted instead and the same code should work. The active learning method demonstrated here works for any classification data (text, tabular, audio, etc.) as long as you are able to train a ML model on the labeled subset of the data.

In [1]:
import time
import numpy as np
import pandas as pd
from autogluon.multimodal import MultiModalPredictor
from gluoncv.auto.data.dataset import ImageClassificationDataset
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

from cleanlab.multiannotator import get_label_quality_scores, get_active_learning_scores
from utils.model_training_autogluon import train

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
!wget -nc 'https://cleanlab-public.s3.amazonaws.com/ActiveLearning/Caltech256/256_ObjectCategories.zip' && unzip -o -q 256_ObjectCategories.zip

File ‘256_ObjectCategories.zip’ already there; not retrieving.



## Select initial labeled dataset
We load our data file into a variable called `dataset`. This is a DataFrame containing labels and file paths for each image (i.e. example) from Caltech-256

We then randomly split the dataset into train and test splits. Test data are just used here to demonstrate the accuracy in our model after each active learning round (you may not have any test data in your applications).

In [3]:
dataset = ImageClassificationDataset.from_folder('./256_ObjectCategories/')
dataset = dataset.replace(257, 256) # no class class in dataset is labeled as 257, we need to reindex

# Split data into train and test
df_train, df_test = train_test_split(dataset, test_size=0.33, random_state=123)

The train data are further split into labeled and unlabeled pools. `df_labeled` represents our initial labeled dataset which we use to train an initial classifier model (in your application, this would be all the labeled data you have). `df_unlabeled` represents our unlabeled pool of examples we could consider labeling. In this example, we technically know all the labels for these images too -- given they all come from Caltech-256 -- but we demonstrate how active learning would work in your applications by assuming we don't know their labels. In each active learning round, we only reveal the label of specific images the algorithm decides to collect labels for.

In this demonstration, our initial training set (`df_labeled`) has 8 labeled images from each class, which is not enough data to train a good classifier. The goal is to grow this dataset with the fewest number of additional labeled examples that suffice to train an accurate model.

In [4]:
def get_labeled(dataset,  num_labeled_per_class=8):
    """Splits provided dataset into two datasets. With df_labeled containing num_labeled_per_class labeles for 
    each class and df_unlabeled containing the rest of the rows in dataset"""
    
    df_labeled = dataset.groupby("label").sample(n=num_labeled_per_class, random_state=123)
    labeled_index = list(df_labeled.index)
    unlabeled_index = [i for i in range(len(dataset)) if i not in labeled_index]
    df_unlabeled = dataset.iloc[unlabeled_index]
    df_unlabeled = df_unlabeled.reset_index(drop=True)
    df_labeled = df_labeled.reset_index(drop=True)    
    return df_labeled, df_unlabeled

# Split the train data into labeled and unlabeled with 8 labeled per each class
df_labeled, df_unlabeled = get_labeled(df_train, num_labeled_per_class=8)

## Train model on labeled data and get predicted class probabilites for unlabeled data

The first step of the active learning pipeline is to train your model on the available labeled data. Next we ask the trained model for its predictions on the unlabeled data -- specifically the predicted probability of each class for each unlabeled example. The `train()` function below returns our `predictor` fitted to `df_labeled`. To use a different type of model, modify this `train()` function as needed. All you need to run active learning with cleanlab is code to: (1) train your model on the labeled data, (2) get its predicted class probabilities for the unlabeled data, (3) collect labels for the examples with the lowest active learning scores.

In [5]:
predictor = train(df_labeled, out_folder=None, time_limit=30)
pred_probs_unlabeled = predictor.predict_proba(df_unlabeled)

Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_165545/"
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|█████████████████████████▉                                                    | 103/310 [00:13<00:27,  7.58it/s, loss=5.58, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/52 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/52 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▏                                                   | 104/310 [00:13<00:27,  7.52it/s, loss=5.58, v_num=][A
Epoch 0:  34%|██████████████████████████▍                                                   | 105/310 [00:13<00:27,  7.56it/s, loss=5.58, v_num=][A
Epoch 0:  34%|██████████████████████████▋                                                   | 106/310 [00:13<00:26,  7.61it/s, loss=5.58, v_num=][A
Epoch 0:  35%|██████████████████████████▉                                  

Epoch 0, global step 6: 'val_accuracy' reached 0.00728 (best 0.00728), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_165545/epoch=0-step=6.ckpt' as top 3


Epoch 0:  66%|███████████████████████████████████████████████████▊                          | 206/310 [00:29<00:15,  6.89it/s, loss=5.49, v_num=]

Time limit reached. Elapsed time is 0:00:30. Signaling Trainer to stop.


Epoch 0:  67%|████████████████████████████████████████████████████                          | 207/310 [00:30<00:14,  6.89it/s, loss=5.49, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/52 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/52 [00:00<?, ?it/s][A
Epoch 0:  67%|████████████████████████████████████████████████████▎                         | 208/310 [00:30<00:14,  6.87it/s, loss=5.49, v_num=][A
Epoch 0:  67%|████████████████████████████████████████████████████▌                         | 209/310 [00:30<00:14,  6.89it/s, loss=5.49, v_num=][A
Epoch 0:  68%|████████████████████████████████████████████████████▊                         | 210/310 [00:30<00:14,  6.91it/s, loss=5.49, v_num=][A
Epoch 0:  68%|█████████████████████████████████████████████████████        

Epoch 0, global step 9: 'val_accuracy' reached 0.00971 (best 0.00971), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_165545/epoch=0-step=9.ckpt' as top 3


Epoch 0:  84%|█████████████████████████████████████████████████████████████████▏            | 259/310 [00:47<00:09,  5.41it/s, loss=5.49, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  5.64it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  5.72it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  5.59it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_165545 


Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 597/597 [01:42<00:00,  5.84it/s]


## Obtain active learning scores for the unlabeled data

Using these predicted class probabilities, you should next compute active learning scores that estimate the informativeness of labeling each datapoint. Since we will collect at most one annotation per example in this pipeline, we only care about scoring the unlabeled data.

These active learning scores represent how confident our model is about an example's true label based on the currently obtained annotations; examples with the lowest scores are those for which additional labels should be collected (i.e. likely the most informative). These scores are estimated via [ActiveLab](https://arxiv.org/abs/2301.11856), an algorithm developed by the Cleanlab team. 

In [6]:
# compute active learning scores
_, active_learning_scores_unlabeled = get_active_learning_scores(
    df_labeled['label'].to_numpy(), pred_probs_unlabeled=pred_probs_unlabeled
)

  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]


In [7]:
# print active learning scores for the first 5 examples in the unlabeld pool:
active_learning_scores_unlabeled[:5]

array([0.00742915, 0.00528911, 0.00507766, 0.00588142, 0.00631748])

## Get index to relabel

Subsequently, rank the unlabeled examples by their active learning scores, and obtain the indices of examples with the lowest scores. These are the **unlabeled** examples whose true label our current model is least confident about. You should prioritize these examples for labeling next.

In [8]:
def get_idx_to_label(active_learning_scores_unlabeled, batch_size_to_label):
    """Function to get indices of examples with the lowest active learning score to collect more labels for."""
    
    return np.argsort(active_learning_scores_unlabeled)[:batch_size_to_label]

In [9]:
batch_size_to_label = 100  # you can pick how many examples to collect more labels for at each round, depending on your setup

# get next idx to label based on batch_size_to_label and magnitude of each example's active learning score
next_idx_to_label = get_idx_to_label(active_learning_scores_unlabeled, batch_size_to_label=batch_size_to_label)
next_idx_to_label[:5],active_learning_scores_unlabeled[next_idx_to_label[:5]]

(array([ 7928,  9244,  3141, 11455,  1645]),
 array([0.00469176, 0.00472476, 0.00474545, 0.00476181, 0.00477689]))

## Improving model accuracy over 15 rounds of active learning (collecting new labels) 

The code below shows a full demonstration of how we can **repeatedly** use the above methods to: select which examples to collect labels for next, add their labels to the current training dataset, and train an improved classifier model.

Here we run 10 rounds of this active learning loop, choosing 100 new unlabeled examples to label in each round. In your applications, you will need to replace the code we used here to reveal the labels of new examples.

[Optional step] After each round, we also report the current model's accuracy on our held-out test dataset (you may not have test data in your applications).

In [10]:
def setup_next_iter_data(df_labeled, df_unlabeled, relabel_idx_unlabeled):
    """Updates inputs after additional labels have been collected in a single active learning round,
    this ensures that the inputs will be well formatted for the next round of active learning."""

    df_labeled = pd.concat([df_labeled,df_unlabeled.iloc[relabel_idx_unlabeled]], ignore_index=True)
    df_unlabeled = df_unlabeled.drop(relabel_idx_unlabeled)
    df_unlabeled = df_unlabeled.reset_index(drop=True)
    df_labeled = df_labeled.reset_index(drop=True)  
    return df_labeled, df_unlabeled

In [11]:
num_rounds = 20
batch_size_to_label = 100

In [None]:
model_accuracy_arr = np.full(num_rounds, np.nan)

for i in range(num_rounds):
    # train model and obtain predicted class probabilities for the unlabeled data
    print('fitting model')
    predictor = train(df_labeled, out_folder=None, time_limit=30)
    
    print('obtaining predicted class probabilities for the unlabeled data')
    pred_probs_unlabeled = predictor.predict_proba(df_unlabeled)
        
    print('computing active learning scores')
    # compute active learning scores
    _, active_learning_scores_unlabeled = get_active_learning_scores(
        df_labeled['label'].to_numpy(), pred_probs_unlabeled=pred_probs_unlabeled
    )
    
    print('getting idx to relabel')
    # get the indices of examples to collect more labels for
    relabel_idx_unlabeled = get_idx_to_label(
        active_learning_scores_unlabeled=active_learning_scores_unlabeled,
        batch_size_to_label=batch_size_to_label,
    )
    
    print('setting up next iter')
    # format the data for the next round of active learning, ie. moving some unlabeled 
    # examples to the labeled pool because we are collecting labels for them
    df_labeled, df_unlabeled = setup_next_iter_data(df_labeled, df_unlabeled, relabel_idx_unlabeled)
    
    # evaluate model accuracy for the current round on held-out test data. This is an optional step 
    # for demonstration purposes, in practical applications you may not have ground truth labels
    print('predicting class labels for test split')
    pred_labels = predictor.predict(data=df_test)
    true_labels_test = np.array(df_test['label'].tolist())
    model_accuracy_arr[i] = np.mean(pred_labels == true_labels_test)
    print('test round: ', i, 'accuracy: ', np.mean(pred_labels == true_labels_test))

Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_205347/"


fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|█████████████████████████▉                                                    | 103/310 [00:13<00:27,  7.48it/s, loss=5.58, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/52 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/52 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▏                                                   | 104/310 [00:14<00:28,  7.32it/s, loss=5.58, v_num=][A
Epoch 0:  34%|██████████████████████████▍                                                   | 105/310 [00:14<00:27,  7.37it/s, loss=5.58, v_num=][A
Epoch 0:  34%|██████████████████████████▋                                                   | 106/310 [00:14<00:27,  7.41it/s, loss=5.58, v_num=][A
Epoch 0:  35%|██████████████████████████▉                                  

Epoch 0:  50%|███████████████████████████████████████                                       | 155/310 [00:16<00:16,  9.23it/s, loss=5.58, v_num=][A
                                                                                                                                                 [A

Epoch 0, global step 6: 'val_accuracy' reached 0.00728 (best 0.00728), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_205347/epoch=0-step=6.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:31. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████▎                                      | 156/310 [00:31<00:31,  4.96it/s, loss=5.59, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/52 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/52 [00:00<?, ?it/s][A
Epoch 0:  51%|███████████████████████████████████████▌                                      | 157/310 [00:31<00:31,  4.91it/s, loss=5.59, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▊                                      | 158/310 [00:32<00:30,  4.93it/s, loss=5.59, v_num=][A
Epoch 0:  51%|████████████████████████████████████████                                      | 159/310 [00:32<00:30,  4.96it/s, loss=5.59, v_num=][A
Epoch 0:  52%|████████████████████████████████████████▎                    

Epoch 0:  67%|████████████████████████████████████████████████████▎                         | 208/310 [00:34<00:16,  6.01it/s, loss=5.59, v_num=][A
                                                                                                                                                 [A

Epoch 0, global step 6: 'val_accuracy' reached 0.00728 (best 0.00728), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_205347/epoch=0-step=6-v1.ckpt' as top 3


Epoch 0:  67%|████████████████████████████████████████████████████▎                         | 208/310 [00:53<00:26,  3.90it/s, loss=5.59, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  5.53it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  5.57it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  5.56it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_205347 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 597/597 [01:43<00:00,  5.80it/s]
predicting class labels for test split
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 316/316 [00:56<00:00,  5.63it/s]
test round:  0 accuracy:  0.005148005148005148
computing active learning scores


  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]
Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_205805/"


getting idx to relabel
setting up next iter
fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|██████████████████████████                                                    | 108/324 [00:15<00:30,  7.17it/s, loss=5.57, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/54 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/54 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▏                                                   | 109/324 [00:15<00:30,  7.03it/s, loss=5.57, v_num=][A
Epoch 0:  34%|██████████████████████████▍                                                   | 110/324 [00:15<00:30,  7.07it/s, loss=5.57, v_num=][A
Epoch 0:  34%|██████████████████████████▋                                                   | 111/324 [00:15<00:29,  7.11it/s, loss=5.57, v_num=][A
Epoch 0:  35%|██████████████████████████▉                                  

Epoch 0:  49%|██████████████████████████████████████▌                                       | 160/324 [00:18<00:18,  8.80it/s, loss=5.57, v_num=][A
Epoch 0:  50%|██████████████████████████████████████▊                                       | 161/324 [00:18<00:18,  8.83it/s, loss=5.57, v_num=][A
Epoch 0:  50%|███████████████████████████████████████                                       | 162/324 [00:18<00:18,  8.86it/s, loss=5.57, v_num=][A
                                                                                                                                                 [A

Epoch 0, global step 6: 'val_accuracy' reached 0.01157 (best 0.01157), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_205805/epoch=0-step=6.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:32. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████▏                                      | 163/324 [00:32<00:32,  4.94it/s, loss=5.57, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/54 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/54 [00:00<?, ?it/s][A
Epoch 0:  51%|███████████████████████████████████████▍                                      | 164/324 [00:33<00:32,  4.91it/s, loss=5.57, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▋                                      | 165/324 [00:33<00:32,  4.93it/s, loss=5.57, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▉                                      | 166/324 [00:33<00:31,  4.95it/s, loss=5.57, v_num=][A
Epoch 0:  52%|████████████████████████████████████████▏                    

Epoch 0:  66%|███████████████████████████████████████████████████▊                          | 215/324 [00:36<00:18,  5.96it/s, loss=5.57, v_num=][A
Epoch 0:  67%|████████████████████████████████████████████████████                          | 216/324 [00:36<00:18,  5.98it/s, loss=5.57, v_num=][A
Epoch 0:  67%|████████████████████████████████████████████████████▏                         | 217/324 [00:36<00:17,  6.00it/s, loss=5.57, v_num=][A
                                                                                                                                                 [A

Epoch 0, global step 6: 'val_accuracy' reached 0.01157 (best 0.01157), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_205805/epoch=0-step=6-v1.ckpt' as top 3


Epoch 0:  67%|████████████████████████████████████████████████████▏                         | 217/324 [00:57<00:28,  3.75it/s, loss=5.57, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 14/14 [00:02<00:00,  5.58it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 14/14 [00:02<00:00,  5.58it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 14/14 [00:02<00:00,  5.55it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_205805 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 594/594 [01:43<00:00,  5.71it/s]
predicting class labels for test split
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 316/316 [00:56<00:00,  5.54it/s]
test round:  1 accuracy:  0.004158004158004158
computing active learning scores
getting idx to relabel
setting up next iter


  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]
Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_210223/"


fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|█████████████████████████▉                                                    | 113/340 [00:15<00:32,  7.07it/s, loss=5.56, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/57 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/57 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▏                                                   | 114/340 [00:16<00:32,  6.93it/s, loss=5.56, v_num=][A
Epoch 0:  34%|██████████████████████████▍                                                   | 115/340 [00:16<00:32,  6.97it/s, loss=5.56, v_num=][A
Epoch 0:  34%|██████████████████████████▌                                                   | 116/340 [00:16<00:31,  7.01it/s, loss=5.56, v_num=][A
Epoch 0:  34%|██████████████████████████▊                                  

Epoch 0:  49%|█████████████████████████████████████▊                                        | 165/340 [00:19<00:20,  8.61it/s, loss=5.56, v_num=][A
Epoch 0:  49%|██████████████████████████████████████                                        | 166/340 [00:19<00:20,  8.64it/s, loss=5.56, v_num=][A
Epoch 0:  49%|██████████████████████████████████████▎                                       | 167/340 [00:19<00:19,  8.67it/s, loss=5.56, v_num=][A
Epoch 0:  49%|██████████████████████████████████████▌                                       | 168/340 [00:19<00:19,  8.70it/s, loss=5.56, v_num=][A
Epoch 0:  50%|██████████████████████████████████████▊                                       | 169/340 [00:19<00:19,  8.72it/s, loss=5.56, v_num=][A
Epoch 0:  50%|███████████████████████████████████████                                       | 170/340 [00:19<00:19,  8.76it/s, loss=5.56, v_num=][A
                                                                                                          

Epoch 0, global step 7: 'val_accuracy' reached 0.00664 (best 0.00664), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_210223/epoch=0-step=7.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:31. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████▏                                      | 171/340 [00:31<00:30,  5.50it/s, loss=5.57, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/57 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/57 [00:00<?, ?it/s][A
Epoch 0:  51%|███████████████████████████████████████▍                                      | 172/340 [00:31<00:30,  5.45it/s, loss=5.57, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▋                                      | 173/340 [00:31<00:30,  5.48it/s, loss=5.57, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▉                                      | 174/340 [00:31<00:30,  5.50it/s, loss=5.57, v_num=][A
Epoch 0:  51%|████████████████████████████████████████▏                    

Epoch 0:  66%|███████████████████████████████████████████████████▏                          | 223/340 [00:34<00:17,  6.51it/s, loss=5.57, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▍                          | 224/340 [00:34<00:17,  6.53it/s, loss=5.57, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▌                          | 225/340 [00:34<00:17,  6.55it/s, loss=5.57, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▊                          | 226/340 [00:34<00:17,  6.57it/s, loss=5.57, v_num=][A
Epoch 0:  67%|████████████████████████████████████████████████████                          | 227/340 [00:34<00:17,  6.59it/s, loss=5.57, v_num=][A
Epoch 0:  67%|████████████████████████████████████████████████████▎                         | 228/340 [00:34<00:16,  6.61it/s, loss=5.57, v_num=][A
                                                                                                          

Epoch 0, global step 7: 'val_accuracy' reached 0.00664 (best 0.00664), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_210223/epoch=0-step=7-v1.ckpt' as top 3


Epoch 0:  67%|████████████████████████████████████████████████████▎                         | 228/340 [00:56<00:27,  4.05it/s, loss=5.57, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  5.70it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  5.66it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  5.66it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_210223 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 591/591 [01:44<00:00,  5.66it/s]
predicting class labels for test split
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 316/316 [00:53<00:00,  5.87it/s]
test round:  2 accuracy:  0.004752004752004752
computing active learning scores


  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]
Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_210632/"


getting idx to relabel
setting up next iter
fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|██████████████████████████                                                    | 118/354 [00:15<00:31,  7.46it/s, loss=5.56, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/59 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/59 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▏                                                   | 119/354 [00:16<00:32,  7.32it/s, loss=5.56, v_num=][A
Epoch 0:  34%|██████████████████████████▍                                                   | 120/354 [00:16<00:31,  7.34it/s, loss=5.56, v_num=][A
Epoch 0:  34%|██████████████████████████▋                                                   | 121/354 [00:16<00:31,  7.37it/s, loss=5.56, v_num=][A
Epoch 0:  34%|██████████████████████████▉                                  

Epoch 0:  48%|█████████████████████████████████████▍                                        | 170/354 [00:18<00:20,  8.99it/s, loss=5.56, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▋                                        | 171/354 [00:18<00:20,  9.02it/s, loss=5.56, v_num=][A
Epoch 0:  49%|█████████████████████████████████████▉                                        | 172/354 [00:19<00:20,  9.05it/s, loss=5.56, v_num=][A
Epoch 0:  49%|██████████████████████████████████████                                        | 173/354 [00:19<00:19,  9.08it/s, loss=5.56, v_num=][A
Epoch 0:  49%|██████████████████████████████████████▎                                       | 174/354 [00:19<00:19,  9.11it/s, loss=5.56, v_num=][A
Epoch 0:  49%|██████████████████████████████████████▌                                       | 175/354 [00:19<00:19,  9.13it/s, loss=5.56, v_num=][A
Epoch 0:  50%|██████████████████████████████████████▊                                       | 176/354 [00:

Epoch 0, global step 7: 'val_accuracy' reached 0.00424 (best 0.00424), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_210632/epoch=0-step=7.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:30. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████▏                                      | 178/354 [00:30<00:29,  5.92it/s, loss=5.56, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/59 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/59 [00:00<?, ?it/s][A
Epoch 0:  51%|███████████████████████████████████████▍                                      | 179/354 [00:30<00:29,  5.87it/s, loss=5.56, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▋                                      | 180/354 [00:30<00:29,  5.88it/s, loss=5.56, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▉                                      | 181/354 [00:30<00:29,  5.90it/s, loss=5.56, v_num=][A
Epoch 0:  51%|████████████████████████████████████████                     

Epoch 0:  65%|██████████████████████████████████████████████████▋                           | 230/354 [00:33<00:17,  6.93it/s, loss=5.56, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▉                           | 231/354 [00:33<00:17,  6.95it/s, loss=5.56, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████                           | 232/354 [00:33<00:17,  6.97it/s, loss=5.56, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▎                          | 233/354 [00:33<00:17,  6.99it/s, loss=5.56, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▌                          | 234/354 [00:33<00:17,  7.01it/s, loss=5.56, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▊                          | 235/354 [00:33<00:16,  7.03it/s, loss=5.56, v_num=][A
Epoch 0:  67%|████████████████████████████████████████████████████                          | 236/354 [00:

Epoch 0, global step 7: 'val_accuracy' reached 0.00424 (best 0.00424), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_210632/epoch=0-step=7-v1.ckpt' as top 3


Epoch 0:  67%|████████████████████████████████████████████████████▏                         | 237/354 [00:56<00:27,  4.22it/s, loss=5.56, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  5.66it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  5.76it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  5.62it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_210632 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 587/587 [01:38<00:00,  5.96it/s]
predicting class labels for test split
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 316/316 [00:53<00:00,  5.89it/s]
test round:  3 accuracy:  0.006237006237006237
computing active learning scores


  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]
Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_211035/"


getting idx to relabel
setting up next iter
fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|█████████████████████████▉                                                    | 123/370 [00:16<00:33,  7.46it/s, loss=5.54, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/62 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/62 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▏                                                   | 124/370 [00:16<00:33,  7.33it/s, loss=5.54, v_num=][A
Epoch 0:  34%|██████████████████████████▎                                                   | 125/370 [00:16<00:33,  7.37it/s, loss=5.54, v_num=][A
Epoch 0:  34%|██████████████████████████▌                                                   | 126/370 [00:17<00:32,  7.40it/s, loss=5.54, v_num=][A
Epoch 0:  34%|██████████████████████████▊                                  

Epoch 0:  47%|████████████████████████████████████▉                                         | 175/370 [00:19<00:21,  8.96it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████                                         | 176/370 [00:19<00:21,  8.99it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▎                                        | 177/370 [00:19<00:21,  9.02it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▌                                        | 178/370 [00:19<00:21,  9.04it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▋                                        | 179/370 [00:19<00:21,  9.07it/s, loss=5.54, v_num=][A
Epoch 0:  49%|█████████████████████████████████████▉                                        | 180/370 [00:19<00:20,  9.10it/s, loss=5.54, v_num=][A
Epoch 0:  49%|██████████████████████████████████████▏                                       | 181/370 [00:

Epoch 0, global step 7: 'val_accuracy' reached 0.01220 (best 0.01220), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211035/epoch=0-step=7.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:32. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████▏                                      | 186/370 [00:32<00:32,  5.68it/s, loss=5.53, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/62 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/62 [00:00<?, ?it/s][A
Epoch 0:  51%|███████████████████████████████████████▍                                      | 187/370 [00:33<00:32,  5.64it/s, loss=5.53, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▋                                      | 188/370 [00:33<00:32,  5.66it/s, loss=5.53, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▊                                      | 189/370 [00:33<00:31,  5.68it/s, loss=5.53, v_num=][A
Epoch 0:  51%|████████████████████████████████████████                     

Epoch 0:  64%|██████████████████████████████████████████████████▏                           | 238/370 [00:35<00:19,  6.65it/s, loss=5.53, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▍                           | 239/370 [00:35<00:19,  6.66it/s, loss=5.53, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▌                           | 240/370 [00:35<00:19,  6.68it/s, loss=5.53, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▊                           | 241/370 [00:35<00:19,  6.70it/s, loss=5.53, v_num=][A
Epoch 0:  65%|███████████████████████████████████████████████████                           | 242/370 [00:36<00:19,  6.72it/s, loss=5.53, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▏                          | 243/370 [00:36<00:18,  6.74it/s, loss=5.53, v_num=][A
Epoch 0:  66%|███████████████████████████████████████████████████▍                          | 244/370 [00:

Epoch 0, global step 7: 'val_accuracy' reached 0.01220 (best 0.01220), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211035/epoch=0-step=7-v1.ckpt' as top 3


Epoch 0:  67%|████████████████████████████████████████████████████▎                         | 248/370 [00:58<00:28,  4.27it/s, loss=5.53, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.73it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.81it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.76it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211035 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 584/584 [01:41<00:00,  5.73it/s]
predicting class labels for test split
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 316/316 [00:54<00:00,  5.81it/s]


  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]
Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_211450/"


test round:  4 accuracy:  0.006534006534006534
computing active learning scores
getting idx to relabel
setting up next iter
fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  33%|██████████████████████████                                                    | 128/383 [00:17<00:34,  7.47it/s, loss=5.54, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/63 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/63 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▎                                                   | 129/383 [00:17<00:34,  7.35it/s, loss=5.54, v_num=][A
Epoch 0:  34%|██████████████████████████▍                                                   | 130/383 [00:17<00:34,  7.38it/s, loss=5.54, v_num=][A
Epoch 0:  34%|██████████████████████████▋                                                   | 131/383 [00:17<00:33,  7.42it/s, loss=5.54, v_num=][A
Epoch 0:  34%|██████████████████████████▉                                  

Epoch 0:  47%|████████████████████████████████████▋                                         | 180/383 [00:20<00:22,  8.92it/s, loss=5.54, v_num=][A
Epoch 0:  47%|████████████████████████████████████▊                                         | 181/383 [00:20<00:22,  8.95it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████                                         | 182/383 [00:20<00:22,  8.98it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▎                                        | 183/383 [00:20<00:22,  9.00it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▍                                        | 184/383 [00:20<00:22,  9.03it/s, loss=5.54, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▋                                        | 185/383 [00:20<00:21,  9.06it/s, loss=5.54, v_num=][A
Epoch 0:  49%|█████████████████████████████████████▉                                        | 186/383 [00:

Epoch 0, global step 8: 'val_accuracy' reached 0.01600 (best 0.01600), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211450/epoch=0-step=8.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:37. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████                                       | 192/383 [00:37<00:36,  5.16it/s, loss=5.54, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/63 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/63 [00:00<?, ?it/s][A
Epoch 0:  50%|███████████████████████████████████████▎                                      | 193/383 [00:37<00:37,  5.13it/s, loss=5.54, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▌                                      | 194/383 [00:37<00:36,  5.15it/s, loss=5.54, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▋                                      | 195/383 [00:37<00:36,  5.17it/s, loss=5.54, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▉                     

Epoch 0:  64%|█████████████████████████████████████████████████▋                            | 244/383 [00:40<00:23,  6.04it/s, loss=5.54, v_num=][A
Epoch 0:  64%|█████████████████████████████████████████████████▉                            | 245/383 [00:40<00:22,  6.05it/s, loss=5.54, v_num=][A
Epoch 0:  64%|██████████████████████████████████████████████████                            | 246/383 [00:40<00:22,  6.07it/s, loss=5.54, v_num=][A
Epoch 0:  64%|██████████████████████████████████████████████████▎                           | 247/383 [00:40<00:22,  6.09it/s, loss=5.54, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▌                           | 248/383 [00:40<00:22,  6.10it/s, loss=5.54, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▋                           | 249/383 [00:40<00:21,  6.12it/s, loss=5.54, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▉                           | 250/383 [00:

Epoch 0, global step 8: 'val_accuracy' reached 0.01600 (best 0.01600), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211450/epoch=0-step=8-v1.ckpt' as top 3


Epoch 0:  67%|███████████████████████████████████████████████████▉                          | 255/383 [01:02<00:31,  4.11it/s, loss=5.54, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.76it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.66it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.77it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211450 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 581/581 [01:38<00:00,  5.92it/s]
predicting class labels for test split
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████| 316/316 [00:55<00:00,  5.74it/s]
test round:  5 accuracy:  0.009504009504009503
computing active learning scores
getting idx to relabel
setting up next iter


  pred_probs = pred_probs / np.sum(pred_probs, axis=1)[:, np.newaxis]
  scaled_pred_probs / np.sum(scaled_pred_probs, axis=1)[:, np.newaxis]
Global seed set to 123
No path specified. Models will be saved in: "AutogluonModels/ag-20230324_211912/"


fitting model


Auto select gpus: [0]
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                            | Params
----------------------------------------------------------------------
0 | model             | TimmAutoModelForImagePrediction | 87.0 M
1 | validation_metric | Accuracy                        | 0     
2 | loss_func         | CrossEntropyLoss                | 0     
----------------------------------------------------------------------
87.0 M    Trainable params
0         Non-trainable params
87.0 M    Total params
174.013   Total estimated model params size (MB)


Epoch 0:  34%|██████████████████████████▌                                                   | 135/396 [00:18<00:36,  7.16it/s, loss=5.53, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/63 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/63 [00:00<?, ?it/s][A
Epoch 0:  34%|██████████████████████████▊                                                   | 136/396 [00:19<00:36,  7.06it/s, loss=5.53, v_num=][A
Epoch 0:  35%|██████████████████████████▉                                                   | 137/396 [00:19<00:36,  7.09it/s, loss=5.53, v_num=][A
Epoch 0:  35%|███████████████████████████▏                                                  | 138/396 [00:19<00:36,  7.13it/s, loss=5.53, v_num=][A
Epoch 0:  35%|███████████████████████████▍                                 

Epoch 0:  47%|████████████████████████████████████▊                                         | 187/396 [00:21<00:24,  8.51it/s, loss=5.53, v_num=][A
Epoch 0:  47%|█████████████████████████████████████                                         | 188/396 [00:22<00:24,  8.54it/s, loss=5.53, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▏                                        | 189/396 [00:22<00:24,  8.56it/s, loss=5.53, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▍                                        | 190/396 [00:22<00:23,  8.59it/s, loss=5.53, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▌                                        | 191/396 [00:22<00:23,  8.61it/s, loss=5.53, v_num=][A
Epoch 0:  48%|█████████████████████████████████████▊                                        | 192/396 [00:22<00:23,  8.64it/s, loss=5.53, v_num=][A
Epoch 0:  49%|██████████████████████████████████████                                        | 193/396 [00:

Epoch 0, global step 8: 'val_accuracy' reached 0.01000 (best 0.01000), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211912/epoch=0-step=8.ckpt' as top 3
Time limit reached. Elapsed time is 0:00:40. Signaling Trainer to stop.


Epoch 0:  50%|███████████████████████████████████████▏                                      | 199/396 [00:40<00:39,  4.97it/s, loss=5.52, v_num=]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|                                                                                                         | 0/63 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                            | 0/63 [00:00<?, ?it/s][A
Epoch 0:  51%|███████████████████████████████████████▍                                      | 200/396 [00:40<00:39,  4.95it/s, loss=5.52, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▌                                      | 201/396 [00:40<00:39,  4.96it/s, loss=5.52, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▊                                      | 202/396 [00:40<00:38,  4.98it/s, loss=5.52, v_num=][A
Epoch 0:  51%|███████████████████████████████████████▉                     

Epoch 0:  63%|█████████████████████████████████████████████████▍                            | 251/396 [00:43<00:24,  5.82it/s, loss=5.52, v_num=][A
Epoch 0:  64%|█████████████████████████████████████████████████▋                            | 252/396 [00:43<00:24,  5.83it/s, loss=5.52, v_num=][A
Epoch 0:  64%|█████████████████████████████████████████████████▊                            | 253/396 [00:43<00:24,  5.85it/s, loss=5.52, v_num=][A
Epoch 0:  64%|██████████████████████████████████████████████████                            | 254/396 [00:43<00:24,  5.86it/s, loss=5.52, v_num=][A
Epoch 0:  64%|██████████████████████████████████████████████████▏                           | 255/396 [00:43<00:23,  5.88it/s, loss=5.52, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▍                           | 256/396 [00:43<00:23,  5.90it/s, loss=5.52, v_num=][A
Epoch 0:  65%|██████████████████████████████████████████████████▌                           | 257/396 [00:

Epoch 0, global step 8: 'val_accuracy' reached 0.01000 (best 0.01000), saving model to '/home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211912/epoch=0-step=8-v1.ckpt' as top 3


Epoch 0:  66%|███████████████████████████████████████████████████▌                          | 262/396 [01:04<00:32,  4.07it/s, loss=5.52, v_num=]


INFO:automm:Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.49it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.47it/s]
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00,  5.61it/s]


INFO:automm:Models and intermediate outputs are saved to /home/ubuntu/examples/active_learning_single_annotator/AutogluonModels/ag-20230324_211912 


obtaining predicted class probabilities for the unlabeled data
Predicting DataLoader 0:  46%|████████████████████████████████████▊                                            | 263/578 [00:46<00:55,  5.71it/s]

## Results

Below, we can see that the model accuracy increases steadily with each additional round of data labeling and model training.

In [None]:
print(f"Initial model test accuracy: {model_accuracy_arr[0]:.3}")
print(f"Final model test accuracy (after 15 rounds of active learning): {model_accuracy_arr[-1]:.3}")

In [None]:
np.save("model_acc_20_rounds_activelab", model_accuracy_arr)

In [None]:
plt.plot(model_accuracy_arr)
plt.xticks(range(num_rounds))
plt.xlabel("Round")
plt.ylabel("Model Accuracy")
plt.show()

### References

[1] Griffin, G., Holub, A., & Perona, P. (2022). Caltech 256 (1.0). https://doi.org/10.22002/D1.20087

Goh, H. W., & Mueller, J. ActiveLab: Active Learning with Re-Labeling by Multiple Annotators. https://arxiv.org/abs/2301.11856