# Working with pre-extracted embeddings

When you're testing a Few-Shot Learning models, you are going to solve hundreds of randomly sampled few-shot tasks. In doing so, you are more than likely to process the same images several times. This means that these images will go through your backbone several times, which is a waste in time and energy. Indeed, most Few-Shot Learning methods nowadays make use of a **frozen backbone**: the logic of these methods is at the feature level. Therefore, you can extract the features of your images once and for all, and then use these features to solve your few-shot tasks.

All the necessary tools to do so are available in EasyFSL. In this tutorial, we will show you how to use them.

## Extracting the features

EasyFSL has a `predict_embeddings()` method, which takes as input a DataLoader and a torch Module, and outputs a DataFrame with all your embeddings. Let's use it to extract all the embeddings from the test set of the CUB dataset. For a backbone, we are going to use the Swin Transformer pre-trained on ImageNet and directly available from torchvision. Note that we can do that because there is no intersection between CUB and ImageNet, so we are not technically cheating. Still, the resulting performance cannot be compared with that of a model trained on CUB's train set, since the training data is not the same.

First do some necessary configuration (this is not the interesting part).

In [1]:
%cd ..
import copy
from pathlib import Path
import random
from statistics import mean
import numpy as np
import torch
from torch import nn
from tqdm import tqdm
from notebooks.get_dataset import *
import torch
from torch.utils.data import Dataset
from easyfsl.modules import resnet12
from easyfsl.methods import PrototypicalNetworks
from easyfsl.samplers import TaskSampler
from torch.utils.data import DataLoader
from torch.optim import SGD, Optimizer
from torch.optim.lr_scheduler import MultiStepLR
from torch.utils.tensorboard import SummaryWriter
from easyfsl.utils import evaluate

  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


D:\iib\project\github\fsl_gesture



In [2]:
random_seed = 0
np.random.seed(random_seed)
torch.manual_seed(random_seed)
random.seed(random_seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

class GestureDataset(Dataset):
    def __init__(self, gesture_data, labels):
        self.data = gesture_data
        self.labels = labels

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        gesture_sample = torch.tensor(self.data[idx],dtype=torch.float)
        label = torch.tensor(self.labels[idx].astype(np.int64))
        
        return (gesture_sample, label)
    def get_labels(self):
        return self.labels

Then we prepare the data and the model.

In [18]:
import torch
from torch import nn
from torch.utils.data import DataLoader
import torchvision.models

batch_size = 16
num_workers = 0
p_ids = [2,5,6,7,8,9,10,11,23,24]
test_path = 'D:/iib_project/data/Gesture_Dataset/gestures/data/test/'
annotation_path = 'D:/iib_project/data/Gesture_Dataset/gestures/annotations/'
testpart1_path = 'test 1-'
testpart2_path = 'test 2-'
testpart3_path = 'test 3-'

multiframe_gestures_test = {'Grab things':testpart2_path, 'Nozzle rotation':testpart1_path,
                            "Teleport":testpart3_path, "Two hands flick":testpart3_path,
                           'Null': testpart1_path}


test, test_label = get_data_multiframes(test_path,multiframe_gestures_test, p_ids,n_test)
test_dataset = GestureDataset(test, test_label)

dataloader = DataLoader(
    test_dataset,
    batch_size=batch_size,
    num_workers=num_workers,
    shuffle=False,
)

model = torchvision.models.swin_v2_t(
    weights=torchvision.models.Swin_V2_T_Weights.IMAGENET1K_V1,
)
# Remove the classification head: we want embeddings, not ImageNet predictions
model.head = nn.Flatten()

# If you have a GPU, use it!
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

And now we extract the embeddings. This gives us a DataFrame with the embeddings of all the images in the test set, along with their respective class_names.

In [19]:
from easyfsl.utils import predict_embeddings

embeddings_df = predict_embeddings(dataloader, model, device=device)

print(embeddings_df)

Predicting embeddings: 100%|██████████████████████████████████████████████████████████| 4/4 [00:03<00:00,  1.10batch/s]


                                            embedding  class_name
0   [tensor(-0.1855), tensor(-0.0161), tensor(-0.1...           4
1   [tensor(0.2868), tensor(-0.3351), tensor(-0.01...           4
2   [tensor(-0.1425), tensor(0.0792), tensor(-0.38...           4
3   [tensor(-0.3981), tensor(-0.4016), tensor(-0.1...           4
4   [tensor(-0.7728), tensor(-0.6934), tensor(0.15...           4
5   [tensor(0.3577), tensor(0.1011), tensor(-0.169...           4
6   [tensor(-0.1443), tensor(0.0808), tensor(-0.29...           4
7   [tensor(0.0193), tensor(-0.8467), tensor(-0.24...           4
8   [tensor(0.0058), tensor(-0.0739), tensor(-0.09...           4
9   [tensor(-0.2402), tensor(-0.4507), tensor(-0.0...           4
10  [tensor(0.0966), tensor(-0.0187), tensor(-0.44...           5
11  [tensor(-0.2819), tensor(-0.6512), tensor(-0.2...           5
12  [tensor(0.1954), tensor(-0.1632), tensor(-0.00...           5
13  [tensor(-0.0334), tensor(-0.4077), tensor(-0.6...           5
14  [tenso

We now have our embeddings ready to use! We will not use the backbone anymore.

## Performing inference on pre-extracted embeddings

To deliver the embeddings to our Few-Shot Classifier, we will need an appropriate DataLoader. We will use the `FeaturesDataset` class from EasyFSL. Since we have a DataFrame ready to use, we will use the handy `from_dataset()` initializer from `FeaturesDataset`, but you can also use `from_dict()` to initialize from a dictionary, or the built-in constructor to initialize it directly from labels and embeddings.

In [20]:
from easyfsl.datasets import FeaturesDataset

features_dataset = FeaturesDataset.from_dataframe(embeddings_df)

print(features_dataset[0])

(tensor([-1.8549e-01, -1.6074e-02, -1.4300e-01,  3.3690e-01,  1.8613e-01,
        -8.8492e-02, -8.1999e-02,  7.1264e-02,  2.1094e-01,  1.2463e-01,
         1.5670e-01, -1.6324e-01,  4.5528e-02,  5.7922e-02, -3.6548e-02,
         8.5379e-02,  1.7829e-01, -1.8733e-02,  2.4848e-01, -7.7767e-02,
         4.4384e-02, -4.3345e-02,  8.5886e-02,  5.2851e-02,  7.6360e-02,
         1.6578e-01,  2.2752e-01, -8.3451e-03,  2.9112e-02, -7.2518e-02,
        -1.0330e-01,  7.9279e-02, -3.7797e-02,  7.6986e-02,  2.9532e-02,
         1.0439e-01,  5.2560e-03, -1.8880e-01,  1.2565e-01,  1.4312e-01,
        -4.9465e-02, -1.9007e-01,  3.3652e-01, -2.0023e-03,  8.1917e-02,
        -1.3372e-01, -4.0351e-02, -9.3523e-02, -1.2170e-01, -5.0943e-02,
        -1.0378e-01, -1.5065e-01, -1.0306e-01,  1.8717e-01, -7.0349e-02,
         1.7137e-01, -4.0008e-01, -2.2976e-01,  2.2477e-02, -7.3796e-03,
        -3.7446e-02,  9.1540e-02,  4.4683e-02,  1.8917e-01,  2.4368e-01,
        -9.5658e-02,  2.1460e-01,  2.0822e-01,  7.

Then, like in all other few-shot tutorials, we are going to build a DataLoader that loads batches in the shape of few-shot tasks:

In [24]:
from easyfsl.samplers import TaskSampler

task_sampler = TaskSampler(
    features_dataset,
    n_way=5,
    n_shot=1,
    n_query=9,
    n_tasks=100,
)
features_loader = DataLoader(
    features_dataset,
    batch_sampler=task_sampler,
    num_workers=num_workers,
    pin_memory=True,
    collate_fn=task_sampler.episodic_collate_fn,
)

We now need to instantiate our Few-Shot Classifier. We will use a Prototypical Network for simplicity, but you can use any other model from EasyFSL.

Since we are working directly on features, **we don't need to initialize Prototypical Networks with a backbone**.

In [25]:
from easyfsl.methods import PrototypicalNetworks

# Default backbone if we don't specify anything is Identity.
# But we specify it anyway for clarity and robustness.
few_shot_classifier = PrototypicalNetworks(backbone=nn.Identity())

We can now evaluate our model on the test set, and just enjoy how fast it goes:

In [26]:
from easyfsl.utils import evaluate

accuracy = evaluate(
    few_shot_classifier,
    features_loader,
    device="cpu",
)

print(f"Average accuracy : {(100 * accuracy):.2f} %")

100%|███████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 109.41it/s, accuracy=0.323]

Average accuracy : 32.31 %





And that is it! Notice that when you're working on pre-extracted embeddings, you can process tasks way faster (65 tasks/s on my MacBook Pro). This should always be your default settings whenever you're working with a method that uses a frozen backbone at test-time (that's most of them).

## Conclusion
Thanks for following this tutorial. If you have any issue, please [raise one](https://github.com/sicara/easy-few-shot-learning/issues), and if EasyFSL is helping you, do not hesitate to [star the repository](https://github.com/sicara/easy-few-shot-learning).