# Embedding Characteristics

In this notebook our goal is to test how good our SSL pretrained weights are. 
- We will query images from different classes and compare embeddings. This will give us better insights for the intraclass/interclass variability.
    - Intraclass variance: variance within one class (The intraclass variance measures the differences between the individual embeddings within each class.)
    - Interclass variance: variance between different classes (The interclass variance measures the differences between the means of each class)
- Note: you need to run this notebook with a kernel in your venv to use vissl libs: https://janakiev.com/blog/jupyter-virtual-envs/#add-virtual-environment-to-jupyter-notebook

## Imports
- matplotlib for visualisation
- torch

In [1]:
%matplotlib inline

In [2]:
import torch
import torchvision
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path
from IPython.display import display

## Reading in pretrained weights

### Option 1: Imagenet pretrained
- Load the best imgnet pretrained weights, docs: https://pytorch.org/vision/stable/models.html
- This is currently ResNet50_Weights.IMAGENET1K_V2 with an accuracy of 80.858%
- weights are saved in /home/olivier/.cache/torch/hub/checkpoints/resnet50-11ad3fa6.pth


In [3]:
#imgnet weights
#model = torchvision.models.resnet50(weights=torchvision.models.ResNet50_Weights.DEFAULT)
# model = torchvision.models.resnet50(pretrained=True)
#torch.save(model.state_dict(),"resnet50_imgnet.pth")
#weights = torch.load("resnet50_imgnet.pth")
#print(weights.keys())
#print(model)

### Option 2: SSL pretrained
Load weights from checkpoint according to vissl tutorial:
https://github.com/facebookresearch/vissl/blob/v0.1.6/tutorials/Using_a_pretrained_model_for_inference_V0_1_6.ipynb


In [4]:
#dictionary to summarize the paths to the the training config used and the path to the weigths
#train_config path is a relative path from the vissl folder
#weights path is an absolute path to where the final_checkpoint.torch is stored 
BASE_DIR_WEIGHTS = "/home/olivier/Documents/master/mp/checkpoints/"
PATHS = {
    "rotnet":
    {
        "train_config": "validation/rotnet_full/train_config.yaml", #relative path from vissl/...
        "weights": BASE_DIR_WEIGHTS + "sku110k/rotnet_full/model_final_checkpoint_phase104.torch",
    },
    "jigsaw":
    {
        "train_config": "validation/jigsaw_full/train_config.yaml",
        "weights": BASE_DIR_WEIGHTS + "sku110k/jigsaw_full/model_final_checkpoint_phase104.torch"
    },
    "moco32":
    {
        "train_config": "validation/moco_full_32/train_config.yaml",
        "weights": BASE_DIR_WEIGHTS + "sku110k/moco_full_32/model_final_checkpoint_phase99.torch"
    },
    "moco64":
    {
        "train_config": "validation/moco_full_64/train_config.yaml",
        "weights": BASE_DIR_WEIGHTS + "sku110k/moco_full_64/model_final_checkpoint_phase99.torch"
    },
    "simclr":
    {
        "train_config": "validation/simclr_full/train_config.yaml",
        "weights": BASE_DIR_WEIGHTS + "sku110k/simclr_full/model_final_checkpoint_phase99.torch"
    },
    "swav":
    {
        "train_config": "validation/swav_full/train_config.yaml",
        "weights": BASE_DIR_WEIGHTS + "sku110k/swav_full/model_final_checkpoint_phase99.torch"
    }
    
}

#CHOOSE the model you want to validate here
train_config = PATHS["moco64"]["train_config"] #change the key of the PATHS dict to the desired model name
weights_file = PATHS["moco64"]["weights"]
print('Train config at (relative path from vissl/...):\n' + train_config)
print('SSL pretrained weights at:\n' + weights_file)

Train config at (relative path from vissl/...):
validation/moco_full_64/train_config.yaml
SSL pretrained weights at:
/home/olivier/Documents/master/mp/checkpoints/sku110k/moco_full_64/model_final_checkpoint_phase99.torch


In [5]:
from omegaconf import OmegaConf
from vissl.utils.hydra_config import AttrDict
from vissl.utils.hydra_config import compose_hydra_configuration, convert_to_attrdict

# 1. Checkpoint config is located at vissl/configs/config/validation/*/train_config.yaml.
# 2. weights are located at /home/olivier/Documents/master/mp/checkpoints/sku110k/*
# The * in the above paths stand for rotnet_full, jigsaw_full or moco_full
# All other options specified below override the train_config.yaml config.

cfg = [
  'config=' + train_config,
  'config.MODEL.WEIGHTS_INIT.PARAMS_FILE=' + weights_file, # Specify path for the model weights.
  'config.MODEL.FEATURE_EVAL_SETTINGS.EVAL_MODE_ON=True', # Turn on model evaluation mode.
  'config.MODEL.FEATURE_EVAL_SETTINGS.FREEZE_TRUNK_ONLY=True', # Freeze trunk. 
  'config.MODEL.FEATURE_EVAL_SETTINGS.EXTRACT_TRUNK_FEATURES_ONLY=True', # Extract the trunk features, as opposed to the HEAD.
  'config.MODEL.FEATURE_EVAL_SETTINGS.SHOULD_FLATTEN_FEATS=False', # Do not flatten features.
  'config.MODEL.FEATURE_EVAL_SETTINGS.LINEAR_EVAL_FEAT_POOL_OPS_MAP=[["res5avg", ["Identity", []]]]' # Extract only the res5avg features.
]

# Compose the hydra configuration.
cfg = compose_hydra_configuration(cfg)
# Convert to AttrDict. This method will also infer certain config options
# and validate the config is valid.
_, cfg = convert_to_attrdict(cfg)

Missing @package directive config/validation/moco_full_64/train_config.yaml in pkg://configs.
See https://hydra.cc/docs/next/upgrades/0.11_to_1.0/adding_a_package_directive
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath 



Now let's build the model with the exact training configs:

In [6]:
from vissl.models import build_model

model = build_model(cfg.MODEL, cfg.OPTIMIZER)

#### Loading the pretrained weights

In [7]:
from classy_vision.generic.util import load_checkpoint
from vissl.utils.checkpoint import init_model_from_consolidated_weights

# Load the checkpoint weights.
weights = load_checkpoint(checkpoint_path=cfg.MODEL.WEIGHTS_INIT.PARAMS_FILE)


# Initializei the model with the simclr model weights.
init_model_from_consolidated_weights(
    config=cfg,
    model=model,
    state_dict=weights,
    state_dict_key_name="classy_state_dict",
    skip_layers=[],  # Use this if you do not want to load all layers
)

print("Weights have loaded")

Weights have loaded


#### Extra info
- VISSL uses the ResNeXT50 class, which is their custom wrapper class
    - ResNeXT50 wrapper class is defined at https://github.com/facebookresearch/vissl/blob/04788de934b39278326331f7a4396e03e85f6e55/vissl/models/trunks/resnext.py
    - ResNet base class https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py for interface of the __init__ method.
    - the model of this wrapper class is a torchvision.models.ResNet() which we will reconstruct here based on the YAML config parameters.
- checkpoints from pretraining are stored on /home/olivier/Documents/master/mp/checkpoints/sku110k/
    - checkpoints have phase numbers: in VISSL, if the workflow involves training and testing both, the number of phases = train phases + test epochs. So if we alternate train and test, the phase number is: 0 (train), 1 (test), 2 (train), 3 (test)... and train_phase_idx is always: 0 (corresponds to phase0), 1 (correponds to phase 2)
    - The weights are stored 

In [8]:
print("Loading vissl checkpoint")
ssl_checkpoint = torch.load(Path(weights_file))
print("Checkpoint contains:")
dataframe_dict = dict()
dataframe_dict["phase_idx"] = ssl_checkpoint["phase_idx"]
dataframe_dict["iteration_num"] = ssl_checkpoint["iteration_num"]
dataframe_dict["train_phase_idx"] = ssl_checkpoint["train_phase_idx"]
dataframe_dict["iteration"] = ssl_checkpoint["iteration"]
dataframe_dict["type"] = ssl_checkpoint["type"]
df = pd.DataFrame(data=dataframe_dict.values(), index=dataframe_dict.keys(),columns=["Value"])
display(df)
if("loss", "classy_state_dict" in ssl_checkpoint.keys()):
    print("Checkpoint also contains elements loss and classy_state_dict")

#the weights of the trunk resnet network are stored in a nested dict:    
#print(ssl_checkpoint["classy_state_dict"]["base_model"]["model"]["trunk"].keys())

Loading vissl checkpoint
Checkpoint contains:


NVIDIA GeForce RTX 4070 Laptop GPU with CUDA capability sm_89 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 4070 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/



Unnamed: 0,Value
phase_idx,99
iteration_num,1880199
train_phase_idx,99
iteration,1880100
type,consolidated


Checkpoint also contains elements loss and classy_state_dict


## Extracting features

In [9]:
from PIL import Image
import torchvision.transforms as transforms

def extract_features(path):
    image = Image.open(path)
    # Convert images to RGB. This is important
    # as the model was trained on RGB images.
    image = image.convert("RGB")

    # Image transformation pipeline.
    pipeline = transforms.Compose([
      transforms.CenterCrop(224),
      transforms.ToTensor(),
    ])
    x = pipeline(image)

    #unsqueeze adds a dim for batch size (with 1 element the entire input tensor of the image)
    features = model(x.unsqueeze(0))

    features_shape = features[0].shape
    #print(f"Features extracted have the shape: { features_shape }")
    return features[0]

In [10]:
CornerShop = Path("/home/olivier/Documents/master/mp/CornerShop/CornerShop/crops")

#create an iterator over all jpg files in cornershop map and put elements in a list
img_paths = list(CornerShop.glob("*/*.jpg")) #**/*.jpg to look into all subdirs for jpgs and iterate over them
#extract the corresponding labels (folder names)
labels = [p.parent.stem for p,_ in zip(img_paths,range(20)) ] #stem attr, conatins foldername 
#path.stem=filename without extension
#path.name=filename with extension
fts_stack = torch.stack([extract_features(p).squeeze() for p,_ in zip(img_paths,range(20)) ])
print(fts_stack.shape)
print(labels[0:10])

torch.Size([20, 2048])
['CawstonDry', 'CawstonDry', 'CawstonDry', 'MinuteMaidAppelPerzik', 'CarrefourSmoothieAardbeiBlauweBessen', 'CarrefourSmoothieAardbeiBlauweBessen', 'CarrefourSmoothieAardbeiBlauweBessen', 'CarrefourSmoothieAardbeiBlauweBessen', 'GiniZeroFles1,5L', 'GiniZeroFles1,5L']


results of the feature extraction:
- fts_stack: contains n rows and 2048 columns (features), this is a stack of features from multiple query images
- labels: list with the corresponding labels of the feature stack

## Comparing features
Here we will investigate relations between the features from different images with:
- Inner product
- Cosine simularity
- Euclidian distance
- Euclidian distance (normalized features)

In [11]:
#data to save statistics about the feature comparisons
data = np.zeros((4,4),dtype=float)
data_index=["max","min","avg","std_dev"] #labels for dataframe
data_columns=np.empty((4),dtype=object) #numpy array of strings (objects)

### Inner product

In [12]:
#multiply the features of one tensor with all other tensors
inner_product = fts_stack.matmul(fts_stack.T)
print("Three examples from inner_product tensor:\n{}".format(inner_product[0:3]))

#save statistics data
data_columns[0] = "inner_product"
data[0][0]= inner_product.max()
data[1][0]= inner_product.min()
data[2][0]= torch.mean(inner_product)
data[3][0]= torch.std(inner_product)

#display statistics of inner product
df = pd.DataFrame(data=data[:,0],index=data_index, columns=[data_columns[0]])
display(df)

Three examples from inner_product tensor:
tensor([[1.3904, 1.2524, 1.2929, 1.3192, 1.2936, 1.2811, 1.2357, 1.2880, 1.2943,
         1.2757, 1.3023, 1.3063, 1.2646, 1.3191, 1.2566, 1.2995, 1.2536, 1.3049,
         1.2246, 1.3002],
        [1.2524, 1.3009, 1.2350, 1.2795, 1.2477, 1.2393, 1.1961, 1.2399, 1.2564,
         1.2308, 1.2538, 1.2570, 1.2235, 1.2785, 1.2219, 1.2604, 1.2163, 1.2590,
         1.1932, 1.2535],
        [1.2929, 1.2350, 1.3375, 1.2941, 1.2702, 1.2618, 1.2182, 1.2656, 1.2738,
         1.2506, 1.2807, 1.2823, 1.2465, 1.3012, 1.2449, 1.2836, 1.2342, 1.2807,
         1.2075, 1.2773]])


Unnamed: 0,inner_product
max,1.465119
min,1.180677
avg,1.290121
std_dev,0.044062


### Cosine simularity
Here we normalize the features and then calculate the inner product with all other tensors.

In [13]:
#NORMALIZE features in feature stack:
fts_stack_norm = fts_stack / fts_stack.norm(dim=1,keepdim=True) 
#fts.norm(dim=1,keepdim=True)
# dim=1: calculate norm over the second dimension (features/columns)
# keepdim=True: keep batch/stack dimension of features

#.norm is deprecated, newer version https://pytorch.org/docs/stable/generated/torch.linalg.matrix_norm.html#torch.linalg.matrix_norm 
#fts_stack_norm = fts_stack / torch.linalg.matrix_norm(fts_stack, dim=1, keepdim=True) #newer version?

In [14]:
#calculate cosine simularity (cosim)
#fts_stack is a matrix with n rows and 2048 columns (features)
#matrix product of fts_stack * fts_stack^T = cosin_sim with all other images from the stack
cosim = fts_stack_norm.matmul(fts_stack_norm.T)
print("Three examples from cosim tensor:\n{}".format(cosim[0:3]))

#save statistics data
data_columns[1] = "cosim" 
data[0][1]= cosim.max()
data[1][1]= cosim.min()
data[2][1]= torch.mean(cosim)
data[3][1]= torch.std(cosim)

#display statistics of cosim
df = pd.DataFrame(data=data[:,1],index=data_index, columns=[data_columns[1]])
display(df)

Three examples from cosim tensor:
tensor([[1.0000, 0.9312, 0.9480, 0.9310, 0.9250, 0.9285, 0.9254, 0.9270, 0.9295,
         0.9246, 0.9295, 0.9253, 0.9227, 0.9242, 0.9247, 0.9235, 0.9154, 0.9311,
         0.9180, 0.9259],
        [0.9312, 1.0000, 0.9362, 0.9335, 0.9224, 0.9286, 0.9261, 0.9226, 0.9328,
         0.9223, 0.9252, 0.9205, 0.9229, 0.9260, 0.9296, 0.9261, 0.9182, 0.9288,
         0.9247, 0.9229],
        [0.9480, 0.9362, 1.0000, 0.9312, 0.9261, 0.9325, 0.9302, 0.9288, 0.9327,
         0.9242, 0.9321, 0.9262, 0.9273, 0.9295, 0.9341, 0.9301, 0.9189, 0.9318,
         0.9229, 0.9274]])


Unnamed: 0,cosim
max,1.000001
min,0.915432
avg,0.936654
std_dev,0.016745


### Euclidean distance

In [15]:
eucl_dist = [] 
for tensor in fts_stack:
    d = [] #store all distances from this tensor to all the other tensors
    for other_tensor in fts_stack:
        d_to = (tensor - other_tensor).pow(2).sum().sqrt() #d(tensor, other_tensor)=euclid distance
        d.append(d_to)
    d = torch.tensor(d)
    #print("distance tensor has shape {}".format(d.shape))
    #add tensor to euclidian distances 
    eucl_dist.append(d)
eucl_dist = torch.stack(eucl_dist)
#print("eucl_dist has shape {}".format(eucl_dist.shape))
print("Three examples from eucl_dist tensor:\n{}".format(eucl_dist[0:3]))

#save statistics data
data_columns[2] = "eucl_dist" 
data[0][2]= eucl_dist.max()
data[1][2]= eucl_dist.min()
data[2][2]= torch.mean(eucl_dist)
data[3][2]= torch.std(eucl_dist)

#display statistics of euclidian distance
df = pd.DataFrame(data=data[:,2],index=data_index, columns=[data_columns[2]])
display(df)

Three examples from eucl_dist tensor:
tensor([[0.0000, 0.4319, 0.3771, 0.4429, 0.4582, 0.4442, 0.4489, 0.4505, 0.4431,
         0.4561, 0.4446, 0.4595, 0.4606, 0.4662, 0.4531, 0.4642, 0.4816, 0.4394,
         0.4702, 0.4562],
        [0.4319, 0.0000, 0.4105, 0.4313, 0.4605, 0.4374, 0.4371, 0.4576, 0.4274,
         0.4565, 0.4529, 0.4693, 0.4527, 0.4572, 0.4304, 0.4516, 0.4659, 0.4420,
         0.4409, 0.4604],
        [0.3771, 0.4105, 0.0000, 0.4397, 0.4513, 0.4277, 0.4283, 0.4412, 0.4295,
         0.4530, 0.4333, 0.4540, 0.4421, 0.4475, 0.4193, 0.4407, 0.4668, 0.4343,
         0.4500, 0.4483]])


Unnamed: 0,eucl_dist
max,0.481599
min,0.0
avg,0.407248
std_dev,0.097231


### Euclidian distance (normalized features)
Using normalized features

In [16]:
eucl_dist_norm = [] 
for tensor in fts_stack_norm:
    d = [] #store all distances from this tensor to all the other tensors
    for other_tensor in fts_stack_norm:
        d_to = (tensor - other_tensor).pow(2).sum().sqrt() #d(tensor, other_tensor)=euclid distance
        d.append(d_to)
    d = torch.tensor(d)
    #print("distance tensor has shape {}".format(d.shape))
    #add tensor to euclidian distances 
    eucl_dist_norm.append(d)
eucl_dist_norm = torch.stack(eucl_dist_norm)
#print("eucl_dist has shape {}".format(eucl_dist.shape))
print("Three examples from eucl_dist tensor:\n{}".format(eucl_dist_norm[0:3]))

#save statistics data
data_columns[3] = "eucl_dist_norm"
data[0][3]= eucl_dist_norm.max()
data[1][3]= eucl_dist_norm.min()
data[2][3]= torch.mean(eucl_dist_norm)
data[3][3]= torch.std(eucl_dist_norm)

#display statistics of euclidian distance normalized
df = pd.DataFrame(data=data[:,3],index=data_index, columns=[data_columns[3]])
display(df)

Three examples from eucl_dist tensor:
tensor([[0.0000, 0.3709, 0.3223, 0.3716, 0.3874, 0.3781, 0.3864, 0.3822, 0.3755,
         0.3883, 0.3755, 0.3864, 0.3932, 0.3893, 0.3880, 0.3911, 0.4113, 0.3711,
         0.4050, 0.3849],
        [0.3709, 0.0000, 0.3571, 0.3647, 0.3940, 0.3778, 0.3845, 0.3934, 0.3667,
         0.3943, 0.3869, 0.3987, 0.3927, 0.3846, 0.3753, 0.3844, 0.4044, 0.3774,
         0.3880, 0.3927],
        [0.3223, 0.3571, 0.0000, 0.3710, 0.3845, 0.3675, 0.3737, 0.3775, 0.3669,
         0.3892, 0.3686, 0.3843, 0.3813, 0.3755, 0.3631, 0.3739, 0.4027, 0.3694,
         0.3928, 0.3809]])


Unnamed: 0,eucl_dist_norm
max,0.411261
min,0.0
avg,0.346172
std_dev,0.082906


Let's compare the statistics of these 4 methods:

In [17]:
df = pd.DataFrame(data=data,index=data_index, columns=data_columns)
display(df)

Unnamed: 0,inner_product,cosim,eucl_dist,eucl_dist_norm
max,1.465119,1.000001,0.481599,0.411261
min,1.180677,0.915432,0.0,0.0
avg,1.290121,0.936654,0.407248,0.346172
std_dev,0.044062,0.016745,0.097231,0.082906
