# **Oxford Flowers Clustering Notebook**

In this notebook, we:
1. Load the validation and test splits of the Oxford Flower Dataset, merging them.
2. Compute (or load) the VLAD vectors from the deep extractor.
3. Cluster these images into 102 clusters (the number of classes).
4. Compute ARI, NMI, and interpret the results.

---

## **1. Setup and Load Data**


In [1]:
import numpy as np
from torchvision import transforms
from torchvision.models import vgg16, VGG16_Weights
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA

from src.datasets import OxfordFlowerDataset
from src.utils import cluster_images_and_generate_statistics
from src.utils import *
from src.features._features import DeepConvFeatureExtractor
from src.metrics.vlad import VLADEncoder
from src.config import ROOT, DEVICE

Device used: cuda


Since we are using the deep extractor, we need to resize the images to 224x224 pixels and turn them into tensors.

In [2]:
transformer = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize((224, 224))
])

# Load the validation and test datasets.

In [3]:
val_dataset = OxfordFlowerDataset(
    purpose="validation",
    transform=transformer
)

test_dataset = OxfordFlowerDataset(
    purpose="test",
    transform=transformer
)

# Merge them

In [4]:
images, labels = [], []
for (val_img, val_label, _), (test_img, test_label, _) in zip(val_dataset, test_dataset):
    images.append(val_img)
    images.append(test_img)
    labels.append(val_label)
    labels.append(test_label)

## **2. Compute VLAD vectors**


### **2.1 Load the PCA and KMeans models**

This requires that you already trained your PCA and KMeans models on the deep features extracted from the Oxford Flowers dataset and saved them as `pickle` files. If not, you can go to notebook `vlad_with_vgg16_embeddings.ipynb` to train them.

In [5]:
pca = load_model(rf'{ROOT}/models/pickle_model_files/pca_vlad_k256_deep_features_vgg16_feature_dim257.pkl')
k_means = load_model(rf'{ROOT}/models/pickle_model_files/k_means_k256_deep_features_vgg16_pca.pkl')

### **2.2 Define the feature extractor and the VLAD encoder**

In [6]:
extractor = DeepConvFeatureExtractor(
    model=vgg16(weights=VGG16_Weights.DEFAULT),
    layer_index=-1,  # Last conv layer
    append_spatial_coords=True,
    device=DEVICE
)

vlad_encoder = VLADEncoder(
    feature_extractor=extractor,
    kmeans_model=k_means,
    pca=pca,
    power_norm_weight=1.0,
)

2025-01-05 13:52:01,717 - Feature_Extractor - INFO - Device used: cuda
2025-01-05 13:52:01,719 - Feature_Extractor - INFO - Selected layer: features.28, Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))


### **2.3 Compute the VLAD vectors for both the validation and test splits**

In [7]:
vlad_vectors = vlad_encoder.transform((img for img in images))

## **3. Cluster into 102 Clusters and Compute ARI, NMI**

`102` is the number of classes in the Oxford Flowers dataset. We want to see how well the clustering algorithm can cluster the images into these classes.


In [8]:
num_classes = 102
results = cluster_images_and_generate_statistics(
    features=vlad_vectors,        # The subset corresponding to val+test
    true_labels=np.array(labels), # The true labels
    n_clusters=num_classes,
    method='kmeans'
)

print(f"Clustering with KMeans into {num_classes} clusters:")
print("RI:", results["ri"])
print("ARI:", results["ari"])
print("NMI:", results["nmi"])




Clustering with KMeans into 102 clusters:
RI: 0.7258474454028792
ARI: 0.009982034303317965
NMI: 0.1241525345925399


## **6. Conclusion**

We've demonstrated:
- How to handle the mismatch between ground-truth class IDs (1..102) and cluster labels (0..101).
- How to cluster images directly on deep-based VLAD vectors.
- How to compute ARI and NMI for objective evaluation.
