### Training PatchCore and EfficientAD Anomaly Detection Models on MVTec AD Dataset and Evaluating AUROC Scores

In [5]:
import numexpr
numexpr.set_num_threads(22) #use all 22 CPU cores on my local device

from anomalib.data import MVTec


In [33]:
#try to force a fork start for each child process to avoid pickling errors
#pickling error:  (failed serialization of anomalib objects when trying to send to newly spawned child process)

import multiprocessing
multiprocessing.set_start_method("fork", force=True)


#### Part 1A: PatchCore Model: Training and Performance Evaluation

In [34]:
'''load the {tile, leather, and grid} textual categories from MVTec-AD dataset for the PatchCore Model'''


datamodule_tile_pc = MVTec(
    root="./datasets",
    category="tile",
    train_batch_size=32,
    eval_batch_size=32,
    num_workers=21,
    val_split_mode="from_test",  # Create validation set from test set
    val_split_ratio=0.5,  # Use 50% of test set for validation
)

datamodule_leather_pc = MVTec(
    root="./datasets",
    category="leather",
    train_batch_size=32,
    eval_batch_size=32,
    num_workers=21,
    val_split_mode="from_test",  # Create validation set from test set
    val_split_ratio=0.5,  # Use 50% of test set for validation
)

datamodule_grid_pc = MVTec(
    root="./datasets",
    category="grid",
    train_batch_size=32,
    eval_batch_size=32,
    num_workers=21,
    val_split_mode="from_test",  # Create validation set from test set
    val_split_ratio=0.5,  # Use 50% of test set for validation
)



In [35]:
from anomalib.pre_processing import PreProcessor
from torchvision.transforms.v2 import Compose, Normalize, Resize
from anomalib.engine import Engine
from anomalib.models import Patchcore


In [None]:
#train and evaluate PatchCore Model
auroc_dict = {}

for category , datamodule in [("tile", datamodule_tile_pc), ("leather", datamodule_leather_pc), ("grid", datamodule_grid_pc)]:
    
    #create transform with the preprocessing requirements for the PatchCore model
    transform = Compose([
    Resize((224, 224)),
    Normalize(mean=[0.43, 0.48, 0.45], std=[0.23, 0.22, 0.25]),
    ])

    # Wrap in anomalib's PreProcessor object
    pre_processor = PreProcessor(transform=transform)


    #create PatchCore model for each category
    patchcore_model = Patchcore(
        backbone="wide_resnet50_2",  # Feature extraction backbone
        layers=["layer2", "layer3"],  # Layers to extract features from
        pre_trained=True,  # Use pretrained weights
        num_neighbors=9,  # Number of nearest neighbors
        pre_processor=pre_processor
    )


    # Initialize fresh training engine for each category
    patchcore_engine = Engine(
        max_epochs=1,  # Patchcore typically needs only one epoch
        accelerator="auto",  # Automatically detect GPU/CPU
        devices=1,  # Number of devices to use
    )
    # patchcore_engine is responsible for orchestrating the training process

    # Train the PatchCore model for each category
    patchcore_engine.fit(model=patchcore_model, datamodule=datamodule)
    # WHAT HAPPENS WHEN YOU FIT THE PATCHCORE MODEL:
    # + no fine-tuning of the Wide_ResNet50_2 backbone; weights remain frozen
    # + instead, the training images are processed and their extrated features (after Coreset Sampling)
    #   are uploaded to the Memory Bank
    # + only 1 epoch is used for training because Patchcore is a one-shot model in which gradient updates are not needed
    # 
    # LIBRARY OBJECT PERSPECTIVE:
    # + calling pathcore_engine.fit(.) passes the trainind data through the patchcore_model object
    # + the patchcore_model extracts features from the training data and uploads them to the Memory Bank
    # + the memory bank and any other stateful components (such as subsampled features) are stored in the patchcore_engine object
    # 

    # Perform Inference and get performance metrics (AUROC score)
    metrics_list = patchcore_engine.test(model=patchcore_model, datamodule=datamodule) #returns list with 1 dict, each key a score_type and value the numerical value (float)
    metrics = metrics_list[0] if len(metrics_list) > 0 else {}

    # get AUROC scores
    category_auroc = metrics.get("image_AUROC", None)  #image_AUROC score refers to the AUROC computed for the whole image and evaluates classification
    print(f"[{category}] image-level AUROC: {category_auroc}")

    # Store AUROC scores in dictionary
    auroc_dict[category] = category_auroc




INFO:anomalib.models.components.base.anomalib_module:Initializing Patchcore model.
/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:208: Attribute 'pre_processor' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['pre_processor'])`.
INFO:timm.models._builder:Loading pretrained weights from Hugging Face hub (timm/wide_resnet50_2.racm_in1k)
INFO:timm.models._hub:[timm/wide_resnet50_2.racm_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
INFO:timm.models._builder:Missing keys (fc.weight, fc.bias) discovered while loading pretrained weights. This is expected if model is being adapted.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
You are using a CUDA device

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.models.image.patchcore.lightning_model:Aggregating the embedding extracted from the training set.
INFO:anomalib.models.image.patchcore.lightning_model:Applying core-set subsampling to get the embedding.

[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[

Testing: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.callbacks.timer:Testing took 5.060075283050537 seconds
Throughput (batch_size=32) : 11.659905574454424 FPS


INFO:anomalib.models.components.base.anomalib_module:Initializing Patchcore model.


[tile] image-level AUROC: 1.0


/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:208: Attribute 'pre_processor' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['pre_processor'])`.
INFO:timm.models._builder:Loading pretrained weights from Hugging Face hub (timm/wide_resnet50_2.racm_in1k)
INFO:timm.models._hub:[timm/wide_resnet50_2.racm_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
INFO:timm.models._builder:Missing keys (fc.weight, fc.bias) discovered while loading pretrained weights. This is expected if model is being adapted.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
INFO:anomalib.data.datamodules.image.mvtec:Found the dataset.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/home/

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.models.image.patchcore.lightning_model:Aggregating the embedding extracted from the training set.
INFO:anomalib.models.image.patchcore.lightning_model:Applying core-set subsampling to get the embedding.

[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[

Testing: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.callbacks.timer:Testing took 6.167374610900879 seconds
Throughput (batch_size=32) : 10.052899963367647 FPS


INFO:anomalib.models.components.base.anomalib_module:Initializing Patchcore model.


[leather] image-level AUROC: 0.0


/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:208: Attribute 'pre_processor' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['pre_processor'])`.
INFO:timm.models._builder:Loading pretrained weights from Hugging Face hub (timm/wide_resnet50_2.racm_in1k)
INFO:timm.models._hub:[timm/wide_resnet50_2.racm_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
INFO:timm.models._builder:Missing keys (fc.weight, fc.bias) discovered while loading pretrained weights. This is expected if model is being adapted.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
INFO:anomalib.data.datamodules.image.mvtec:Found the dataset.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/home/

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.models.image.patchcore.lightning_model:Aggregating the embedding extracted from the training set.
INFO:anomalib.models.image.patchcore.lightning_model:Applying core-set subsampling to get the embedding.

[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[

Testing: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.callbacks.timer:Testing took 4.700275659561157 seconds
Throughput (batch_size=32) : 8.510139169951282 FPS


[grid] image-level AUROC: 0.9610389471054077


In [37]:
#Compute average AUROC for the PatchCore Model
print("PatchCore Model Performance Eval")

valid_auroc_scores = [score for score in auroc_dict.values() if score is not None]
if len(valid_auroc_scores) > 0:
    avg_auroc = sum(valid_auroc_scores) / len(valid_auroc_scores)
    print("\nAverge Per-category AUROC:", auroc_dict)
    print(f"Average AUROC across categories: {avg_auroc}")
else:
    print("No valid AUROC scores found.")
    

PatchCore Model Performance Eval

Averge Per-category AUROC: {'tile': 1.0, 'leather': 0.0, 'grid': 0.9610389471054077}
Average AUROC across categories: 0.6536796490351359


#### Part 1B: EfficientAD Model: Training and Performance Evaluation

In [38]:
'''recreate datemodules for the EfficientAD model, chaning train and eval batch sizes and setting num workers'''

datamodule_tile_efAD = MVTec(
    root="./datasets",
    category="tile",
    train_batch_size=1,
    eval_batch_size=1,
    num_workers=0,  # set to 0 to force single-threaded data loading to avoid the pickling error
    val_split_mode="from_test",  # Create validation set from test set
    val_split_ratio=0.5,  # Use 50% of test set for validation
)

datamodule_leather_efAD = MVTec(
    root="./datasets",
    category="leather",
    train_batch_size=1,
    eval_batch_size=1,
    num_workers=0,
    val_split_mode="from_test",  # Create validation set from test set
    val_split_ratio=0.5,  # Use 50% of test set for validation
)

datamodule_grid_efAD = MVTec(
    root="./datasets",
    category="grid",
    train_batch_size=1,
    eval_batch_size=1,
    num_workers=0,
    val_split_mode="from_test",  # Create validation set from test set
    val_split_ratio=0.5,  # Use 50% of test set for validation
)



In [39]:
#train and evualate the efficientAD model
from anomalib.models import EfficientAd

auroc_dict = {}

for category , datamodule in [("tile", datamodule_tile_efAD), ("leather", datamodule_leather_efAD), ("grid", datamodule_grid_efAD)]:
    
    #create transform with the preprocessing requirements for the EfficientAD model
    transform = Compose([
    Resize((300, 300)),
    ])

    # Wrap transform in anomalib's PreProcessor object
    pre_processor = PreProcessor(transform=transform)

    #create EfficientAD model for each category
    efficientAD_model = EfficientAd(
        teacher_out_channels=384,  # Number of teacher output channels
        lr=1e-4,
        pre_processor=pre_processor #uses callback hooks to intercept that image batches and pre-process them before it is passed to the model
    )

    #Initialize fresh training engine for each category
    efficientAD_engine = Engine(
        max_epochs=2,  
        accelerator="gpu",  # Automatically detect GPU/CPU
        devices=1,  # Number of devices to use
    )

    #Train the EfficientAD model for each category
    efficientAD_engine.fit(model=efficientAD_model, datamodule=datamodule)

    # Perform Inference and get performance metrics (AUROC score)
    metrics_list = efficientAD_engine.test(model=efficientAD_model, datamodule=datamodule) 
    metrics = metrics_list[0] if len(metrics_list) > 0 else {}

    # get AUROC scores
    category_auroc = metrics.get("image_AUROC", None)
    print(f"[{category}] image-level AUROC: {category_auroc}")

    # Store AUROC scores in dict
    auroc_dict[category] = category_auroc



INFO:anomalib.models.components.base.anomalib_module:Initializing EfficientAd model.
/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:208: Attribute 'pre_processor' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['pre_processor'])`.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
INFO:anomalib.data.datamodules.image.mvtec:Found the dataset.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type                  | Params | Mode 
-----------------------------------------------------------------
0 | pre_processor  | PreProcessor          | 0      | train
1 | post_processor | OneClassPostProcessor | 0      | train
2 | evaluator      | Evaluator             | 0      | train
3 | model          | EfficientAdModel 

Training: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.models.image.efficient_ad.lightning_model:Load pretrained teacher model from pre_trained/efficientad_pretrained_weights/pretrained_teacher_small.pth
Calculate teacher channel mean & std: 100%|██████████| 230/230 [01:04<00:00,  3.57it/s]
/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/core/module.py:516: You called `self.log('train_st', ..., logger=True)` but have no logger configured. You can enable one by doing `Trainer(logger=ALogger(...))`
/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/utilities/data.py:78: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 3. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
/home/alecpippas/dev_projects/AI_Course_Projects/CS-GY6613-Assignments-awp251/.venv/lib/python3.11/site-packages/lightning/pytorch/core/

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:anomalib.models.image.efficient_ad.lightning_model:Calculate Validation Dataset Quantiles
Calculate Validation Dataset Quantiles: 100%|██████████| 58/58 [00:16<00:00,  3.42it/s]


NotImplementedError: ('{} cannot be pickled', '_SingleProcessDataLoaderIter')

In [40]:
#Compute average AUROC for the EfficientAD Model
print("EfficientAD Model")

valid_auroc_scores = [score for score in auroc_dict.values() if score is not None]
if len(valid_auroc_scores) > 0:
    avg_auroc = sum(valid_auroc_scores) / len(valid_auroc_scores)
    print("\n Averge Per-category AUROC:", auroc_dict)
    print(f"Average AUROC: {avg_auroc}")
else:
    print("No valid AUROC scores found.")
    

EfficientAD Model
No valid AUROC scores found.


### Part 2: Similarity 

The similarity search in this project is designed to find other similar images (both normal and anonmalous) to query the image, NOT to specifically find anomalous versions of a good query image.

The point of this exercise is to demonstrate that visual similarity != anomaly status because:

1) normal images can be visually similar to anomalous images
2) anomaly detection models should distinguish between normal and anomalous patterns, not just find similar images
3) the embedding space captures visual features, not just find visually similar images

This exercise exemplifies why PatchCore (in which 1 test image is compared against a memory bank of good image embeddings) works better for anomaly detection.

In [28]:
import torch
import torch.nn.functional as F
import torchvision.transforms as T
from PIL import Image
import timm

#Create a backbone model that returns intermediate feature representations (aka "feature vectors" or "embedding vectors" of anomalous objects 
backbone = timm.create_model(
    model_name="resnet18",
    pretrained=True,
    features_only=True  # no classification head
)
backbone.eval()  # set to eval mode

#transform instanstiation (resize, mean center, and variance normalize)
transform = T.Compose([
    T.Resize((224, 224)),
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

def extract_embedding(model, image: Image.Image) -> torch.Tensor:
    """
    Forward an image through the timm backbone to get a single embedding vector.
    1) transforms the image
    2) does a forward pass
    3) global average pool of the last layer
    """
    # transform given image to a shape of [1, 3, 224, 224]
    x = transform(image).unsqueeze(0)

    with torch.no_grad():
       
        feature_maps = model(x)   # True returns a list of feature maps
        last_map = feature_maps[-1]  # the final layer4 output, shape [1, 512, H, W]

        # global average pool to get [1, 512]
        pooled = F.adaptive_avg_pool2d(last_map, 1)
        embedding = pooled.view(pooled.size(0), -1)  # shape [1, 512]

    return embedding[0]  # return a 1D tensor of size [512]

INFO:timm.models._builder:Loading pretrained weights from Hugging Face hub (timm/resnet18.a1_in1k)
INFO:timm.models._hub:[timm/resnet18.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
INFO:timm.models._builder:Missing keys (fc.weight, fc.bias) discovered while loading pretrained weights. This is expected if model is being adapted.


In [None]:
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

"""Initializes a Qdrant client and recreates an empty 'anomalous_images' collection configured to store 512-dimensional vectors using cosine distance."""


client = QdrantClient(url="localhost", port=6333)

collection_name = "anomalous_images"
embedding_dim = 512

# Create the collection (recreates a new collection overiding an older collection if it exists)
client.recreate_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(size=embedding_dim, distance=Distance.COSINE)
)

INFO:httpx:HTTP Request: GET http://localhost:6333 "HTTP/1.1 200 OK"
  client.recreate_collection(
INFO:httpx:HTTP Request: DELETE http://localhost:6333/collections/anomalous_images "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: PUT http://localhost:6333/collections/anomalous_images "HTTP/1.1 200 OK"


True

In [30]:
import os

# Define the categories you want to index
# categories = ["grid", "leather", "tile"]

#index only grid images to Qpred to save time (sufficient for testing similarity search)
categories = ["grid"]



base_dir = "./datasets" 

#initialize PointStruct list and the index
points_to_upsert = []
point_id = 1

for cat in categories:
    # We focus on the test set here, where anomalies exist.
    test_dir = os.path.join(base_dir, cat, "test")
    # Each test folder contains a "good" folder (normal) and one or more anomaly folders.
    for subfolder in os.listdir(test_dir):
        subfolder_path = os.path.join(test_dir, subfolder)
        if not os.path.isdir(subfolder_path):
            continue

        #assign "normal"/"anomalous" label to images
        # In MVTec, "good" folder has the normal images; all other folders have anomalous images
        label = "normal" if subfolder.lower() == "good" else "anomalous"
        for filename in os.listdir(subfolder_path):
            if filename.lower().endswith((".png", ".jpg", ".jpeg")):
                img_path = os.path.join(subfolder_path, filename)
                try:
                    image = Image.open(img_path).convert("RGB")
                except Exception as e:
                    print(f"Error loading image {img_path}: {e}")
                    continue

                embedding = extract_embedding(backbone, image)
                emb_np = embedding.cpu().numpy().tolist()

                point = PointStruct(
                    id=point_id,
                    vector=emb_np,
                    payload={
                        "filename": filename,
                        "category": cat,
                        "label": label
                    }
                )
                points_to_upsert.append(point)
                point_id += 1

# Upsert all points (PointStruct objects) to Qdrant
client.upsert(collection_name=collection_name, points=points_to_upsert)
print(f"Indexed {len(points_to_upsert)} images from MVTec into Qdrant.")

INFO:httpx:HTTP Request: PUT http://localhost:6333/collections/anomalous_images/points?wait=true "HTTP/1.1 200 OK"


Indexed 78 images from MVTec into Qdrant.


In [None]:
# Extract an embedding (feature vector) from chosen query image:
query_img = Image.open("./datasets/grid/test/good/001.png").convert("RGB")
query_embedding = extract_embedding(backbone, query_img) #returns a Torch Tensor object

print("Extracted embedding shape:", query_embedding.shape)


Extracted embedding shape: torch.Size([512])


In [32]:
from qdrant_client.models import Filter, FieldCondition, MatchValue

'''Query Qdrant Database for Similar Pictures'''

# Convert the query embedding to a list of floats, if it isn't already.
query_vector = query_embedding.cpu().numpy().tolist()

qdrant_filter = None  #no filtering used: we want to test that similarity search doesn't return opposite labeled images

# Query Qdrant for the top 5 similar images.
hits = client.search(
    collection_name=collection_name,  # your collection name (e.g., "mvtec_images")
    query_vector=query_vector,
    limit=5,
    query_filter=qdrant_filter
)

# Process the results:
print("Top 5 similar images:")
for i, hit in enumerate(hits):
    print(f"Rank {i+1}: ID={hit.id}, Score={hit.score:.4f}, Payload={hit.payload}")

  hits = client.search(
INFO:httpx:HTTP Request: POST http://localhost:6333/collections/anomalous_images/points/search "HTTP/1.1 200 OK"


Top 5 similar images:
Rank 1: ID=1, Score=1.0000, Payload={'filename': '001.png', 'category': 'grid', 'label': 'normal'}
Rank 2: ID=73, Score=0.9837, Payload={'filename': '004.png', 'category': 'grid', 'label': 'anomalous'}
Rank 3: ID=50, Score=0.9825, Payload={'filename': '004.png', 'category': 'grid', 'label': 'anomalous'}
Rank 4: ID=76, Score=0.9801, Payload={'filename': '002.png', 'category': 'grid', 'label': 'anomalous'}
Rank 5: ID=43, Score=0.9782, Payload={'filename': '003.png', 'category': 'grid', 'label': 'anomalous'}


### Part 3: Patch Description Network (PDN) Receptive Field Calculation


<div style="text-align: left; font-size: 110%;">

<b>Given the following layers (filter size $f_i$, stride $s_i$):</b>

Layer 1: $f_{1} = 4$, $s_{1} = 1$  (4×4 conv, stride=1)  
Layer 2: $f_{2} = 2$, $s_{2} = 2$  (2×2 pool, stride=2)  
Layer 3: $f_{3} = 4$, $s_{3} = 1$  (4×4 conv, stride=1)  
Layer 4: $f_{4} = 2$, $s_{4} = 2$  (2×2 pool, stride=2)  
Layer 5: $f_{5} = 3$, $s_{5} = 1$  (3×3 conv, stride=1)  
Layer 6: $f_{6} = 4$, $s_{6} = 1$  (4×4 conv, stride=1)


Define $r_{0} = 1$ as the base receptive field (one input pixel).

<p>
<b>Receptive field formula after <span style="font-family: 'Latin Modern Math', serif;">&ell;</span> layers:</b>
</p>

$$
r_{\ell} = r_{\ell-1} + (f_\ell - 1) \prod_{j=1}^{\ell-1} s_j
$$

<p>
<b>Layer-by-layer computation:</b>
</p>

<pre style="font-size: 105%; font-family: 'Fira Mono', 'Consolas', monospace; background: #f8f8f8; padding: 8px;">
r₁ = r₀ + (f₁ - 1) × 1
   = 1 + (4-1) × 1
   = 4

r₂ = r₁ + (f₂ - 1) × s₁
   = 4 + (2-1) × 1
   = 5

r₃ = r₂ + (f₃ - 1) × (s₁ × s₂)
   = 5 + (4-1) × (1 × 2)
   = 5 + 3 × 2
   = 11

r₄ = r₃ + (f₄ - 1) × (s₁ × s₂ × s₃)
   = 11 + (2-1) × (1 × 2 × 1)
   = 11 + 1 × 2
   = 13

r₅ = r₄ + (f₅ - 1) × (s₁ × s₂ × s₃ × s₄)
   = 13 + (3-1) × (1 × 2 × 1 × 2)
   = 13 + 2 × 4
   = 21

r₆ = r₅ + (f₆ - 1) × (s₁ × s₂ × s₃ × s₄ × s₅)
   = 21 + (4-1) × (1 × 2 × 1 × 2 × 1)
   = 21 + 3 × 4
   = 33
</pre>

<p>
<b>Conclusion:</b><br>
The final output neuron in Layer 6 has a receptive field of <b>33 × 33 pixels</b>.
</p>

<div style="border: 2px solid #444; padding: 8px; display: inline-block; font-size: 110%; background: #f0f0f0;">
Therefore, each output feature descriptor indeed covers a <b>33 × 33</b> patch from the input.
</div>
</div>
