# Example: Detection of Similar Queries Using Similarity-based Detector

This notebook demonstrates how to use the similarity-based detector from the monitoring toolkit to identify repeated or highly similar queries.

We simulate a practical scenario where an attacker sends slightly modified images to probe the model's decision boundary.

## 1. Setup and Imports

We start by importing necessary libraries and loading the detector from the toolkit.

In [None]:
from detectors.registry import get_detector
from utils.query import Query

import torch
import matplotlib.pyplot as plt
import torchvision.transforms as T
from PIL import Image
from pprint import pprint
import requests
from io import BytesIO

## 2. Load Example Image
For this demo, we use the classic dog image from ``torchvision.datasets.``

In [None]:
URL = "https://raw.githubusercontent.com/pytorch/hub/master/images/dog.jpg"
response = requests.get(URL)
image = Image.open(BytesIO(response.content)).convert("RGB")
image

## 3. Simulate Slightly Perturbed Image
We simulate adversarial behaviour by adding low-amplitude Gaussian noise to the original image.

The noise is imperceptible to humans but can fool a model like an adversarial example.



In [None]:
transform = T.Compose([
    T.Resize((224, 224)),
    T.ToTensor()
])
img_tensor = transform(image)

noise_std = 0.01  
noise = torch.randn_like(img_tensor) * noise_std
img_noised = img_tensor + noise
img_noised = torch.clamp(img_noised, 0, 1)  

## 4. Visualize Original vs Perturbed Image

Below are original and perturbed images, along with the difference (noise) between them. 

In [None]:
diff = torch.abs(img_noised - img_tensor)
diff_vis = diff / diff.max()

fig, axs = plt.subplots(1, 4, figsize=(20, 5))
axs[0].imshow(img_tensor.permute(1, 2, 0))
axs[0].set_title("Original")
axs[0].axis("off")

axs[1].imshow(img_noised.permute(1, 2, 0))
axs[1].set_title("Noised")
axs[1].axis("off")

axs[2].imshow(diff.permute(1, 2, 0))
axs[2].set_title("Visualized Noise (absolute difference)")
axs[2].axis("off")

axs[3].imshow(diff_vis.permute(1, 2, 0))
axs[3].set_title("Visualized Noise (amplified)")
axs[3].axis("off")

plt.tight_layout()
plt.show()

## 5. Initialize Similarity-based Detector
We use ``ImageSimilarityDetector`` to identify queries that have similarity score with any of the last 9 images greater than 0.9.

The default similarity metric for an image similarity detector is the **cosine similarity** between image **embeddings**, obtained from the penultimate layer of ResNet18 pre-trained on ImageNet. 

In [None]:
detector = get_detector(
    "image_similarity", 
    config={"threshold": 0.9, "max_history_size": 10}
)

## 6. Run Detector
We now feed both images to the similarity detector and track its detection result after each query. 

In [None]:
queries = [
    Query(input_data=img_tensor),
    Query(input_data=img_noised)
]

for i, query in enumerate(queries, 1):
    result = detector.process(query)
    print(f"Query {i}: suspicious={result.is_suspicious}, confidence={result.confidence:.4f}, reason={result.reason}")
    if result.is_suspicious:
        pprint(detector.get_state(include_embedding=False))

## Results and Interpretation
After the second query is processed, the similarity score between the two images is 0.9978, which demonstrates that the images are nearly identical. 

As the similarity score is greater than 0.9, the second query is classified as suspicious. 

The detector with high confidence detects imperceptable modifications. 