# Visual Search System – System Overview

### This notebook provides a high-level overview of the visual search backend.
### It explains the system architecture, the flow of a search query, and the key components involved in processing image queries and generating AI explanations.

## System Architecture

### The system consists of the following components:

- **Frontend (Next.js)**: Accepts user queries and displays results.
- **Backend (FastAPI)**:
  - `search.py`: Encodes text queries, searches image embeddings using FAISS (or alternative vector DB).
  - `explain.py`: Uses a captioning model to explain image relevance to the query.
- **Embeddings**: Pre-computed image embeddings stored in FAISS or another vector store.
- **Data**: Image files (raw and processed).

## Query Flow

1. **User inputs a natural language query**.
2. **Backend encodes query using OpenCLIP**.
3. **Query embedding is matched against image vectors in FAISS**.
4. **Top-K matching images are returned**.
5. **BLIP captioning model generates explanations per image**.
6. **Frontend displays images + explanation text**.

## Backend Module Overview

- `main.py`: FastAPI entrypoint for `/search` and `/image` routes.
- `search.py`: Encodes queries and searches indexed image embeddings.
- `explain.py`: Uses BLIP to generate captions + explanations.
- `utils.py`: Handles image loading from local or URL.

### Example
```python
from app.search import search_images
results = search_images("a dog in hoodie")
results

## Models Used

- **OpenCLIP (`ViT-B-32`)** for text and image embeddings.
- **BLIP** for generating captions and natural language explanations.

These models are loaded once and used across queries.

In [1]:
import requests

query = "a dog with hoodie"
response = requests.get(f"http://localhost:8000/search?query={query}")
results = response.json()["results"]

for item in results:
    print("Image:", item["image_url"])
    print("Explanation:", item["explanation"])

Image: /image/10001.jpg
Explanation: a dog wearing a yellow and black shirt. Relevant to query: 'a dog with hoodie'
Image: /image/21275.jpg
Explanation: a small dog wearing a costume on its head. Relevant to query: 'a dog with hoodie'
Image: /image/8025.jpg
Explanation: a dog wearing a sweater. Relevant to query: 'a dog with hoodie'
Image: /image/15813.jpg
Explanation: a small black dog wearing a red, white and blue striped shirt. Relevant to query: 'a dog with hoodie'
Image: /image/6111.jpg
Explanation: a dog sitting on a table. Relevant to query: 'a dog with hoodie'


## File Structure Summary
```
visual-search-system/
├── backend/
│ ├── app/
│ │ ├── main.py ← FastAPI app
│ │ ├── search.py ← Embedding-based retrieval
│ │ ├── explain.py ← BLIP caption generator
│ │ ├── utils.py ← Image loading helper
├── embeddings/
│ ├── image.index ← FAISS index
│ ├── metadata.npy ← Image file mappings
├── data/
│ ├── images/ ← Raw images
│ ├── processed/ ← Resized, preprocessed images
├── scripts/
│ ├── generate_embeddings.py ← Embedding generation pipeline