🔍 Visual Search

AI-powered product discovery for e-commerce platforms

About this project

This is a demonstration of visual search technology for e-commerce platforms.

Traditional product search forces customers to describe what they want in words — which is hard when they already have an image in mind. This project shows how AI-powered visual search can replace or complement text search: a shopper uploads any product photo, pastes a screenshot, or drops an image — and the system instantly finds visually similar products from the catalogue.

The goal is to demonstrate the full engineering stack required to build this capability in a real e-commerce environment: image embedding with CLIP, vector similarity search with Qdrant, async indexing via Redis, product metadata in PostgreSQL, image storage in MinIO, and a complete storefront UI — all containerised and production-ready.

Screenshots

Product catalogue — browse, filter by category, sort

Search by image — drag, browse, or paste a screenshot with Ctrl+V

Visual search results — ranked by similarity with match scores

✨ Features

Image-to-product search — CLIP (ViT-B/32) embeds query images into 512-dim vectors, Qdrant finds nearest neighbours by cosine similarity
Paste screenshots — press Ctrl+V anywhere on the page to paste a screenshot directly into the search box; the drawer opens automatically
Drag & drop — or browse files normally
Match scores — every result shows a percentage match, colour-coded by confidence (green / blue / grey)
Category filter — narrow results to a specific product category before searching
Full product catalogue — browse 194 products across 24 categories, filter, sort by name / price / rating, paginate
Product modals — click any card for full details: description, brand, stock level, warranty info
Redis caching — repeated identical searches are served instantly from cache
Rate limiting — Nginx enforces 60 req/min per IP on all API endpoints
Idempotent seeder — re-running make seed never creates duplicates (upsert on conflict)

🛠️ Tech stack

Layer	Technology
AI / Embeddings	CLIP ViT-B/32 via HuggingFace Transformers
Vector search	Qdrant (cosine similarity ANN)
API services	FastAPI + Uvicorn (4 microservices)
Metadata store	PostgreSQL 15 + SQLAlchemy ORM
Image store	MinIO (S3-compatible object storage)
Job queue / cache	Redis 7
API gateway	Nginx 1.25 (rate limiting, image proxy, routing)
Frontend	Vanilla JS SPA served by nginx:alpine
Orchestration	Docker Compose (10 containers)

🏗️ Architecture

                         ┌─────────────────────────────────┐
                         │           Browser               │
                         └──────────────┬──────────────────┘
                                        │ HTTP :80
                         ┌──────────────▼──────────────────┐
                         │         Nginx Gateway           │
                         │  rate limiting · image proxy    │
                         └──┬──────┬──────┬──────┬─────────┘
                            │      │      │      │
              /search   /index  /images   /   (internal only)
                 │         │       │      │
        ┌────────▼─┐ ┌─────▼──┐   │  ┌───▼─────┐
        │  Search  │ │Indexing│   │  │Frontend │
        │ :8002    │ │ :8001  │   │  │  :80    │
        └────┬─────┘ └───┬────┘   │  └─────────┘
             │           │        │
        ┌────▼───┐   ┌───▼───┐ ┌──▼──────┐
        │  CLIP  │   │ CLIP  │ │  MinIO  │
        │Embedding│  │Embed  │ │ images  │
        │ :8004  │   │:8004  │ └─────────┘
        └────────┘   └───┬───┘
                         │
                    ┌────▼────┐      ┌──────────┐
                    │  Redis  │      │ Main API │ ← internal only
                    │  queue  │      │  :8003   │
                    └────┬────┘      └────┬─────┘
                         │ worker         │
                    ┌────▼────┐      ┌────▼─────┐
                    │ Qdrant  │      │ Postgres │
                    │ vectors │      │ metadata │
                    └─────────┘      └──────────┘

How indexing works

POST /index receives image + product metadata
Image uploaded to MinIO, metadata saved to PostgreSQL
Job pushed to Redis queue
Background worker dequeues → fetches image → CLIP embeds it → vector stored in Qdrant

How search works

POST /search receives a query image
CLIP generates a 512-dim embedding vector for the image
Qdrant performs ANN cosine-similarity search → returns top-K product IDs
Full metadata fetched from PostgreSQL in a single batch request
Results returned ranked by similarity score

🚀 Quick start

Prerequisites

Docker Desktop (Windows/Mac) or Docker Engine + Compose plugin (Linux)
Python 3.9+ (for the one-time seeder script only)
~4 GB RAM · ~3 GB disk (CLIP model is ~600 MB, downloaded once and cached permanently)

1. Clone

git clone https://github.com/YOUR_USERNAME/visual-search.git
cd visual-search

2. Configure

cp .env.example .env

The defaults in .env.example work out of the box for local development. Change passwords before deploying to any shared environment.

3. Build

make build

4. Start

make up

Starts all containers in dependency order. The first run downloads the CLIP model (~600 MB) — cached permanently in a Docker volume after that.

5. Seed the database

make seed

Fetches 194 real products across 24 categories from DummyJSON, downloads their images, and indexes everything into Qdrant + Postgres. Takes ~2–3 minutes.

6. Open

http://localhost

🧰 Makefile reference

make all            # Full setup: build → start → seed

make build          # Build all Docker images
make build-backend  # Build only the 4 backend service images
make build-frontend # Build only the frontend

make up             # Start all containers (staged startup, no seeding)
make down           # Stop and remove containers (data volumes preserved)
make restart        # Restart all running containers

make seed           # Fetch + index 194 products from DummyJSON

make ps             # Show container status and health
make logs           # Stream all container logs
make logs-search    # Stream a specific service's logs

make clean          # Remove containers, keep data volumes
make fclean         # Full wipe — removes containers, images, AND all data

🌐 API reference

`POST /search` — search by image

curl -X POST http://localhost/search?top_k=12 \
     -F "file=@product.jpg"

Parameter	Type	Default	Description
`file`	file	required	Query image (JPEG, PNG, WebP, …)
`top_k`	int	12	Number of results to return (max 50)
`category`	string	—	Restrict results to one category

Response

{
  "results": [
    {
      "product_id": "DJS-001",
      "score": 0.9234,
      "name": "Essence Mascara Lash Princess",
      "brand": "Essence",
      "price": 9.99,
      "category": "beauty",
      "description": "...",
      "rating": 4.94,
      "stock": 5,
      "image_url": "http://minio:9000/product-images/products/DJS-001.jpg"
    }
  ],
  "cache": false,
  "count": 12
}

`POST /index` — add a product

curl -X POST http://localhost/index \
     -F "file=@shoe.jpg" \
     -F "product_id=SKU-123" \
     -F "name=Blue Running Shoes" \
     -F "price=89.99" \
     -F "category=footwear" \
     -F "brand=Nike" \
     -F "stock=42" \
     -F "rating=4.5"

`GET /products/` — browse catalogue

GET /products/?skip=0&limit=40&category=beauty&sort=name

Sort options: name · price_asc · price_desc · rating

Other endpoints

GET  /products/categories               # list all distinct categories
GET  /products/batch?ids=ID1,ID2,ID3    # batch fetch by product ID
GET  /products/{product_id}             # single product detail
DELETE /products/{product_id}           # remove from metadata store
DELETE /index/{product_id}              # remove from vector index + metadata
GET  /health                            # gateway liveness check

🔌 Integrating into your own e-commerce platform

This project is designed to be dropped into an existing stack. Three integration paths:

Option A — Call the search API from your frontend

async function searchByImage(imageFile) {
  const fd = new FormData();
  fd.append('file', imageFile);

  const res = await fetch('http://your-host/search?top_k=12', {
    method: 'POST',
    body: fd,
  });
  const { results } = await res.json();
  // results[i] → { product_id, score, name, price, image_url, ... }
  return results;
}

Option B — Index your own product catalogue

Replace make seed with a script that calls POST /index for each product in your database. The indexing service handles image upload, embedding, and storage automatically.

import requests

for product in your_products:
    requests.post('http://localhost:8001/index',
        data={
            'product_id': product['sku'],
            'name':        product['name'],
            'price':       product['price'],
            'category':    product['category'],
            'brand':       product['brand'],
            'stock':       product['stock'],
            'rating':      product['rating'],
        },
        files={'file': open(product['image_path'], 'rb')}
    )

Port 8001 (indexing service direct) bypasses Nginx rate limiting — use this for bulk ingestion.

Option C — Replace the dataset entirely

make fclean          # wipe all data volumes
make build && make up
# then run your own indexer

📁 Project structure

visual-search/
├── docker-compose.yml          # 10-service stack
├── .env.example                # config template — copy to .env
├── Makefile                    # all commands
├── docs/
│   └── screenshots/            # project screenshots (used in README)
├── nginx/
│   └── nginx.conf              # API gateway: routing, rate limits, image proxy
├── scripts/
│   ├── seed.py                 # DummyJSON seeder (194 products, 24 categories)
│   └── requirements.txt
└── services/
    ├── frontend/               # Vanilla JS SPA (nginx:alpine)
    │   └── app/index.html
    ├── embedding/              # CLIP inference — POST /embed  (:8004)
    │   └── app/
    │       ├── main.py
    │       └── model.py
    ├── indexing/               # Ingestion + Redis worker  (:8001)
    │   └── app/
    │       ├── routes.py       # POST /index, DELETE /index/:id
    │       ├── storage.py      # MinIO + Postgres helpers
    │       └── worker.py       # Redis consumer → Qdrant upsert
    ├── search/                 # Visual search endpoint  (:8002)
    │   └── app/
    │       ├── routes.py       # POST /search
    │       └── cache.py        # Redis result cache
    └── main_api/               # Products CRUD  (:8003)
        └── app/
            ├── routes.py
            ├── models.py
            └── database.py

⚙️ Key configuration

Variable	Default	Description
`EMBEDDING_MODEL`	`openai/clip-vit-base-patch32`	HuggingFace model ID. Use `clip-vit-large-patch14` for higher accuracy (768-dim, slower)
`EMBEDDING_DIM`	`512`	Must match the chosen model's output dimension
`REDIS_CACHE_TTL`	`3600`	Search result cache lifetime in seconds
`POSTGRES_PASSWORD`	(change me)	Set a strong password before deploying
`MINIO_ROOT_PASSWORD`	(change me)	Set a strong password before deploying

🐛 Troubleshooting

First startup is slow — The embedding service downloads the CLIP model (~600 MB) on first run. Subsequent starts are instant (cached in hf_cache Docker volume).

make seed fails — Make sure all containers are healthy first: make ps. Then retry.

502 Bad Gateway after container restart — Nginx may hold a stale IP. Fix: docker compose restart nginx.

Images not loading — The MinIO bucket is set to public-read automatically on startup. If images still 404: docker compose logs indexing | grep -i bucket.

Out of memory during build — PyTorch installs ~2 GB of dependencies. Increase Docker Desktop's memory limit to at least 4 GB in Settings → Resources.

Search returns no results — Qdrant must be populated. Run make seed after make up.

📄 License

MIT — see LICENSE.

Built with FastAPI · CLIP · Qdrant · PostgreSQL · MinIO · Redis · Docker

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs/screenshots		docs/screenshots
nginx		nginx
scripts		scripts
services		services
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

🔍 Visual Search

AI-powered product discovery for e-commerce platforms

About this project

Screenshots

Product catalogue — browse, filter by category, sort

Search by image — drag, browse, or paste a screenshot with Ctrl+V

Visual search results — ranked by similarity with match scores

✨ Features

🛠️ Tech stack

🏗️ Architecture

How indexing works

How search works

🚀 Quick start

Prerequisites

1. Clone

2. Configure

3. Build

4. Start

5. Seed the database

6. Open

🧰 Makefile reference

🌐 API reference

POST /search — search by image

POST /index — add a product

GET /products/ — browse catalogue

Other endpoints

🔌 Integrating into your own e-commerce platform

Option A — Call the search API from your frontend

Option B — Index your own product catalogue

Option C — Replace the dataset entirely

📁 Project structure

⚙️ Key configuration

🐛 Troubleshooting

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /search` — search by image

`POST /index` — add a product

`GET /products/` — browse catalogue

Packages