This is a demonstration of visual search technology for e-commerce platforms.
Traditional product search forces customers to describe what they want in words β which is hard when they already have an image in mind. This project shows how AI-powered visual search can replace or complement text search: a shopper uploads any product photo, pastes a screenshot, or drops an image β and the system instantly finds visually similar products from the catalogue.
The goal is to demonstrate the full engineering stack required to build this capability in a real e-commerce environment: image embedding with CLIP, vector similarity search with Qdrant, async indexing via Redis, product metadata in PostgreSQL, image storage in MinIO, and a complete storefront UI β all containerised and production-ready.
- Image-to-product search β CLIP (ViT-B/32) embeds query images into 512-dim vectors, Qdrant finds nearest neighbours by cosine similarity
- Paste screenshots β press
Ctrl+Vanywhere on the page to paste a screenshot directly into the search box; the drawer opens automatically - Drag & drop β or browse files normally
- Match scores β every result shows a percentage match, colour-coded by confidence (green / blue / grey)
- Category filter β narrow results to a specific product category before searching
- Full product catalogue β browse 194 products across 24 categories, filter, sort by name / price / rating, paginate
- Product modals β click any card for full details: description, brand, stock level, warranty info
- Redis caching β repeated identical searches are served instantly from cache
- Rate limiting β Nginx enforces 60 req/min per IP on all API endpoints
- Idempotent seeder β re-running
make seednever creates duplicates (upsert on conflict)
| Layer | Technology |
|---|---|
| AI / Embeddings | CLIP ViT-B/32 via HuggingFace Transformers |
| Vector search | Qdrant (cosine similarity ANN) |
| API services | FastAPI + Uvicorn (4 microservices) |
| Metadata store | PostgreSQL 15 + SQLAlchemy ORM |
| Image store | MinIO (S3-compatible object storage) |
| Job queue / cache | Redis 7 |
| API gateway | Nginx 1.25 (rate limiting, image proxy, routing) |
| Frontend | Vanilla JS SPA served by nginx:alpine |
| Orchestration | Docker Compose (10 containers) |
βββββββββββββββββββββββββββββββββββ
β Browser β
ββββββββββββββββ¬βββββββββββββββββββ
β HTTP :80
ββββββββββββββββΌβββββββββββββββββββ
β Nginx Gateway β
β rate limiting Β· image proxy β
ββββ¬βββββββ¬βββββββ¬βββββββ¬ββββββββββ
β β β β
/search /index /images / (internal only)
β β β β
ββββββββββΌββ βββββββΌβββ β βββββΌββββββ
β Search β βIndexingβ β βFrontend β
β :8002 β β :8001 β β β :80 β
ββββββ¬ββββββ βββββ¬βββββ β βββββββββββ
β β β
ββββββΌββββ βββββΌββββ ββββΌβββββββ
β CLIP β β CLIP β β MinIO β
βEmbeddingβ βEmbed β β images β
β :8004 β β:8004 β βββββββββββ
ββββββββββ βββββ¬ββββ
β
ββββββΌβββββ ββββββββββββ
β Redis β β Main API β β internal only
β queue β β :8003 β
ββββββ¬βββββ ββββββ¬ββββββ
β worker β
ββββββΌβββββ ββββββΌββββββ
β Qdrant β β Postgres β
β vectors β β metadata β
βββββββββββ ββββββββββββ
POST /indexreceives image + product metadata- Image uploaded to MinIO, metadata saved to PostgreSQL
- Job pushed to Redis queue
- Background worker dequeues β fetches image β CLIP embeds it β vector stored in Qdrant
POST /searchreceives a query image- CLIP generates a 512-dim embedding vector for the image
- Qdrant performs ANN cosine-similarity search β returns top-K product IDs
- Full metadata fetched from PostgreSQL in a single batch request
- Results returned ranked by similarity score
- Docker Desktop (Windows/Mac) or Docker Engine + Compose plugin (Linux)
- Python 3.9+ (for the one-time seeder script only)
- ~4 GB RAM Β· ~3 GB disk (CLIP model is ~600 MB, downloaded once and cached permanently)
git clone https://github.com/YOUR_USERNAME/visual-search.git
cd visual-searchcp .env.example .envThe defaults in .env.example work out of the box for local development. Change passwords before deploying to any shared environment.
make buildmake upStarts all containers in dependency order. The first run downloads the CLIP model (~600 MB) β cached permanently in a Docker volume after that.
make seedFetches 194 real products across 24 categories from DummyJSON, downloads their images, and indexes everything into Qdrant + Postgres. Takes ~2β3 minutes.
http://localhost
make all # Full setup: build β start β seed
make build # Build all Docker images
make build-backend # Build only the 4 backend service images
make build-frontend # Build only the frontend
make up # Start all containers (staged startup, no seeding)
make down # Stop and remove containers (data volumes preserved)
make restart # Restart all running containers
make seed # Fetch + index 194 products from DummyJSON
make ps # Show container status and health
make logs # Stream all container logs
make logs-search # Stream a specific service's logs
make clean # Remove containers, keep data volumes
make fclean # Full wipe β removes containers, images, AND all datacurl -X POST http://localhost/search?top_k=12 \
-F "file=@product.jpg"| Parameter | Type | Default | Description |
|---|---|---|---|
file |
file | required | Query image (JPEG, PNG, WebP, β¦) |
top_k |
int | 12 | Number of results to return (max 50) |
category |
string | β | Restrict results to one category |
Response
{
"results": [
{
"product_id": "DJS-001",
"score": 0.9234,
"name": "Essence Mascara Lash Princess",
"brand": "Essence",
"price": 9.99,
"category": "beauty",
"description": "...",
"rating": 4.94,
"stock": 5,
"image_url": "http://minio:9000/product-images/products/DJS-001.jpg"
}
],
"cache": false,
"count": 12
}curl -X POST http://localhost/index \
-F "file=@shoe.jpg" \
-F "product_id=SKU-123" \
-F "name=Blue Running Shoes" \
-F "price=89.99" \
-F "category=footwear" \
-F "brand=Nike" \
-F "stock=42" \
-F "rating=4.5"GET /products/?skip=0&limit=40&category=beauty&sort=name
Sort options: name Β· price_asc Β· price_desc Β· rating
GET /products/categories # list all distinct categories
GET /products/batch?ids=ID1,ID2,ID3 # batch fetch by product ID
GET /products/{product_id} # single product detail
DELETE /products/{product_id} # remove from metadata store
DELETE /index/{product_id} # remove from vector index + metadata
GET /health # gateway liveness check
This project is designed to be dropped into an existing stack. Three integration paths:
async function searchByImage(imageFile) {
const fd = new FormData();
fd.append('file', imageFile);
const res = await fetch('http://your-host/search?top_k=12', {
method: 'POST',
body: fd,
});
const { results } = await res.json();
// results[i] β { product_id, score, name, price, image_url, ... }
return results;
}Replace make seed with a script that calls POST /index for each product in your database. The indexing service handles image upload, embedding, and storage automatically.
import requests
for product in your_products:
requests.post('http://localhost:8001/index',
data={
'product_id': product['sku'],
'name': product['name'],
'price': product['price'],
'category': product['category'],
'brand': product['brand'],
'stock': product['stock'],
'rating': product['rating'],
},
files={'file': open(product['image_path'], 'rb')}
)Port
8001(indexing service direct) bypasses Nginx rate limiting β use this for bulk ingestion.
make fclean # wipe all data volumes
make build && make up
# then run your own indexervisual-search/
βββ docker-compose.yml # 10-service stack
βββ .env.example # config template β copy to .env
βββ Makefile # all commands
βββ docs/
β βββ screenshots/ # project screenshots (used in README)
βββ nginx/
β βββ nginx.conf # API gateway: routing, rate limits, image proxy
βββ scripts/
β βββ seed.py # DummyJSON seeder (194 products, 24 categories)
β βββ requirements.txt
βββ services/
βββ frontend/ # Vanilla JS SPA (nginx:alpine)
β βββ app/index.html
βββ embedding/ # CLIP inference β POST /embed (:8004)
β βββ app/
β βββ main.py
β βββ model.py
βββ indexing/ # Ingestion + Redis worker (:8001)
β βββ app/
β βββ routes.py # POST /index, DELETE /index/:id
β βββ storage.py # MinIO + Postgres helpers
β βββ worker.py # Redis consumer β Qdrant upsert
βββ search/ # Visual search endpoint (:8002)
β βββ app/
β βββ routes.py # POST /search
β βββ cache.py # Redis result cache
βββ main_api/ # Products CRUD (:8003)
βββ app/
βββ routes.py
βββ models.py
βββ database.py
| Variable | Default | Description |
|---|---|---|
EMBEDDING_MODEL |
openai/clip-vit-base-patch32 |
HuggingFace model ID. Use clip-vit-large-patch14 for higher accuracy (768-dim, slower) |
EMBEDDING_DIM |
512 |
Must match the chosen model's output dimension |
REDIS_CACHE_TTL |
3600 |
Search result cache lifetime in seconds |
POSTGRES_PASSWORD |
(change me) | Set a strong password before deploying |
MINIO_ROOT_PASSWORD |
(change me) | Set a strong password before deploying |
First startup is slow β The embedding service downloads the CLIP model (~600 MB) on first run. Subsequent starts are instant (cached in hf_cache Docker volume).
make seed fails β Make sure all containers are healthy first: make ps. Then retry.
502 Bad Gateway after container restart β Nginx may hold a stale IP. Fix: docker compose restart nginx.
Images not loading β The MinIO bucket is set to public-read automatically on startup. If images still 404: docker compose logs indexing | grep -i bucket.
Out of memory during build β PyTorch installs ~2 GB of dependencies. Increase Docker Desktop's memory limit to at least 4 GB in Settings β Resources.
Search returns no results β Qdrant must be populated. Run make seed after make up.
MIT β see LICENSE.



