Skip to content

Aashish365/Visual-Search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” Visual Search

AI-powered product discovery for e-commerce platforms

Docker FastAPI Python CLIP Qdrant License: MIT


About this project

This is a demonstration of visual search technology for e-commerce platforms.

Traditional product search forces customers to describe what they want in words β€” which is hard when they already have an image in mind. This project shows how AI-powered visual search can replace or complement text search: a shopper uploads any product photo, pastes a screenshot, or drops an image β€” and the system instantly finds visually similar products from the catalogue.

The goal is to demonstrate the full engineering stack required to build this capability in a real e-commerce environment: image embedding with CLIP, vector similarity search with Qdrant, async indexing via Redis, product metadata in PostgreSQL, image storage in MinIO, and a complete storefront UI β€” all containerised and production-ready.


Screenshots

Product catalogue β€” browse, filter by category, sort

Product catalogue

Search by image β€” drag, browse, or paste a screenshot with Ctrl+V

Upload image

Visual search results β€” ranked by similarity with match scores

Search results


✨ Features

  • Image-to-product search β€” CLIP (ViT-B/32) embeds query images into 512-dim vectors, Qdrant finds nearest neighbours by cosine similarity
  • Paste screenshots β€” press Ctrl+V anywhere on the page to paste a screenshot directly into the search box; the drawer opens automatically
  • Drag & drop β€” or browse files normally
  • Match scores β€” every result shows a percentage match, colour-coded by confidence (green / blue / grey)
  • Category filter β€” narrow results to a specific product category before searching
  • Full product catalogue β€” browse 194 products across 24 categories, filter, sort by name / price / rating, paginate
  • Product modals β€” click any card for full details: description, brand, stock level, warranty info
  • Redis caching β€” repeated identical searches are served instantly from cache
  • Rate limiting β€” Nginx enforces 60 req/min per IP on all API endpoints
  • Idempotent seeder β€” re-running make seed never creates duplicates (upsert on conflict)

πŸ› οΈ Tech stack

Layer Technology
AI / Embeddings CLIP ViT-B/32 via HuggingFace Transformers
Vector search Qdrant (cosine similarity ANN)
API services FastAPI + Uvicorn (4 microservices)
Metadata store PostgreSQL 15 + SQLAlchemy ORM
Image store MinIO (S3-compatible object storage)
Job queue / cache Redis 7
API gateway Nginx 1.25 (rate limiting, image proxy, routing)
Frontend Vanilla JS SPA served by nginx:alpine
Orchestration Docker Compose (10 containers)

πŸ—οΈ Architecture

Architecture diagram

                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚           Browser               β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚ HTTP :80
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚         Nginx Gateway           β”‚
                         β”‚  rate limiting Β· image proxy    β”‚
                         β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚      β”‚      β”‚      β”‚
              /search   /index  /images   /   (internal only)
                 β”‚         β”‚       β”‚      β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β” β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”   β”‚  β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
        β”‚  Search  β”‚ β”‚Indexingβ”‚   β”‚  β”‚Frontend β”‚
        β”‚ :8002    β”‚ β”‚ :8001  β”‚   β”‚  β”‚  :80    β”‚
        β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚           β”‚        β”‚
        β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”   β”Œβ”€β”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
        β”‚  CLIP  β”‚   β”‚ CLIP  β”‚ β”‚  MinIO  β”‚
        β”‚Embeddingβ”‚  β”‚Embed  β”‚ β”‚ images  β”‚
        β”‚ :8004  β”‚   β”‚:8004  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”¬β”€β”€β”€β”˜
                         β”‚
                    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Redis  β”‚      β”‚ Main API β”‚ ← internal only
                    β”‚  queue  β”‚      β”‚  :8003   β”‚
                    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
                         β”‚ worker         β”‚
                    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
                    β”‚ Qdrant  β”‚      β”‚ Postgres β”‚
                    β”‚ vectors β”‚      β”‚ metadata β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How indexing works

  1. POST /index receives image + product metadata
  2. Image uploaded to MinIO, metadata saved to PostgreSQL
  3. Job pushed to Redis queue
  4. Background worker dequeues β†’ fetches image β†’ CLIP embeds it β†’ vector stored in Qdrant

How search works

  1. POST /search receives a query image
  2. CLIP generates a 512-dim embedding vector for the image
  3. Qdrant performs ANN cosine-similarity search β†’ returns top-K product IDs
  4. Full metadata fetched from PostgreSQL in a single batch request
  5. Results returned ranked by similarity score

πŸš€ Quick start

Prerequisites

  • Docker Desktop (Windows/Mac) or Docker Engine + Compose plugin (Linux)
  • Python 3.9+ (for the one-time seeder script only)
  • ~4 GB RAM Β· ~3 GB disk (CLIP model is ~600 MB, downloaded once and cached permanently)

1. Clone

git clone https://github.com/YOUR_USERNAME/visual-search.git
cd visual-search

2. Configure

cp .env.example .env

The defaults in .env.example work out of the box for local development. Change passwords before deploying to any shared environment.

3. Build

make build

4. Start

make up

Starts all containers in dependency order. The first run downloads the CLIP model (~600 MB) β€” cached permanently in a Docker volume after that.

5. Seed the database

make seed

Fetches 194 real products across 24 categories from DummyJSON, downloads their images, and indexes everything into Qdrant + Postgres. Takes ~2–3 minutes.

6. Open

http://localhost

🧰 Makefile reference

make all            # Full setup: build β†’ start β†’ seed

make build          # Build all Docker images
make build-backend  # Build only the 4 backend service images
make build-frontend # Build only the frontend

make up             # Start all containers (staged startup, no seeding)
make down           # Stop and remove containers (data volumes preserved)
make restart        # Restart all running containers

make seed           # Fetch + index 194 products from DummyJSON

make ps             # Show container status and health
make logs           # Stream all container logs
make logs-search    # Stream a specific service's logs

make clean          # Remove containers, keep data volumes
make fclean         # Full wipe β€” removes containers, images, AND all data

🌐 API reference

POST /search β€” search by image

curl -X POST http://localhost/search?top_k=12 \
     -F "file=@product.jpg"
Parameter Type Default Description
file file required Query image (JPEG, PNG, WebP, …)
top_k int 12 Number of results to return (max 50)
category string β€” Restrict results to one category

Response

{
  "results": [
    {
      "product_id": "DJS-001",
      "score": 0.9234,
      "name": "Essence Mascara Lash Princess",
      "brand": "Essence",
      "price": 9.99,
      "category": "beauty",
      "description": "...",
      "rating": 4.94,
      "stock": 5,
      "image_url": "http://minio:9000/product-images/products/DJS-001.jpg"
    }
  ],
  "cache": false,
  "count": 12
}

POST /index β€” add a product

curl -X POST http://localhost/index \
     -F "file=@shoe.jpg" \
     -F "product_id=SKU-123" \
     -F "name=Blue Running Shoes" \
     -F "price=89.99" \
     -F "category=footwear" \
     -F "brand=Nike" \
     -F "stock=42" \
     -F "rating=4.5"

GET /products/ β€” browse catalogue

GET /products/?skip=0&limit=40&category=beauty&sort=name

Sort options: name Β· price_asc Β· price_desc Β· rating

Other endpoints

GET  /products/categories               # list all distinct categories
GET  /products/batch?ids=ID1,ID2,ID3    # batch fetch by product ID
GET  /products/{product_id}             # single product detail
DELETE /products/{product_id}           # remove from metadata store
DELETE /index/{product_id}              # remove from vector index + metadata
GET  /health                            # gateway liveness check

πŸ”Œ Integrating into your own e-commerce platform

This project is designed to be dropped into an existing stack. Three integration paths:

Option A β€” Call the search API from your frontend

async function searchByImage(imageFile) {
  const fd = new FormData();
  fd.append('file', imageFile);

  const res = await fetch('http://your-host/search?top_k=12', {
    method: 'POST',
    body: fd,
  });
  const { results } = await res.json();
  // results[i] β†’ { product_id, score, name, price, image_url, ... }
  return results;
}

Option B β€” Index your own product catalogue

Replace make seed with a script that calls POST /index for each product in your database. The indexing service handles image upload, embedding, and storage automatically.

import requests

for product in your_products:
    requests.post('http://localhost:8001/index',
        data={
            'product_id': product['sku'],
            'name':        product['name'],
            'price':       product['price'],
            'category':    product['category'],
            'brand':       product['brand'],
            'stock':       product['stock'],
            'rating':      product['rating'],
        },
        files={'file': open(product['image_path'], 'rb')}
    )

Port 8001 (indexing service direct) bypasses Nginx rate limiting β€” use this for bulk ingestion.

Option C β€” Replace the dataset entirely

make fclean          # wipe all data volumes
make build && make up
# then run your own indexer

πŸ“ Project structure

visual-search/
β”œβ”€β”€ docker-compose.yml          # 10-service stack
β”œβ”€β”€ .env.example                # config template β€” copy to .env
β”œβ”€β”€ Makefile                    # all commands
β”œβ”€β”€ docs/
β”‚   └── screenshots/            # project screenshots (used in README)
β”œβ”€β”€ nginx/
β”‚   └── nginx.conf              # API gateway: routing, rate limits, image proxy
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ seed.py                 # DummyJSON seeder (194 products, 24 categories)
β”‚   └── requirements.txt
└── services/
    β”œβ”€β”€ frontend/               # Vanilla JS SPA (nginx:alpine)
    β”‚   └── app/index.html
    β”œβ”€β”€ embedding/              # CLIP inference β€” POST /embed  (:8004)
    β”‚   └── app/
    β”‚       β”œβ”€β”€ main.py
    β”‚       └── model.py
    β”œβ”€β”€ indexing/               # Ingestion + Redis worker  (:8001)
    β”‚   └── app/
    β”‚       β”œβ”€β”€ routes.py       # POST /index, DELETE /index/:id
    β”‚       β”œβ”€β”€ storage.py      # MinIO + Postgres helpers
    β”‚       └── worker.py       # Redis consumer β†’ Qdrant upsert
    β”œβ”€β”€ search/                 # Visual search endpoint  (:8002)
    β”‚   └── app/
    β”‚       β”œβ”€β”€ routes.py       # POST /search
    β”‚       └── cache.py        # Redis result cache
    └── main_api/               # Products CRUD  (:8003)
        └── app/
            β”œβ”€β”€ routes.py
            β”œβ”€β”€ models.py
            └── database.py

βš™οΈ Key configuration

Variable Default Description
EMBEDDING_MODEL openai/clip-vit-base-patch32 HuggingFace model ID. Use clip-vit-large-patch14 for higher accuracy (768-dim, slower)
EMBEDDING_DIM 512 Must match the chosen model's output dimension
REDIS_CACHE_TTL 3600 Search result cache lifetime in seconds
POSTGRES_PASSWORD (change me) Set a strong password before deploying
MINIO_ROOT_PASSWORD (change me) Set a strong password before deploying

πŸ› Troubleshooting

First startup is slow β€” The embedding service downloads the CLIP model (~600 MB) on first run. Subsequent starts are instant (cached in hf_cache Docker volume).

make seed fails β€” Make sure all containers are healthy first: make ps. Then retry.

502 Bad Gateway after container restart β€” Nginx may hold a stale IP. Fix: docker compose restart nginx.

Images not loading β€” The MinIO bucket is set to public-read automatically on startup. If images still 404: docker compose logs indexing | grep -i bucket.

Out of memory during build β€” PyTorch installs ~2 GB of dependencies. Increase Docker Desktop's memory limit to at least 4 GB in Settings β†’ Resources.

Search returns no results β€” Qdrant must be populated. Run make seed after make up.


πŸ“„ License

MIT β€” see LICENSE.


Built with FastAPI Β· CLIP Β· Qdrant Β· PostgreSQL Β· MinIO Β· Redis Β· Docker

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors