Skip to content

zieen/embedding-container

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qdrant Embedding Service

This project provides a FastAPI-based service for generating multi-modal embeddings (Text, Image, PDF) using ColQwen/ColPali models, suitable for indexing in Qdrant.

Prerequisites

  • Python 3.12+ (managed via Conda recommended)
  • Qdrant running locally (default: localhost:6334)
  • GPU recommended for inference (though mps/Cpu is supported)

Configuration

The service device can be configured via the DEVICE environment variable.

  • DEVICE=auto (Default: Auto-selects CUDA > MPS > CPU)
  • DEVICE=mps (Apple Silicon GPU)
  • DEVICE=cpu (Processor)
  • DEVICE=cuda (NVIDIA GPU)
  • DEVICE=cuda:0 (Specific GPU)
  • BATCH_SIZE=1 (Number of images to process per batch. Reduce if OOM occurs. Default: 1)

Installation

  1. Create and Activate Environment

    conda create -n qdrant python=3.12
    conda activate qdrant
  2. Install Dependencies

    pip install -r requirements.txt

    Note: Ensure you have poppler installed for PDF processing (e.g., brew install poppler on macOS).

Running with Docker

  1. Build the Image

    docker build -t embedding-service .
  2. Run the Container

    Run the container, specifying the device (default is cpu in Dockerfile, but you can override it).

    docker run -p 8025:8025 -e DEVICE=cpu embedding-service

    Note: The service listens on port 8025 inside the container.

Running the API Service

Start the FastAPI server using uvicorn:

uvicorn api.main:app --host 0.0.0.0 --port 8001 --reload

The service will load the ColQwen model (approx. 4B params) on startup. This may take a few moments.

API Endpoints

1. Process Query

Generate embeddings for a text query.

  • URL: /process_query
  • Method: POST
  • Payload:
    {
      "query": "your search query"
    }
  • Response:
    {
      "embedding": [0.123, ...]
    }

2. Get PDF Embedding

Generate embeddings for a PDF file located on the server filesystem.

  • URL: /get_pdf_embedding
  • Method: POST
  • Content-Type: multipart/form-data
  • Form Data:
    • file: Upload string PDF file (optional).
    • pdf_path: Absolute path to the PDF on server (optional).
    • Note: One of file or pdf_path must be provided.
  • Response:
    {
      "page_count": 1,
      "pages": [
        {
          "page_number": 1,
          "size": [595, 842],
          "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...",
          "embeddings": [[...], ...],
          "pooled_rows": [[...], ...],
          "pooled_cols": [[...], ...]
        }
      ]
    }

3. Encode Image Batch

Generate embeddings for a batch of uploaded images.

  • URL: /encode_image_batch
  • Method: POST
  • Content-Type: multipart/form-data
  • Files: List of image files.
  • Response:
    {
      "embeddings": [[...], ...],
      "pooled_rows": [[...], ...],
      "pooled_cols": [[...], ...]
    }

4. Health Check

Check if the service is running and the model is loaded.

  • URL: /health
  • Method: GET
  • Response: {"status": "healthy", "model_loaded": true}

Testing

A test script is provided to verify the API endpoints (mocking the heavy model logic).

python tests/test_api_client.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors