Qdrant Embedding Service

This project provides a FastAPI-based service for generating multi-modal embeddings (Text, Image, PDF) using ColQwen/ColPali models, suitable for indexing in Qdrant.

Prerequisites

Python 3.12+ (managed via Conda recommended)
Qdrant running locally (default: localhost:6334)
GPU recommended for inference (though mps/Cpu is supported)

Configuration

The service device can be configured via the DEVICE environment variable.

DEVICE=auto (Default: Auto-selects CUDA > MPS > CPU)
DEVICE=mps (Apple Silicon GPU)
DEVICE=cpu (Processor)
DEVICE=cuda (NVIDIA GPU)
DEVICE=cuda:0 (Specific GPU)
BATCH_SIZE=1 (Number of images to process per batch. Reduce if OOM occurs. Default: 1)

Installation

Create and Activate Environment

conda create -n qdrant python=3.12
conda activate qdrant

Install Dependencies
```
pip install -r requirements.txt
```
Note: Ensure you have poppler installed for PDF processing (e.g., brew install poppler on macOS).

Running with Docker

Build the Image
```
docker build -t embedding-service .
```
Run the Container

Run the container, specifying the device (default is cpu in Dockerfile, but you can override it).
```
docker run -p 8025:8025 -e DEVICE=cpu embedding-service
```
Note: The service listens on port 8025 inside the container.

Running the API Service

Start the FastAPI server using uvicorn:

uvicorn api.main:app --host 0.0.0.0 --port 8001 --reload

The service will load the ColQwen model (approx. 4B params) on startup. This may take a few moments.

API Endpoints

1. Process Query

Generate embeddings for a text query.

URL: /process_query
Method: POST
Payload:
```
{
  "query": "your search query"
}
```
Response:
```
{
  "embedding": [0.123, ...]
}
```

2. Get PDF Embedding

Generate embeddings for a PDF file located on the server filesystem.

URL: /get_pdf_embedding
Method: POST
Content-Type: multipart/form-data
Form Data:
- file: Upload string PDF file (optional).
- pdf_path: Absolute path to the PDF on server (optional).
- Note: One of file or pdf_path must be provided.

Response:

{
  "page_count": 1,
  "pages": [
    {
      "page_number": 1,
      "size": [595, 842],
      "image_base64": "iVBORw0KGgoAAAANSUhEUgAA...",
      "embeddings": [[...], ...],
      "pooled_rows": [[...], ...],
      "pooled_cols": [[...], ...]
    }
  ]
}

3. Encode Image Batch

Generate embeddings for a batch of uploaded images.

URL: /encode_image_batch
Method: POST
Content-Type: multipart/form-data
Files: List of image files.

Response:

{
  "embeddings": [[...], ...],
  "pooled_rows": [[...], ...],
  "pooled_cols": [[...], ...]
}

4. Health Check

Check if the service is running and the model is loaded.

URL: /health
Method: GET
Response: {"status": "healthy", "model_loaded": true}

Testing

A test script is provided to verify the API endpoints (mocking the heavy model logic).

python tests/test_api_client.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
app		app
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh
stop.sh		stop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Qdrant Embedding Service

Prerequisites

Configuration

Installation

Running with Docker

Running the API Service

API Endpoints

1. Process Query

2. Get PDF Embedding

3. Encode Image Batch

4. Health Check

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Qdrant Embedding Service

Prerequisites

Configuration

Installation

Running with Docker

Running the API Service

API Endpoints

1. Process Query

2. Get PDF Embedding

3. Encode Image Batch

4. Health Check

Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages