Skip to content

v1.11.0

Choose a tag to compare

@jztan jztan released this 09 May 13:52
· 309 commits to develop since this release

What's New in v1.11.0

Added

  • Bring Your Own Model (BYOM) — embedding model is now configurable via the [embedding] model = "..." setting. Four models validated: BAAI/bge-small-en-v1.5 (default), BAAI/bge-large-en-v1.5, mixedbread-ai/mxbai-embed-large-v1, and nomic-ai/nomic-embed-text-v1.5. See docs/embedding-models.md for MTEB scores and trade-offs.
  • model_name threaded through pdf_search and pdf_cache_stats responses so agents can verify which embedding model produced a given result.
  • model column on page_embeddings cache table — switching models evicts stale rows automatically (no manual cache clear needed).
  • Embedding-model benchmark script (scripts/bench_embedding_models.py) with MRR + latency gate, summary tables, and markdown export.
  • embedding_model property on Config (renamed MODEL_NAMEDEFAULT_MODEL).

Fixed

  • embedder batch_size lowered to 8 (fastembed default is 256) to prevent OOM and hang on long-context models like nomic-embed-text-v1.5 (8192-token window) when processing 75-page PDFs.

Changed

  • Bumped pip to 26.1.1 and python-multipart to 0.0.27 (transitive dep updates).

Installation

pip install pdf-mcp==1.11.0

Links