feat: add Jina AI embedding provider by hanxiao · Pull Request #245 · volcengine/OpenViking

hanxiao · 2026-02-22T02:44:37Z

Add Jina AI as a supported embedding provider for dense embeddings.

MMTEB scores vs model size. jina-v5-text models (red) outperform models 2-16x their size.

MTEB English v2 scores. v5-text-nano (239M) achieves 71.0, matching models with 2x+ parameters.

Both models are open-weight (Apache 2.0) and support Matryoshka dimension reduction, task-specific embeddings, and local deployment via GGUF/MLX.

Paper: arXiv:2602.15547 | Blog | HuggingFace

Features

API mode: Jina AI hosted API at https://api.jina.ai/v1 (OpenAI-compatible)
Local mode: Open-weight models available in GGUF and MLX formats on HuggingFace. Run locally with llama.cpp, MLX, or vLLM and point api_base to your local server.
Task-specific embeddings via task parameter
Late chunking support via late_chunking parameter
Matryoshka dimension reduction via dimensions parameter

Changes

New jina_embedders.py with JinaDenseEmbedder
Register jina provider in embedding_config.py
Update docs (EN/ZH), README, FAQ with Jina provider info
Add unit tests with mocked API calls

CLAassistant · 2026-02-22T02:44:43Z

All committers have signed the CLA.

hanxiao · 2026-02-22T03:33:08Z

@MaojiaSheng @ZaynJarvis Could you review this PR when you get a chance? Thanks!

ZaynJarvis · 2026-02-22T04:09:15Z

openviking_cli/utils/config/embedding_config.py

+                    "model_name": cfg.model,
+                    "api_key": cfg.api_key,
+                    "api_base": cfg.api_base,
+                    "dimension": cfg.dimension,


looks like jina cannot set other dimension except the configured ones, consider remove this config? (includes docs) also does JinaEmbedder has dimension value validation?

Other lgtm

v5 supports arbitrary dimensions via MRL - any value (1, 33, 34, 512, etc.) up to model max works. max is per-model: 1024 for small, 768 for nano. validation added.

hanxiao

Thanks for the review! Regarding the dimension config:

Jina v5 models support Matryoshka Representation Learning (MRL), so the dimensions parameter can truncate embeddings to any value up to the max dimension (1024 for small, 768 for nano). The API handles truncation + L2 renormalization server-side.

I will add dimension validation in JinaDenseEmbedder.__init__ to ensure the requested dimension does not exceed the model max. Pushing the fix now.

ZaynJarvis

Thx, lgtm @MaojiaSheng

MaojiaSheng · 2026-02-22T10:22:49Z

Thanks, will be merged

github-project-automation bot added this to OpenViking project Feb 22, 2026

github-project-automation bot moved this to Backlog in OpenViking project Feb 22, 2026

hanxiao force-pushed the feat/jina-embedding-provider branch 3 times, most recently from 4139a81 to c7cd135 Compare February 22, 2026 03:25

hanxiao force-pushed the feat/jina-embedding-provider branch from c7cd135 to 658d757 Compare February 22, 2026 03:47

ZaynJarvis reviewed Feb 22, 2026

View reviewed changes

hanxiao commented Feb 22, 2026

View reviewed changes

feat: add Jina AI embedding provider

79957ff

hanxiao force-pushed the feat/jina-embedding-provider branch from 658d757 to 79957ff Compare February 22, 2026 04:47

ZaynJarvis approved these changes Feb 22, 2026

View reviewed changes

MaojiaSheng approved these changes Feb 22, 2026

View reviewed changes

MaojiaSheng merged commit ea2a508 into volcengine:main Feb 22, 2026
1 check passed

github-project-automation bot moved this from Backlog to Done in OpenViking project Feb 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: add Jina AI embedding provider#245

feat: add Jina AI embedding provider#245
MaojiaSheng merged 1 commit intovolcengine:mainfrom
hanxiao:feat/jina-embedding-provider

hanxiao commented Feb 22, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Feb 22, 2026 •

edited

Loading

Uh oh!

hanxiao commented Feb 22, 2026

Uh oh!

ZaynJarvis Feb 22, 2026

Uh oh!

hanxiao Feb 22, 2026 •

edited

Loading

Uh oh!

hanxiao left a comment

Uh oh!

ZaynJarvis left a comment •

edited

Loading

Uh oh!

MaojiaSheng commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

hanxiao commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Changes

Uh oh!

CLAassistant commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanxiao commented Feb 22, 2026

Uh oh!

ZaynJarvis Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

hanxiao Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanxiao left a comment

Choose a reason for hiding this comment

Uh oh!

ZaynJarvis left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaojiaSheng commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hanxiao commented Feb 22, 2026 •

edited

Loading

CLAassistant commented Feb 22, 2026 •

edited

Loading

hanxiao Feb 22, 2026 •

edited

Loading

ZaynJarvis left a comment •

edited

Loading