suggestion: move from qwen3 for emb to qwen3-vl-embedding / reranking

**Disclaimer:** This suggestion may be outside the scope of this project and might be better suited as a completely separate, standalone effort. Feel free to close it without further comment.

---

I’m wondering whether the qwen3-vl-embedding / reranking models could be integrated into flux2.c in some way. qwen3-vl-embedding is a state-of-the-art model (as of early Jan 2026) for multimodal embeddings: the input can be text, images, or video, and it produces an embedding from the input plus an optional instruction. qwen3-vl-reranking is the complementary reranking model: given an input, a query, and an optional instruction, it outputs a relevance score. Repeating this with multiple inputs and the same query yields a set of relevance scores for the collection.

qwen3-vl-embedding was trained with QAT and supports Matryoshka embeddings, so the output vectors can be quantized to lower precision (e.g., int4) and truncated to smaller sizes (e.g., from 2048 to 256) with minimal performance degradation.

Both models are available in 2B and 8B parameter versions.

At the moment, these models are not widely supported by the ML/AI ecosystem. They are not yet supported by llama.cpp or by mlx, although GGUF conversion appears to be possible — I’ve tried using the [ggml-org/gguf-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) conversion tool.

Perhaps qwen3-vl-embedding could enable additional input modes for flux2.c (e.g., video, image, or text, with an optional instruction to steer the representation).

---

My main concern is that even though the backbone used to produce the embeddings is still qwen3, the additional training involved in creating qwen3-vl-embedding might have resulted in embeddings that are **not** compatible with those expected by the flux2 model.

---

References:

- Qwen3 VL Embedding / Reranking repo: https://github.com/QwenLM/Qwen3-VL-Embedding
- Paper: https://github.com/QwenLM/Qwen3-VL-Embedding/blob/main/assets/qwen3vlembedding_technical_report.pdf
- Hugging Face (Qwen3-VL-Embedding): https://huggingface.co/collections/Qwen/qwen3-vl-embedding
- Hugging Face (Qwen3-VL-Reranker): https://huggingface.co/collections/Qwen/qwen3-vl-reranker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

suggestion: move from qwen3 for emb to qwen3-vl-embedding / reranking #35

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

suggestion: move from qwen3 for emb to qwen3-vl-embedding / reranking #35

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions