Skip to content

suggestion: move from qwen3 for emb to qwen3-vl-embedding / reranking #35

@S1M0N38

Description

@S1M0N38

Disclaimer: This suggestion may be outside the scope of this project and might be better suited as a completely separate, standalone effort. Feel free to close it without further comment.


I’m wondering whether the qwen3-vl-embedding / reranking models could be integrated into flux2.c in some way. qwen3-vl-embedding is a state-of-the-art model (as of early Jan 2026) for multimodal embeddings: the input can be text, images, or video, and it produces an embedding from the input plus an optional instruction. qwen3-vl-reranking is the complementary reranking model: given an input, a query, and an optional instruction, it outputs a relevance score. Repeating this with multiple inputs and the same query yields a set of relevance scores for the collection.

qwen3-vl-embedding was trained with QAT and supports Matryoshka embeddings, so the output vectors can be quantized to lower precision (e.g., int4) and truncated to smaller sizes (e.g., from 2048 to 256) with minimal performance degradation.

Both models are available in 2B and 8B parameter versions.

At the moment, these models are not widely supported by the ML/AI ecosystem. They are not yet supported by llama.cpp or by mlx, although GGUF conversion appears to be possible — I’ve tried using the ggml-org/gguf-my-repo conversion tool.

Perhaps qwen3-vl-embedding could enable additional input modes for flux2.c (e.g., video, image, or text, with an optional instruction to steer the representation).


My main concern is that even though the backbone used to produce the embeddings is still qwen3, the additional training involved in creating qwen3-vl-embedding might have resulted in embeddings that are not compatible with those expected by the flux2 model.


References:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions