feat(openai_api_compatible): text embedding prefix #2079

utsumi-fj · 2025-11-17T09:32:26Z

Allow configuration of document and query prefixes for text embedding in openai_api_compatible model plugin.

Related Issues or Context

Some embedding models improve accuracy when a prefix is specified. This PR makes it possible to set document and query prefixes.

Note: This pull request can be used even if the dify-plugin-sdk's bug langgenius/dify-plugin-sdks#241 is not yet fixed. Once the bug is fixed, the functionality will behave as intended.

This PR contains Changes to LLM Models Plugin

My Changes Affect Message Flow Handling (System Messages and User→Assistant Turn-Taking)

My Changes Affect Tool Interaction Flow (Multi-Round Usage and Output Handling, for both Agent App and Agent Node)

My Changes Affect Multimodal Input Handling (Images, PDFs, Audio, Video, etc.)

My Changes Affect Multimodal Output Generation (Images, Audio, Video, etc.)

My Changes Affect Structured Output Format (JSON, XML, etc.)

My Changes Affect Token Consumption Metrics

My Changes Affect Other LLM Functionalities (Reasoning Process, Grounding, Prompt Caching, etc.)

Other Changes (Add New Models, Fix Model Parameters etc.)

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

I have Bumped Up the Version in Manifest.yaml (Top-Level Version Field, Not in Meta Section)

Dify Plugin SDK Version

I have Ensured dify_plugin>=0.3.0,<0.6.0 is in requirements.txt (SDK docs)

Environment Verification (If Any Code Changes)

Local Deployment Environment

Dify Version is: 1.9.2, I have Tested My Changes on Local Deployment Dify with a Clean Environment That Matches the Production Configuration.

Allow configuration of document and query prefixes for text embedding. Signed-off-by: Yuichiro Utsumi <utsumi.yuichiro@fujitsu.com>

gemini-code-assist · 2025-11-17T09:32:41Z

Summary of Changes

Hello @utsumi-fj, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the openai_api_compatible plugin by enabling the configuration of document and query prefixes for text embedding. This feature addresses the need for certain embedding models to receive contextual prefixes, which can lead to more accurate and relevant embeddings. The changes include updates to the plugin's configuration schema to expose these new settings in the UI and modifications to the core Python logic to apply these prefixes dynamically.

Highlights

Text Embedding Prefixes: Introduced the capability to configure document and query prefixes for text embedding models within the openai_api_compatible plugin, enhancing embedding accuracy for models that benefit from such prefixes.
Configuration UI: Added new UI fields (document_prefix and query_prefix) to the plugin's configuration schema, allowing users to easily set these prefixes.
Core Logic Implementation: Modified the text embedding logic to apply the specified document or query prefixes to input texts before they are sent to the embedding model.
Version Bump: The plugin version in manifest.yaml has been incremented from 0.0.24 to 0.0.25.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a useful feature to allow configuring document and query prefixes for text embeddings in the openai_api_compatible model plugin. The implementation is clean and follows existing patterns. I've found one high-severity issue where an empty prefix would incorrectly add a leading space to texts, which could impact embedding results. I've provided a suggestion to fix this. Otherwise, the changes look good.

gemini-code-assist · 2025-11-17T09:38:31Z

models/openai_api_compatible/models/text_embedding/text_embedding.py

+        return ""
+
+    def _add_prefix(self, texts: list[str], prefix: str) -> list[str]:
+        return [f"{prefix} {text}" for text in texts]


The current implementation will add a leading space to each text if the prefix is empty (e.g., f' {text}'). This is likely not the intended behavior and could negatively affect embedding quality. This change ensures prefixes are only added when they are not empty, and avoids creating a new list unnecessarily when no prefix is provided.

Suggested change

return [f"{prefix} {text}" for text in texts]

return [f"{prefix} {text}" for text in texts] if prefix else texts

Thanks, fixed!

feat(openai_api_compatible): text embedding prefix

c6bc1a0

Allow configuration of document and query prefixes for text embedding. Signed-off-by: Yuichiro Utsumi <utsumi.yuichiro@fujitsu.com>

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 17, 2025

dosubot bot added the enhancement New feature or request label Nov 17, 2025

gemini-code-assist bot reviewed Nov 17, 2025

View reviewed changes

check whether prefix is empty or not

e70a15b

crazywoola approved these changes Nov 18, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 18, 2025

crazywoola merged commit f0930df into langgenius:main Nov 18, 2025
1 check passed

utsumi-fj mentioned this pull request Nov 24, 2025

fix(text_embedding): pass input_type to distinguish between document and query langgenius/dify-plugin-sdks#242

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(openai_api_compatible): text embedding prefix #2079

feat(openai_api_compatible): text embedding prefix #2079

Uh oh!

utsumi-fj commented Nov 17, 2025

Uh oh!

gemini-code-assist bot commented Nov 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 17, 2025

Uh oh!

utsumi-fj Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	return [f"{prefix} {text}" for text in texts]
	return [f"{prefix} {text}" for text in texts] if prefix else texts

feat(openai_api_compatible): text embedding prefix #2079

feat(openai_api_compatible): text embedding prefix #2079

Uh oh!

Conversation

utsumi-fj commented Nov 17, 2025

Related Issues or Context

This PR contains Changes to LLM Models Plugin

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

Dify Plugin SDK Version

Environment Verification (If Any Code Changes)

Local Deployment Environment

Uh oh!

gemini-code-assist bot commented Nov 17, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

utsumi-fj Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants