Skip to content

🐛 Bugfix: Enhance prompt generation with knowledge base display names part2#2813

Merged
Dallas98 merged 10 commits intodevelopfrom
develop_fix_kb_4
Apr 27, 2026
Merged

🐛 Bugfix: Enhance prompt generation with knowledge base display names part2#2813
Dallas98 merged 10 commits intodevelopfrom
develop_fix_kb_4

Conversation

@Zhi-a
Copy link
Copy Markdown
Contributor

@Zhi-a Zhi-a commented Apr 17, 2026

✨ Enhance prompt generation with knowledge base display names
智能体提示词,使用工具选择的知识库名称,而不是编造

  • Added knowledge_base_display_names to the GeneratePromptRequest model to allow frontend-configured names for knowledge bases.
  • Updated backend functions to utilize these display names, improving few-shot example generation without requiring database lookups.
  • Modified frontend components to capture and pass knowledge base display names during prompt generation.
  • Enhanced tests to cover the new functionality and ensure proper integration of knowledge base display names in the prompt generation process.

修改前
image

修改后,
image

Zhi-a added 2 commits April 15, 2026 15:01
- Introduced `get_knowledge_name_map_by_index_names` function to retrieve a mapping of index names to their corresponding display names.
- Updated `create_agent_config` and `create_tool_config_list` to utilize the new mapping for generating user-friendly summaries.
- Enhanced `KnowledgeBaseSearchTool` to support conversion from display names to index names during queries.
- Added tests to verify the functionality of the new mapping and its integration within the tool configuration process.
- Added `knowledge_base_display_names` to the `GeneratePromptRequest` model to allow frontend-configured names for knowledge bases.
- Updated backend functions to utilize these display names, improving few-shot example generation without requiring database lookups.
- Modified frontend components to capture and pass knowledge base display names during prompt generation.
- Enhanced tests to cover the new functionality and ensure proper integration of knowledge base display names in the prompt generation process.
Copilot AI review requested due to automatic review settings April 17, 2026 09:23
@Zhi-a Zhi-a requested review from Dallas98 and WMC001 as code owners April 17, 2026 09:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end support for using knowledge base display names (frontend-configured) during system prompt generation and tool execution, reducing reliance on database lookups and improving few-shot example correctness.

Changes:

  • Extends prompt generation APIs/services to accept knowledge_base_display_names (frontend takes precedence over DB-derived names) and inject them into prompt templates.
  • Adds get_knowledge_name_map_by_index_names() to map internal index_name -> knowledge_name, and uses it to build display_name_to_index_map for KB tool parameter conversion.
  • Updates frontend tool config + prompt generation flow to collect and pass KB display names; expands tests across backend and SDK.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
sdk/nexent/core/tools/knowledge_base_search_tool.py Adds display_name_to_index_map and converts display names to index names before querying.
sdk/nexent/core/agents/nexent_agent.py Passes display_name_to_index_map via metadata after tool instantiation (excluded params).
backend/database/knowledge_db.py Adds get_knowledge_name_map_by_index_names() for index-name → display-name lookup.
backend/services/tool_configuration_service.py Builds display_name_to_index_map during local tool validation for KB tool.
backend/services/prompt_service.py Accepts knowledge_base_display_names, injects into template context, adds DB fallback helper.
backend/prompts/utils/prompt_generate_zh.yaml Adds KB configuration note block for few-shot generation.
backend/prompts/utils/prompt_generate_en.yaml Adds KB configuration note block for few-shot generation.
backend/consts/model.py Extends GeneratePromptRequest with knowledge_base_display_names.
backend/apps/prompt_app.py Wires request field through to prompt generation streaming endpoint.
backend/agents/create_agent_info.py Uses display names in KB summary and builds display_name_to_index_map in tool metadata.
frontend/types/agentConfig.ts Adds display_names on tools and knowledge_base_display_names in prompt params.
frontend/app/[locale]/agents/components/agentInfo/AgentGenerateDetail.tsx Collects KB display names from tools and sends with prompt generation request.
frontend/app/[locale]/agents/components/agentConfig/tool/ToolConfigModal.tsx Persists selected KB display names on the configured tool in local state.
test/sdk/core/tools/test_knowledge_base_search_tool.py Updates tool instantiation and adds extensive unit tests for new behaviors.
test/sdk/core/agents/test_nexent_agent.py Tests that metadata-driven tool creation sets display_name_to_index_map.
test/backend/services/test_tool_configuration_service.py Updates validation tests to account for KB display mapping.
test/backend/services/test_prompt_service.py Adds tests for KB display name injection and DB fallback helper.
test/backend/database/test_knowledge_db.py Adds tests for get_knowledge_name_map_by_index_names().
test/backend/agents/test_create_agent_info.py Updates tool metadata assertions and adds display-name map tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +483 to +486
if (Array.isArray(editedAgent.tools)) {
for (const tool of editedAgent.tools) {
if (typeof tool === "object" && tool.display_names && Array.isArray(tool.display_names)) {
knowledgeBaseDisplayNames.push(...tool.display_names);
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knowledgeBaseDisplayNames.push(...tool.display_names) can accumulate duplicates (e.g., multiple KB tools or repeated selections), which unnecessarily bloats the payload and prompt. Consider deduplicating while preserving order (e.g., with a Set) before sending knowledge_base_display_names.

Suggested change
if (Array.isArray(editedAgent.tools)) {
for (const tool of editedAgent.tools) {
if (typeof tool === "object" && tool.display_names && Array.isArray(tool.display_names)) {
knowledgeBaseDisplayNames.push(...tool.display_names);
const seenKnowledgeBaseDisplayNames = new Set<string>();
if (Array.isArray(editedAgent.tools)) {
for (const tool of editedAgent.tools) {
if (
typeof tool === "object" &&
tool.display_names &&
Array.isArray(tool.display_names)
) {
for (const displayName of tool.display_names) {
if (!seenKnowledgeBaseDisplayNames.has(displayName)) {
seenKnowledgeBaseDisplayNames.add(displayName);
knowledgeBaseDisplayNames.push(displayName);
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +814 to +818
"""Test that cite_index in results starts from record_ops + index + 1."""
mock_results = create_mock_search_result(2)
knowledge_base_search_tool.vdb_core.hybrid_search.return_value = mock_results

# record_ops starts at 1, so cite_index should be 1+0+1=1, 1+1+1=2
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment explaining the expected cite_index math is inconsistent: 1+0+1 equals 2, but the assertions expect the first cite_index to be 1. Update the comment to match the actual intended formula (cite_index = record_ops + index per implementation) to avoid confusing future readers.

Suggested change
"""Test that cite_index in results starts from record_ops + index + 1."""
mock_results = create_mock_search_result(2)
knowledge_base_search_tool.vdb_core.hybrid_search.return_value = mock_results
# record_ops starts at 1, so cite_index should be 1+0+1=1, 1+1+1=2
"""Test that cite_index in results starts from record_ops + index."""
mock_results = create_mock_search_result(2)
knowledge_base_search_tool.vdb_core.hybrid_search.return_value = mock_results
# record_ops starts at 1, so cite_index should be record_ops + index: 1+0=1, 1+1=2

Copilot uses AI. Check for mistakes.
Comment on lines +89 to +139
display_name_to_index_map: dict = Field(
description="Mapping from display_name (knowledge_name) to index_name",
default_factory=dict, exclude=True),
):
"""Initialize the KBSearchTool.

Args:
top_k (int, optional): Number of results to return. Defaults to 3.
observer (MessageObserver, optional): Message observer instance. Defaults to None.
display_name_to_index_map (dict, optional): Mapping from display_name to index_name.
When LLM passes display_name as index_names parameter, it will be converted
to the actual index_name for ES queries.

Raises:
ValueError: If language is not supported
"""
super().__init__()
self.top_k = top_k
self.observer = observer
self.vdb_core = vdb_core
self.index_names = [] if index_names is None else index_names
self.search_mode = search_mode
self.embedding_model = embedding_model
self.rerank = rerank
self.rerank_model_name = rerank_model_name
self.rerank_model = rerank_model
self.display_name_to_index_map = display_name_to_index_map

self.record_ops = 1 # To record serial number
self.running_prompt_zh = "知识库检索中..."
self.running_prompt_en = "Searching the knowledge base..."


def _convert_to_index_names(self, names: List[str]) -> List[str]:
"""Convert display names (knowledge_name) to index names if necessary.

When LLM passes display_name as the index_names parameter,
this method converts it to the actual index_name for ES queries.

Args:
names: List of names that could be either display_name or index_name

Returns:
List of actual index_names for ES queries
"""
# Handle FieldInfo case (smolagents doesn't expand Field defaults)
display_map = self.display_name_to_index_map
if isinstance(display_map, FieldInfo):
display_map = display_map.default
if not display_map:
return names
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

display_name_to_index_map uses Field(default_factory=dict) in a plain Python __init__, so when the caller doesn't pass it, the value is a FieldInfo whose .default is typically PydanticUndefined (because the default comes from default_factory). _convert_to_index_names() then sets display_map = display_map.default, which can lead to a non-dict sentinel and cause failures like TypeError: argument of type 'PydanticUndefinedType' is not iterable when doing name in display_map. Consider normalizing in __init__ (e.g., if it's FieldInfo, use {} / call default_factory), and/or make the signature default None and set {} explicitly to avoid storing FieldInfo on the instance.

Copilot uses AI. Check for mistakes.
Comment on lines +718 to +724
# Build display_name to index_name mapping for LLM parameter conversion
index_names = instantiation_params.get("index_names", [])
display_name_to_index_map = {}
if index_names:
knowledge_name_map = get_knowledge_name_map_by_index_names(index_names)
for idx_name, kb_name in knowledge_name_map.items():
display_name_to_index_map[kb_name] = idx_name
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index_names = instantiation_params.get("index_names", []) can be a JSON string (the codebase already handles this case elsewhere, e.g. in prompt_service.get_knowledge_base_display_names). Passing a string into get_knowledge_name_map_by_index_names() will produce incorrect queries (.in_(index_names) iterates characters) or runtime errors. Normalize index_names to List[str] here (parse JSON string, and/or wrap a single string into a list) before building display_name_to_index_map.

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +93
# Get knowledge base display names for few-shot examples
# Priority: frontend-provided > database query
if knowledge_base_display_names:
logger.debug(f"Using frontend-provided knowledge base display names: {knowledge_base_display_names}")
else:
knowledge_base_display_names = get_knowledge_base_display_names(
tool_info_list=tool_info_list,
agent_id=agent_id,
tenant_id=tenant_id
)
logger.debug(f"Using database query for knowledge base display names: {knowledge_base_display_names}")
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check if knowledge_base_display_names: treats an empty list the same as None, so a client that intentionally passes [] (to explicitly override DB values) will unexpectedly fall back to the database lookup. If the intent is “frontend-provided takes precedence even if empty”, use an explicit is not None check (and optionally validate/dedupe the list).

Copilot uses AI. Check for mistakes.

# Add knowledge base display names for few-shot examples if available
if knowledge_base_display_names:
kb_names_str = ", ".join(f'"{name}"' for name in knowledge_base_display_names)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kb_names_str = ", ".join(f'"{name}"' for name in knowledge_base_display_names) will generate malformed examples if a display name contains quotes, backslashes, or newlines. Since these values can come from the frontend/database, consider producing a JSON-escaped representation (e.g., via json.dumps) or otherwise escaping/quoting robustly before injecting into the prompt template.

Suggested change
kb_names_str = ", ".join(f'"{name}"' for name in knowledge_base_display_names)
kb_names_str = ", ".join(json.dumps(name, ensure_ascii=False) for name in knowledge_base_display_names)

Copilot uses AI. Check for mistakes.
Comment on lines +379 to +403
def get_knowledge_name_map_by_index_names(index_names: List[str]) -> Dict[str, str]:
"""
Get a mapping from index_name to knowledge_name (display name) for the given index_names.
Used to build user-friendly knowledge base summaries in prompts.

Args:
index_names: List of internal index names

Returns:
Dict[str, str]: Mapping of index_name -> knowledge_name.
If a knowledge base is not found in the database,
the index_name itself is used as the fallback value.
"""
if not index_names:
return {}

try:
with get_db_session() as session:
result = session.query(
KnowledgeRecord.index_name,
KnowledgeRecord.knowledge_name
).filter(
KnowledgeRecord.index_name.in_(index_names),
KnowledgeRecord.delete_flag != 'Y'
).all()
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_knowledge_name_map_by_index_names() queries by index_name only and does not filter by tenant_id. In a multi-tenant system, this can leak knowledge base names across tenants if index_name is not guaranteed globally unique (or if callers can supply arbitrary index_names). Consider adding a tenant_id parameter and filtering on it in the query, then update callers to pass the tenant context.

Copilot uses AI. Check for mistakes.
…into develop_fix_kb_4

# Conflicts:
#	backend/prompts/utils/prompt_generate_zh.yaml
#	test/backend/agents/test_create_agent_info.py
@Zhi-a Zhi-a changed the title ✨ Enhance prompt generation with knowledge base display names ✨ Enhance prompt generation with knowledge base display names part2 Apr 20, 2026
Zhi-a added 3 commits April 21, 2026 14:13
…into develop_fix_kb_3

# Conflicts:
#	test/backend/agents/test_create_agent_info.py
…into develop_fix_kb_4

# Conflicts:
#	frontend/app/[locale]/agents/components/agentInfo/AgentGenerateDetail.tsx
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 87.23404% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/services/prompt_service.py 86.95% 3 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

@Zhi-a Zhi-a changed the title ✨ Enhance prompt generation with knowledge base display names part2 🐛 Bugfix: Enhance prompt generation with knowledge base display names part2 Apr 25, 2026
Zhi-a added 2 commits April 25, 2026 14:42
…into develop_fix_kb_4

# Conflicts:
#	test/backend/agents/test_create_agent_info.py
tool_ids=prompt_request.tool_ids,
sub_agent_ids=prompt_request.sub_agent_ids
sub_agent_ids=prompt_request.sub_agent_ids,
knowledge_base_display_names=prompt_request.knowledge_base_display_names
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image 这个对齐为啥这么奇怪啊

@Dallas98 Dallas98 merged commit df61b72 into develop Apr 27, 2026
15 of 16 checks passed
@Zhi-a Zhi-a deleted the develop_fix_kb_4 branch April 28, 2026 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants