🐛 Bugfix: Enhance prompt generation with knowledge base display names part2#2813
🐛 Bugfix: Enhance prompt generation with knowledge base display names part2#2813
Conversation
- Introduced `get_knowledge_name_map_by_index_names` function to retrieve a mapping of index names to their corresponding display names. - Updated `create_agent_config` and `create_tool_config_list` to utilize the new mapping for generating user-friendly summaries. - Enhanced `KnowledgeBaseSearchTool` to support conversion from display names to index names during queries. - Added tests to verify the functionality of the new mapping and its integration within the tool configuration process.
- Added `knowledge_base_display_names` to the `GeneratePromptRequest` model to allow frontend-configured names for knowledge bases. - Updated backend functions to utilize these display names, improving few-shot example generation without requiring database lookups. - Modified frontend components to capture and pass knowledge base display names during prompt generation. - Enhanced tests to cover the new functionality and ensure proper integration of knowledge base display names in the prompt generation process.
There was a problem hiding this comment.
Pull request overview
This PR adds end-to-end support for using knowledge base display names (frontend-configured) during system prompt generation and tool execution, reducing reliance on database lookups and improving few-shot example correctness.
Changes:
- Extends prompt generation APIs/services to accept
knowledge_base_display_names(frontend takes precedence over DB-derived names) and inject them into prompt templates. - Adds
get_knowledge_name_map_by_index_names()to map internalindex_name -> knowledge_name, and uses it to builddisplay_name_to_index_mapfor KB tool parameter conversion. - Updates frontend tool config + prompt generation flow to collect and pass KB display names; expands tests across backend and SDK.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
sdk/nexent/core/tools/knowledge_base_search_tool.py |
Adds display_name_to_index_map and converts display names to index names before querying. |
sdk/nexent/core/agents/nexent_agent.py |
Passes display_name_to_index_map via metadata after tool instantiation (excluded params). |
backend/database/knowledge_db.py |
Adds get_knowledge_name_map_by_index_names() for index-name → display-name lookup. |
backend/services/tool_configuration_service.py |
Builds display_name_to_index_map during local tool validation for KB tool. |
backend/services/prompt_service.py |
Accepts knowledge_base_display_names, injects into template context, adds DB fallback helper. |
backend/prompts/utils/prompt_generate_zh.yaml |
Adds KB configuration note block for few-shot generation. |
backend/prompts/utils/prompt_generate_en.yaml |
Adds KB configuration note block for few-shot generation. |
backend/consts/model.py |
Extends GeneratePromptRequest with knowledge_base_display_names. |
backend/apps/prompt_app.py |
Wires request field through to prompt generation streaming endpoint. |
backend/agents/create_agent_info.py |
Uses display names in KB summary and builds display_name_to_index_map in tool metadata. |
frontend/types/agentConfig.ts |
Adds display_names on tools and knowledge_base_display_names in prompt params. |
frontend/app/[locale]/agents/components/agentInfo/AgentGenerateDetail.tsx |
Collects KB display names from tools and sends with prompt generation request. |
frontend/app/[locale]/agents/components/agentConfig/tool/ToolConfigModal.tsx |
Persists selected KB display names on the configured tool in local state. |
test/sdk/core/tools/test_knowledge_base_search_tool.py |
Updates tool instantiation and adds extensive unit tests for new behaviors. |
test/sdk/core/agents/test_nexent_agent.py |
Tests that metadata-driven tool creation sets display_name_to_index_map. |
test/backend/services/test_tool_configuration_service.py |
Updates validation tests to account for KB display mapping. |
test/backend/services/test_prompt_service.py |
Adds tests for KB display name injection and DB fallback helper. |
test/backend/database/test_knowledge_db.py |
Adds tests for get_knowledge_name_map_by_index_names(). |
test/backend/agents/test_create_agent_info.py |
Updates tool metadata assertions and adds display-name map tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (Array.isArray(editedAgent.tools)) { | ||
| for (const tool of editedAgent.tools) { | ||
| if (typeof tool === "object" && tool.display_names && Array.isArray(tool.display_names)) { | ||
| knowledgeBaseDisplayNames.push(...tool.display_names); |
There was a problem hiding this comment.
knowledgeBaseDisplayNames.push(...tool.display_names) can accumulate duplicates (e.g., multiple KB tools or repeated selections), which unnecessarily bloats the payload and prompt. Consider deduplicating while preserving order (e.g., with a Set) before sending knowledge_base_display_names.
| if (Array.isArray(editedAgent.tools)) { | |
| for (const tool of editedAgent.tools) { | |
| if (typeof tool === "object" && tool.display_names && Array.isArray(tool.display_names)) { | |
| knowledgeBaseDisplayNames.push(...tool.display_names); | |
| const seenKnowledgeBaseDisplayNames = new Set<string>(); | |
| if (Array.isArray(editedAgent.tools)) { | |
| for (const tool of editedAgent.tools) { | |
| if ( | |
| typeof tool === "object" && | |
| tool.display_names && | |
| Array.isArray(tool.display_names) | |
| ) { | |
| for (const displayName of tool.display_names) { | |
| if (!seenKnowledgeBaseDisplayNames.has(displayName)) { | |
| seenKnowledgeBaseDisplayNames.add(displayName); | |
| knowledgeBaseDisplayNames.push(displayName); | |
| } | |
| } |
| """Test that cite_index in results starts from record_ops + index + 1.""" | ||
| mock_results = create_mock_search_result(2) | ||
| knowledge_base_search_tool.vdb_core.hybrid_search.return_value = mock_results | ||
|
|
||
| # record_ops starts at 1, so cite_index should be 1+0+1=1, 1+1+1=2 |
There was a problem hiding this comment.
The comment explaining the expected cite_index math is inconsistent: 1+0+1 equals 2, but the assertions expect the first cite_index to be 1. Update the comment to match the actual intended formula (cite_index = record_ops + index per implementation) to avoid confusing future readers.
| """Test that cite_index in results starts from record_ops + index + 1.""" | |
| mock_results = create_mock_search_result(2) | |
| knowledge_base_search_tool.vdb_core.hybrid_search.return_value = mock_results | |
| # record_ops starts at 1, so cite_index should be 1+0+1=1, 1+1+1=2 | |
| """Test that cite_index in results starts from record_ops + index.""" | |
| mock_results = create_mock_search_result(2) | |
| knowledge_base_search_tool.vdb_core.hybrid_search.return_value = mock_results | |
| # record_ops starts at 1, so cite_index should be record_ops + index: 1+0=1, 1+1=2 |
| display_name_to_index_map: dict = Field( | ||
| description="Mapping from display_name (knowledge_name) to index_name", | ||
| default_factory=dict, exclude=True), | ||
| ): | ||
| """Initialize the KBSearchTool. | ||
|
|
||
| Args: | ||
| top_k (int, optional): Number of results to return. Defaults to 3. | ||
| observer (MessageObserver, optional): Message observer instance. Defaults to None. | ||
| display_name_to_index_map (dict, optional): Mapping from display_name to index_name. | ||
| When LLM passes display_name as index_names parameter, it will be converted | ||
| to the actual index_name for ES queries. | ||
|
|
||
| Raises: | ||
| ValueError: If language is not supported | ||
| """ | ||
| super().__init__() | ||
| self.top_k = top_k | ||
| self.observer = observer | ||
| self.vdb_core = vdb_core | ||
| self.index_names = [] if index_names is None else index_names | ||
| self.search_mode = search_mode | ||
| self.embedding_model = embedding_model | ||
| self.rerank = rerank | ||
| self.rerank_model_name = rerank_model_name | ||
| self.rerank_model = rerank_model | ||
| self.display_name_to_index_map = display_name_to_index_map | ||
|
|
||
| self.record_ops = 1 # To record serial number | ||
| self.running_prompt_zh = "知识库检索中..." | ||
| self.running_prompt_en = "Searching the knowledge base..." | ||
|
|
||
|
|
||
| def _convert_to_index_names(self, names: List[str]) -> List[str]: | ||
| """Convert display names (knowledge_name) to index names if necessary. | ||
|
|
||
| When LLM passes display_name as the index_names parameter, | ||
| this method converts it to the actual index_name for ES queries. | ||
|
|
||
| Args: | ||
| names: List of names that could be either display_name or index_name | ||
|
|
||
| Returns: | ||
| List of actual index_names for ES queries | ||
| """ | ||
| # Handle FieldInfo case (smolagents doesn't expand Field defaults) | ||
| display_map = self.display_name_to_index_map | ||
| if isinstance(display_map, FieldInfo): | ||
| display_map = display_map.default | ||
| if not display_map: | ||
| return names |
There was a problem hiding this comment.
display_name_to_index_map uses Field(default_factory=dict) in a plain Python __init__, so when the caller doesn't pass it, the value is a FieldInfo whose .default is typically PydanticUndefined (because the default comes from default_factory). _convert_to_index_names() then sets display_map = display_map.default, which can lead to a non-dict sentinel and cause failures like TypeError: argument of type 'PydanticUndefinedType' is not iterable when doing name in display_map. Consider normalizing in __init__ (e.g., if it's FieldInfo, use {} / call default_factory), and/or make the signature default None and set {} explicitly to avoid storing FieldInfo on the instance.
| # Build display_name to index_name mapping for LLM parameter conversion | ||
| index_names = instantiation_params.get("index_names", []) | ||
| display_name_to_index_map = {} | ||
| if index_names: | ||
| knowledge_name_map = get_knowledge_name_map_by_index_names(index_names) | ||
| for idx_name, kb_name in knowledge_name_map.items(): | ||
| display_name_to_index_map[kb_name] = idx_name |
There was a problem hiding this comment.
index_names = instantiation_params.get("index_names", []) can be a JSON string (the codebase already handles this case elsewhere, e.g. in prompt_service.get_knowledge_base_display_names). Passing a string into get_knowledge_name_map_by_index_names() will produce incorrect queries (.in_(index_names) iterates characters) or runtime errors. Normalize index_names to List[str] here (parse JSON string, and/or wrap a single string into a list) before building display_name_to_index_map.
| # Get knowledge base display names for few-shot examples | ||
| # Priority: frontend-provided > database query | ||
| if knowledge_base_display_names: | ||
| logger.debug(f"Using frontend-provided knowledge base display names: {knowledge_base_display_names}") | ||
| else: | ||
| knowledge_base_display_names = get_knowledge_base_display_names( | ||
| tool_info_list=tool_info_list, | ||
| agent_id=agent_id, | ||
| tenant_id=tenant_id | ||
| ) | ||
| logger.debug(f"Using database query for knowledge base display names: {knowledge_base_display_names}") |
There was a problem hiding this comment.
The check if knowledge_base_display_names: treats an empty list the same as None, so a client that intentionally passes [] (to explicitly override DB values) will unexpectedly fall back to the database lookup. If the intent is “frontend-provided takes precedence even if empty”, use an explicit is not None check (and optionally validate/dedupe the list).
|
|
||
| # Add knowledge base display names for few-shot examples if available | ||
| if knowledge_base_display_names: | ||
| kb_names_str = ", ".join(f'"{name}"' for name in knowledge_base_display_names) |
There was a problem hiding this comment.
kb_names_str = ", ".join(f'"{name}"' for name in knowledge_base_display_names) will generate malformed examples if a display name contains quotes, backslashes, or newlines. Since these values can come from the frontend/database, consider producing a JSON-escaped representation (e.g., via json.dumps) or otherwise escaping/quoting robustly before injecting into the prompt template.
| kb_names_str = ", ".join(f'"{name}"' for name in knowledge_base_display_names) | |
| kb_names_str = ", ".join(json.dumps(name, ensure_ascii=False) for name in knowledge_base_display_names) |
| def get_knowledge_name_map_by_index_names(index_names: List[str]) -> Dict[str, str]: | ||
| """ | ||
| Get a mapping from index_name to knowledge_name (display name) for the given index_names. | ||
| Used to build user-friendly knowledge base summaries in prompts. | ||
|
|
||
| Args: | ||
| index_names: List of internal index names | ||
|
|
||
| Returns: | ||
| Dict[str, str]: Mapping of index_name -> knowledge_name. | ||
| If a knowledge base is not found in the database, | ||
| the index_name itself is used as the fallback value. | ||
| """ | ||
| if not index_names: | ||
| return {} | ||
|
|
||
| try: | ||
| with get_db_session() as session: | ||
| result = session.query( | ||
| KnowledgeRecord.index_name, | ||
| KnowledgeRecord.knowledge_name | ||
| ).filter( | ||
| KnowledgeRecord.index_name.in_(index_names), | ||
| KnowledgeRecord.delete_flag != 'Y' | ||
| ).all() |
There was a problem hiding this comment.
get_knowledge_name_map_by_index_names() queries by index_name only and does not filter by tenant_id. In a multi-tenant system, this can leak knowledge base names across tenants if index_name is not guaranteed globally unique (or if callers can supply arbitrary index_names). Consider adding a tenant_id parameter and filtering on it in the query, then update callers to pass the tenant context.
…into develop_fix_kb_4 # Conflicts: # backend/prompts/utils/prompt_generate_zh.yaml # test/backend/agents/test_create_agent_info.py
…into develop_fix_kb_3 # Conflicts: # test/backend/agents/test_create_agent_info.py
…into develop_fix_kb_4 # Conflicts: # frontend/app/[locale]/agents/components/agentInfo/AgentGenerateDetail.tsx
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…up/nexent into develop_fix_kb_4 # Conflicts: # test/backend/agents/test_create_agent_info.py
…into develop_fix_kb_4 # Conflicts: # test/backend/agents/test_create_agent_info.py
| tool_ids=prompt_request.tool_ids, | ||
| sub_agent_ids=prompt_request.sub_agent_ids | ||
| sub_agent_ids=prompt_request.sub_agent_ids, | ||
| knowledge_base_display_names=prompt_request.knowledge_base_display_names |

✨ Enhance prompt generation with knowledge base display names
智能体提示词,使用工具选择的知识库名称,而不是编造
knowledge_base_display_namesto theGeneratePromptRequestmodel to allow frontend-configured names for knowledge bases.修改前

修改后,
