Skip to content

Commit 46d1a2b

Browse files
authored
refactor: add single tool (#39)
1 parent e064582 commit 46d1a2b

25 files changed

+1779
-1098
lines changed

build.sh

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,12 @@ if ! command -v python3 &> /dev/null; then
1414
exit 1
1515
fi
1616

17+
USE_UV=false
1718
# 2. Check for uv and install dependencies
1819
if command -v uv &> /dev/null; then
1920
echo "--> Using uv to install dependencies..."
2021
uv sync
22+
USE_UV=true
2123
else
2224
echo "--> uv not found, using pip to install from pyproject.toml..."
2325
python3 -m pip install -e .
@@ -26,7 +28,11 @@ fi
2628
# 3. Install PyInstaller if not present
2729
if ! python3 -c "import PyInstaller" 2>/dev/null; then
2830
echo "--> PyInstaller not found. Installing..."
29-
python3 -m pip install pyinstaller
31+
if [ "$USE_UV" = true ]; then
32+
uv pip install pyinstaller
33+
else
34+
python3 -m pip install pyinstaller
35+
fi
3036
fi
3137

3238
# 4. Clean up previous builds

config/prompts_en.yaml

Lines changed: 17 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,8 @@ chat_workflow:
7676
Criteria: Query suggests need to access information already stored in the system
7777
7878
## Classification Decision Flow
79-
1. Determine if it involves historically stored data/memory in the system qa_analysis
80-
2. Determine if it is simple social interaction simple_chat
79+
1. Determine if it involves historically stored data/memory in the system qa_analysis
80+
2. Determine if it is simple social interaction simple_chat
8181
## Pattern Recognition Guidance
8282
- **Time pattern**: Containing time words usually points to qa_analysis
8383
- **Subject pattern**: First-person queries (I, my) usually involve personal historical data
@@ -312,68 +312,38 @@ chat_workflow:
312312
# Tool result validation and filtering
313313
tool_result_validation:
314314
system: |
315-
You are the tool result validation expert of the OpenContext intelligent context management system, responsible for evaluating the quality and relevance of tool call results.
315+
You are the tool result filtering expert of the OpenContext intelligent context management system. Your task is simple: filter results that are relevant to the user's question from tool-returned results.
316316
317-
## Core Tasks
318-
Evaluate the tool call results just executed, determining:
319-
1. Which results are valuable for answering user questions (relevance)
320-
2. Whether tool parameters are correct (validity)
321-
3. Whether certain tools need to be retried with different parameters (improvement suggestions)
322-
323-
## Evaluation Dimensions
324-
325-
### Relevance Judgment
317+
## Relevance Judgment Criteria
326318
- **High relevance**: Directly contains information needed to answer the question
327-
- **Medium relevance**: Contains some useful information, but not sufficient
319+
- **Medium relevance**: Contains some useful information, helpful for answering
328320
- **Low relevance**: Related to the question but not very useful
329321
- **Not relevant**: Completely unrelated information
330322
331323
**Only keep high and medium relevance results**
332324
333-
### Parameter Validity
334-
- Whether tool parameters are reasonable (query terms, time range, etc.)
335-
- Whether returned results meet expectations
336-
- Whether results are useless due to incorrect parameters
337-
338-
### Improvement Suggestions
339-
- If a tool parameter is inappropriate, suggest how to adjust
340-
- If information is insufficient, suggest what tool should be called in the next round
341-
- Avoid repeatedly calling the same tool with the same parameters
342-
343-
## Output Format
344-
Strictly output in JSON format:
325+
## Output Format (Strictly Follow)
326+
Must strictly output in the following JSON format:
345327
```json
346328
{
347-
"relevant_result_ids": ["result_id_1", "result_id_3", "result_id_5"],
348-
"feedback": "Brief feedback explanation",
349-
"retry_suggestions": [
350-
{
351-
"tool_name": "Tool name",
352-
"reason": "Why retry is needed",
353-
"suggested_params": {"param": "Suggested parameter value"}
354-
}
355-
]
329+
"relevant_result_ids": ["result_id_1", "result_id_2", "result_id_3"]
356330
}
357331
```
358332
359-
**Important Principles**:
360-
- relevant_result_ids only includes IDs of high and medium relevance results
361-
- feedback summarizes overall quality in 1-2 sentences
362-
- retry_suggestions are only provided when retry is really needed
363-
- If all results are not relevant, return empty relevant_result_ids: []
333+
**Important Requirements**:
334+
- Field name must be `relevant_result_ids` (not relevant_results)
335+
- Value must be a string array, containing only result_id values
336+
- Do not add other fields
337+
- If all results are not relevant, return empty array: `{"relevant_result_ids": []}`
364338
user: |
365-
Please evaluate the following tool call results:
339+
Please filter results that are relevant to the user's question from the following tool results.
366340
367341
**User Question**: {original_query}
368342
**Enhanced Query**: {enhanced_query}
369343
370-
**Tool Call Situation**:
371-
{tool_calls}
372-
373-
**Tool Return Results**:
344+
**Tool Results**:
374345
{tool_results}
375-
376-
Please analyze the relevance of each result and return JSON format evaluation results.
346+
```
377347
378348
sufficiency_evaluation:
379349
system: |
@@ -890,7 +860,7 @@ generation:
890860
- Keep general activities concise but complete overview, ensure all activities are reflected
891861
- Explain user's specific operations and goals
892862
- Use natural friendly tone, avoid excessive emoji use, maximum 1-2
893-
- Reflect activity coherence and logic, description in three layers: Main activity Specific operation Goal result
863+
- Reflect activity coherence and logic, description in three layers: Main activity Specific operation Goal result
894864
895865
3. **Context ID Requirements**:
896866
- Select at most 5 most valuable context IDs to return

config/prompts_zh.yaml

Lines changed: 15 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -312,68 +312,38 @@ chat_workflow:
312312
# 工具结果验证与过滤
313313
tool_result_validation:
314314
system: |
315-
你是OpenContext智能上下文管理系统的工具结果验证专家,负责评估工具调用结果的质量和相关性
315+
你是OpenContext智能上下文管理系统的工具结果过滤专家。你的任务很简单:从工具返回的结果中,筛选出与用户问题相关的结果
316316
317-
## 核心任务
318-
评估刚刚执行的工具调用结果,判断:
319-
1. 哪些结果对回答用户问题有价值(相关性)
320-
2. 工具参数是否正确(有效性)
321-
3. 是否需要用不同参数重试某些工具(改进建议)
322-
323-
## 评估维度
324-
325-
### 相关性判断
317+
## 相关性判断标准
326318
- **高相关**: 直接包含回答问题所需的信息
327-
- **中相关**: 包含部分有用信息,但不够充分
328-
- **低相关**: 与问题相关但不太有用
319+
- **中相关**: 包含部分有用信息,对回答有帮助
320+
- **低相关**: 与问题相关但用处不大
329321
- **不相关**: 完全无关的信息
330322
331323
**只保留高相关和中相关的结果**
332324
333-
### 参数有效性
334-
- 工具参数是否合理(查询词、时间范围等)
335-
- 返回结果是否符合预期
336-
- 是否因为参数错误导致结果无用
337-
338-
### 改进建议
339-
- 如果某个工具参数不当,建议如何调整
340-
- 如果信息不足,建议下轮应该调用什么工具
341-
- 避免重复调用相同工具和参数
342-
343-
## 输出格式
344-
严格输出JSON格式:
325+
## 输出格式(严格遵守)
326+
必须严格按照以下JSON格式输出:
345327
```json
346328
{
347-
"relevant_result_ids": ["result_id_1", "result_id_3", "result_id_5"],
348-
"feedback": "简要反馈说明",
349-
"retry_suggestions": [
350-
{
351-
"tool_name": "工具名称",
352-
"reason": "为什么需要重试",
353-
"suggested_params": {"param": "建议的参数值"}
354-
}
355-
]
329+
"relevant_result_ids": ["result_id_1", "result_id_2", "result_id_3"]
356330
}
357331
```
358332
359-
**重要原则**:
360-
- relevant_result_ids 只包含高相关和中相关结果的ID
361-
- feedback 用1-2句话概括整体质量
362-
- retry_suggestions 只在确实需要重试时才提供
363-
- 如果所有结果都不相关,返回空的 relevant_result_ids: []
333+
**重要要求**:
334+
- 字段名必须是 `relevant_result_ids`(不是 relevant_results)
335+
- 值必须是字符串数组,只包含 result_id 的值
336+
- 不要添加其他字段
337+
- 如果所有结果都不相关,返回空数组:`{"relevant_result_ids": []}`
364338
user: |
365-
请评估以下工具调用结果:
339+
请从以下工具结果中筛选出与用户问题相关的结果。
366340
367341
**用户问题**: {original_query}
368342
**增强查询**: {enhanced_query}
369343
370-
**工具调用情况**:
371-
{tool_calls}
372-
373-
**工具返回结果**:
344+
**工具结果**:
374345
{tool_results}
375-
376-
请分析每个结果的相关性,返回JSON格式的评估结果。
346+
```
377347
378348
sufficiency_evaluation:
379349
system: |

opencontext/context_consumption/completion/completion_service.py

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
from opencontext.context_consumption.completion.completion_cache import get_completion_cache
2222
from opencontext.utils.logging_utils import get_logger
2323
from opencontext.models.enums import CompletionType
24-
from opencontext.tools.retrieval_tools.text_search_tool import TextSearchTool
24+
from opencontext.tools.retrieval_tools.semantic_context_tool import SemanticContextTool
2525

2626
logger = get_logger(__name__)
2727

@@ -56,7 +56,7 @@ def __init__(self):
5656
self.chat_client = None
5757
self.cache = get_completion_cache() # Use a dedicated cache manager
5858
self.prompt_manager = None # Prompt manager
59-
self.text_search_tool = None # TextSearchTool instance
59+
self.semantic_search_tool = None # SemanticContextTool instance
6060

6161
# Completion configuration
6262
self.max_context_length = 500 # Maximum context length
@@ -73,9 +73,9 @@ def _initialize(self):
7373
self.storage = get_storage()
7474

7575
self.prompt_manager = get_prompt_manager()
76-
77-
# Initialize TextSearchTool
78-
self.text_search_tool = TextSearchTool()
76+
77+
# Initialize SemanticContextTool
78+
self.semantic_search_tool = SemanticContextTool()
7979

8080
logger.info("CompletionService initialized successfully")
8181

@@ -330,18 +330,17 @@ def _get_reference_suggestions(self, context: Dict[str, Any]) -> List[Completion
330330
suggestions = []
331331

332332
try:
333-
if not self.text_search_tool:
333+
if not self.semantic_search_tool:
334334
return suggestions
335-
335+
336336
# Use context for vector search
337337
search_text = context.get('context_before', '')
338338
if len(search_text) < 10:
339339
return suggestions
340-
341-
# Use TextSearchTool for semantic search
342-
search_results = self.text_search_tool.execute(
340+
341+
# Use SemanticContextTool for semantic search
342+
search_results = self.semantic_search_tool.execute(
343343
query=search_text,
344-
context_type='vault', # Only search note content
345344
top_k=5
346345
)
347346

opencontext/context_consumption/context_agent/core/llm_context_strategy.py

Lines changed: 30 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ async def analyze_and_plan_tools(
6767
response = await generate_for_agent_async(
6868
messages=messages,
6969
tools=self.all_tools,
70-
thinking="disabled",
70+
# thinking="disabled",
7171
)
7272

7373
# Extract tool calls from the response
@@ -231,7 +231,7 @@ async def execute_tool_calls_parallel(
231231
for call in tool_calls:
232232
function_name = call.get("function", {}).get("name")
233233
function_args = call.get("function", {}).get("arguments", {})
234-
234+
# self.logger.info(f"Tool call {function_name} args {function_args}")
235235
if function_name:
236236
task = self.tools_executor.run_async(function_name, function_args)
237237
tasks.append((function_name, task))
@@ -358,12 +358,12 @@ async def validate_and_filter_tool_results(
358358
user_template = prompts.get("user", "")
359359

360360
# Format tool calls summary
361-
tool_calls_summary = []
362-
for call in tool_calls:
363-
tool_calls_summary.append({
364-
"tool_name": call.get("function", {}).get("name"),
365-
"parameters": call.get("function", {}).get("arguments", {})
366-
})
361+
# tool_calls_summary = []
362+
# for call in tool_calls:
363+
# tool_calls_summary.append({
364+
# "tool_name": call.get("function", {}).get("name"),
365+
# "parameters": call.get("function", {}).get("arguments", {})
366+
# })
367367

368368
# Format tool results summary
369369
results_summary = []
@@ -378,7 +378,7 @@ async def validate_and_filter_tool_results(
378378
user_prompt = user_template.format(
379379
original_query=intent.original_query,
380380
enhanced_query=intent.enhanced_query,
381-
tool_calls=json.dumps(tool_calls_summary, ensure_ascii=False, indent=2),
381+
# tool_calls=json.dumps(tool_calls_summary, ensure_ascii=False, indent=2),
382382
tool_results=json.dumps(results_summary, ensure_ascii=False, indent=2),
383383
)
384384

@@ -387,10 +387,10 @@ async def validate_and_filter_tool_results(
387387
messages = [{"role": "system", "content": system_prompt}]
388388

389389
# Add user's chat history to give LLM context awareness
390-
if existing_context.chat_history:
391-
recent_messages = existing_context.chat_history[-10:] # Last 10 messages
392-
for msg in recent_messages:
393-
messages.append({"role": msg.role, "content": msg.content})
390+
# if existing_context.chat_history:
391+
# recent_messages = existing_context.chat_history[-10:] # Last 10 messages
392+
# for msg in recent_messages:
393+
# messages.append({"role": msg.role, "content": msg.content})
394394

395395
messages.append({"role": "user", "content": user_prompt})
396396

@@ -405,17 +405,27 @@ async def validate_and_filter_tool_results(
405405
validation_result = parse_json_from_response(response)
406406

407407
# Extract relevant result IDs
408-
relevant_ids = set(validation_result.get("relevant_result_ids", []))
409408

410-
# Filter relevant context items
411-
relevant_items = [
412-
item for item in tool_results
413-
if item.id in relevant_ids
414-
]
409+
# Fallback: if no valid IDs found, return all results to avoid data loss
410+
if 'relevant_result_ids' not in validation_result:
411+
self.logger.warning(
412+
"No relevant_result_ids found in validation response. "
413+
"Returning all results as fallback."
414+
)
415+
relevant_items = tool_results
416+
else:
417+
relevant_ids = set(validation_result.get("relevant_result_ids", []))
418+
# Filter relevant context items
419+
relevant_items = [
420+
item for item in tool_results
421+
if item.id in relevant_ids
422+
]
423+
# self.logger.info(f"Filtered to {len(relevant_items)}/{len(tool_results)} relevant items")
424+
415425
# Build validation message for conversation history
416426
validation_message = {
417427
"role": "assistant",
418-
"content": f"Tool validation result:\n{json.dumps(validation_result, ensure_ascii=False, indent=2)}"
428+
"content": f"Filtered {len(relevant_items)}/{len(tool_results)} relevant results"
419429
}
420430

421431
return relevant_items, validation_message

0 commit comments

Comments
 (0)