Skip to content

feat: add Memory V2 full suite test #1354

Merged
qin-ctx merged 1 commit intovolcengine:mainfrom
kaisongli:feat/memory_v2_test_suite
Apr 14, 2026
Merged

feat: add Memory V2 full suite test #1354
qin-ctx merged 1 commit intovolcengine:mainfrom
kaisongli:feat/memory_v2_test_suite

Conversation

@kaisongli
Copy link
Copy Markdown
Collaborator

Description

Related Issue

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add Memory V2 Full Suite Test

Relevant files:

  • tests/oc2ov_test/tests/p0/test_memory_v2_full_suite.py

Sub-PR theme: Update Test Utilities for CI and Timeouts

Relevant files:

  • tests/oc2ov_test/utils/test_utils.py
  • tests/oc2ov_test/utils/openclaw_cli_client.py

Sub-PR theme: Increase Workflow Timeouts

Relevant files:

  • .github/workflows/api_test.yml
  • .github/workflows/oc2ov_test.yml

⚡ Recommended focus areas for review

Flaky Test Due to Fixed Sleeps

The test uses fixed time.sleep() calls (2s and 10s) to wait for memory generation. This can cause flakiness if the memory generation takes longer than expected in different environments.

time.sleep(2)

# 步骤 2: 执行 commit 触发记忆生成
print(f"\n[步骤 2/5] 执行 commit 触发记忆生成")
commit_response = self.commit_memory()
if commit_response.get("status_code") == 200:
    print(f"✓ commit 执行成功")
    result["steps"]["compact"] = "success"
else:
    print(f"✗ commit 执行失败: {commit_response}")
    result["steps"]["compact"] = "failed"
print("  等待记忆文件生成...")
time.sleep(10)
Inconsistent Step Numbering

The step numbering in the test output is inconsistent (e.g., [步骤 1/4] followed by [步骤 2/5]). This is confusing for users reading the test output.

print(f"\n[步骤 1/4] 发送测试消息: {scenario['test_message']}")
test_response = self.run_openclaw_command(scenario['test_message'])
print(f"✓ 消息发送成功")
result["steps"]["send_message"] = "success"
time.sleep(2)

# 步骤 2: 执行 commit 触发记忆生成
print(f"\n[步骤 2/5] 执行 commit 触发记忆生成")
commit_response = self.commit_memory()
if commit_response.get("status_code") == 200:
    print(f"✓ commit 执行成功")
    result["steps"]["compact"] = "success"
else:
    print(f"✗ commit 执行失败: {commit_response}")
    result["steps"]["compact"] = "failed"
print("  等待记忆文件生成...")
time.sleep(10)

# 步骤 3: 验证记忆文件生成
print(f"\n[步骤 3/5] 验证记忆文件生成")
memory_files_result = self.check_memory_files(scenario['memory_type'])
result["memory_files"] = memory_files_result

if memory_files_result["found"]:
    print(f"✓ 记忆文件验证成功")
    result["steps"]["memory_files"] = "success"
else:
    print(f"✗ 记忆文件验证失败")
    result["steps"]["memory_files"] = "failed"

# 步骤 4: 询问验证消息
print(f"\n[步骤 4/5] 询问验证消息: {scenario['verify_message']}")
verify_response = self.run_openclaw_command(scenario['verify_message'])
print(f"✓ 验证消息发送成功")
result["steps"]["verify"] = "success"

# 步骤 5: 验证关键词
print(f"\n[步骤 5/5] 验证关键词")
Inconsistent Pass Rate Requirement

The __main__ entry point requires a 70% pass rate, while the pytest test requires 100%. This inconsistency could lead to different outcomes when running the test directly vs via pytest.

    assert pass_rate == 1.0, f"测试通过率不是100%: {pass_rate*100:.1f}%,有{results['summary']['failed']}个场景失败"


if __name__ == "__main__":
    """直接运行测试"""
    tester = MemoryV2TestSuite()
    results = tester.run_full_test_suite()

    # 输出详细的测试报告
    print("\n" + "="*60)
    print("详细测试报告")
    print("="*60)

    for scenario in results["scenarios"]:
        status_icon = "✓" if scenario["status"] == "passed" else "✗"
        print(f"\n{status_icon} {scenario['scenario']}: {scenario['status']}")
        print(f"  关键词通过: {scenario.get('pass_count', 0)}/{scenario.get('total_keywords', 0)}")
        if scenario.get('keywords'):
            for kw in scenario['keywords']:
                kw_icon = "✓" if kw['found'] else "✗"
                print(f"    {kw_icon} {kw['keyword']}: {'找到' if kw['found'] else '未找到'}")

    exit(0 if results['summary']['passed'] >= results['total_scenarios'] * 0.7 else 1)

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@kaisongli kaisongli force-pushed the feat/memory_v2_test_suite branch 8 times, most recently from 7caba93 to bdaa384 Compare April 14, 2026 07:28
@kaisongli kaisongli force-pushed the feat/memory_v2_test_suite branch from bdaa384 to 2107cd8 Compare April 14, 2026 07:52
@qin-ctx qin-ctx merged commit f1a8954 into volcengine:main Apr 14, 2026
6 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants