feat(memory): batch agent experience consolidation by huangruiteng · Pull Request #2201 · volcengine/OpenViking

huangruiteng · 2026-05-22T18:14:31Z

Summary

Adds an opt-in batch mode for agent experience consolidation after trajectory extraction, plus low-cardinality phase telemetry for agent-memory extraction.

Today agent memory extraction writes trajectories first, then runs one experience-consolidation LLM pass per newly written trajectory. That keeps the flow simple, but corpus preparation becomes slow when a session produces several trajectories. This PR keeps the existing per-trajectory behavior as the default and adds a bounded batch mode:

memory.agent_experience_consolidation_mode = "batch"
memory.agent_experience_batch_max_trajectories = 5

The quality target is parity with the existing experience granularity. Batch mode is not intended to force one experience per source trajectory.

Source Attribution Safety

Batch mode must not reuse the single-trajectory fallback, because that can attach an entire mixed batch to one experience card. This PR adds a temporary source_trajectory_ids field only for the batch extraction schema. The provider resolves it into concrete trajectory URIs before apply and strips the field so it is not persisted.

If a batch output omits source attribution for a written/edited experience, the system skips appending source trajectories instead of attaching the whole batch. A single experience may still cite multiple source trajectories; the attribution field is lineage, not a split instruction.

Defaults and Compatibility

Default behavior remains unchanged:

agent_experience_consolidation_mode defaults to per_trajectory
existing per-trajectory source fallback remains intact
batch mode is opt-in and bounded by agent_experience_batch_max_trajectories
telemetry is additive and uses fixed phase buckets rather than URI/task-specific metric keys

Validation Signal

Small TAU-2 airline corpus-prep remeasure on the same cached train transcripts. This is not a benchmark-score claim; it only validates write-time consolidation behavior on realistic multi-step sessions.

Setup: 8 airline train tasks, success-only commit policy; 7 successful sessions committed and 1 failed session skipped. Counts exclude .abstract.md / .overview.md.

mode	wall	traj	exp	read
per-trajectory	744s	12	11	10 single experience consolidation calls, ~165s
batch max5	661s	14	10	4 batch + 3 single consolidation calls, ~135s

Quality read: the batch output stayed close in durable experience count on this sample and did not show obvious whole-batch-to-one-card source misattribution. The remaining quality risk is mild granularity drift between related cards, so the feature stays opt-in and bounded while telemetry makes the write bottleneck observable.

TAU-2 benchmark config in this branch defaults corpus preparation to batch mode and records the expected server memory config in run artifacts. --strict-preflight checks OPENVIKING_CONFIG_FILE / ~/.openviking/ov.conf so evidence runs fail fast if the server is still using per-trajectory consolidation.

Tests

uv run ruff format --check openviking/session/compressor_v2.py openviking/session/memory/agent_experience_context_provider.py openviking/session/memory/batch_agent_experience_context_provider.py openviking/telemetry/operation.py openviking_cli/utils/config/memory_config.py tests/session/memory/test_agent_experience_context_provider.py tests/session/memory/test_compressor_v2.py tests/session/test_session_commit.py tests/test_telemetry_runtime.py
uv run ruff check openviking/session/compressor_v2.py openviking/session/memory/agent_experience_context_provider.py openviking/session/memory/batch_agent_experience_context_provider.py openviking/telemetry/operation.py openviking_cli/utils/config/memory_config.py tests/session/memory/test_agent_experience_context_provider.py tests/session/memory/test_compressor_v2.py tests/session/test_session_commit.py tests/test_telemetry_runtime.py
uv run --group dev --with pytest-asyncio --with pytest-cov pytest tests/session/memory/test_agent_experience_context_provider.py tests/session/memory/test_compressor_v2.py::TestCompressorV2::test_extract_phase_runs_post_apply_before_lock_release tests/session/memory/test_compressor_v2.py::TestCompressorV2::test_agent_memory_batch_experience_respects_batch_size tests/session/memory/test_compressor_v2.py::TestCompressorV2::test_agent_memory_batch_experience_keeps_sources_separate_per_experience tests/session/memory/test_compressor_v2.py::TestCompressorV2::test_agent_memory_batch_experience_skips_source_append_without_attribution tests/session/test_session_commit.py::TestCommit::test_commit_extracts_memories tests/test_telemetry_runtime.py::test_telemetry_summary_includes_agent_memory_phase_metrics -q

github-actions · 2026-05-22T18:15:42Z

PR Reviewer Guide 🔍

(Review updated until commit `1ae63ba`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

github-actions · 2026-05-22T18:16:34Z

PR Code Suggestions ✨

No code suggestions found for the PR.

github-actions · 2026-05-23T17:21:40Z

Persistent review updated to latest commit 1ae63ba

This reverts commit 9b109ce.

feat: batch agent experience consolidation

f690c88

github-project-automation Bot added this to OpenViking project May 22, 2026

github-project-automation Bot moved this to Backlog in OpenViking project May 22, 2026

huangruiteng added 17 commits May 23, 2026 02:28

style: format batch experience tests

f33643e

test: cover batch experience chunk sizing

309f916

fix: derive batch experience prompt from single provider

bf6acfb

style: format batch experience prompt adapter

098e50b

feat(memory): align experience prompt with atomic intent

f0de40e

fix(memory): match atomic experience prompt archive

f8dacfa

fix: satisfy batch experience lint

76ebbf6

fix: preserve batch experience granularity

d13ccbf

style: format batch experience test

dc03db1

feat: expose agent memory phase telemetry

941fa18

feat: surface commit telemetry in benchmark manifests

abe602b

fix: preserve batch action boundaries

8cce894

feat: audit experience corpus quality

3fe7bc5

chore: trim batch experience diagnostics

223a473

chore: tighten batch prompt adapter

cf6a7c7

chore: default TAU corpus prep to batch mode

beaf4b9

chore: format TAU batch eval config

1ae63ba

huangruiteng marked this pull request as ready for review May 23, 2026 17:20

huangruiteng added 2 commits May 24, 2026 01:35

chore: keep batch PR focused on core memory

9b109ce

Revert "chore: keep batch PR focused on core memory"

397764e

This reverts commit 9b109ce.

qin-ctx requested a review from chenjw May 25, 2026 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): batch agent experience consolidation#2201

feat(memory): batch agent experience consolidation#2201
huangruiteng wants to merge 20 commits into
volcengine:mainfrom
huangruiteng:feat/batch-experience-consolidation

huangruiteng commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

huangruiteng commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Source Attribution Safety

Defaults and Compatibility

Validation Signal

Tests

Uh oh!

github-actions Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 1ae63ba)

Uh oh!

github-actions Bot commented May 22, 2026

PR Code Suggestions ✨

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

huangruiteng commented May 22, 2026 •

edited

Loading

github-actions Bot commented May 22, 2026 •

edited

Loading

(Review updated until commit `1ae63ba`)