feat(training-export): overhaul trigger system and message conversion by wzhgba · Pull Request #79703 · openclaw/openclaw

wzhgba · 2026-05-09T07:16:12Z

Draft / work-in-progress — this PR is under active development. Feedback welcome on the overall direction.

Summary

Introduce a trajectory-first, trigger-driven training export system that produces episode-level JSONL data from the OpenClaw runtime — no offline reconstruction, no separate pipeline. The system is opt-in (trainingExport.enabled: true) and writes to:

~/.openclaw/training-export/episodes.jsonl

Each line is a self-contained training sample: a task episode (full agent turn with system prompt, messages, tools, metadata) or a compact-summary episode (compression prompt → summary pair for RL compaction training).

Relationship to Existing Systems

`/export-trajectory` (human-facing debug bundles)

The existing /export-trajectory command (docs at docs/tools/trajectory.md) produces redacted interactive support bundles for human debugging — prompt timelines, tool traces, transcript snapshots, usage metadata. It is triggered manually by users or support staff.

The training export introduced here is complementary and non-overlapping:

	`/export-trajectory`	Training Export
Purpose	Human debugging, support	Machine training data
Trigger	Manual command	Automatic (compaction, reset, export command)
Output	Redacted bundle directory	JSONL episodes
Format	Directory of text/markdown files	One JSON line per episode
Privacy	Redacted (best-effort)	Full content (machine-consumed; opt-in)
Audience	Developers, support	RL training pipelines

Both systems read from the same trajectory (trajectory capture / cache-trace). Training export simply produces a different output format for a different consumer, alongside the existing mechanism.

Compaction subsystem

Training export hooks into the Pi SDK compaction lifecycle (session_before_compact + session_compact) to capture:

Pre-compaction context (task episode): the full conversation before compression
Post-compaction summary (summary episode): the prompt sent to the summarization model + the summary it produced, with compaction metadata (tokensBefore, firstKeptEntryId, fromExtension)

This is the same data the compaction system already computes internally — training export just persists it in a structured training format before it is discarded.

Key Design Decisions

1. Trajectory-first

All training fields (system prompt, messages, tools, model metadata) come from runtime trajectory context.compiled events. Message and tool conversion delegates to the Pi SDK / provider layer (convertMessages from @mariozechner/pi-ai/openai-completions).

2. Unified compaction hook

A single Pi SDK extension (session_before_compact + session_compact) handles all compaction modes (default, safeguard, manual, overflow, timeout). No runTrainingExport calls scattered across individual compaction paths.

3. Pair-export guarantee

For compaction-triggered exports, task and summary episodes are built as a batch. If either is filtered by quality checks, the entire batch is discarded — no orphaned episodes.

4. Config-gated at every call site

getTrainingExportConfig(cfg)?.enabled === true is checked at all three entry points (extension registration, session reset, trajectory export command), so reviewers can see the opt-in gating logic without digging into implementation details.

5. `compactionSummary` bridging

Pi SDK's convertToLlm converts compactionSummary → user messages, but the upstream convertMessages from @mariozechner/pi-ai/openai-completions does not handle the compactionSummary role. A pre-processing step (sharing a single map with thinking-block stripping) mirrors Pi SDK's conversion format before handing off to the upstream converter.

6. Training-quality message filtering (all triggers)

Training episodes must end with a complete assistant message — regardless of trigger type. Any snapshot (compaction, reset, or trajectory export) may end mid-turn at a non-assistant message (e.g. toolResult). Trailing non-assistant messages are trimmed from every trigger's output. The trainExampleMessagesAreUsable check requires ≥1 user + ≥1 assistant; if trimming leaves the episode unusable, it is discarded entirely. This is a universal training-data quality requirement, not a compaction-specific behavior.

7. Reset export is independent of plugin hooks

The before_reset training export call is placed outside emitGatewayBeforeResetPluginHook, so it fires regardless of whether any before_reset plugin hooks are registered.

8. Private file permissions

The export directory (~/.openclaw/training-export) and JSONL file are created with private filesystem modes (0o700 / 0o600) to prevent world-readable access to training data.

Files Changed

File	Change
`src/training-export.ts`	New — core module: snapshot collection, episode construction, JSONL I/O, prompt constants, compaction extension
`src/training-export.test.ts`	New — test suite
`docs/training-export.md`	New — formal feature documentation
`src/config/types.openclaw.ts`	Add `trainingExport` config type (`enabled`, `compat`)
`src/config/zod-schema.ts`	Add `trainingExport` schema
`src/config/schema.help.ts`	Add field help text
`src/config/schema.labels.ts`	Add field labels
`src/agents/pi-embedded-runner/extensions.ts`	Register compaction extension (config-gated, opt-in)
`src/gateway/session-reset-service.ts`	`before_reset` trigger (config-gated, outside hook function)
`src/auto-reply/reply/commands-export-trajectory.ts`	`trajectory_export` trigger (config-gated, alongside existing command)
`src/agents/openai-transport-stream.ts`	Minor: export `convertResponsesMessages` for use in conversion pipeline

Configuration

trainingExport:
  enabled: true            # default: false (opt-in)
  compat: {}               # optional ModelCompatConfig override for export path

When enabled is false (the default), the extension is not registered and runTrainingExport is never called — zero overhead.

How to Test

Enable via trainingExport.enabled: true
Trigger a compaction in a session long enough to exceed the context threshold
Check ~/.openclaw/training-export/episodes.jsonl — should contain paired task + summary episodes with compaction metadata
Reset a session — should produce a task episode
Run /export-trajectory — should produce a task episode via the training export path as well
Disable via trainingExport.enabled: false — episodes file should receive no new entries

Open Questions for Review

Default to opt-in (enabled: false) — is this the right default, or should we consider a different approach?
Privacy and retention policy — the training export writes full (non-redacted) session content to disk. Should there be a retention/cleanup mechanism?

clawsweeper · 2026-05-09T07:19:24Z

Codex review: needs real behavior proof before merge.

Summary
The PR adds an opt-in core training export config, JSONL episode writer, compaction/reset/trajectory trigger hooks, documentation, and unit tests.

Reproducibility: Do we have a high-confidence way to reproduce the issue? Not applicable: this is a new feature PR rather than a bug report, and the missing evidence is after-fix real behavior proof from a real setup.

Real behavior proof
Needs real behavior proof before merge: No after-fix compaction/reset/trajectory JSONL output, terminal output, recording, linked artifact, or redacted runtime log is attached; the contributor should add redacted proof and update the PR body for re-review.

Next step before merge
Human review is needed because the PR is draft/conflicting, lacks contributor real behavior proof, and needs a maintainer-approved privacy/retention contract before automation should attempt repair.

Security
Needs attention: Needs attention because the patch adds a persistent unredacted session-content export path before the retention, cleanup, sharing, and redaction boundary is settled.

Review findings

[P1] Use the current Pi package scope — src/training-export.ts:12-15
[P1] Remove the unused pre-compact config read — src/training-export.ts:1082
[P2] Define retention before appending training exports — src/training-export.ts:824

Review details

Best possible solution:

Land only after the branch is rebased to current main, uses the current Pi package scope, passes core typechecking, encodes an approved retention/redaction contract, and includes redacted real-runtime proof for each trigger path.

Do we have a high-confidence way to reproduce the issue?

Do we have a high-confidence way to reproduce the issue? Not applicable: this is a new feature PR rather than a bug report, and the missing evidence is after-fix real behavior proof from a real setup.

Is this the best way to solve the issue?

Is this the best way to solve the issue? No for merge as-is; the trajectory-first direction may be viable, but the patch must be updated to current dependency names and an explicit privacy/retention policy before the core append path ships.

Full review comments:

[P1] Use the current Pi package scope — src/training-export.ts:12-15
Current main depends on and imports @earendil-works/pi-*, but this new core file imports @mariozechner/pi-*. Those modules are not declared in the current package contract, so the branch will fail typecheck/runtime resolution after rebasing unless these imports use the current scope.
Confidence: 0.94
[P1] Remove the unused pre-compact config read — src/training-export.ts:1082
config is assigned inside the session_before_compact handler but never read there. Because tsconfig.core.json enables noUnusedLocals for production src/**/*, this remains a core typecheck failure even after the package imports are fixed.
Confidence: 0.9
[P2] Define retention before appending training exports — src/training-export.ts:824
This append path persists trajectory-derived prompts, messages, tools, and metadata to a long-lived JSONL file while the PR still leaves retention and cleanup as an open question. The feature needs an agreed policy or cleanup mechanism before enabling users to accumulate unredacted session content indefinitely.
Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.93

Security concerns:

[medium] Define training export retention — src/training-export.ts:824
The new exporter appends trajectory-derived session content to a long-lived JSONL file under the OpenClaw state directory; private file modes help, but the PR still leaves retention and cleanup policy unresolved.
Confidence: 0.86

What I checked:

Live PR state: GitHub reports this PR is open, draft, not maintainer-modifiable, mergeable=CONFLICTING, and at head 6ace302; the Real behavior proof check failed. (6ace302b5313)
Current main lacks the feature: No current-main source/docs/package matches were found for trainingExport, runTrainingExport, training-export, episodes.jsonl, or compactionTrainingExport, so the PR is not obsolete on main. (83b8289ee274)
PR imports obsolete Pi package scope: The new core module imports @mariozechner/pi-agent-core, @mariozechner/pi-ai, and @mariozechner/pi-coding-agent, but current main uses the @earendil-works scope. (src/training-export.ts:12, 6ace302b5313)
Current dependency contract: Current main declares @earendil-works/pi-agent-core, @earendil-works/pi-ai, @earendil-works/pi-coding-agent, and @earendil-works/pi-tui at 0.74.0; package and lock searches only show @mariozechner clipboard packages, not the PR's Pi imports. (package.json:1754, 83b8289ee274)
Core typecheck rejects unused locals: The production core TypeScript config enables noUnusedLocals and noUnusedParameters for src/**/*, so unused variables in src/training-export.ts are build blockers. (tsconfig.core.json:4, 83b8289ee274)
PR leaves an unused pre-compact variable: The session_before_compact handler assigns config from the runtime registry but never reads it in that handler, which will trip the current noUnusedLocals contract after imports are updated. (src/training-export.ts:1082, 6ace302b5313)

Likely related people:

steipete: Current-main blame and recent commits connect steipete to the Pi package-scope state, trajectory runtime/docs, and OpenAI transport surfaces that this PR extends. (role: recent area contributor; confidence: high; commits: 365c986a5b2a, 474bea162b4d, 1888242bd30a; files: package.json, src/trajectory/runtime.ts, docs/tools/trajectory.md)
bradhallett: Recent merged work changed Pi auto-compaction behavior in the same embedded extension factory area that this PR modifies. (role: adjacent compaction contributor; confidence: medium; commits: 0bdba47a3e89; files: src/agents/pi-embedded-runner/extensions.ts)
vincentkoc: Recent merged work touched provider runtime selection and harness extension boundaries, which are relevant to the core-vs-runtime integration shape of this PR. (role: adjacent runtime/config contributor; confidence: medium; commits: aa27e27f3606, 47f6a98909b5; files: src/agents/pi-embedded-runner/extensions.ts, src/config/types.openclaw.ts)

Remaining risk / open question:

The PR persists session-derived training data before the retention, cleanup, sharing, and redaction contract is approved.
No contributor-supplied real behavior proof shows compaction, reset, or /export-trajectory JSONL output from a real OpenClaw setup.
The branch is draft, conflicting, and not maintainer-modifiable, so normal review and repair paths are limited until the contributor updates it.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 83b8289ee274.

openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime agents Agent runtime and tooling size: XL triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 9, 2026

wzhgba force-pushed the wuzehuan/training-export branch from b675a0d to 4f6af34 Compare May 9, 2026 08:10

openclaw-barnacle Bot added the triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context. label May 9, 2026

wzhgba force-pushed the wuzehuan/training-export branch 3 times, most recently from 8649e3e to 9f06f41 Compare May 9, 2026 10:46

feat(training-export): overhaul trigger system and message conversion

6ace302

wzhgba force-pushed the wuzehuan/training-export branch from 9f06f41 to 6ace302 Compare May 9, 2026 13:37

This was referenced May 11, 2026

[Feature]: Runtime Token Budget Awareness and Proactive Context Compaction #80593

Closed

[Feature]: Runtime Token Budget Awareness and Proactive Context Compaction #80594

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(training-export): overhaul trigger system and message conversion#79703

feat(training-export): overhaul trigger system and message conversion#79703
wzhgba wants to merge 1 commit into
openclaw:mainfrom
SenseTime-FVG:wuzehuan/training-export

wzhgba commented May 9, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

wzhgba commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Relationship to Existing Systems

/export-trajectory (human-facing debug bundles)

Compaction subsystem

Key Design Decisions

1. Trajectory-first

2. Unified compaction hook

3. Pair-export guarantee

4. Config-gated at every call site

5. compactionSummary bridging

6. Training-quality message filtering (all triggers)

7. Reset export is independent of plugin hooks

8. Private file permissions

Files Changed

Configuration

How to Test

Open Questions for Review

Uh oh!

clawsweeper Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wzhgba commented May 9, 2026 •

edited

Loading

`/export-trajectory` (human-facing debug bundles)

5. `compactionSummary` bridging

clawsweeper Bot commented May 9, 2026 •

edited

Loading