feat(training-export): overhaul trigger system and message conversion#79703
feat(training-export): overhaul trigger system and message conversion#79703wzhgba wants to merge 1 commit into
Conversation
|
Codex review: needs real behavior proof before merge. Summary Reproducibility: Do we have a high-confidence way to reproduce the issue? Not applicable: this is a new feature PR rather than a bug report, and the missing evidence is after-fix real behavior proof from a real setup. Real behavior proof Next step before merge Security Review findings
Review detailsBest possible solution: Land only after the branch is rebased to current main, uses the current Pi package scope, passes core typechecking, encodes an approved retention/redaction contract, and includes redacted real-runtime proof for each trigger path. Do we have a high-confidence way to reproduce the issue? Do we have a high-confidence way to reproduce the issue? Not applicable: this is a new feature PR rather than a bug report, and the missing evidence is after-fix real behavior proof from a real setup. Is this the best way to solve the issue? Is this the best way to solve the issue? No for merge as-is; the trajectory-first direction may be viable, but the patch must be updated to current dependency names and an explicit privacy/retention policy before the core append path ships. Full review comments:
Overall correctness: patch is incorrect Security concerns:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 83b8289ee274. |
b675a0d to
4f6af34
Compare
8649e3e to
9f06f41
Compare
9f06f41 to
6ace302
Compare
Summary
Introduce a trajectory-first, trigger-driven training export system that produces episode-level JSONL data from the OpenClaw runtime — no offline reconstruction, no separate pipeline. The system is opt-in (
trainingExport.enabled: true) and writes to:Each line is a self-contained training sample: a task episode (full agent turn with system prompt, messages, tools, metadata) or a compact-summary episode (compression prompt → summary pair for RL compaction training).
Relationship to Existing Systems
/export-trajectory(human-facing debug bundles)The existing
/export-trajectorycommand (docs atdocs/tools/trajectory.md) produces redacted interactive support bundles for human debugging — prompt timelines, tool traces, transcript snapshots, usage metadata. It is triggered manually by users or support staff.The training export introduced here is complementary and non-overlapping:
/export-trajectoryBoth systems read from the same trajectory (trajectory capture /
cache-trace). Training export simply produces a different output format for a different consumer, alongside the existing mechanism.Compaction subsystem
Training export hooks into the Pi SDK compaction lifecycle (
session_before_compact+session_compact) to capture:compactionmetadata (tokensBefore,firstKeptEntryId,fromExtension)This is the same data the compaction system already computes internally — training export just persists it in a structured training format before it is discarded.
Key Design Decisions
1. Trajectory-first
All training fields (system prompt, messages, tools, model metadata) come from runtime trajectory
context.compiledevents. Message and tool conversion delegates to the Pi SDK / provider layer (convertMessagesfrom@mariozechner/pi-ai/openai-completions).2. Unified compaction hook
A single Pi SDK extension (
session_before_compact+session_compact) handles all compaction modes (default, safeguard, manual, overflow, timeout). NorunTrainingExportcalls scattered across individual compaction paths.3. Pair-export guarantee
For compaction-triggered exports, task and summary episodes are built as a batch. If either is filtered by quality checks, the entire batch is discarded — no orphaned episodes.
4. Config-gated at every call site
getTrainingExportConfig(cfg)?.enabled === trueis checked at all three entry points (extension registration, session reset, trajectory export command), so reviewers can see the opt-in gating logic without digging into implementation details.5.
compactionSummarybridgingPi SDK's
convertToLlmconvertscompactionSummary→usermessages, but the upstreamconvertMessagesfrom@mariozechner/pi-ai/openai-completionsdoes not handle thecompactionSummaryrole. A pre-processing step (sharing a single map with thinking-block stripping) mirrors Pi SDK's conversion format before handing off to the upstream converter.6. Training-quality message filtering (all triggers)
Training episodes must end with a complete
assistantmessage — regardless of trigger type. Any snapshot (compaction, reset, or trajectory export) may end mid-turn at a non-assistantmessage (e.g.toolResult). Trailing non-assistant messages are trimmed from every trigger's output. ThetrainExampleMessagesAreUsablecheck requires ≥1 user + ≥1 assistant; if trimming leaves the episode unusable, it is discarded entirely. This is a universal training-data quality requirement, not a compaction-specific behavior.7. Reset export is independent of plugin hooks
The
before_resettraining export call is placed outsideemitGatewayBeforeResetPluginHook, so it fires regardless of whether anybefore_resetplugin hooks are registered.8. Private file permissions
The export directory (
~/.openclaw/training-export) and JSONL file are created with private filesystem modes (0o700/0o600) to prevent world-readable access to training data.Files Changed
src/training-export.tssrc/training-export.test.tsdocs/training-export.mdsrc/config/types.openclaw.tstrainingExportconfig type (enabled,compat)src/config/zod-schema.tstrainingExportschemasrc/config/schema.help.tssrc/config/schema.labels.tssrc/agents/pi-embedded-runner/extensions.tssrc/gateway/session-reset-service.tsbefore_resettrigger (config-gated, outside hook function)src/auto-reply/reply/commands-export-trajectory.tstrajectory_exporttrigger (config-gated, alongside existing command)src/agents/openai-transport-stream.tsconvertResponsesMessagesfor use in conversion pipelineConfiguration
When
enabledisfalse(the default), the extension is not registered andrunTrainingExportis never called — zero overhead.How to Test
trainingExport.enabled: true~/.openclaw/training-export/episodes.jsonl— should contain paired task + summary episodes withcompactionmetadata/export-trajectory— should produce a task episode via the training export path as welltrainingExport.enabled: false— episodes file should receive no new entriesOpen Questions for Review
enabled: false) — is this the right default, or should we consider a different approach?