Merged
Conversation
Replace leftover ~/.claude-memory/ references with ~/.openexp/ so all OpenExp data lives under a single self-contained directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
anthroos
added a commit
that referenced
this pull request
Mar 22, 2026
- Q-update: EMA → additive (Q = clamp(Q + α*r, floor, ceiling)) - q_init: 0.5 → 0.0 (memories earn value from zero) - q_ceiling: 1.0 added - Outcome resolver: CRM CSV transitions → memory rewards - client_id tagging on memories - resolve CLI command - session-end hook with retrieval reward loop - 73/73 tests pass Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
anthroos
added a commit
that referenced
this pull request
Apr 27, 2026
…#39)
* feat: minimal prediction/outcome schema + pack instrumentation rule
Prediction/outcome MCP tools were dead code — 2 entries total since
2026-03-23 because no SKILL.md / CLAUDE.md / hook ever specified the
trigger. The seed pack went into production for the first time on
2026-04-27 (Norda case, lead-norda-001) and there was no instrumentation
to prove the pack moved the outcome. Without a baseline, any future
experiment over packs is unfalsifiable.
This commit installs the minimum viable instrumentation.
Schema (new path, required):
log_prediction(pack_id, pack_author, cited_step, case_id,
applied_action, expected_signal, expected_window_days,
prevented_action?, notes?)
log_outcome(prediction_id, actual_signal, days_to_resolve, notes?)
Three fields are explicitly NOT in the schema: confidence,
alternative_action_if_no_pack, predicted_outcome_alternative. Reasoning
in CHANGELOG — Claude-side confidence is uncalibrated until ≥30 outcome
datapoints, and the same Claude that writes the prediction would invent
a biased counterfactual. Real ablation requires a separate pack-blind
run, which is a different track.
Three fields are kept that the audit flagged as not-obvious: cited_step
(sharp trigger — only log when a relative_day was cited), prevented_action
(negative-space prediction — half the pack's value is "don't do X"), and
expected_signal + expected_window_days (resolves later without
interpretation).
Backward compat: old free-text predictions still accepted, old
log_outcome with reward still updates Q-values for memory_ids_used.
schema_version: 2 marks new-path entries.
Files:
- openexp/mcp_server.py: new MCP schema, dispatcher routes by which
fields are present
- openexp/reward_tracker.py: log_prediction/log_outcome accept both
paths; new path skips Q-update entirely
- experiences/d49e0997/SKILL.md: new "Instrumentation (mandatory)"
section before "What NOT to do"; trigger criterion is sharp
- templates/SKILL.template.md: same section so future packs inherit
- CHANGELOG.md NEW: explains the schema choices
- scripts/seed_norda_outcome.py NEW: retrospective seed for datapoint #1
Datapoint #1 (Norda case) verified in storage:
pred_a751b249 — pack inbound-acquisition-with-free-pilot, day +57,
applied "upload to Vchasno → KEP → invite", prevented "follow-up
nudge during the 6-day pause", resolved as "signed both sides" in
4 days.
300/300 tests pass. No CLAUDE.md global rule yet — first watch whether
the SKILL.md rule is followed naturally on a single pack before
elevating it to user-level.
* docs(readme): document new prediction/outcome schema
Old MCP Tools table described log_prediction/log_outcome with one-liner
generics ("track a prediction"). That is no longer accurate after the
schema change in this PR.
Added a "Prediction / outcome instrumentation" subsection right under
the table:
- Trigger criterion (only fires when a pack's relative_day is cited)
- Two field tables (new-path log_prediction + log_outcome)
- The "what is deliberately NOT in the schema" reasoning (confidence,
fake counterfactual fields)
- Backward compat note (legacy fields still accepted, schema_version: 2
marks new entries)
This is doc-only; the code on this branch already implements both paths.
---------
Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com>
anthroos
added a commit
that referenced
this pull request
Apr 27, 2026
Replace leftover ~/.claude-memory/ references with ~/.openexp/ so all OpenExp data lives under a single self-contained directory. Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
anthroos
added a commit
that referenced
this pull request
Apr 27, 2026
- Q-update: EMA → additive (Q = clamp(Q + α*r, floor, ceiling)) - q_init: 0.5 → 0.0 (memories earn value from zero) - q_ceiling: 1.0 added - Outcome resolver: CRM CSV transitions → memory rewards - client_id tagging on memories - resolve CLI command - session-end hook with retrieval reward loop - 73/73 tests pass Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
anthroos
added a commit
that referenced
this pull request
Apr 27, 2026
…#39)
* feat: minimal prediction/outcome schema + pack instrumentation rule
Prediction/outcome MCP tools were dead code — 2 entries total since
2026-03-23 because no SKILL.md / CLAUDE.md / hook ever specified the
trigger. The seed pack went into production for the first time on
2026-04-27 (first counterparty case) and there was no instrumentation
to prove the pack moved the outcome. Without a baseline, any future
experiment over packs is unfalsifiable.
This commit installs the minimum viable instrumentation.
Schema (new path, required):
log_prediction(pack_id, pack_author, cited_step, case_id,
applied_action, expected_signal, expected_window_days,
prevented_action?, notes?)
log_outcome(prediction_id, actual_signal, days_to_resolve, notes?)
Three fields are explicitly NOT in the schema: confidence,
alternative_action_if_no_pack, predicted_outcome_alternative. Reasoning
in CHANGELOG — Claude-side confidence is uncalibrated until ≥30 outcome
datapoints, and the same Claude that writes the prediction would invent
a biased counterfactual. Real ablation requires a separate pack-blind
run, which is a different track.
Three fields are kept that the audit flagged as not-obvious: cited_step
(sharp trigger — only log when a relative_day was cited), prevented_action
(negative-space prediction — half the pack's value is "don't do X"), and
expected_signal + expected_window_days (resolves later without
interpretation).
Backward compat: old free-text predictions still accepted, old
log_outcome with reward still updates Q-values for memory_ids_used.
schema_version: 2 marks new-path entries.
Files:
- openexp/mcp_server.py: new MCP schema, dispatcher routes by which
fields are present
- openexp/reward_tracker.py: log_prediction/log_outcome accept both
paths; new path skips Q-update entirely
- experiences/d49e0997/SKILL.md: new "Instrumentation (mandatory)"
section before "What NOT to do"; trigger criterion is sharp
- templates/SKILL.template.md: same section so future packs inherit
- CHANGELOG.md NEW: explains the schema choices
- scripts/seed_first_outcome.py NEW: retrospective seed for datapoint #1
Datapoint #1 (first counterparty case) verified in storage:
pred_a751b249 — pack inbound-acquisition-with-free-pilot, day +57,
applied "upload to <E_SIGNING_PLATFORM> → <DIGITAL_KEY> → invite", prevented "follow-up
nudge during the 6-day pause", resolved as "signed both sides" in
4 days.
300/300 tests pass. No CLAUDE.md global rule yet — first watch whether
the SKILL.md rule is followed naturally on a single pack before
elevating it to user-level.
* docs(readme): document new prediction/outcome schema
Old MCP Tools table described log_prediction/log_outcome with one-liner
generics ("track a prediction"). That is no longer accurate after the
schema change in this PR.
Added a "Prediction / outcome instrumentation" subsection right under
the table:
- Trigger criterion (only fires when a pack's relative_day is cited)
- Two field tables (new-path log_prediction + log_outcome)
- The "what is deliberately NOT in the schema" reasoning (confidence,
fake counterfactual fields)
- Backward compat note (legacy fields still accepted, schema_version: 2
marks new entries)
This is doc-only; the code on this branch already implements both paths.
---------
Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com>
anthroos
added a commit
that referenced
this pull request
Apr 28, 2026
…#39)
* feat: minimal prediction/outcome schema + pack instrumentation rule
Prediction/outcome MCP tools were dead code — 2 entries total since
2026-03-23 because no SKILL.md / CLAUDE.md / hook ever specified the
trigger. The seed pack went into production for the first time on
2026-04-27 (first counterparty case) and there was no instrumentation
to prove the pack moved the outcome. Without a baseline, any future
experiment over packs is unfalsifiable.
This commit installs the minimum viable instrumentation.
Schema (new path, required):
log_prediction(pack_id, pack_author, cited_step, case_id,
applied_action, expected_signal, expected_window_days,
prevented_action?, notes?)
log_outcome(prediction_id, actual_signal, days_to_resolve, notes?)
Three fields are explicitly NOT in the schema: confidence,
alternative_action_if_no_pack, predicted_outcome_alternative. Reasoning
in CHANGELOG — Claude-side confidence is uncalibrated until ≥30 outcome
datapoints, and the same Claude that writes the prediction would invent
a biased counterfactual. Real ablation requires a separate pack-blind
run, which is a different track.
Three fields are kept that the audit flagged as not-obvious: cited_step
(sharp trigger — only log when a relative_day was cited), prevented_action
(negative-space prediction — half the pack's value is "don't do X"), and
expected_signal + expected_window_days (resolves later without
interpretation).
Backward compat: old free-text predictions still accepted, old
log_outcome with reward still updates Q-values for memory_ids_used.
schema_version: 2 marks new-path entries.
Files:
- openexp/mcp_server.py: new MCP schema, dispatcher routes by which
fields are present
- openexp/reward_tracker.py: log_prediction/log_outcome accept both
paths; new path skips Q-update entirely
- experiences/d49e0997/SKILL.md: new "Instrumentation (mandatory)"
section before "What NOT to do"; trigger criterion is sharp
- templates/SKILL.template.md: same section so future packs inherit
- CHANGELOG.md NEW: explains the schema choices
- scripts/seed_first_outcome.py NEW: retrospective seed for datapoint #1
Datapoint #1 (first counterparty case) verified in storage:
pred_a751b249 — pack inbound-acquisition-with-free-pilot, day +57,
applied "upload to <E_SIGNING_PLATFORM> → <DIGITAL_KEY> → invite", prevented "follow-up
nudge during the 6-day pause", resolved as "signed both sides" in
4 days.
300/300 tests pass. No CLAUDE.md global rule yet — first watch whether
the SKILL.md rule is followed naturally on a single pack before
elevating it to user-level.
* docs(readme): document new prediction/outcome schema
Old MCP Tools table described log_prediction/log_outcome with one-liner
generics ("track a prediction"). That is no longer accurate after the
schema change in this PR.
Added a "Prediction / outcome instrumentation" subsection right under
the table:
- Trigger criterion (only fires when a pack's relative_day is cited)
- Two field tables (new-path log_prediction + log_outcome)
- The "what is deliberately NOT in the schema" reasoning (confidence,
fake counterfactual fields)
- Backward compat note (legacy fields still accepted, schema_version: 2
marks new entries)
This is doc-only; the code on this branch already implements both paths.
---------
Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
~/.claude-memory/references with~/.openexp/across 8 files~/.openexp/directory (observations, sessions, data)Files changed
openexp/core/config.py— default paths for OBSERVATIONS_DIR and SESSIONS_DIRopenexp/hooks/post-tool-use.sh— hardcoded observation write pathopenexp/hooks/session-start.sh— hardcoded sessions read path.env.example— commented-out default valuesREADME.md— configuration table + data flow diagramdocs/architecture.md— system diagram + data persistence tabledocs/configuration.md— configuration tabledocs/how-it-works.md— observation storage descriptionTest plan
grep -r "claude-memory" .returns zero matches./setup.shon clean machine — verify~/.openexp/dirs created~/.openexp/observations/🤖 Generated with Claude Code