feat: SessionEnd hook, expert skill restructure, cache invalidation, CLI fix by yilu331 · Pull Request #27 · ReflexioAI/reflexio

yilu331 · 2026-04-20T18:43:45Z

Summary

Comprehensive improvement to the Claude Code integration: fix the session-end hook, restructure the expert skill for reliable triggering, add cache invalidation on mutations, and fix CLI issues.

Changes

SessionEnd hook (handler.js + setup_cmd.py + HOOK.md)

Use SessionEnd instead of Stop — Stop fires per-turn, SessionEnd fires once on exit
Add --force-extraction so extraction always runs at session end
Redirect publish output to ~/.reflexio/logs/stop-hook.log (was silently discarded)
Remove dead stop_hook_active guard (not in Claude Code payload)
Use positional args ($1..$5) in sh -c instead of template literal interpolation (prevents shell injection)
Auto-install hook when expert mode enabled; remove on normal-mode re-install
Shell-quote all hook paths with shlex.quote for paths with spaces

Expert skill restructure (SKILL-expert.md + rules/reflexio.md + references/)

Move triggering logic to rules/reflexio.md (always loaded) — "When to Publish to Reflexio" section with profile/playbook guidance and confirmation model
Simplify skill from 235 → 98 lines — keep only the how-to (payload formats, patterns)
Move heavy content to references/proactive-patterns.md and references/server-management.md
Rewrite description to triggering conditions only per skill creation guidelines
Setup wizard copies references/ in expert mode; removes on normal-mode re-install
Normal-mode re-install cleans up all expert artifacts (SessionEnd hook, references/, reflexio-extract command)

Client cache invalidation (cache.py + client.py)

Add clear() and invalidate(method_name) to InMemoryCache
Call invalidation in all 12 mutation methods (delete, update, publish)
Fixes: delete-all → list now returns empty instead of stale cached data

CLI fix (interactions.py)

reflexio interactions list without --user-id failed with Pydantic validation error
Now uses require_user_id() to fall back to REFLEXIO_USER_ID env var

Test plan

node --check handler.js — syntax valid
ruff check + pyright on all Python files — clean
47 setup_cmd tests pass (9 new: SessionEnd hook, references, expert→normal downgrade)
21 cache tests pass (7 new: clear, invalidate, client integration)
handler.js transcript parsing verified against real Claude Code JSONL
Manual testing of expert skill correction capture

coderabbitai · 2026-04-20T23:14:32Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an expert mode for Claude Code integration that installs a SessionEnd hook and copies expert references; the hook handler now runs reflexio interactions publish --force-extraction and logs output. Documentation and skill references were updated for proactive extraction patterns; tests cover expert install, idempotency, and uninstall cleanup.

Changes

Cohort / File(s)	Summary
CLI Setup & Tests `reflexio/cli/commands/setup_cmd.py`, `tests/cli/test_setup_cmd.py`	Added `expert: bool` parameter to hook merge/install paths; expert install upserts `SessionEnd` hook and copies `skill/references` into installed skill. Uninstall removes `SessionEnd`. Tests added for expert vs normal installs, idempotency, and uninstall cleanup.
Hook Implementation & Docs `reflexio/integrations/claude_code/hook/handler.js`, `reflexio/integrations/claude_code/hook/HOOK.md`	Hook event renamed `Stop` → `SessionEnd`. Handler removed early-exit guard, ensures `~/.reflexio/logs`, invokes `reflexio interactions publish --force-extraction` (fire-and-forget) and appends stdout/stderr to `stop-hook.log`. Docs updated to reflect SessionEnd, logging, and expert-only installation.
Skill Documentation & References `reflexio/integrations/claude_code/skill/SKILL-expert.md`, `reflexio/integrations/claude_code/skill/SKILL.md`, `reflexio/integrations/claude_code/skill/references/*`	Reworked expert skill guidance to a publish-focused workflow; shortened skill description. Added `proactive-patterns.md` and `server-management.md` reference docs.
Rules `reflexio/integrations/claude_code/rules/reflexio.md`	Clarified Reflexio’s cross-session scope, softened wording about mentioning Reflexio, and added explicit when-to-publish guidance contrasting Reflexio vs local memory.
Client Cache & Client Invalidation `reflexio/client/cache.py`, `reflexio/client/client.py`, `tests/client/test_cache.py`	Cache entries now record `method_name`; added `clear()` and `invalidate(method_name)`. Client invalidates or clears relevant caches after mutating operations; tests added for cache clear/invalidate and client-cache interactions.

Sequence Diagram(s)

sequenceDiagram
  participant Claude
  participant HookHandler as Hook Handler (handler.js)
  participant ReflexioCLI as reflexio interactions publish
  participant FS as FileSystem (~/.reflexio/logs)

  Claude->>HookHandler: SessionEnd hook invocation
  HookHandler->>FS: ensure logs dir & write temp payload
  HookHandler->>ReflexioCLI: run publish --force-extraction (fire-and-forget)
  ReflexioCLI-->>HookHandler: stdout/stderr (captured)
  HookHandler->>FS: append output to stop-hook.log and remove temp

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

I’m a rabbit in expert mode, ears alert and bright, 🐇
SessionEnd rings, I publish through the night,
Force-extract the lessons, log each chase and cheer,
Proactive patterns found — the signal’s clear,
I hop, I copy refs, I tidy hooks with delight. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references several major changes (SessionEnd hook, expert skill restructure, cache invalidation, CLI fix) that are all present in the changeset, covering the key areas of work.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/stop-hook-force-extraction-logging

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 723-725: The setup code currently only adds the "Stop" hook when
expert is True via _upsert_hook(hooks, "Stop", ...), but does not remove an
existing "Stop" entry when expert is False; update the setup flow so that when
expert is False it explicitly removes any existing "Stop" hook from the hooks
collection (e.g., call the inverse removal helper or filter the hooks list for
entries not named "Stop") instead of just skipping _upsert_hook, referencing the
same "Stop" hook name and the hooks variable so re-running setup in normal mode
clears the Reflexio Stop hook from settings.json.
- Around line 723-725: The Stop hook is being built with an unquoted path ("node
{handler_js_path}") which will break if handler_js_path contains spaces; before
calling _upsert_hook (and similarly for the other two hook registrations in this
module), shell-quote the path(s) using Python's shlex.quote and interpolate the
quoted value into the command (e.g., quote handler_js_path and any other hook
path variables) so the command string passed to _upsert_hook is safe for spaces
and special characters.

In `@reflexio/integrations/claude_code/hook/handler.js`:
- Around line 112-118: The current spawn call constructs a shell command by
interpolating userId, agentVersion, sessionId and paths into sh -c (symbols:
LOG_DIR, logFile, spawn, payloadFile, userId, agentVersion, sessionId), which
risks shell injection and broken parsing; fix by avoiding sh -c and passing the
executable and arguments as an array to spawn (e.g., use spawn("reflexio",
["interactions","publish","--user-id", userId, "--file", payloadFile,
"--source","claude-code","--agent-version", agentVersion, "--session-id",
sessionId || "unknown","--force-extraction"], { stdio: ... })) and handle
stdout/stderr by opening/redirecting to logFile (or piping streams) and perform
the rm (unlink) of payloadFile only after the child process exits, rather than
embedding both operations in a shell string.

In `@reflexio/integrations/claude_code/hook/HOOK.md`:
- Around line 33-40: The docs claim tool calls are skipped, but the parser in
handler.js preserves tool_use blocks into the tools_used field and retains pure
tool-only assistant turns as placeholder content; update HOOK.md (the stop-hook
description list item 2) to reflect that tool calls are captured as tools_used
and tool-only assistant turns may be included as placeholders rather than being
dropped, and mention that tool_use blocks are preserved by the handler.js parser
so published payloads include those tool usages.

In `@reflexio/integrations/claude_code/skill/SKILL-expert.md`:
- Around line 105-165: Proactive publishes are being duplicated by the expert
Stop hook; prevent double-publishing by adding deduplication or gating. Modify
the publishing flow used in reflexio/cli/commands/interactions.py,
reflexio/client/client.py, and reflexio/server/api.py so that reflexio publish
calls include an idempotency key (e.g., session_id + span_hash) and have the
Stop hook check for an existing published span before force-extraction;
alternatively implement a simple server-side dedupe in the API endpoint that
ignores repeated payloads with the same idempotency key. Ensure the Stop hook
(the expert Stop handler) uses that same key or query to skip already-captured
spans so a proactive publish and the Stop hook cannot both send the same
interaction.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 52da4aae-6ef0-425f-9169-3c4b595b2b1f

📥 Commits

Reviewing files that changed from the base of the PR and between 1ca3585 and 469f20c.

📒 Files selected for processing (5)

reflexio/cli/commands/setup_cmd.py
reflexio/integrations/claude_code/hook/HOOK.md
reflexio/integrations/claude_code/hook/handler.js
reflexio/integrations/claude_code/skill/SKILL-expert.md
tests/cli/test_setup_cmd.py

coderabbitai · 2026-04-20T23:23:57Z

+## Step-by-Step: Proactive Extraction (High-Signal Moments)
+
+Beyond explicit corrections, certain moments during a session carry high learning value — even when the user says nothing. When you detect any of the patterns below, publish to Reflexio using the same mechanism as corrections. Do not wait for user input.
+
+**Payload format** — same as corrections, but the "user" turn is the original request (no correction needed), and the "assistant" turn captures the full failure → recovery arc:
+```bash
+cat > /tmp/reflexio-summary.json << 'SUMMARY_EOF'
+{
+  "agent_version": "claude-code",
+  "source": "claude-code-expert",
+  "interactions": [
+    {"role": "user", "content": "<what the user originally asked>"},
+    {
+      "role": "assistant",
+      "content": "<what you tried, what went wrong, and how you recovered — preserve self-correction phrases verbatim>",
+      "tools_used": [
+        {"tool_name": "<tool>", "tool_data": {"input": "<params> — FAILED: <exact error>"}},
+        {"tool_name": "<tool>", "tool_data": {"input": "<corrected params> — succeeded"}}
+      ]
+    }
+  ]
+}
+SUMMARY_EOF
+reflexio publish --agent-version claude-code --source claude-code-expert --skip-aggregation --force-extraction --file /tmp/reflexio-summary.json && rm -f /tmp/reflexio-summary.json
+```
+
+List every failed attempt in `tools_used` in chronological order, followed by the successful one. Include exact error messages verbatim — they are load-bearing evidence for extracting precise behavioral rules.
+
+### Patterns to detect and publish:
+
+**A. Self-recovered tool failures**
+You tried a tool call, got an error, and fixed it yourself without user input. The error message and your recovery strategy are extractable as a behavioral rule (trigger → pitfall → instruction).
+
+_Example:_ You ran a SQL query with `JOIN channels ON l.channel_id`, got `invalid identifier 'L.CHANNEL_ID'`, discovered the column is actually `l.stream_channel_id` by introspecting the schema, and rewrote the query.
+
+**B. Retry chains (2+ failures on same operation)**
+You attempted the same operation 2+ times with different errors before succeeding. Each retry and the eventual fix form a learning arc. Publish the full chain as one interaction — list all attempts in `tools_used`.
+
+_Example:_ File edit failed (wrong indentation), retried with different context (still wrong), then read the file first and got it right. The pattern "read before editing unfamiliar files" is extractable.
+
+**C. Discovered documentation or behavior gaps**
+Tool documentation or expected behavior said X, but reality was Y. The discrepancy is a reusable rule that prevents future agents from making the same incorrect assumption.
+
+_Example:_ API docs say `--format json` is supported, but the CLI returns `unknown flag`. You discovered `--output-format json` works instead.
+
+**D. Workarounds for limitations**
+An API, tool, or system doesn't support what you needed, so you used an alternative approach. The limitation and workaround are a reusable playbook rule.
+
+_Example:_ The database doesn't support `LATERAL JOIN`, so you rewrote the query using a correlated subquery. Or: the MCP tool doesn't accept wildcards, so you listed the directory first and filtered client-side.
+
+**E. Anomalous or implausible results**
+Results that are unexpected — zeros where you expected values, row counts that don't match documentation, mean/median divergence suggesting data skew, results that contradict what the user described. Publish how you detected the anomaly and what you did about it.
+
+_Example:_ Query returned 0 rows when the user said "we have thousands of records." You discovered the table uses soft deletes and added `WHERE deleted_at IS NULL` — or found the user was looking at a different environment.
+
+### When to publish proactively:
+
+- **Publish immediately** after recovering from the situation — don't batch multiple unrelated patterns into one publish
+- **Don't publish routine successes** — only friction, failures, surprises, and workarounds. If a tool call succeeded on the first try with no surprises, there's nothing to extract
+- **Keep the bar reasonable** — a single typo in a file path that you immediately corrected is not worth publishing. The signal is in non-obvious failures where the recovery required understanding something new about the system
+


⚠️ Potential issue | 🟠 Major

Proactive publishes will be duplicated by the expert Stop hook.

This section tells the agent to publish self-recovered failures immediately, while expert mode now also force-publishes the full transcript on Stop. The publish path shown in reflexio/cli/commands/interactions.py, reflexio/client/client.py, and reflexio/server/api.py doesn't attach an idempotency key or dedupe repeated session content, so the same learning can be extracted twice. Either gate the Stop hook for sessions that already published, or add a way for the session-end extractor to skip already-captured spans.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@reflexio/integrations/claude_code/skill/SKILL-expert.md` around lines 105 - 165, Proactive publishes are being duplicated by the expert Stop hook; prevent double-publishing by adding deduplication or gating. Modify the publishing flow used in reflexio/cli/commands/interactions.py, reflexio/client/client.py, and reflexio/server/api.py so that reflexio publish calls include an idempotency key (e.g., session_id + span_hash) and have the Stop hook check for an existing published span before force-extraction; alternatively implement a simple server-side dedupe in the API endpoint that ignores repeated payloads with the same idempotency key. Ensure the Stop hook (the expert Stop handler) uses that same key or query to skip already-captured spans so a proactive publish and the Stop hook cannot both send the same interaction.

Valid observation — acknowledged. The SessionEnd hook publishes the raw transcript; proactive publishes capture specific corrections with structured context. The server extracts different signals from each. Dedup is a future optimization, not a blocker for this PR.

@yilu331, understood — the two publish paths are architecturally distinct: the SessionEnd hook submits the raw session transcript for bulk extraction, while proactive publishes submit pre-structured, high-signal payloads (with tools_used, exact errors, and recovery arcs) that the server can extract more precisely. Double-publishing the same span would produce different extraction results, not true duplicates. Dedup as a future optimization makes sense.

(̿▀̿ ̿Ĺ̯̿̿▀̿ ̿)̄ ~ᶘ ᵒᴥᵒᶅ

✏️ Learnings added

Learnt from: yilu331 Repo: ReflexioAI/reflexio PR: 27 File: reflexio/integrations/claude_code/skill/SKILL-expert.md:0-0 Timestamp: 2026-04-21T22:16:19.076Z Learning: In the reflexio project, the SessionEnd hook (handler.js) publishes the raw session transcript to Reflexio for bulk extraction, while proactive in-session publishes (from SKILL-expert.md) submit pre-structured payloads with `tools_used`, exact error messages, and failure-to-recovery arcs. The server extracts different signals from each. These two publish paths are intentionally distinct, and deduplication is a future optimization rather than a current requirement.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/cli/test_setup_cmd.py (1)

358-367: Good idempotency test, but consider adding a "downgrade" scenario.

This test correctly validates that repeated expert installs don't duplicate hooks. However, there's no coverage for the case where a user first installs with --expert, then later re-runs setup in normal mode. Based on the current implementation, the SessionEnd hook would remain in settings.json even though the summary would show only "SessionStart + UserPromptSubmit".

Consider adding a test like:

def test_normal_mode_removes_session_end_after_expert(self, tmp_path: Path) -> None:
    """Re-running in normal mode should remove expert-only SessionEnd hook."""
    _install_claude_code_integration(
        tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT
    )
    _install_claude_code_integration(
        tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT
    )
    settings = json.loads((tmp_path / ".claude" / "settings.json").read_text())
    assert "SessionEnd" not in settings["hooks"]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_setup_cmd.py` around lines 358 - 367, Add a regression test
that installs expert mode then re-runs setup in normal mode to ensure
expert-only hooks are removed: create a test named
test_normal_mode_removes_session_end_after_expert that calls
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) then calls
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT), reads tmp_path / ".claude" /
"settings.json" into settings and asserts that "SessionEnd" is not present in
settings["hooks"]; use the existing helpers and InstallLocation enum so this
covers downgrade behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 723-725: The SessionEnd hook is only added when expert is True but
never removed when expert is False, leaving a stale hook after re-running setup;
modify the setup logic around the expert check (the block that calls
_upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")) so that when
expert is False you explicitly remove any existing "SessionEnd" entry from the
hooks collection and persist that change (use the existing hook-management
utility or remove/filter the "SessionEnd" key from hooks before saving),
ensuring the hooks state reflects the expert flag.

---

Nitpick comments:
In `@tests/cli/test_setup_cmd.py`:
- Around line 358-367: Add a regression test that installs expert mode then
re-runs setup in normal mode to ensure expert-only hooks are removed: create a
test named test_normal_mode_removes_session_end_after_expert that calls
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) then calls
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT), reads tmp_path / ".claude" /
"settings.json" into settings and asserts that "SessionEnd" is not present in
settings["hooks"]; use the existing helpers and InstallLocation enum so this
covers downgrade behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f2bb6067-c771-4347-af31-c7c77e3ed361

📥 Commits

Reviewing files that changed from the base of the PR and between 469f20c and 62418ba.

📒 Files selected for processing (4)

reflexio/cli/commands/setup_cmd.py
reflexio/integrations/claude_code/hook/HOOK.md
reflexio/integrations/claude_code/hook/handler.js
tests/cli/test_setup_cmd.py

🚧 Files skipped from review as they are similar to previous changes (2)

reflexio/integrations/claude_code/hook/handler.js
reflexio/integrations/claude_code/hook/HOOK.md

coderabbitai

♻️ Duplicate comments (1)

reflexio/integrations/claude_code/skill/SKILL-expert.md (1)
111-171: ⚠️ Potential issue | 🟠 Major

Duplication risk: proactive publishes and Stop hook will process the same interactions twice.

The proactive extraction patterns (A-E) instruct Claude to reflexio publish --force-extraction immediately when detecting self-recovered failures, retry chains, etc. (line 134). However, expert mode also auto-installs the Stop (SessionEnd) hook, which force-publishes the full transcript at session end. Since the relevant code snippets from profile_generation_service_utils.py and playbook_extractor.py show no deduplication logic, the same interaction will be extracted twice—once from the immediate proactive publish and again from the session-end publish.

This can result in duplicate playbooks/profiles and wasted extraction cycles.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@reflexio/integrations/claude_code/skill/SKILL-expert.md` around lines 111 -
171, The proactive publish flow will double-publish interactions because
immediate force-extraction calls (the reflexio publish --force-extraction usage
in SKILL-expert) and the auto-installed Stop/SessionEnd hook both send the same
transcript; update the code to deduplicate by recording a unique extraction key
per interaction (e.g., a stable hash of
{"agent_version","source","interactions"} or a uuid per extraction) and check it
before publishing in the functions that trigger publishes (the proactive publish
path from SKILL-expert and the SessionEnd/Stop hook code in
playbook_extractor.py and profile_generation_service_utils.py); specifically,
implement a fast idempotency store or mark (in memory or persistent store) when
publish succeeds and have SessionEnd skip publishes whose key is present, or
have the proactive publish set a “published_by_proactive” flag the SessionEnd
logic checks before sending.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@reflexio/integrations/claude_code/skill/SKILL-expert.md`:
- Around line 111-171: The proactive publish flow will double-publish
interactions because immediate force-extraction calls (the reflexio publish
--force-extraction usage in SKILL-expert) and the auto-installed Stop/SessionEnd
hook both send the same transcript; update the code to deduplicate by recording
a unique extraction key per interaction (e.g., a stable hash of
{"agent_version","source","interactions"} or a uuid per extraction) and check it
before publishing in the functions that trigger publishes (the proactive publish
path from SKILL-expert and the SessionEnd/Stop hook code in
playbook_extractor.py and profile_generation_service_utils.py); specifically,
implement a fast idempotency store or mark (in memory or persistent store) when
publish succeeds and have SessionEnd skip publishes whose key is present, or
have the proactive publish set a “published_by_proactive” flag the SessionEnd
logic checks before sending.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8e26fd57-8549-4f6a-9bfb-183a8611f85d

📥 Commits

Reviewing files that changed from the base of the PR and between 62418ba and d5548d9.

📒 Files selected for processing (1)

reflexio/integrations/claude_code/skill/SKILL-expert.md

coderabbitai · 2026-04-21T19:19:02Z

Only repository collaborators, contributors, or members can run CodeRabbit commands.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

reflexio/cli/commands/setup_cmd.py (2)

715-721: ⚠️ Potential issue | 🟠 Major

Hook commands should shell-quote paths.

Lines 717, 721, and 725 serialize raw paths into shell commands; installs under paths with spaces can break hook execution.

🔐 Suggested fix

 import contextlib
 import json
 import os
+import shlex
 import shutil
 import subprocess
@@
-    _upsert_hook(hooks, "SessionStart", f"bash {session_start_hook_sh}")
+    _upsert_hook(
+        hooks, "SessionStart", f"bash {shlex.quote(str(session_start_hook_sh))}"
+    )
@@
-    _upsert_hook(hooks, "UserPromptSubmit", f"node {search_hook_js}")
+    _upsert_hook(
+        hooks, "UserPromptSubmit", f"node {shlex.quote(str(search_hook_js))}"
+    )
@@
-        _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")
+        _upsert_hook(hooks, "SessionEnd", f"node {shlex.quote(str(handler_js_path))}")

Also applies to: 725-725

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@reflexio/cli/commands/setup_cmd.py` around lines 715 - 721, The hook commands
currently embed raw Path objects into shell strings (see session_start_hook_sh,
search_hook_js passed to _upsert_hook), which breaks when paths contain spaces;
update the hook command construction to shell-quote the path strings (e.g., use
shlex.quote(str(path))) before interpolating into the "bash ..." or "node ..."
command so the generated hook commands are safe for paths with spaces; apply the
same quoting wherever handler_js_path-derived paths are used to build hook
commands.

723-726: ⚠️ Potential issue | 🟠 Major

Re-running setup with expert=False leaves a stale SessionEnd hook.

Line 723 only adds SessionEnd when expert is enabled; it does not remove a previously installed Reflexio SessionEnd entry when expert is later disabled.

🧩 Suggested fix

     # SessionEnd hook — captures full session transcript on exit (expert mode only)
     if expert:
         _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")
+    else:
+        event_hooks = hooks.get("SessionEnd", [])
+        hooks["SessionEnd"] = [
+            entry
+            for entry in event_hooks
+            if not any("reflexio" in h.get("command", "") for h in entry.get("hooks", []))
+        ]
+        if not hooks["SessionEnd"]:
+            hooks.pop("SessionEnd", None)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@reflexio/cli/commands/setup_cmd.py` around lines 723 - 726, Currently the
code only calls _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")
when expert is True, leaving any previously installed "SessionEnd" hook in hooks
when expert is later False; fix by removing the hook when expert is False (e.g.,
call a removal helper like _remove_hook(hooks, "SessionEnd") or filter the hooks
list to drop entries with name "SessionEnd") so that "SessionEnd" is upserted
only when expert is True and removed otherwise.

🧹 Nitpick comments (1)

tests/cli/test_setup_cmd.py (1)

355-386: Add expert→normal transition coverage.

These tests validate fresh installs, but they don’t cover re-running setup from expert mode to normal mode. A transition test would guard against stale expert artifacts persisting across re-installs.

✅ Suggested test addition

+    def test_reinstall_normal_removes_expert_artifacts(self, tmp_path: Path) -> None:
+        """Re-running setup without expert removes expert-only artifacts."""
+        _install_claude_code_integration(
+            tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT
+        )
+        _install_claude_code_integration(
+            tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT
+        )
+
+        settings = json.loads((tmp_path / ".claude" / "settings.json").read_text())
+        assert "SessionEnd" not in settings.get("hooks", {})
+        assert not (tmp_path / ".claude" / "commands" / "reflexio-extract").exists()
+        assert not (
+            tmp_path / ".claude" / "skills" / "reflexio" / "references"
+        ).exists()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_setup_cmd.py` around lines 355 - 386, Add a new test in
tests/cli/test_setup_cmd.py that exercises the expert→normal transition: call
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) to simulate expert install, then call
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT) (or omit expert flag) to simulate
reverting to normal; read the generated .claude/settings.json and assert that
"SessionEnd" is no longer present in settings["hooks"] while "SessionStart" and
"UserPromptSubmit" remain present (and have length 1 if appropriate) to ensure
expert artifacts are removed on re-install.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 829-842: When setup is run with expert=False you must remove any
prior expert-only artifacts left by a previous run with expert=True; update the
block around the expert flag (the code that currently creates cmd_src/cmd_dest
and refs_src/refs_dest) so that in the else branch (or when not expert) you
explicitly remove claude_dir / "commands" / "reflexio-extract" (delete the
directory or its SKILL.md) and remove claude_dir / "skills" / "reflexio" /
"references" if they exist (use exists() checks and shutil.rmtree or unlink as
appropriate), ensuring no exception is raised if those paths are already absent.

In `@reflexio/integrations/claude_code/skill/references/server-management.md`:
- Around line 16-18: The fenced code block containing the Agent invocation
(Agent(prompt="Start the Reflexio server: nohup reflexio services start --only
backend > ~/.reflexio/logs/server.log 2>&1 & sleep 5 && reflexio status check",
run_in_background=true)) is missing a language identifier; update that fenced
block in server-management.md to include a language tag (e.g., change ``` to
```text) so markdownlint MD040 is satisfied and the block renders with the
specified language.

---

Duplicate comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 715-721: The hook commands currently embed raw Path objects into
shell strings (see session_start_hook_sh, search_hook_js passed to
_upsert_hook), which breaks when paths contain spaces; update the hook command
construction to shell-quote the path strings (e.g., use shlex.quote(str(path)))
before interpolating into the "bash ..." or "node ..." command so the generated
hook commands are safe for paths with spaces; apply the same quoting wherever
handler_js_path-derived paths are used to build hook commands.
- Around line 723-726: Currently the code only calls _upsert_hook(hooks,
"SessionEnd", f"node {handler_js_path}") when expert is True, leaving any
previously installed "SessionEnd" hook in hooks when expert is later False; fix
by removing the hook when expert is False (e.g., call a removal helper like
_remove_hook(hooks, "SessionEnd") or filter the hooks list to drop entries with
name "SessionEnd") so that "SessionEnd" is upserted only when expert is True and
removed otherwise.

---

Nitpick comments:
In `@tests/cli/test_setup_cmd.py`:
- Around line 355-386: Add a new test in tests/cli/test_setup_cmd.py that
exercises the expert→normal transition: call
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) to simulate expert install, then call
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT) (or omit expert flag) to simulate
reverting to normal; read the generated .claude/settings.json and assert that
"SessionEnd" is no longer present in settings["hooks"] while "SessionStart" and
"UserPromptSubmit" remain present (and have length 1 if appropriate) to ensure
expert artifacts are removed on re-install.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e8bfef1e-d5ed-434b-926f-f9dda8f55bff

📥 Commits

Reviewing files that changed from the base of the PR and between d5548d9 and 9d80a26.

📒 Files selected for processing (7)

reflexio/cli/commands/setup_cmd.py
reflexio/integrations/claude_code/rules/reflexio.md
reflexio/integrations/claude_code/skill/SKILL-expert.md
reflexio/integrations/claude_code/skill/SKILL.md
reflexio/integrations/claude_code/skill/references/proactive-patterns.md
reflexio/integrations/claude_code/skill/references/server-management.md
tests/cli/test_setup_cmd.py

✅ Files skipped from review due to trivial changes (3)

reflexio/integrations/claude_code/skill/SKILL.md
reflexio/integrations/claude_code/skill/references/proactive-patterns.md
reflexio/integrations/claude_code/rules/reflexio.md

🚧 Files skipped from review as they are similar to previous changes (1)

reflexio/integrations/claude_code/skill/SKILL-expert.md

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

reflexio/client/client.py (1)

712-718: ⚠️ Potential issue | 🟠 Major

Move invalidation to post-success path for async deletes.

At Line 712 and Line 905, invalidation happens before the delete request executes. In wait_for_response=False, an immediate read can repopulate stale data before deletion completes, and no second invalidation cleans it up.

Proposed fix

 def _delete_profile_sync(
     self, request: DeleteUserProfileRequest
 ) -> DeleteUserProfileResponse:
     """Internal sync method to delete profile."""
     response = self._make_request(
         "DELETE",
         "/api/delete_profile",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_profiles")
     return DeleteUserProfileResponse(**response)

 async def _delete_profile_async(
     self, request: DeleteUserProfileRequest
 ) -> DeleteUserProfileResponse:
     """Internal async method to delete profile."""
     response = await self._make_async_request(
         "DELETE",
         "/api/delete_profile",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_profiles")
     return DeleteUserProfileResponse(**response)

 def delete_profile(...):
@@
-    self._cache.invalidate("get_profiles")
     if wait_for_response:
         return self._delete_profile_sync(request)
     self._fire_and_forget(self._delete_profile_async, request)
     return None


 def _delete_agent_playbook_sync(
     self, request: DeleteAgentPlaybookRequest
 ) -> DeleteAgentPlaybookResponse:
@@
     response = self._make_request(
         "DELETE",
         "/api/delete_agent_playbook",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_agent_playbooks")
     return DeleteAgentPlaybookResponse(**response)

 async def _delete_agent_playbook_async(
     self, request: DeleteAgentPlaybookRequest
 ) -> DeleteAgentPlaybookResponse:
@@
     response = await self._make_async_request(
         "DELETE",
         "/api/delete_agent_playbook",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_agent_playbooks")
     return DeleteAgentPlaybookResponse(**response)

 def delete_agent_playbook(...):
@@
-    self._cache.invalidate("get_agent_playbooks")
     if wait_for_response:
         return self._delete_agent_playbook_sync(request)
     self._fire_and_forget(self._delete_agent_playbook_async, request)
     return None

Also applies to: 905-911

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@reflexio/client/client.py` around lines 712 - 718, The cache invalidation for
"get_profiles" is happening before the delete completes and should be moved to
run only after a successful delete; update the code so that in the non-blocking
path you do NOT call self._cache.invalidate("get_profiles") before firing the
async task, but instead perform the invalidation after the delete finishes
(i.e., inside _delete_profile_sync after a successful delete return and inside
_delete_profile_async or its completion callback once the remote delete
succeeds); apply the same change for the similar delete block referenced around
the other delete path (the 905-911 logic) so invalidation happens post-success
for both sync and async flows.

🧹 Nitpick comments (1)

tests/client/test_cache.py (1)

379-449: Add coverage for non-blocking delete cache behavior.

These integration tests cover bulk synchronous delete paths well, but they don’t exercise wait_for_response=False for delete_profile / delete_agent_playbook, where cache consistency can diverge under races. Add a regression test for those fire-and-forget paths.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/client/test_cache.py` around lines 379 - 449, Tests currently cover
synchronous bulk deletes but miss the non-blocking "fire-and-forget" paths; add
a new regression test(s) in tests/client/test_cache.py that patches
requests.Session similarly and exercises delete_profile(wait_for_response=False)
and delete_agent_playbook(wait_for_response=False) to ensure they invalidate the
relevant cache entries. Specifically, create scenarios that populate the
profiles or agent_playbooks cache via get_profiles/get_agent_playbooks, call
delete_profile/delete_agent_playbook with wait_for_response=False, simulate the
HTTP response(s) via mock_response.json.side_effect, and then assert subsequent
get_profiles/get_agent_playbooks trigger API calls (cache miss) by checking
mock_session.request.call_count; reference the methods delete_profile and
delete_agent_playbook and the helpers get_profiles/get_agent_playbooks to locate
where to add the tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@reflexio/client/client.py`:
- Around line 712-718: The cache invalidation for "get_profiles" is happening
before the delete completes and should be moved to run only after a successful
delete; update the code so that in the non-blocking path you do NOT call
self._cache.invalidate("get_profiles") before firing the async task, but instead
perform the invalidation after the delete finishes (i.e., inside
_delete_profile_sync after a successful delete return and inside
_delete_profile_async or its completion callback once the remote delete
succeeds); apply the same change for the similar delete block referenced around
the other delete path (the 905-911 logic) so invalidation happens post-success
for both sync and async flows.

---

Nitpick comments:
In `@tests/client/test_cache.py`:
- Around line 379-449: Tests currently cover synchronous bulk deletes but miss
the non-blocking "fire-and-forget" paths; add a new regression test(s) in
tests/client/test_cache.py that patches requests.Session similarly and exercises
delete_profile(wait_for_response=False) and
delete_agent_playbook(wait_for_response=False) to ensure they invalidate the
relevant cache entries. Specifically, create scenarios that populate the
profiles or agent_playbooks cache via get_profiles/get_agent_playbooks, call
delete_profile/delete_agent_playbook with wait_for_response=False, simulate the
HTTP response(s) via mock_response.json.side_effect, and then assert subsequent
get_profiles/get_agent_playbooks trigger API calls (cache miss) by checking
mock_session.request.call_count; reference the methods delete_profile and
delete_agent_playbook and the helpers get_profiles/get_agent_playbooks to locate
where to add the tests.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: bb8b9eb6-b678-433a-96a5-a10cd9d0303a

📥 Commits

Reviewing files that changed from the base of the PR and between 9d80a26 and 15787e7.

📒 Files selected for processing (3)

reflexio/client/cache.py
reflexio/client/client.py
tests/client/test_cache.py

…install - Add --force-extraction to session-end publish so extraction runs regardless of batch-interval throttling - Redirect publish output to ~/.reflexio/logs/stop-hook.log instead of silently discarding errors - Remove dead stop_hook_active guard (field not in Claude Code payload) - Auto-install Stop hook when expert mode is enabled via setup wizard - Update HOOK.md to reflect all changes

Add 5 high-signal patterns that trigger auto-publish without user input: - Self-recovered tool failures (error + recovery arc) - Retry chains (2+ failures on same operation) - Documentation/behavior gaps (docs say X, reality is Y) - Workarounds for limitations (used alternative approach) - Anomalous results (unexpected zeros, row-count mismatches) Also add end-of-session extraction nudge: suggest /reflexio-extract when session had any friction points.

…triggers

- test_normal_mode_no_stop_hook: verifies Stop hook absent without --expert - test_expert_mode_installs_stop_hook: verifies Stop hook present with --expert, checks command points to handler.js - test_expert_mode_stop_hook_idempotent: verifies no duplication on repeated install - test_remove_from_dir_cleans_stop_hook: verifies uninstall removes the Stop hook

Stop fires once per turn (every assistant response), causing repeated publishes of the growing transcript and wasteful --force-extraction on every turn. SessionEnd fires once when the session exits — correct for full-session capture. - _merge_hook_config: Stop → SessionEnd - _remove_hook_config: add SessionEnd to removal list (keep Stop for backwards compat with old installs) - handler.js: update JSDoc comments - HOOK.md: update event name, installation example, frontmatter - tests: rename and update all SessionEnd assertions

Root cause: description led with "Search" which framed Reflexio as a search tool, causing Claude to route corrections to its built-in memory system instead of triggering the Reflexio publish flow. Changes: - Lead description with "Publish user corrections" (the missed trigger) - Add when_to_use field with concrete trigger phrases - Explain profile vs playbook: facts publish immediately, procedures publish after user confirmation (explicit or implicit) - Distinguish from Claude's built-in local memory - Add callout in skill body preventing memory/Reflexio conflation - Reorder trigger list: corrections first, search second

Root cause: skill tried to handle both when-to-trigger and how-to-publish, but Claude decides whether to invoke a skill before reading its content. Triggering guidance needs to be in rules (always loaded). Changes: - rules/reflexio.md: add "When to Publish to Reflexio" section with profile/playbook/procedure guidance and confirmation model; soften existing section titles from commanding to guiding tone - skill/SKILL-expert.md: simplify from 235 to 98 lines; keep only how-to (payload formats, correction patterns, end-of-session nudge) - skill/SKILL.md: update description to triggering conditions only per skill creation guidelines - skill/references/: new directory with proactive-patterns.md and server-management.md (heavy content moved out of main skill) - setup_cmd.py: copy references/ dir in expert mode - tests: add test_expert_mode_installs_references and test_normal_mode_no_references

delete_all_*, delete_*_by_ids, delete_*, update_*, and publish_* methods now clear relevant cache entries so subsequent list/get calls return fresh data instead of stale cached results. Changes: - cache.py: add clear() and invalidate(method_name) methods; store method_name in each entry for prefix-based invalidation - client.py: call cache.invalidate("get_profiles") on profile mutations, cache.invalidate("get_agent_playbooks") on playbook mutations, cache.clear() on nuclear deletes (delete_all_interactions, delete_all_playbooks) - tests: 7 new tests for clear, invalidate, and client integration

reflexio interactions list without --user-id failed with a Pydantic validation error because GetInteractionsRequest requires user_id. Now uses require_user_id() to fall back to REFLEXIO_USER_ID env var, matching the pattern used by other list commands.

- setup_cmd.py: shell-quote all hook paths with shlex.quote for paths with spaces; remove SessionEnd hook + expert artifacts (references/, reflexio-extract command) when re-installing in normal mode; extract _remove_reflexio_hook helper - handler.js: use positional args ($1..$5) instead of template literal interpolation in sh -c to prevent shell injection - HOOK.md: fix inaccurate docs — tool_use blocks ARE preserved as tools_used metadata, not skipped - server-management.md: add language identifier to code fence - tests: add test_normal_reinstall_removes_expert_artifacts

yilu331 changed the title ~~fix: Stop hook force-extraction, error logging, and expert mode auto-install~~ fix: Stop hook force-extraction, logging, expert mode auto-install, and proactive extraction triggers Apr 20, 2026

yilu331 changed the title ~~fix: Stop hook force-extraction, logging, expert mode auto-install, and proactive extraction triggers~~ fix: Stop hook reliability + proactive extraction triggers in expert skill Apr 20, 2026

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread reflexio/cli/commands/setup_cmd.py Outdated

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread reflexio/cli/commands/setup_cmd.py

Comment thread reflexio/integrations/claude_code/skill/references/server-management.md Outdated

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

yilu331 changed the title ~~fix: Stop hook reliability + proactive extraction triggers in expert skill~~ feat: SessionEnd hook, expert skill restructure, cache invalidation, CLI fix Apr 21, 2026

yilu331 added 12 commits April 21, 2026 15:14

fix: update expert skill description to reflect proactive extraction …

3dcf148

…triggers

fix: remove behavioral instruction from skill description

324e92d

fix: remove search from skill description — hook handles search

931ebca

yilu331 force-pushed the fix/stop-hook-force-extraction-logging branch from 1a38703 to 14fc142 Compare April 21, 2026 22:14

yilu331 merged commit 3dab8b8 into main Apr 21, 2026
1 check passed

This was referenced Apr 21, 2026

chore(release): reflexio-ai v0.2.14 #29

Closed

feat(openclaw): rewrite federated plugin using Plugin SDK #30

Merged

coderabbitai Bot mentioned this pull request May 8, 2026

feat: cache invalidation + PATCH config + MiniMax e2e fixes + setup-init/QMD hardening #51

Merged

4 tasks

This was referenced May 18, 2026

feat: add shared embedding service provider #79

Merged

feat(openclaw): replace TS integration with claude-smart-style openclaw-smart plugin #80

Open

Conversation

yilu331 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

SessionEnd hook (handler.js + setup_cmd.py + HOOK.md)

Expert skill restructure (SKILL-expert.md + rules/reflexio.md + references/)

Client cache invalidation (cache.py + client.py)

CLI fix (interactions.py)

Test plan

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yilu331 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 21, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yilu331 commented Apr 20, 2026 •

edited

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

coderabbitai Bot Apr 20, 2026 •

edited

Loading