feat: SessionEnd hook, expert skill restructure, cache invalidation, CLI fix#27
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an expert mode for Claude Code integration that installs a Changes
Sequence Diagram(s)sequenceDiagram
participant Claude
participant HookHandler as Hook Handler (handler.js)
participant ReflexioCLI as reflexio interactions publish
participant FS as FileSystem (~/.reflexio/logs)
Claude->>HookHandler: SessionEnd hook invocation
HookHandler->>FS: ensure logs dir & write temp payload
HookHandler->>ReflexioCLI: run publish --force-extraction (fire-and-forget)
ReflexioCLI-->>HookHandler: stdout/stderr (captured)
HookHandler->>FS: append output to stop-hook.log and remove temp
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 723-725: The setup code currently only adds the "Stop" hook when
expert is True via _upsert_hook(hooks, "Stop", ...), but does not remove an
existing "Stop" entry when expert is False; update the setup flow so that when
expert is False it explicitly removes any existing "Stop" hook from the hooks
collection (e.g., call the inverse removal helper or filter the hooks list for
entries not named "Stop") instead of just skipping _upsert_hook, referencing the
same "Stop" hook name and the hooks variable so re-running setup in normal mode
clears the Reflexio Stop hook from settings.json.
- Around line 723-725: The Stop hook is being built with an unquoted path ("node
{handler_js_path}") which will break if handler_js_path contains spaces; before
calling _upsert_hook (and similarly for the other two hook registrations in this
module), shell-quote the path(s) using Python's shlex.quote and interpolate the
quoted value into the command (e.g., quote handler_js_path and any other hook
path variables) so the command string passed to _upsert_hook is safe for spaces
and special characters.
In `@reflexio/integrations/claude_code/hook/handler.js`:
- Around line 112-118: The current spawn call constructs a shell command by
interpolating userId, agentVersion, sessionId and paths into sh -c (symbols:
LOG_DIR, logFile, spawn, payloadFile, userId, agentVersion, sessionId), which
risks shell injection and broken parsing; fix by avoiding sh -c and passing the
executable and arguments as an array to spawn (e.g., use spawn("reflexio",
["interactions","publish","--user-id", userId, "--file", payloadFile,
"--source","claude-code","--agent-version", agentVersion, "--session-id",
sessionId || "unknown","--force-extraction"], { stdio: ... })) and handle
stdout/stderr by opening/redirecting to logFile (or piping streams) and perform
the rm (unlink) of payloadFile only after the child process exits, rather than
embedding both operations in a shell string.
In `@reflexio/integrations/claude_code/hook/HOOK.md`:
- Around line 33-40: The docs claim tool calls are skipped, but the parser in
handler.js preserves tool_use blocks into the tools_used field and retains pure
tool-only assistant turns as placeholder content; update HOOK.md (the stop-hook
description list item 2) to reflect that tool calls are captured as tools_used
and tool-only assistant turns may be included as placeholders rather than being
dropped, and mention that tool_use blocks are preserved by the handler.js parser
so published payloads include those tool usages.
In `@reflexio/integrations/claude_code/skill/SKILL-expert.md`:
- Around line 105-165: Proactive publishes are being duplicated by the expert
Stop hook; prevent double-publishing by adding deduplication or gating. Modify
the publishing flow used in reflexio/cli/commands/interactions.py,
reflexio/client/client.py, and reflexio/server/api.py so that reflexio publish
calls include an idempotency key (e.g., session_id + span_hash) and have the
Stop hook check for an existing published span before force-extraction;
alternatively implement a simple server-side dedupe in the API endpoint that
ignores repeated payloads with the same idempotency key. Ensure the Stop hook
(the expert Stop handler) uses that same key or query to skip already-captured
spans so a proactive publish and the Stop hook cannot both send the same
interaction.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 52da4aae-6ef0-425f-9169-3c4b595b2b1f
📒 Files selected for processing (5)
reflexio/cli/commands/setup_cmd.pyreflexio/integrations/claude_code/hook/HOOK.mdreflexio/integrations/claude_code/hook/handler.jsreflexio/integrations/claude_code/skill/SKILL-expert.mdtests/cli/test_setup_cmd.py
| ## Step-by-Step: Proactive Extraction (High-Signal Moments) | ||
|
|
||
| Beyond explicit corrections, certain moments during a session carry high learning value — even when the user says nothing. When you detect any of the patterns below, publish to Reflexio using the same mechanism as corrections. Do not wait for user input. | ||
|
|
||
| **Payload format** — same as corrections, but the "user" turn is the original request (no correction needed), and the "assistant" turn captures the full failure → recovery arc: | ||
| ```bash | ||
| cat > /tmp/reflexio-summary.json << 'SUMMARY_EOF' | ||
| { | ||
| "agent_version": "claude-code", | ||
| "source": "claude-code-expert", | ||
| "interactions": [ | ||
| {"role": "user", "content": "<what the user originally asked>"}, | ||
| { | ||
| "role": "assistant", | ||
| "content": "<what you tried, what went wrong, and how you recovered — preserve self-correction phrases verbatim>", | ||
| "tools_used": [ | ||
| {"tool_name": "<tool>", "tool_data": {"input": "<params> — FAILED: <exact error>"}}, | ||
| {"tool_name": "<tool>", "tool_data": {"input": "<corrected params> — succeeded"}} | ||
| ] | ||
| } | ||
| ] | ||
| } | ||
| SUMMARY_EOF | ||
| reflexio publish --agent-version claude-code --source claude-code-expert --skip-aggregation --force-extraction --file /tmp/reflexio-summary.json && rm -f /tmp/reflexio-summary.json | ||
| ``` | ||
|
|
||
| List every failed attempt in `tools_used` in chronological order, followed by the successful one. Include exact error messages verbatim — they are load-bearing evidence for extracting precise behavioral rules. | ||
|
|
||
| ### Patterns to detect and publish: | ||
|
|
||
| **A. Self-recovered tool failures** | ||
| You tried a tool call, got an error, and fixed it yourself without user input. The error message and your recovery strategy are extractable as a behavioral rule (trigger → pitfall → instruction). | ||
|
|
||
| _Example:_ You ran a SQL query with `JOIN channels ON l.channel_id`, got `invalid identifier 'L.CHANNEL_ID'`, discovered the column is actually `l.stream_channel_id` by introspecting the schema, and rewrote the query. | ||
|
|
||
| **B. Retry chains (2+ failures on same operation)** | ||
| You attempted the same operation 2+ times with different errors before succeeding. Each retry and the eventual fix form a learning arc. Publish the full chain as one interaction — list all attempts in `tools_used`. | ||
|
|
||
| _Example:_ File edit failed (wrong indentation), retried with different context (still wrong), then read the file first and got it right. The pattern "read before editing unfamiliar files" is extractable. | ||
|
|
||
| **C. Discovered documentation or behavior gaps** | ||
| Tool documentation or expected behavior said X, but reality was Y. The discrepancy is a reusable rule that prevents future agents from making the same incorrect assumption. | ||
|
|
||
| _Example:_ API docs say `--format json` is supported, but the CLI returns `unknown flag`. You discovered `--output-format json` works instead. | ||
|
|
||
| **D. Workarounds for limitations** | ||
| An API, tool, or system doesn't support what you needed, so you used an alternative approach. The limitation and workaround are a reusable playbook rule. | ||
|
|
||
| _Example:_ The database doesn't support `LATERAL JOIN`, so you rewrote the query using a correlated subquery. Or: the MCP tool doesn't accept wildcards, so you listed the directory first and filtered client-side. | ||
|
|
||
| **E. Anomalous or implausible results** | ||
| Results that are unexpected — zeros where you expected values, row counts that don't match documentation, mean/median divergence suggesting data skew, results that contradict what the user described. Publish how you detected the anomaly and what you did about it. | ||
|
|
||
| _Example:_ Query returned 0 rows when the user said "we have thousands of records." You discovered the table uses soft deletes and added `WHERE deleted_at IS NULL` — or found the user was looking at a different environment. | ||
|
|
||
| ### When to publish proactively: | ||
|
|
||
| - **Publish immediately** after recovering from the situation — don't batch multiple unrelated patterns into one publish | ||
| - **Don't publish routine successes** — only friction, failures, surprises, and workarounds. If a tool call succeeded on the first try with no surprises, there's nothing to extract | ||
| - **Keep the bar reasonable** — a single typo in a file path that you immediately corrected is not worth publishing. The signal is in non-obvious failures where the recovery required understanding something new about the system | ||
|
|
There was a problem hiding this comment.
Proactive publishes will be duplicated by the expert Stop hook.
This section tells the agent to publish self-recovered failures immediately, while expert mode now also force-publishes the full transcript on Stop. The publish path shown in reflexio/cli/commands/interactions.py, reflexio/client/client.py, and reflexio/server/api.py doesn't attach an idempotency key or dedupe repeated session content, so the same learning can be extracted twice. Either gate the Stop hook for sessions that already published, or add a way for the session-end extractor to skip already-captured spans.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@reflexio/integrations/claude_code/skill/SKILL-expert.md` around lines 105 -
165, Proactive publishes are being duplicated by the expert Stop hook; prevent
double-publishing by adding deduplication or gating. Modify the publishing flow
used in reflexio/cli/commands/interactions.py, reflexio/client/client.py, and
reflexio/server/api.py so that reflexio publish calls include an idempotency key
(e.g., session_id + span_hash) and have the Stop hook check for an existing
published span before force-extraction; alternatively implement a simple
server-side dedupe in the API endpoint that ignores repeated payloads with the
same idempotency key. Ensure the Stop hook (the expert Stop handler) uses that
same key or query to skip already-captured spans so a proactive publish and the
Stop hook cannot both send the same interaction.
There was a problem hiding this comment.
Valid observation — acknowledged. The SessionEnd hook publishes the raw transcript; proactive publishes capture specific corrections with structured context. The server extracts different signals from each. Dedup is a future optimization, not a blocker for this PR.
There was a problem hiding this comment.
@yilu331, understood — the two publish paths are architecturally distinct: the SessionEnd hook submits the raw session transcript for bulk extraction, while proactive publishes submit pre-structured, high-signal payloads (with tools_used, exact errors, and recovery arcs) that the server can extract more precisely. Double-publishing the same span would produce different extraction results, not true duplicates. Dedup as a future optimization makes sense.
(̿▀̿ ̿Ĺ̯̿̿▀̿ ̿)̄ ~ᶘ ᵒᴥᵒᶅ
✏️ Learnings added
Learnt from: yilu331
Repo: ReflexioAI/reflexio PR: 27
File: reflexio/integrations/claude_code/skill/SKILL-expert.md:0-0
Timestamp: 2026-04-21T22:16:19.076Z
Learning: In the reflexio project, the SessionEnd hook (handler.js) publishes the raw session transcript to Reflexio for bulk extraction, while proactive in-session publishes (from SKILL-expert.md) submit pre-structured payloads with `tools_used`, exact error messages, and failure-to-recovery arcs. The server extracts different signals from each. These two publish paths are intentionally distinct, and deduplication is a future optimization rather than a current requirement.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/cli/test_setup_cmd.py (1)
358-367: Good idempotency test, but consider adding a "downgrade" scenario.This test correctly validates that repeated expert installs don't duplicate hooks. However, there's no coverage for the case where a user first installs with
--expert, then later re-runs setup in normal mode. Based on the current implementation, the SessionEnd hook would remain insettings.jsoneven though the summary would show only "SessionStart + UserPromptSubmit".Consider adding a test like:
def test_normal_mode_removes_session_end_after_expert(self, tmp_path: Path) -> None: """Re-running in normal mode should remove expert-only SessionEnd hook.""" _install_claude_code_integration( tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT ) _install_claude_code_integration( tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT ) settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) assert "SessionEnd" not in settings["hooks"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/cli/test_setup_cmd.py` around lines 358 - 367, Add a regression test that installs expert mode then re-runs setup in normal mode to ensure expert-only hooks are removed: create a test named test_normal_mode_removes_session_end_after_expert that calls _install_claude_code_integration(tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT) then calls _install_claude_code_integration(tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT), reads tmp_path / ".claude" / "settings.json" into settings and asserts that "SessionEnd" is not present in settings["hooks"]; use the existing helpers and InstallLocation enum so this covers downgrade behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 723-725: The SessionEnd hook is only added when expert is True but
never removed when expert is False, leaving a stale hook after re-running setup;
modify the setup logic around the expert check (the block that calls
_upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")) so that when
expert is False you explicitly remove any existing "SessionEnd" entry from the
hooks collection and persist that change (use the existing hook-management
utility or remove/filter the "SessionEnd" key from hooks before saving),
ensuring the hooks state reflects the expert flag.
---
Nitpick comments:
In `@tests/cli/test_setup_cmd.py`:
- Around line 358-367: Add a regression test that installs expert mode then
re-runs setup in normal mode to ensure expert-only hooks are removed: create a
test named test_normal_mode_removes_session_end_after_expert that calls
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) then calls
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT), reads tmp_path / ".claude" /
"settings.json" into settings and asserts that "SessionEnd" is not present in
settings["hooks"]; use the existing helpers and InstallLocation enum so this
covers downgrade behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f2bb6067-c771-4347-af31-c7c77e3ed361
📒 Files selected for processing (4)
reflexio/cli/commands/setup_cmd.pyreflexio/integrations/claude_code/hook/HOOK.mdreflexio/integrations/claude_code/hook/handler.jstests/cli/test_setup_cmd.py
🚧 Files skipped from review as they are similar to previous changes (2)
- reflexio/integrations/claude_code/hook/handler.js
- reflexio/integrations/claude_code/hook/HOOK.md
There was a problem hiding this comment.
♻️ Duplicate comments (1)
reflexio/integrations/claude_code/skill/SKILL-expert.md (1)
111-171:⚠️ Potential issue | 🟠 MajorDuplication risk: proactive publishes and Stop hook will process the same interactions twice.
The proactive extraction patterns (A-E) instruct Claude to
reflexio publish --force-extractionimmediately when detecting self-recovered failures, retry chains, etc. (line 134). However, expert mode also auto-installs the Stop (SessionEnd) hook, which force-publishes the full transcript at session end. Since the relevant code snippets fromprofile_generation_service_utils.pyandplaybook_extractor.pyshow no deduplication logic, the same interaction will be extracted twice—once from the immediate proactive publish and again from the session-end publish.This can result in duplicate playbooks/profiles and wasted extraction cycles.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@reflexio/integrations/claude_code/skill/SKILL-expert.md` around lines 111 - 171, The proactive publish flow will double-publish interactions because immediate force-extraction calls (the reflexio publish --force-extraction usage in SKILL-expert) and the auto-installed Stop/SessionEnd hook both send the same transcript; update the code to deduplicate by recording a unique extraction key per interaction (e.g., a stable hash of {"agent_version","source","interactions"} or a uuid per extraction) and check it before publishing in the functions that trigger publishes (the proactive publish path from SKILL-expert and the SessionEnd/Stop hook code in playbook_extractor.py and profile_generation_service_utils.py); specifically, implement a fast idempotency store or mark (in memory or persistent store) when publish succeeds and have SessionEnd skip publishes whose key is present, or have the proactive publish set a “published_by_proactive” flag the SessionEnd logic checks before sending.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@reflexio/integrations/claude_code/skill/SKILL-expert.md`:
- Around line 111-171: The proactive publish flow will double-publish
interactions because immediate force-extraction calls (the reflexio publish
--force-extraction usage in SKILL-expert) and the auto-installed Stop/SessionEnd
hook both send the same transcript; update the code to deduplicate by recording
a unique extraction key per interaction (e.g., a stable hash of
{"agent_version","source","interactions"} or a uuid per extraction) and check it
before publishing in the functions that trigger publishes (the proactive publish
path from SKILL-expert and the SessionEnd/Stop hook code in
playbook_extractor.py and profile_generation_service_utils.py); specifically,
implement a fast idempotency store or mark (in memory or persistent store) when
publish succeeds and have SessionEnd skip publishes whose key is present, or
have the proactive publish set a “published_by_proactive” flag the SessionEnd
logic checks before sending.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 8e26fd57-8549-4f6a-9bfb-183a8611f85d
📒 Files selected for processing (1)
reflexio/integrations/claude_code/skill/SKILL-expert.md
|
Only repository collaborators, contributors, or members can run CodeRabbit commands. |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
reflexio/cli/commands/setup_cmd.py (2)
715-721:⚠️ Potential issue | 🟠 MajorHook commands should shell-quote paths.
Lines 717, 721, and 725 serialize raw paths into shell commands; installs under paths with spaces can break hook execution.
🔐 Suggested fix
import contextlib import json import os +import shlex import shutil import subprocess @@ - _upsert_hook(hooks, "SessionStart", f"bash {session_start_hook_sh}") + _upsert_hook( + hooks, "SessionStart", f"bash {shlex.quote(str(session_start_hook_sh))}" + ) @@ - _upsert_hook(hooks, "UserPromptSubmit", f"node {search_hook_js}") + _upsert_hook( + hooks, "UserPromptSubmit", f"node {shlex.quote(str(search_hook_js))}" + ) @@ - _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}") + _upsert_hook(hooks, "SessionEnd", f"node {shlex.quote(str(handler_js_path))}")Also applies to: 725-725
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@reflexio/cli/commands/setup_cmd.py` around lines 715 - 721, The hook commands currently embed raw Path objects into shell strings (see session_start_hook_sh, search_hook_js passed to _upsert_hook), which breaks when paths contain spaces; update the hook command construction to shell-quote the path strings (e.g., use shlex.quote(str(path))) before interpolating into the "bash ..." or "node ..." command so the generated hook commands are safe for paths with spaces; apply the same quoting wherever handler_js_path-derived paths are used to build hook commands.
723-726:⚠️ Potential issue | 🟠 MajorRe-running setup with
expert=Falseleaves a stale SessionEnd hook.Line 723 only adds
SessionEndwhen expert is enabled; it does not remove a previously installed ReflexioSessionEndentry when expert is later disabled.🧩 Suggested fix
# SessionEnd hook — captures full session transcript on exit (expert mode only) if expert: _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}") + else: + event_hooks = hooks.get("SessionEnd", []) + hooks["SessionEnd"] = [ + entry + for entry in event_hooks + if not any("reflexio" in h.get("command", "") for h in entry.get("hooks", [])) + ] + if not hooks["SessionEnd"]: + hooks.pop("SessionEnd", None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@reflexio/cli/commands/setup_cmd.py` around lines 723 - 726, Currently the code only calls _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}") when expert is True, leaving any previously installed "SessionEnd" hook in hooks when expert is later False; fix by removing the hook when expert is False (e.g., call a removal helper like _remove_hook(hooks, "SessionEnd") or filter the hooks list to drop entries with name "SessionEnd") so that "SessionEnd" is upserted only when expert is True and removed otherwise.
🧹 Nitpick comments (1)
tests/cli/test_setup_cmd.py (1)
355-386: Add expert→normal transition coverage.These tests validate fresh installs, but they don’t cover re-running setup from expert mode to normal mode. A transition test would guard against stale expert artifacts persisting across re-installs.
✅ Suggested test addition
+ def test_reinstall_normal_removes_expert_artifacts(self, tmp_path: Path) -> None: + """Re-running setup without expert removes expert-only artifacts.""" + _install_claude_code_integration( + tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT + ) + _install_claude_code_integration( + tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT + ) + + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + assert "SessionEnd" not in settings.get("hooks", {}) + assert not (tmp_path / ".claude" / "commands" / "reflexio-extract").exists() + assert not ( + tmp_path / ".claude" / "skills" / "reflexio" / "references" + ).exists()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/cli/test_setup_cmd.py` around lines 355 - 386, Add a new test in tests/cli/test_setup_cmd.py that exercises the expert→normal transition: call _install_claude_code_integration(tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT) to simulate expert install, then call _install_claude_code_integration(tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT) (or omit expert flag) to simulate reverting to normal; read the generated .claude/settings.json and assert that "SessionEnd" is no longer present in settings["hooks"] while "SessionStart" and "UserPromptSubmit" remain present (and have length 1 if appropriate) to ensure expert artifacts are removed on re-install.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 829-842: When setup is run with expert=False you must remove any
prior expert-only artifacts left by a previous run with expert=True; update the
block around the expert flag (the code that currently creates cmd_src/cmd_dest
and refs_src/refs_dest) so that in the else branch (or when not expert) you
explicitly remove claude_dir / "commands" / "reflexio-extract" (delete the
directory or its SKILL.md) and remove claude_dir / "skills" / "reflexio" /
"references" if they exist (use exists() checks and shutil.rmtree or unlink as
appropriate), ensuring no exception is raised if those paths are already absent.
In `@reflexio/integrations/claude_code/skill/references/server-management.md`:
- Around line 16-18: The fenced code block containing the Agent invocation
(Agent(prompt="Start the Reflexio server: nohup reflexio services start --only
backend > ~/.reflexio/logs/server.log 2>&1 & sleep 5 && reflexio status check",
run_in_background=true)) is missing a language identifier; update that fenced
block in server-management.md to include a language tag (e.g., change ``` to
```text) so markdownlint MD040 is satisfied and the block renders with the
specified language.
---
Duplicate comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 715-721: The hook commands currently embed raw Path objects into
shell strings (see session_start_hook_sh, search_hook_js passed to
_upsert_hook), which breaks when paths contain spaces; update the hook command
construction to shell-quote the path strings (e.g., use shlex.quote(str(path)))
before interpolating into the "bash ..." or "node ..." command so the generated
hook commands are safe for paths with spaces; apply the same quoting wherever
handler_js_path-derived paths are used to build hook commands.
- Around line 723-726: Currently the code only calls _upsert_hook(hooks,
"SessionEnd", f"node {handler_js_path}") when expert is True, leaving any
previously installed "SessionEnd" hook in hooks when expert is later False; fix
by removing the hook when expert is False (e.g., call a removal helper like
_remove_hook(hooks, "SessionEnd") or filter the hooks list to drop entries with
name "SessionEnd") so that "SessionEnd" is upserted only when expert is True and
removed otherwise.
---
Nitpick comments:
In `@tests/cli/test_setup_cmd.py`:
- Around line 355-386: Add a new test in tests/cli/test_setup_cmd.py that
exercises the expert→normal transition: call
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) to simulate expert install, then call
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT) (or omit expert flag) to simulate
reverting to normal; read the generated .claude/settings.json and assert that
"SessionEnd" is no longer present in settings["hooks"] while "SessionStart" and
"UserPromptSubmit" remain present (and have length 1 if appropriate) to ensure
expert artifacts are removed on re-install.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e8bfef1e-d5ed-434b-926f-f9dda8f55bff
📒 Files selected for processing (7)
reflexio/cli/commands/setup_cmd.pyreflexio/integrations/claude_code/rules/reflexio.mdreflexio/integrations/claude_code/skill/SKILL-expert.mdreflexio/integrations/claude_code/skill/SKILL.mdreflexio/integrations/claude_code/skill/references/proactive-patterns.mdreflexio/integrations/claude_code/skill/references/server-management.mdtests/cli/test_setup_cmd.py
✅ Files skipped from review due to trivial changes (3)
- reflexio/integrations/claude_code/skill/SKILL.md
- reflexio/integrations/claude_code/skill/references/proactive-patterns.md
- reflexio/integrations/claude_code/rules/reflexio.md
🚧 Files skipped from review as they are similar to previous changes (1)
- reflexio/integrations/claude_code/skill/SKILL-expert.md
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
reflexio/client/client.py (1)
712-718:⚠️ Potential issue | 🟠 MajorMove invalidation to post-success path for async deletes.
At Line 712 and Line 905, invalidation happens before the delete request executes. In
wait_for_response=False, an immediate read can repopulate stale data before deletion completes, and no second invalidation cleans it up.Proposed fix
def _delete_profile_sync( self, request: DeleteUserProfileRequest ) -> DeleteUserProfileResponse: """Internal sync method to delete profile.""" response = self._make_request( "DELETE", "/api/delete_profile", json=request.model_dump(), ) + self._cache.invalidate("get_profiles") return DeleteUserProfileResponse(**response) async def _delete_profile_async( self, request: DeleteUserProfileRequest ) -> DeleteUserProfileResponse: """Internal async method to delete profile.""" response = await self._make_async_request( "DELETE", "/api/delete_profile", json=request.model_dump(), ) + self._cache.invalidate("get_profiles") return DeleteUserProfileResponse(**response) def delete_profile(...): @@ - self._cache.invalidate("get_profiles") if wait_for_response: return self._delete_profile_sync(request) self._fire_and_forget(self._delete_profile_async, request) return None def _delete_agent_playbook_sync( self, request: DeleteAgentPlaybookRequest ) -> DeleteAgentPlaybookResponse: @@ response = self._make_request( "DELETE", "/api/delete_agent_playbook", json=request.model_dump(), ) + self._cache.invalidate("get_agent_playbooks") return DeleteAgentPlaybookResponse(**response) async def _delete_agent_playbook_async( self, request: DeleteAgentPlaybookRequest ) -> DeleteAgentPlaybookResponse: @@ response = await self._make_async_request( "DELETE", "/api/delete_agent_playbook", json=request.model_dump(), ) + self._cache.invalidate("get_agent_playbooks") return DeleteAgentPlaybookResponse(**response) def delete_agent_playbook(...): @@ - self._cache.invalidate("get_agent_playbooks") if wait_for_response: return self._delete_agent_playbook_sync(request) self._fire_and_forget(self._delete_agent_playbook_async, request) return NoneAlso applies to: 905-911
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@reflexio/client/client.py` around lines 712 - 718, The cache invalidation for "get_profiles" is happening before the delete completes and should be moved to run only after a successful delete; update the code so that in the non-blocking path you do NOT call self._cache.invalidate("get_profiles") before firing the async task, but instead perform the invalidation after the delete finishes (i.e., inside _delete_profile_sync after a successful delete return and inside _delete_profile_async or its completion callback once the remote delete succeeds); apply the same change for the similar delete block referenced around the other delete path (the 905-911 logic) so invalidation happens post-success for both sync and async flows.
🧹 Nitpick comments (1)
tests/client/test_cache.py (1)
379-449: Add coverage for non-blocking delete cache behavior.These integration tests cover bulk synchronous delete paths well, but they don’t exercise
wait_for_response=Falsefordelete_profile/delete_agent_playbook, where cache consistency can diverge under races. Add a regression test for those fire-and-forget paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/client/test_cache.py` around lines 379 - 449, Tests currently cover synchronous bulk deletes but miss the non-blocking "fire-and-forget" paths; add a new regression test(s) in tests/client/test_cache.py that patches requests.Session similarly and exercises delete_profile(wait_for_response=False) and delete_agent_playbook(wait_for_response=False) to ensure they invalidate the relevant cache entries. Specifically, create scenarios that populate the profiles or agent_playbooks cache via get_profiles/get_agent_playbooks, call delete_profile/delete_agent_playbook with wait_for_response=False, simulate the HTTP response(s) via mock_response.json.side_effect, and then assert subsequent get_profiles/get_agent_playbooks trigger API calls (cache miss) by checking mock_session.request.call_count; reference the methods delete_profile and delete_agent_playbook and the helpers get_profiles/get_agent_playbooks to locate where to add the tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@reflexio/client/client.py`:
- Around line 712-718: The cache invalidation for "get_profiles" is happening
before the delete completes and should be moved to run only after a successful
delete; update the code so that in the non-blocking path you do NOT call
self._cache.invalidate("get_profiles") before firing the async task, but instead
perform the invalidation after the delete finishes (i.e., inside
_delete_profile_sync after a successful delete return and inside
_delete_profile_async or its completion callback once the remote delete
succeeds); apply the same change for the similar delete block referenced around
the other delete path (the 905-911 logic) so invalidation happens post-success
for both sync and async flows.
---
Nitpick comments:
In `@tests/client/test_cache.py`:
- Around line 379-449: Tests currently cover synchronous bulk deletes but miss
the non-blocking "fire-and-forget" paths; add a new regression test(s) in
tests/client/test_cache.py that patches requests.Session similarly and exercises
delete_profile(wait_for_response=False) and
delete_agent_playbook(wait_for_response=False) to ensure they invalidate the
relevant cache entries. Specifically, create scenarios that populate the
profiles or agent_playbooks cache via get_profiles/get_agent_playbooks, call
delete_profile/delete_agent_playbook with wait_for_response=False, simulate the
HTTP response(s) via mock_response.json.side_effect, and then assert subsequent
get_profiles/get_agent_playbooks trigger API calls (cache miss) by checking
mock_session.request.call_count; reference the methods delete_profile and
delete_agent_playbook and the helpers get_profiles/get_agent_playbooks to locate
where to add the tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: bb8b9eb6-b678-433a-96a5-a10cd9d0303a
📒 Files selected for processing (3)
reflexio/client/cache.pyreflexio/client/client.pytests/client/test_cache.py
…install - Add --force-extraction to session-end publish so extraction runs regardless of batch-interval throttling - Redirect publish output to ~/.reflexio/logs/stop-hook.log instead of silently discarding errors - Remove dead stop_hook_active guard (field not in Claude Code payload) - Auto-install Stop hook when expert mode is enabled via setup wizard - Update HOOK.md to reflect all changes
Add 5 high-signal patterns that trigger auto-publish without user input: - Self-recovered tool failures (error + recovery arc) - Retry chains (2+ failures on same operation) - Documentation/behavior gaps (docs say X, reality is Y) - Workarounds for limitations (used alternative approach) - Anomalous results (unexpected zeros, row-count mismatches) Also add end-of-session extraction nudge: suggest /reflexio-extract when session had any friction points.
- test_normal_mode_no_stop_hook: verifies Stop hook absent without --expert - test_expert_mode_installs_stop_hook: verifies Stop hook present with --expert, checks command points to handler.js - test_expert_mode_stop_hook_idempotent: verifies no duplication on repeated install - test_remove_from_dir_cleans_stop_hook: verifies uninstall removes the Stop hook
Stop fires once per turn (every assistant response), causing repeated publishes of the growing transcript and wasteful --force-extraction on every turn. SessionEnd fires once when the session exits — correct for full-session capture. - _merge_hook_config: Stop → SessionEnd - _remove_hook_config: add SessionEnd to removal list (keep Stop for backwards compat with old installs) - handler.js: update JSDoc comments - HOOK.md: update event name, installation example, frontmatter - tests: rename and update all SessionEnd assertions
Root cause: description led with "Search" which framed Reflexio as a search tool, causing Claude to route corrections to its built-in memory system instead of triggering the Reflexio publish flow. Changes: - Lead description with "Publish user corrections" (the missed trigger) - Add when_to_use field with concrete trigger phrases - Explain profile vs playbook: facts publish immediately, procedures publish after user confirmation (explicit or implicit) - Distinguish from Claude's built-in local memory - Add callout in skill body preventing memory/Reflexio conflation - Reorder trigger list: corrections first, search second
Root cause: skill tried to handle both when-to-trigger and how-to-publish, but Claude decides whether to invoke a skill before reading its content. Triggering guidance needs to be in rules (always loaded). Changes: - rules/reflexio.md: add "When to Publish to Reflexio" section with profile/playbook/procedure guidance and confirmation model; soften existing section titles from commanding to guiding tone - skill/SKILL-expert.md: simplify from 235 to 98 lines; keep only how-to (payload formats, correction patterns, end-of-session nudge) - skill/SKILL.md: update description to triggering conditions only per skill creation guidelines - skill/references/: new directory with proactive-patterns.md and server-management.md (heavy content moved out of main skill) - setup_cmd.py: copy references/ dir in expert mode - tests: add test_expert_mode_installs_references and test_normal_mode_no_references
delete_all_*, delete_*_by_ids, delete_*, update_*, and publish_*
methods now clear relevant cache entries so subsequent list/get
calls return fresh data instead of stale cached results.
Changes:
- cache.py: add clear() and invalidate(method_name) methods;
store method_name in each entry for prefix-based invalidation
- client.py: call cache.invalidate("get_profiles") on profile
mutations, cache.invalidate("get_agent_playbooks") on playbook
mutations, cache.clear() on nuclear deletes (delete_all_interactions,
delete_all_playbooks)
- tests: 7 new tests for clear, invalidate, and client integration
reflexio interactions list without --user-id failed with a Pydantic validation error because GetInteractionsRequest requires user_id. Now uses require_user_id() to fall back to REFLEXIO_USER_ID env var, matching the pattern used by other list commands.
- setup_cmd.py: shell-quote all hook paths with shlex.quote for paths with spaces; remove SessionEnd hook + expert artifacts (references/, reflexio-extract command) when re-installing in normal mode; extract _remove_reflexio_hook helper - handler.js: use positional args ($1..$5) instead of template literal interpolation in sh -c to prevent shell injection - HOOK.md: fix inaccurate docs — tool_use blocks ARE preserved as tools_used metadata, not skipped - server-management.md: add language identifier to code fence - tests: add test_normal_reinstall_removes_expert_artifacts
1a38703 to
14fc142
Compare
Summary
Comprehensive improvement to the Claude Code integration: fix the session-end hook, restructure the expert skill for reliable triggering, add cache invalidation on mutations, and fix CLI issues.
Changes
SessionEnd hook (handler.js + setup_cmd.py + HOOK.md)
SessionEndinstead ofStop—Stopfires per-turn,SessionEndfires once on exit--force-extractionso extraction always runs at session end~/.reflexio/logs/stop-hook.log(was silently discarded)stop_hook_activeguard (not in Claude Code payload)$1..$5) in sh -c instead of template literal interpolation (prevents shell injection)shlex.quotefor paths with spacesExpert skill restructure (SKILL-expert.md + rules/reflexio.md + references/)
rules/reflexio.md(always loaded) — "When to Publish to Reflexio" section with profile/playbook guidance and confirmation modelreferences/proactive-patterns.mdandreferences/server-management.mddescriptionto triggering conditions only per skill creation guidelinesreferences/in expert mode; removes on normal-mode re-installClient cache invalidation (cache.py + client.py)
clear()andinvalidate(method_name)toInMemoryCachedelete-all→listnow returns empty instead of stale cached dataCLI fix (interactions.py)
reflexio interactions listwithout--user-idfailed with Pydantic validation errorrequire_user_id()to fall back toREFLEXIO_USER_IDenv varTest plan
node --check handler.js— syntax validruff check+pyrighton all Python files — clean