Skip to content

feat: SessionEnd hook, expert skill restructure, cache invalidation, CLI fix#27

Merged
yilu331 merged 12 commits into
mainfrom
fix/stop-hook-force-extraction-logging
Apr 21, 2026
Merged

feat: SessionEnd hook, expert skill restructure, cache invalidation, CLI fix#27
yilu331 merged 12 commits into
mainfrom
fix/stop-hook-force-extraction-logging

Conversation

@yilu331
Copy link
Copy Markdown
Collaborator

@yilu331 yilu331 commented Apr 20, 2026

Summary

Comprehensive improvement to the Claude Code integration: fix the session-end hook, restructure the expert skill for reliable triggering, add cache invalidation on mutations, and fix CLI issues.

Changes

SessionEnd hook (handler.js + setup_cmd.py + HOOK.md)

  • Use SessionEnd instead of StopStop fires per-turn, SessionEnd fires once on exit
  • Add --force-extraction so extraction always runs at session end
  • Redirect publish output to ~/.reflexio/logs/stop-hook.log (was silently discarded)
  • Remove dead stop_hook_active guard (not in Claude Code payload)
  • Use positional args ($1..$5) in sh -c instead of template literal interpolation (prevents shell injection)
  • Auto-install hook when expert mode enabled; remove on normal-mode re-install
  • Shell-quote all hook paths with shlex.quote for paths with spaces

Expert skill restructure (SKILL-expert.md + rules/reflexio.md + references/)

  • Move triggering logic to rules/reflexio.md (always loaded) — "When to Publish to Reflexio" section with profile/playbook guidance and confirmation model
  • Simplify skill from 235 → 98 lines — keep only the how-to (payload formats, patterns)
  • Move heavy content to references/proactive-patterns.md and references/server-management.md
  • Rewrite description to triggering conditions only per skill creation guidelines
  • Setup wizard copies references/ in expert mode; removes on normal-mode re-install
  • Normal-mode re-install cleans up all expert artifacts (SessionEnd hook, references/, reflexio-extract command)

Client cache invalidation (cache.py + client.py)

  • Add clear() and invalidate(method_name) to InMemoryCache
  • Call invalidation in all 12 mutation methods (delete, update, publish)
  • Fixes: delete-alllist now returns empty instead of stale cached data

CLI fix (interactions.py)

  • reflexio interactions list without --user-id failed with Pydantic validation error
  • Now uses require_user_id() to fall back to REFLEXIO_USER_ID env var

Test plan

  • node --check handler.js — syntax valid
  • ruff check + pyright on all Python files — clean
  • 47 setup_cmd tests pass (9 new: SessionEnd hook, references, expert→normal downgrade)
  • 21 cache tests pass (7 new: clear, invalidate, client integration)
  • handler.js transcript parsing verified against real Claude Code JSONL
  • Manual testing of expert skill correction capture

@yilu331 yilu331 changed the title fix: Stop hook force-extraction, error logging, and expert mode auto-install fix: Stop hook force-extraction, logging, expert mode auto-install, and proactive extraction triggers Apr 20, 2026
@yilu331 yilu331 changed the title fix: Stop hook force-extraction, logging, expert mode auto-install, and proactive extraction triggers fix: Stop hook reliability + proactive extraction triggers in expert skill Apr 20, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an expert mode for Claude Code integration that installs a SessionEnd hook and copies expert references; the hook handler now runs reflexio interactions publish --force-extraction and logs output. Documentation and skill references were updated for proactive extraction patterns; tests cover expert install, idempotency, and uninstall cleanup.

Changes

Cohort / File(s) Summary
CLI Setup & Tests
reflexio/cli/commands/setup_cmd.py, tests/cli/test_setup_cmd.py
Added expert: bool parameter to hook merge/install paths; expert install upserts SessionEnd hook and copies skill/references into installed skill. Uninstall removes SessionEnd. Tests added for expert vs normal installs, idempotency, and uninstall cleanup.
Hook Implementation & Docs
reflexio/integrations/claude_code/hook/handler.js, reflexio/integrations/claude_code/hook/HOOK.md
Hook event renamed StopSessionEnd. Handler removed early-exit guard, ensures ~/.reflexio/logs, invokes reflexio interactions publish --force-extraction (fire-and-forget) and appends stdout/stderr to stop-hook.log. Docs updated to reflect SessionEnd, logging, and expert-only installation.
Skill Documentation & References
reflexio/integrations/claude_code/skill/SKILL-expert.md, reflexio/integrations/claude_code/skill/SKILL.md, reflexio/integrations/claude_code/skill/references/*
Reworked expert skill guidance to a publish-focused workflow; shortened skill description. Added proactive-patterns.md and server-management.md reference docs.
Rules
reflexio/integrations/claude_code/rules/reflexio.md
Clarified Reflexio’s cross-session scope, softened wording about mentioning Reflexio, and added explicit when-to-publish guidance contrasting Reflexio vs local memory.
Client Cache & Client Invalidation
reflexio/client/cache.py, reflexio/client/client.py, tests/client/test_cache.py
Cache entries now record method_name; added clear() and invalidate(method_name). Client invalidates or clears relevant caches after mutating operations; tests added for cache clear/invalidate and client-cache interactions.

Sequence Diagram(s)

sequenceDiagram
  participant Claude
  participant HookHandler as Hook Handler (handler.js)
  participant ReflexioCLI as reflexio interactions publish
  participant FS as FileSystem (~/.reflexio/logs)

  Claude->>HookHandler: SessionEnd hook invocation
  HookHandler->>FS: ensure logs dir & write temp payload
  HookHandler->>ReflexioCLI: run publish --force-extraction (fire-and-forget)
  ReflexioCLI-->>HookHandler: stdout/stderr (captured)
  HookHandler->>FS: append output to stop-hook.log and remove temp
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

I’m a rabbit in expert mode, ears alert and bright, 🐇
SessionEnd rings, I publish through the night,
Force-extract the lessons, log each chase and cheer,
Proactive patterns found — the signal’s clear,
I hop, I copy refs, I tidy hooks with delight. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references several major changes (SessionEnd hook, expert skill restructure, cache invalidation, CLI fix) that are all present in the changeset, covering the key areas of work.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/stop-hook-force-extraction-logging

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 723-725: The setup code currently only adds the "Stop" hook when
expert is True via _upsert_hook(hooks, "Stop", ...), but does not remove an
existing "Stop" entry when expert is False; update the setup flow so that when
expert is False it explicitly removes any existing "Stop" hook from the hooks
collection (e.g., call the inverse removal helper or filter the hooks list for
entries not named "Stop") instead of just skipping _upsert_hook, referencing the
same "Stop" hook name and the hooks variable so re-running setup in normal mode
clears the Reflexio Stop hook from settings.json.
- Around line 723-725: The Stop hook is being built with an unquoted path ("node
{handler_js_path}") which will break if handler_js_path contains spaces; before
calling _upsert_hook (and similarly for the other two hook registrations in this
module), shell-quote the path(s) using Python's shlex.quote and interpolate the
quoted value into the command (e.g., quote handler_js_path and any other hook
path variables) so the command string passed to _upsert_hook is safe for spaces
and special characters.

In `@reflexio/integrations/claude_code/hook/handler.js`:
- Around line 112-118: The current spawn call constructs a shell command by
interpolating userId, agentVersion, sessionId and paths into sh -c (symbols:
LOG_DIR, logFile, spawn, payloadFile, userId, agentVersion, sessionId), which
risks shell injection and broken parsing; fix by avoiding sh -c and passing the
executable and arguments as an array to spawn (e.g., use spawn("reflexio",
["interactions","publish","--user-id", userId, "--file", payloadFile,
"--source","claude-code","--agent-version", agentVersion, "--session-id",
sessionId || "unknown","--force-extraction"], { stdio: ... })) and handle
stdout/stderr by opening/redirecting to logFile (or piping streams) and perform
the rm (unlink) of payloadFile only after the child process exits, rather than
embedding both operations in a shell string.

In `@reflexio/integrations/claude_code/hook/HOOK.md`:
- Around line 33-40: The docs claim tool calls are skipped, but the parser in
handler.js preserves tool_use blocks into the tools_used field and retains pure
tool-only assistant turns as placeholder content; update HOOK.md (the stop-hook
description list item 2) to reflect that tool calls are captured as tools_used
and tool-only assistant turns may be included as placeholders rather than being
dropped, and mention that tool_use blocks are preserved by the handler.js parser
so published payloads include those tool usages.

In `@reflexio/integrations/claude_code/skill/SKILL-expert.md`:
- Around line 105-165: Proactive publishes are being duplicated by the expert
Stop hook; prevent double-publishing by adding deduplication or gating. Modify
the publishing flow used in reflexio/cli/commands/interactions.py,
reflexio/client/client.py, and reflexio/server/api.py so that reflexio publish
calls include an idempotency key (e.g., session_id + span_hash) and have the
Stop hook check for an existing published span before force-extraction;
alternatively implement a simple server-side dedupe in the API endpoint that
ignores repeated payloads with the same idempotency key. Ensure the Stop hook
(the expert Stop handler) uses that same key or query to skip already-captured
spans so a proactive publish and the Stop hook cannot both send the same
interaction.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 52da4aae-6ef0-425f-9169-3c4b595b2b1f

📥 Commits

Reviewing files that changed from the base of the PR and between 1ca3585 and 469f20c.

📒 Files selected for processing (5)
  • reflexio/cli/commands/setup_cmd.py
  • reflexio/integrations/claude_code/hook/HOOK.md
  • reflexio/integrations/claude_code/hook/handler.js
  • reflexio/integrations/claude_code/skill/SKILL-expert.md
  • tests/cli/test_setup_cmd.py

Comment thread reflexio/cli/commands/setup_cmd.py Outdated
Comment thread reflexio/integrations/claude_code/hook/handler.js Outdated
Comment thread reflexio/integrations/claude_code/hook/HOOK.md
Comment on lines +105 to +165
## Step-by-Step: Proactive Extraction (High-Signal Moments)

Beyond explicit corrections, certain moments during a session carry high learning value — even when the user says nothing. When you detect any of the patterns below, publish to Reflexio using the same mechanism as corrections. Do not wait for user input.

**Payload format** — same as corrections, but the "user" turn is the original request (no correction needed), and the "assistant" turn captures the full failure → recovery arc:
```bash
cat > /tmp/reflexio-summary.json << 'SUMMARY_EOF'
{
"agent_version": "claude-code",
"source": "claude-code-expert",
"interactions": [
{"role": "user", "content": "<what the user originally asked>"},
{
"role": "assistant",
"content": "<what you tried, what went wrong, and how you recovered — preserve self-correction phrases verbatim>",
"tools_used": [
{"tool_name": "<tool>", "tool_data": {"input": "<params> — FAILED: <exact error>"}},
{"tool_name": "<tool>", "tool_data": {"input": "<corrected params> — succeeded"}}
]
}
]
}
SUMMARY_EOF
reflexio publish --agent-version claude-code --source claude-code-expert --skip-aggregation --force-extraction --file /tmp/reflexio-summary.json && rm -f /tmp/reflexio-summary.json
```

List every failed attempt in `tools_used` in chronological order, followed by the successful one. Include exact error messages verbatim — they are load-bearing evidence for extracting precise behavioral rules.

### Patterns to detect and publish:

**A. Self-recovered tool failures**
You tried a tool call, got an error, and fixed it yourself without user input. The error message and your recovery strategy are extractable as a behavioral rule (trigger → pitfall → instruction).

_Example:_ You ran a SQL query with `JOIN channels ON l.channel_id`, got `invalid identifier 'L.CHANNEL_ID'`, discovered the column is actually `l.stream_channel_id` by introspecting the schema, and rewrote the query.

**B. Retry chains (2+ failures on same operation)**
You attempted the same operation 2+ times with different errors before succeeding. Each retry and the eventual fix form a learning arc. Publish the full chain as one interaction — list all attempts in `tools_used`.

_Example:_ File edit failed (wrong indentation), retried with different context (still wrong), then read the file first and got it right. The pattern "read before editing unfamiliar files" is extractable.

**C. Discovered documentation or behavior gaps**
Tool documentation or expected behavior said X, but reality was Y. The discrepancy is a reusable rule that prevents future agents from making the same incorrect assumption.

_Example:_ API docs say `--format json` is supported, but the CLI returns `unknown flag`. You discovered `--output-format json` works instead.

**D. Workarounds for limitations**
An API, tool, or system doesn't support what you needed, so you used an alternative approach. The limitation and workaround are a reusable playbook rule.

_Example:_ The database doesn't support `LATERAL JOIN`, so you rewrote the query using a correlated subquery. Or: the MCP tool doesn't accept wildcards, so you listed the directory first and filtered client-side.

**E. Anomalous or implausible results**
Results that are unexpected — zeros where you expected values, row counts that don't match documentation, mean/median divergence suggesting data skew, results that contradict what the user described. Publish how you detected the anomaly and what you did about it.

_Example:_ Query returned 0 rows when the user said "we have thousands of records." You discovered the table uses soft deletes and added `WHERE deleted_at IS NULL` — or found the user was looking at a different environment.

### When to publish proactively:

- **Publish immediately** after recovering from the situation — don't batch multiple unrelated patterns into one publish
- **Don't publish routine successes** — only friction, failures, surprises, and workarounds. If a tool call succeeded on the first try with no surprises, there's nothing to extract
- **Keep the bar reasonable** — a single typo in a file path that you immediately corrected is not worth publishing. The signal is in non-obvious failures where the recovery required understanding something new about the system

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Proactive publishes will be duplicated by the expert Stop hook.

This section tells the agent to publish self-recovered failures immediately, while expert mode now also force-publishes the full transcript on Stop. The publish path shown in reflexio/cli/commands/interactions.py, reflexio/client/client.py, and reflexio/server/api.py doesn't attach an idempotency key or dedupe repeated session content, so the same learning can be extracted twice. Either gate the Stop hook for sessions that already published, or add a way for the session-end extractor to skip already-captured spans.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@reflexio/integrations/claude_code/skill/SKILL-expert.md` around lines 105 -
165, Proactive publishes are being duplicated by the expert Stop hook; prevent
double-publishing by adding deduplication or gating. Modify the publishing flow
used in reflexio/cli/commands/interactions.py, reflexio/client/client.py, and
reflexio/server/api.py so that reflexio publish calls include an idempotency key
(e.g., session_id + span_hash) and have the Stop hook check for an existing
published span before force-extraction; alternatively implement a simple
server-side dedupe in the API endpoint that ignores repeated payloads with the
same idempotency key. Ensure the Stop hook (the expert Stop handler) uses that
same key or query to skip already-captured spans so a proactive publish and the
Stop hook cannot both send the same interaction.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — acknowledged. The SessionEnd hook publishes the raw transcript; proactive publishes capture specific corrections with structured context. The server extracts different signals from each. Dedup is a future optimization, not a blocker for this PR.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yilu331, understood — the two publish paths are architecturally distinct: the SessionEnd hook submits the raw session transcript for bulk extraction, while proactive publishes submit pre-structured, high-signal payloads (with tools_used, exact errors, and recovery arcs) that the server can extract more precisely. Double-publishing the same span would produce different extraction results, not true duplicates. Dedup as a future optimization makes sense.

(̿▀̿ ̿Ĺ̯̿̿▀̿ ̿)̄ ~ᶘ ᵒᴥᵒᶅ


✏️ Learnings added
Learnt from: yilu331
Repo: ReflexioAI/reflexio PR: 27
File: reflexio/integrations/claude_code/skill/SKILL-expert.md:0-0
Timestamp: 2026-04-21T22:16:19.076Z
Learning: In the reflexio project, the SessionEnd hook (handler.js) publishes the raw session transcript to Reflexio for bulk extraction, while proactive in-session publishes (from SKILL-expert.md) submit pre-structured payloads with `tools_used`, exact error messages, and failure-to-recovery arcs. The server extracts different signals from each. These two publish paths are intentionally distinct, and deduplication is a future optimization rather than a current requirement.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/cli/test_setup_cmd.py (1)

358-367: Good idempotency test, but consider adding a "downgrade" scenario.

This test correctly validates that repeated expert installs don't duplicate hooks. However, there's no coverage for the case where a user first installs with --expert, then later re-runs setup in normal mode. Based on the current implementation, the SessionEnd hook would remain in settings.json even though the summary would show only "SessionStart + UserPromptSubmit".

Consider adding a test like:

def test_normal_mode_removes_session_end_after_expert(self, tmp_path: Path) -> None:
    """Re-running in normal mode should remove expert-only SessionEnd hook."""
    _install_claude_code_integration(
        tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT
    )
    _install_claude_code_integration(
        tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT
    )
    settings = json.loads((tmp_path / ".claude" / "settings.json").read_text())
    assert "SessionEnd" not in settings["hooks"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_setup_cmd.py` around lines 358 - 367, Add a regression test
that installs expert mode then re-runs setup in normal mode to ensure
expert-only hooks are removed: create a test named
test_normal_mode_removes_session_end_after_expert that calls
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) then calls
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT), reads tmp_path / ".claude" /
"settings.json" into settings and asserts that "SessionEnd" is not present in
settings["hooks"]; use the existing helpers and InstallLocation enum so this
covers downgrade behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 723-725: The SessionEnd hook is only added when expert is True but
never removed when expert is False, leaving a stale hook after re-running setup;
modify the setup logic around the expert check (the block that calls
_upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")) so that when
expert is False you explicitly remove any existing "SessionEnd" entry from the
hooks collection and persist that change (use the existing hook-management
utility or remove/filter the "SessionEnd" key from hooks before saving),
ensuring the hooks state reflects the expert flag.

---

Nitpick comments:
In `@tests/cli/test_setup_cmd.py`:
- Around line 358-367: Add a regression test that installs expert mode then
re-runs setup in normal mode to ensure expert-only hooks are removed: create a
test named test_normal_mode_removes_session_end_after_expert that calls
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) then calls
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT), reads tmp_path / ".claude" /
"settings.json" into settings and asserts that "SessionEnd" is not present in
settings["hooks"]; use the existing helpers and InstallLocation enum so this
covers downgrade behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f2bb6067-c771-4347-af31-c7c77e3ed361

📥 Commits

Reviewing files that changed from the base of the PR and between 469f20c and 62418ba.

📒 Files selected for processing (4)
  • reflexio/cli/commands/setup_cmd.py
  • reflexio/integrations/claude_code/hook/HOOK.md
  • reflexio/integrations/claude_code/hook/handler.js
  • tests/cli/test_setup_cmd.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • reflexio/integrations/claude_code/hook/handler.js
  • reflexio/integrations/claude_code/hook/HOOK.md

Comment thread reflexio/cli/commands/setup_cmd.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
reflexio/integrations/claude_code/skill/SKILL-expert.md (1)

111-171: ⚠️ Potential issue | 🟠 Major

Duplication risk: proactive publishes and Stop hook will process the same interactions twice.

The proactive extraction patterns (A-E) instruct Claude to reflexio publish --force-extraction immediately when detecting self-recovered failures, retry chains, etc. (line 134). However, expert mode also auto-installs the Stop (SessionEnd) hook, which force-publishes the full transcript at session end. Since the relevant code snippets from profile_generation_service_utils.py and playbook_extractor.py show no deduplication logic, the same interaction will be extracted twice—once from the immediate proactive publish and again from the session-end publish.

This can result in duplicate playbooks/profiles and wasted extraction cycles.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@reflexio/integrations/claude_code/skill/SKILL-expert.md` around lines 111 -
171, The proactive publish flow will double-publish interactions because
immediate force-extraction calls (the reflexio publish --force-extraction usage
in SKILL-expert) and the auto-installed Stop/SessionEnd hook both send the same
transcript; update the code to deduplicate by recording a unique extraction key
per interaction (e.g., a stable hash of
{"agent_version","source","interactions"} or a uuid per extraction) and check it
before publishing in the functions that trigger publishes (the proactive publish
path from SKILL-expert and the SessionEnd/Stop hook code in
playbook_extractor.py and profile_generation_service_utils.py); specifically,
implement a fast idempotency store or mark (in memory or persistent store) when
publish succeeds and have SessionEnd skip publishes whose key is present, or
have the proactive publish set a “published_by_proactive” flag the SessionEnd
logic checks before sending.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@reflexio/integrations/claude_code/skill/SKILL-expert.md`:
- Around line 111-171: The proactive publish flow will double-publish
interactions because immediate force-extraction calls (the reflexio publish
--force-extraction usage in SKILL-expert) and the auto-installed Stop/SessionEnd
hook both send the same transcript; update the code to deduplicate by recording
a unique extraction key per interaction (e.g., a stable hash of
{"agent_version","source","interactions"} or a uuid per extraction) and check it
before publishing in the functions that trigger publishes (the proactive publish
path from SKILL-expert and the SessionEnd/Stop hook code in
playbook_extractor.py and profile_generation_service_utils.py); specifically,
implement a fast idempotency store or mark (in memory or persistent store) when
publish succeeds and have SessionEnd skip publishes whose key is present, or
have the proactive publish set a “published_by_proactive” flag the SessionEnd
logic checks before sending.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8e26fd57-8549-4f6a-9bfb-183a8611f85d

📥 Commits

Reviewing files that changed from the base of the PR and between 62418ba and d5548d9.

📒 Files selected for processing (1)
  • reflexio/integrations/claude_code/skill/SKILL-expert.md

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Only repository collaborators, contributors, or members can run CodeRabbit commands.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
reflexio/cli/commands/setup_cmd.py (2)

715-721: ⚠️ Potential issue | 🟠 Major

Hook commands should shell-quote paths.

Lines 717, 721, and 725 serialize raw paths into shell commands; installs under paths with spaces can break hook execution.

🔐 Suggested fix
 import contextlib
 import json
 import os
+import shlex
 import shutil
 import subprocess
@@
-    _upsert_hook(hooks, "SessionStart", f"bash {session_start_hook_sh}")
+    _upsert_hook(
+        hooks, "SessionStart", f"bash {shlex.quote(str(session_start_hook_sh))}"
+    )
@@
-    _upsert_hook(hooks, "UserPromptSubmit", f"node {search_hook_js}")
+    _upsert_hook(
+        hooks, "UserPromptSubmit", f"node {shlex.quote(str(search_hook_js))}"
+    )
@@
-        _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")
+        _upsert_hook(hooks, "SessionEnd", f"node {shlex.quote(str(handler_js_path))}")

Also applies to: 725-725

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@reflexio/cli/commands/setup_cmd.py` around lines 715 - 721, The hook commands
currently embed raw Path objects into shell strings (see session_start_hook_sh,
search_hook_js passed to _upsert_hook), which breaks when paths contain spaces;
update the hook command construction to shell-quote the path strings (e.g., use
shlex.quote(str(path))) before interpolating into the "bash ..." or "node ..."
command so the generated hook commands are safe for paths with spaces; apply the
same quoting wherever handler_js_path-derived paths are used to build hook
commands.

723-726: ⚠️ Potential issue | 🟠 Major

Re-running setup with expert=False leaves a stale SessionEnd hook.

Line 723 only adds SessionEnd when expert is enabled; it does not remove a previously installed Reflexio SessionEnd entry when expert is later disabled.

🧩 Suggested fix
     # SessionEnd hook — captures full session transcript on exit (expert mode only)
     if expert:
         _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")
+    else:
+        event_hooks = hooks.get("SessionEnd", [])
+        hooks["SessionEnd"] = [
+            entry
+            for entry in event_hooks
+            if not any("reflexio" in h.get("command", "") for h in entry.get("hooks", []))
+        ]
+        if not hooks["SessionEnd"]:
+            hooks.pop("SessionEnd", None)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@reflexio/cli/commands/setup_cmd.py` around lines 723 - 726, Currently the
code only calls _upsert_hook(hooks, "SessionEnd", f"node {handler_js_path}")
when expert is True, leaving any previously installed "SessionEnd" hook in hooks
when expert is later False; fix by removing the hook when expert is False (e.g.,
call a removal helper like _remove_hook(hooks, "SessionEnd") or filter the hooks
list to drop entries with name "SessionEnd") so that "SessionEnd" is upserted
only when expert is True and removed otherwise.
🧹 Nitpick comments (1)
tests/cli/test_setup_cmd.py (1)

355-386: Add expert→normal transition coverage.

These tests validate fresh installs, but they don’t cover re-running setup from expert mode to normal mode. A transition test would guard against stale expert artifacts persisting across re-installs.

✅ Suggested test addition
+    def test_reinstall_normal_removes_expert_artifacts(self, tmp_path: Path) -> None:
+        """Re-running setup without expert removes expert-only artifacts."""
+        _install_claude_code_integration(
+            tmp_path, expert=True, location=InstallLocation.CURRENT_PROJECT
+        )
+        _install_claude_code_integration(
+            tmp_path, expert=False, location=InstallLocation.CURRENT_PROJECT
+        )
+
+        settings = json.loads((tmp_path / ".claude" / "settings.json").read_text())
+        assert "SessionEnd" not in settings.get("hooks", {})
+        assert not (tmp_path / ".claude" / "commands" / "reflexio-extract").exists()
+        assert not (
+            tmp_path / ".claude" / "skills" / "reflexio" / "references"
+        ).exists()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_setup_cmd.py` around lines 355 - 386, Add a new test in
tests/cli/test_setup_cmd.py that exercises the expert→normal transition: call
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) to simulate expert install, then call
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT) (or omit expert flag) to simulate
reverting to normal; read the generated .claude/settings.json and assert that
"SessionEnd" is no longer present in settings["hooks"] while "SessionStart" and
"UserPromptSubmit" remain present (and have length 1 if appropriate) to ensure
expert artifacts are removed on re-install.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 829-842: When setup is run with expert=False you must remove any
prior expert-only artifacts left by a previous run with expert=True; update the
block around the expert flag (the code that currently creates cmd_src/cmd_dest
and refs_src/refs_dest) so that in the else branch (or when not expert) you
explicitly remove claude_dir / "commands" / "reflexio-extract" (delete the
directory or its SKILL.md) and remove claude_dir / "skills" / "reflexio" /
"references" if they exist (use exists() checks and shutil.rmtree or unlink as
appropriate), ensuring no exception is raised if those paths are already absent.

In `@reflexio/integrations/claude_code/skill/references/server-management.md`:
- Around line 16-18: The fenced code block containing the Agent invocation
(Agent(prompt="Start the Reflexio server: nohup reflexio services start --only
backend > ~/.reflexio/logs/server.log 2>&1 & sleep 5 && reflexio status check",
run_in_background=true)) is missing a language identifier; update that fenced
block in server-management.md to include a language tag (e.g., change ``` to
```text) so markdownlint MD040 is satisfied and the block renders with the
specified language.

---

Duplicate comments:
In `@reflexio/cli/commands/setup_cmd.py`:
- Around line 715-721: The hook commands currently embed raw Path objects into
shell strings (see session_start_hook_sh, search_hook_js passed to
_upsert_hook), which breaks when paths contain spaces; update the hook command
construction to shell-quote the path strings (e.g., use shlex.quote(str(path)))
before interpolating into the "bash ..." or "node ..." command so the generated
hook commands are safe for paths with spaces; apply the same quoting wherever
handler_js_path-derived paths are used to build hook commands.
- Around line 723-726: Currently the code only calls _upsert_hook(hooks,
"SessionEnd", f"node {handler_js_path}") when expert is True, leaving any
previously installed "SessionEnd" hook in hooks when expert is later False; fix
by removing the hook when expert is False (e.g., call a removal helper like
_remove_hook(hooks, "SessionEnd") or filter the hooks list to drop entries with
name "SessionEnd") so that "SessionEnd" is upserted only when expert is True and
removed otherwise.

---

Nitpick comments:
In `@tests/cli/test_setup_cmd.py`:
- Around line 355-386: Add a new test in tests/cli/test_setup_cmd.py that
exercises the expert→normal transition: call
_install_claude_code_integration(tmp_path, expert=True,
location=InstallLocation.CURRENT_PROJECT) to simulate expert install, then call
_install_claude_code_integration(tmp_path, expert=False,
location=InstallLocation.CURRENT_PROJECT) (or omit expert flag) to simulate
reverting to normal; read the generated .claude/settings.json and assert that
"SessionEnd" is no longer present in settings["hooks"] while "SessionStart" and
"UserPromptSubmit" remain present (and have length 1 if appropriate) to ensure
expert artifacts are removed on re-install.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e8bfef1e-d5ed-434b-926f-f9dda8f55bff

📥 Commits

Reviewing files that changed from the base of the PR and between d5548d9 and 9d80a26.

📒 Files selected for processing (7)
  • reflexio/cli/commands/setup_cmd.py
  • reflexio/integrations/claude_code/rules/reflexio.md
  • reflexio/integrations/claude_code/skill/SKILL-expert.md
  • reflexio/integrations/claude_code/skill/SKILL.md
  • reflexio/integrations/claude_code/skill/references/proactive-patterns.md
  • reflexio/integrations/claude_code/skill/references/server-management.md
  • tests/cli/test_setup_cmd.py
✅ Files skipped from review due to trivial changes (3)
  • reflexio/integrations/claude_code/skill/SKILL.md
  • reflexio/integrations/claude_code/skill/references/proactive-patterns.md
  • reflexio/integrations/claude_code/rules/reflexio.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • reflexio/integrations/claude_code/skill/SKILL-expert.md

Comment thread reflexio/cli/commands/setup_cmd.py
Comment thread reflexio/integrations/claude_code/skill/references/server-management.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
reflexio/client/client.py (1)

712-718: ⚠️ Potential issue | 🟠 Major

Move invalidation to post-success path for async deletes.

At Line 712 and Line 905, invalidation happens before the delete request executes. In wait_for_response=False, an immediate read can repopulate stale data before deletion completes, and no second invalidation cleans it up.

Proposed fix
 def _delete_profile_sync(
     self, request: DeleteUserProfileRequest
 ) -> DeleteUserProfileResponse:
     """Internal sync method to delete profile."""
     response = self._make_request(
         "DELETE",
         "/api/delete_profile",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_profiles")
     return DeleteUserProfileResponse(**response)

 async def _delete_profile_async(
     self, request: DeleteUserProfileRequest
 ) -> DeleteUserProfileResponse:
     """Internal async method to delete profile."""
     response = await self._make_async_request(
         "DELETE",
         "/api/delete_profile",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_profiles")
     return DeleteUserProfileResponse(**response)

 def delete_profile(...):
@@
-    self._cache.invalidate("get_profiles")
     if wait_for_response:
         return self._delete_profile_sync(request)
     self._fire_and_forget(self._delete_profile_async, request)
     return None


 def _delete_agent_playbook_sync(
     self, request: DeleteAgentPlaybookRequest
 ) -> DeleteAgentPlaybookResponse:
@@
     response = self._make_request(
         "DELETE",
         "/api/delete_agent_playbook",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_agent_playbooks")
     return DeleteAgentPlaybookResponse(**response)

 async def _delete_agent_playbook_async(
     self, request: DeleteAgentPlaybookRequest
 ) -> DeleteAgentPlaybookResponse:
@@
     response = await self._make_async_request(
         "DELETE",
         "/api/delete_agent_playbook",
         json=request.model_dump(),
     )
+    self._cache.invalidate("get_agent_playbooks")
     return DeleteAgentPlaybookResponse(**response)

 def delete_agent_playbook(...):
@@
-    self._cache.invalidate("get_agent_playbooks")
     if wait_for_response:
         return self._delete_agent_playbook_sync(request)
     self._fire_and_forget(self._delete_agent_playbook_async, request)
     return None

Also applies to: 905-911

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@reflexio/client/client.py` around lines 712 - 718, The cache invalidation for
"get_profiles" is happening before the delete completes and should be moved to
run only after a successful delete; update the code so that in the non-blocking
path you do NOT call self._cache.invalidate("get_profiles") before firing the
async task, but instead perform the invalidation after the delete finishes
(i.e., inside _delete_profile_sync after a successful delete return and inside
_delete_profile_async or its completion callback once the remote delete
succeeds); apply the same change for the similar delete block referenced around
the other delete path (the 905-911 logic) so invalidation happens post-success
for both sync and async flows.
🧹 Nitpick comments (1)
tests/client/test_cache.py (1)

379-449: Add coverage for non-blocking delete cache behavior.

These integration tests cover bulk synchronous delete paths well, but they don’t exercise wait_for_response=False for delete_profile / delete_agent_playbook, where cache consistency can diverge under races. Add a regression test for those fire-and-forget paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/client/test_cache.py` around lines 379 - 449, Tests currently cover
synchronous bulk deletes but miss the non-blocking "fire-and-forget" paths; add
a new regression test(s) in tests/client/test_cache.py that patches
requests.Session similarly and exercises delete_profile(wait_for_response=False)
and delete_agent_playbook(wait_for_response=False) to ensure they invalidate the
relevant cache entries. Specifically, create scenarios that populate the
profiles or agent_playbooks cache via get_profiles/get_agent_playbooks, call
delete_profile/delete_agent_playbook with wait_for_response=False, simulate the
HTTP response(s) via mock_response.json.side_effect, and then assert subsequent
get_profiles/get_agent_playbooks trigger API calls (cache miss) by checking
mock_session.request.call_count; reference the methods delete_profile and
delete_agent_playbook and the helpers get_profiles/get_agent_playbooks to locate
where to add the tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@reflexio/client/client.py`:
- Around line 712-718: The cache invalidation for "get_profiles" is happening
before the delete completes and should be moved to run only after a successful
delete; update the code so that in the non-blocking path you do NOT call
self._cache.invalidate("get_profiles") before firing the async task, but instead
perform the invalidation after the delete finishes (i.e., inside
_delete_profile_sync after a successful delete return and inside
_delete_profile_async or its completion callback once the remote delete
succeeds); apply the same change for the similar delete block referenced around
the other delete path (the 905-911 logic) so invalidation happens post-success
for both sync and async flows.

---

Nitpick comments:
In `@tests/client/test_cache.py`:
- Around line 379-449: Tests currently cover synchronous bulk deletes but miss
the non-blocking "fire-and-forget" paths; add a new regression test(s) in
tests/client/test_cache.py that patches requests.Session similarly and exercises
delete_profile(wait_for_response=False) and
delete_agent_playbook(wait_for_response=False) to ensure they invalidate the
relevant cache entries. Specifically, create scenarios that populate the
profiles or agent_playbooks cache via get_profiles/get_agent_playbooks, call
delete_profile/delete_agent_playbook with wait_for_response=False, simulate the
HTTP response(s) via mock_response.json.side_effect, and then assert subsequent
get_profiles/get_agent_playbooks trigger API calls (cache miss) by checking
mock_session.request.call_count; reference the methods delete_profile and
delete_agent_playbook and the helpers get_profiles/get_agent_playbooks to locate
where to add the tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: bb8b9eb6-b678-433a-96a5-a10cd9d0303a

📥 Commits

Reviewing files that changed from the base of the PR and between 9d80a26 and 15787e7.

📒 Files selected for processing (3)
  • reflexio/client/cache.py
  • reflexio/client/client.py
  • tests/client/test_cache.py

@yilu331 yilu331 changed the title fix: Stop hook reliability + proactive extraction triggers in expert skill feat: SessionEnd hook, expert skill restructure, cache invalidation, CLI fix Apr 21, 2026
yilu331 added 12 commits April 21, 2026 15:14
…install

- Add --force-extraction to session-end publish so extraction runs
  regardless of batch-interval throttling
- Redirect publish output to ~/.reflexio/logs/stop-hook.log instead of
  silently discarding errors
- Remove dead stop_hook_active guard (field not in Claude Code payload)
- Auto-install Stop hook when expert mode is enabled via setup wizard
- Update HOOK.md to reflect all changes
Add 5 high-signal patterns that trigger auto-publish without user input:
- Self-recovered tool failures (error + recovery arc)
- Retry chains (2+ failures on same operation)
- Documentation/behavior gaps (docs say X, reality is Y)
- Workarounds for limitations (used alternative approach)
- Anomalous results (unexpected zeros, row-count mismatches)

Also add end-of-session extraction nudge: suggest /reflexio-extract
when session had any friction points.
- test_normal_mode_no_stop_hook: verifies Stop hook absent without --expert
- test_expert_mode_installs_stop_hook: verifies Stop hook present with --expert,
  checks command points to handler.js
- test_expert_mode_stop_hook_idempotent: verifies no duplication on repeated install
- test_remove_from_dir_cleans_stop_hook: verifies uninstall removes the Stop hook
Stop fires once per turn (every assistant response), causing repeated
publishes of the growing transcript and wasteful --force-extraction on
every turn. SessionEnd fires once when the session exits — correct for
full-session capture.

- _merge_hook_config: Stop → SessionEnd
- _remove_hook_config: add SessionEnd to removal list (keep Stop for
  backwards compat with old installs)
- handler.js: update JSDoc comments
- HOOK.md: update event name, installation example, frontmatter
- tests: rename and update all SessionEnd assertions
Root cause: description led with "Search" which framed Reflexio as a
search tool, causing Claude to route corrections to its built-in memory
system instead of triggering the Reflexio publish flow.

Changes:
- Lead description with "Publish user corrections" (the missed trigger)
- Add when_to_use field with concrete trigger phrases
- Explain profile vs playbook: facts publish immediately, procedures
  publish after user confirmation (explicit or implicit)
- Distinguish from Claude's built-in local memory
- Add callout in skill body preventing memory/Reflexio conflation
- Reorder trigger list: corrections first, search second
Root cause: skill tried to handle both when-to-trigger and how-to-publish,
but Claude decides whether to invoke a skill before reading its content.
Triggering guidance needs to be in rules (always loaded).

Changes:
- rules/reflexio.md: add "When to Publish to Reflexio" section with
  profile/playbook/procedure guidance and confirmation model; soften
  existing section titles from commanding to guiding tone
- skill/SKILL-expert.md: simplify from 235 to 98 lines; keep only
  how-to (payload formats, correction patterns, end-of-session nudge)
- skill/SKILL.md: update description to triggering conditions only
  per skill creation guidelines
- skill/references/: new directory with proactive-patterns.md and
  server-management.md (heavy content moved out of main skill)
- setup_cmd.py: copy references/ dir in expert mode
- tests: add test_expert_mode_installs_references and
  test_normal_mode_no_references
delete_all_*, delete_*_by_ids, delete_*, update_*, and publish_*
methods now clear relevant cache entries so subsequent list/get
calls return fresh data instead of stale cached results.

Changes:
- cache.py: add clear() and invalidate(method_name) methods;
  store method_name in each entry for prefix-based invalidation
- client.py: call cache.invalidate("get_profiles") on profile
  mutations, cache.invalidate("get_agent_playbooks") on playbook
  mutations, cache.clear() on nuclear deletes (delete_all_interactions,
  delete_all_playbooks)
- tests: 7 new tests for clear, invalidate, and client integration
reflexio interactions list without --user-id failed with a Pydantic
validation error because GetInteractionsRequest requires user_id.
Now uses require_user_id() to fall back to REFLEXIO_USER_ID env var,
matching the pattern used by other list commands.
- setup_cmd.py: shell-quote all hook paths with shlex.quote for
  paths with spaces; remove SessionEnd hook + expert artifacts
  (references/, reflexio-extract command) when re-installing in
  normal mode; extract _remove_reflexio_hook helper
- handler.js: use positional args ($1..$5) instead of template
  literal interpolation in sh -c to prevent shell injection
- HOOK.md: fix inaccurate docs — tool_use blocks ARE preserved
  as tools_used metadata, not skipped
- server-management.md: add language identifier to code fence
- tests: add test_normal_reinstall_removes_expert_artifacts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant