feat: intelligent routing — Phase 1 of AI coding system evolution#49
feat: intelligent routing — Phase 1 of AI coding system evolution#49JordanCoin merged 3 commits intomainfrom
Conversation
…rift detection Phase 1 of codemap's evolution into a code-aware AI coding system. The prompt-submit hook now performs intent classification (refactor/bugfix/feature/explore/test/docs), tracks a per-session working set, emits machine-readable JSON markers for tool consumption, and detects documentation drift. New files: - cmd/intent.go — TaskIntent struct + classifyIntent() with weighted signal scoring - cmd/drift.go — CheckDrift() resolves docs from routing config, treats stale as drift - watch/workingset.go — WorkingSet tracks files edited during session with hub awareness Changes: - hookPromptSubmit rewritten with intent analysis, risk levels, and suggestions - Structured output: <!-- codemap:intent --> and <!-- codemap:routes --> JSON markers - config.Subsystem gains Instructions field for domain-specific guidance - watch.State/Graph gain WorkingSet field, persisted in state.json - New MCP tool: get_working_set - Docs updated (HOOKS.md, MCP.md) Reviewed by OpenAI Codex — 3 findings (1 P1, 2 P2), all fixed: - P1: Drift now resolves doc paths from routing.subsystems config - P2: Docs outside recent_commits window treated as stale, not missing - P2: Intent score ties broken deterministically via categoryDefs order Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Phase 1 of making codemap’s hook intelligence machine-readable and actionable by adding intent classification, session working-set tracking, drift detection, and an MCP tool to expose the working set.
Changes:
- Add prompt-submit intent classification + structured JSON markers for intent and route matches.
- Track and persist a per-session working set in the watch daemon; display it in hooks and expose it via MCP (
get_working_set). - Add documentation drift detection driven by routing subsystem config, plus config/docs updates for
subsystems[].instructions.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
watch/workingset.go |
New WorkingSet + WorkingFile types for tracking per-file edit activity. |
watch/workingset_test.go |
Unit tests for working set touch/remove/active/hot/hub counting behaviors. |
watch/types.go |
Add WorkingSet to daemon Graph and exported State JSON. |
watch/events.go |
Update working set on fs events; persist working set into state snapshots. |
watch/daemon.go |
Initialize WorkingSet in NewDaemon. |
cmd/intent.go |
New intent classifier with weighted signals, scope, risk analysis, and suggestions. |
cmd/intent_test.go |
Tests for category detection, confidence, tie-breaking, scope, risk suggestions, and subsystem matching. |
cmd/drift.go |
New drift detection comparing doc/code freshness via git history. |
cmd/drift_test.go |
Tests for drift config edge cases and helper behaviors. |
cmd/hooks.go |
Prompt-submit now emits structured markers, intent suggestions, drift warnings, and working set summary; routes include instructions. |
config/config.go |
Add Subsystem.Instructions field to routing config schema. |
mcp/main.go |
Add MCP tool get_working_set that reads daemon state and formats working set output. |
docs/MCP.md |
Document the new MCP tool. |
docs/HOOKS.md |
Update hook behavior docs; document structured markers, intent categories/risk, drift, working set, and subsystem instructions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| CodePath: codePath, | ||
| DocPath: docPath, | ||
| CommitsBehind: recentCommits, | ||
| Reason: fmt.Sprintf("%s changed recently but %s has not been updated in the last %d commits", codePath, docPath, recentCommits), | ||
| }) |
There was a problem hiding this comment.
When docCommits < 0, the Reason currently says the doc “has not been updated”, but docCommits can also be -1 because the doc path doesn’t exist at all. Consider checking for file existence (or differentiating git-no-history vs missing file) and emitting a clearer reason like “documentation file is missing” vs “stale within last N commits”.
cmd/hooks.go
Outdated
|
|
||
| // Show subsystem instructions if available | ||
| if match.Instructions != "" { | ||
| fmt.Printf(" 📝 %s\n", match.Instructions) |
There was a problem hiding this comment.
match.Instructions is documented as markdown and may contain newlines; printing it with a single %s will break indentation/formatting for multi-line instructions. Consider splitting on newlines and prefixing each line with the same indentation (or rendering as a separate indented block).
| fmt.Printf(" 📝 %s\n", match.Instructions) | |
| lines := strings.Split(match.Instructions, "\n") | |
| for i, line := range lines { | |
| if i == 0 { | |
| fmt.Printf(" 📝 %s\n", line) | |
| } else { | |
| fmt.Printf(" %s\n", line) | |
| } | |
| } |
watch/events.go
Outdated
| Importers: map[string][]string{}, | ||
| Imports: map[string][]string{}, | ||
| RecentEvents: eventsCopy, | ||
| WorkingSet: d.graph.WorkingSet, |
There was a problem hiding this comment.
writeState currently persists the entire in-memory WorkingSet into state.json. Since WorkingSet has no pruning/cap, a long-running daemon/session can grow this map unbounded and significantly increase state.json size (and hook ReadState latency). Consider capping/pruning (e.g., keep top-N hot files + recently touched window) before persisting, or store only a summarized view in State.
| WorkingSet: d.graph.WorkingSet, | |
| WorkingSet: nil, // avoid persisting unbounded in-memory WorkingSet |
cmd/intent.go
Outdated
| } | ||
|
|
||
| // intentSignal represents a word/phrase that signals a specific category. | ||
| // Multi-word phrases (e.g. "how does") are matched first and have higher specificity. |
There was a problem hiding this comment.
The comment says multi-word phrases are “matched first”, but the implementation just does strings.Contains over the signals in their listed order and doesn’t prioritize longer phrases. Either implement explicit longest-phrase-first matching (e.g., sort signals by phrase length desc per category) or adjust the comment to match the actual behavior.
| // Multi-word phrases (e.g. "how does") are matched first and have higher specificity. | |
| // Multi-word phrases (e.g. "how does") can be given higher weights for greater specificity. |
cmd/intent.go
Outdated
| Target: file, | ||
| Reason: formatImporterReason(file, importerCount), | ||
| }) | ||
| // For refactors on hubs, suggest checking deps |
There was a problem hiding this comment.
This comment says the dependency-check suggestion is “for refactors on hubs”, but the condition also triggers for the "feature" category. Update the comment (or the condition) so it accurately reflects when the suggestion is emitted.
| // For refactors on hubs, suggest checking deps | |
| // For refactors and features on hubs, suggest checking deps |
cmd/drift.go
Outdated
| for _, docPath := range docPaths { | ||
| // Check how many commits ago the doc was last modified. | ||
| // -1 means the doc has no commits in the window — treat as very stale. | ||
| docCommits := lastModifiedCommitsAgo(root, docPath, recentCommits) | ||
|
|
||
| for _, codePath := range codePaths { | ||
| codeCommits := lastModifiedCommitsAgo(root, codePath, recentCommits) |
There was a problem hiding this comment.
checkSubsystemDrift calls lastModifiedCommitsAgo inside the nested docPaths×codePaths loop, and lastModifiedCommitsAgo runs two git log commands each time. In prompt-submit hooks this can become a noticeable latency spike; consider caching the full git log -N output once per CheckDrift invocation and/or avoiding the second git call per path (e.g., compute “commits ago” directly from the path-specific log output).
| for _, docPath := range docPaths { | |
| // Check how many commits ago the doc was last modified. | |
| // -1 means the doc has no commits in the window — treat as very stale. | |
| docCommits := lastModifiedCommitsAgo(root, docPath, recentCommits) | |
| for _, codePath := range codePaths { | |
| codeCommits := lastModifiedCommitsAgo(root, codePath, recentCommits) | |
| // Cache "commits ago" per path so we don't run git log repeatedly for the | |
| // same doc/code paths within this check. This keeps semantics identical | |
| // while avoiding redundant expensive calls. | |
| docCommitsCache := make(map[string]int) | |
| codeCommitsCache := make(map[string]int) | |
| for _, docPath := range docPaths { | |
| // Check how many commits ago the doc was last modified. | |
| // -1 means the doc has no commits in the window — treat as very stale. | |
| docCommits, ok := docCommitsCache[docPath] | |
| if !ok { | |
| docCommits = lastModifiedCommitsAgo(root, docPath, recentCommits) | |
| docCommitsCache[docPath] = docCommits | |
| } | |
| for _, codePath := range codePaths { | |
| codeCommits, ok := codeCommitsCache[codePath] | |
| if !ok { | |
| codeCommits = lastModifiedCommitsAgo(root, codePath, recentCommits) | |
| codeCommitsCache[codePath] = codeCommits | |
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 979f8dcd63
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
cmd/drift.go
Outdated
| codePaths := guessCodePaths(subsystemID) | ||
| w := checkSubsystemDrift(root, subsystemID, docPaths, codePaths, recentCommits) |
There was a problem hiding this comment.
Use configured subsystem paths when checking drift
CheckDrift currently derives codePaths only from guessCodePaths(subsystemID), so it ignores routing.subsystems[].paths entirely. In repos where subsystem IDs are logical names (for example id: "cli" with paths: ["cmd/**"]), this makes codeCommits stay -1 and suppresses drift warnings even when code changed and docs are stale, so the drift feature silently fails for valid custom configs.
Useful? React with 👍 / 👎.
| for _, docPath := range docPaths { | ||
| // Check how many commits ago the doc was last modified. |
There was a problem hiding this comment.
Treat fallback doc names as alternatives, not required pairs
The fallback doc list (docs/<id>.md and docs/<ID>.md) is iterated as if both files must be fresh, so a missing variant triggers a stale warning even when the other variant exists and is current. This produces persistent false positives (for example docs/HOOKS.md present but docs/hooks.md missing), which can drown out real drift warnings and reduce trust in the hook output.
Useful? React with 👍 / 👎.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use configured subsystem paths (routing.subsystems[].paths) in drift detection instead of only guessCodePaths (P1 from Copilot) - Cache git log results per drift check to avoid redundant calls - Cap WorkingSet to 50 files before persisting to state.json - Fix multi-line instructions formatting in route suggestions - Fix comment accuracy: intentSignal doc and check-deps condition Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Phase 1 of turning codemap into a code-aware AI coding system — making codemap's existing intelligence machine-readable and actionable.
<!-- codemap:intent -->,<!-- codemap:routes -->) alongside human-readable textget_working_setexposes session working set to MCP clientssubsystems[].instructionsfield for domain-specific guidance injectionWhat the prompt-submit hook now outputs:
Files changed
cmd/intent.go,cmd/drift.go,watch/workingset.go+ testscmd/hooks.go,config/config.go,mcp/main.go,watch/{daemon,events,types}.go, docsTest plan
go test ./... -race— all 9 packages passcodemap .output unchanged (backwards compatible)codemap hook prompt-submittested with refactor/bugfix/explore prompts🤖 Generated with Claude Code