Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
8c99970
feat(learning): phase A — core infrastructure for unified self-learni…
Apr 11, 2026
cb19840
refactor(knowledge): knowledge-persistence SKILL is now a format spec
Apr 11, 2026
4911a1b
refactor(commands): remove retrospective knowledge writers from imple…
Apr 11, 2026
5dbd8e3
chore(plugins): remove knowledge-persistence from write-only plugin m…
Apr 11, 2026
4d22fb3
feat(learning): HUD learning counts row — shows promoted knowledge en…
Apr 11, 2026
941e713
feat(learning): devflow learn --review + --purge-legacy-knowledge com…
Apr 11, 2026
c9ebd54
test(learning): HUD counts + review command + end-to-end integration …
Apr 11, 2026
d7476ae
fix(tests): update shell-hooks tests for per-type thresholds and qual…
Apr 11, 2026
af66e8f
test(learning): add decision and pitfall coverage to learn.test.ts
Apr 11, 2026
f930db1
docs: update CLAUDE.md and README.md for 4-type self-learning
Apr 11, 2026
73d3e07
docs: update self-learning.md to describe unified extractor architecture
Apr 11, 2026
b1134ca
fix: address self-review issues
Apr 11, 2026
b2efaf5
fix(v2): address Evaluator misalignments — D7 migration, threshold do…
Apr 11, 2026
764f11b
feat(learning): clean up orphan PROJECT-PATTERNS.md and extend --purg…
Apr 12, 2026
b8d0ba6
feat(learning): add citation sentence to SKILL.md, coder.md, reviewer…
Apr 12, 2026
d1460fe
feat(learning): add capacity constants and state helpers to json-help…
Apr 12, 2026
625ac7e
feat(learning): enforce hard ceiling at 100 with threshold notificati…
Apr 12, 2026
fedb11e
feat(learning): add citation usage scanner with stop-hook integration…
Apr 12, 2026
c54b86b
feat(learning): add HUD notification component for capacity alerts (D…
Apr 12, 2026
df18363
fix(security): harden knowledge-usage-scan against path traversal (CW…
Apr 12, 2026
53bc0f4
feat(learning): extend --review with capacity mode and add --dismiss-…
Apr 12, 2026
6921b4c
fix: address self-review issues
Apr 12, 2026
06d6557
fix: add missing D25 JSDoc and update D15 for hard-ceiling meaning
Apr 12, 2026
e68650d
fix(learning): include state files in --reset cleanup
Apr 12, 2026
f99588e
feat(migrations): add run-once migration registry for devflow init
Apr 12, 2026
0dd9e24
fix: address self-review issues
Apr 12, 2026
95ecd00
fix(security): harden writeFileAtomic against symlink TOCTOU in legac…
Apr 13, 2026
d5b879f
fix(hud): add runtime type guards for severity, JSON shape, and obser…
Apr 13, 2026
cf593b3
fix(security): harden learn.ts against unsafe JSON.parse and shell in…
Apr 13, 2026
8435914
docs: fix documentation accuracy for self-learning thresholds and rec…
Apr 13, 2026
299dacf
fix(commands): remove stale knowledge-write phases from teams variant…
Apr 13, 2026
ab20b47
fix(security): harden hook scripts against injection and resource abuse
Apr 13, 2026
74166ce
fix(knowledge-persistence): remove stale write-side references post-D…
Apr 13, 2026
cdec1cd
fix(init): harden migration runner and fix install ordering regression
Apr 13, 2026
595d1a9
test: fix three test quality issues (r9-test-improvements)
Apr 13, 2026
6c9cc88
refactor: simplifier polish on resolver fixes
Apr 13, 2026
ed59ce0
fix(hooks): unify lock helper naming and document timeout contracts
Apr 13, 2026
9028ac3
test: add integration seam + knowledge-usage-scan security coverage
Apr 13, 2026
3484a57
fix: extract fs-atomic + notifications-shape helpers, remove D35 coll…
Apr 13, 2026
ba1941b
refactor: extract migration reporter, split isRawObservation, add gua…
Apr 13, 2026
d89552d
refactor: fix stale D34 lock-staleness claims and remove dead test he…
Apr 13, 2026
95b8df2
test: fix EPIPE race in DEVFLOW_BG_UPDATER guard tests
Apr 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,14 @@ Commands with Teams Variant ship as `{name}.md` (parallel subagents) and `{name}

**Ambient Mode**: Three-layer architecture for always-on intent classification. SessionStart hook (`session-start-classification`) reads lean classification rules (`~/.claude/skills/devflow:router/references/classification-rules.md`, ~30 lines) and injects as `additionalContext` — once per session, deterministic, zero model overhead. UserPromptSubmit hook (`preamble`) injects a one-sentence prompt per message triggering classification + router loading via Skill tool. Router SKILL.md is a pure skill lookup table (~50 lines) loaded on-demand only for GUIDED/ORCHESTRATED depth — maps intent×depth to domain and orchestration skills. Toggleable via `devflow ambient --enable/--disable/--status` or `devflow init`.

**Self-Learning**: A SessionEnd hook (`session-end-learning`) accumulates session IDs and triggers a background `claude -p --model sonnet` every 3 sessions (5 at 15+ observations) to detect repeated workflows and procedural knowledge from batch transcripts. Observations accumulate in `.memory/learning-log.jsonl` with confidence scores, temporal decay, and daily run caps. When confidence thresholds are met (5 observations with 7-day temporal spread for both workflow and procedural types), artifacts are auto-created as slash commands (`.claude/commands/self-learning/`) or skills (`.claude/skills/{slug}/`). Loaded artifacts are reinforced locally (no LLM) on each session end. Single toggle mechanism: hook presence in `settings.json` IS the enabled state — no `enabled` field in `learning.json`. Toggleable via `devflow learn --enable/--disable/--status` or `devflow init --learn/--no-learn`. Configurable model/throttle/caps/debug via `devflow learn --configure`. Use `devflow learn --reset` to remove all artifacts + log + transient state. Use `devflow learn --purge` to remove invalid observations. Debug logs stored at `~/.devflow/logs/{project-slug}/`.
**Self-Learning**: A SessionEnd hook (`session-end-learning`) accumulates session IDs and triggers a background `claude -p --model sonnet` every 3 sessions (5 at 15+ observations) to detect **4 observation types** — workflow, procedural, decision, and pitfall — from batch transcripts. Transcript content is split into two channels by `scripts/hooks/lib/transcript-filter.cjs`: `USER_SIGNALS` (plain user messages, feeds workflow/procedural detection) and `DIALOG_PAIRS` (prior-assistant + user turns, feeds decision/pitfall detection). Detection uses per-type linguistic markers and quality gates stored in each observation as `quality_ok`. Per-type thresholds govern promotion (workflow: 3 required; procedural: 4 required; decision/pitfall: 2 required), each with independent temporal spread requirements. Observations accumulate in `.memory/learning-log.jsonl`; their lifecycle is `observing → ready → created → deprecated`. When thresholds are met, `json-helper.cjs render-ready` renders deterministically to 4 targets: slash commands (`.claude/commands/self-learning/`), skills (`.claude/skills/{slug}/`), decisions.md ADR entries, and pitfalls.md PF entries. A session-start feedback reconciler (`json-helper.cjs reconcile-manifest`) checks the manifest at `.memory/.learning-manifest.json` against the filesystem to detect deletions (applies 0.3× confidence penalty) and edits (ignored per D13). Loaded artifacts are reinforced locally (no LLM) on each session end. Single toggle mechanism: hook presence in `settings.json` IS the enabled state — no `enabled` field in `learning.json`. Toggleable via `devflow learn --enable/--disable/--status` or `devflow init --learn/--no-learn`. Configurable model/throttle/caps/debug via `devflow learn --configure`. Use `devflow learn --reset` to remove all artifacts + log + transient state. Use `devflow learn --purge` to remove invalid observations. Use `devflow learn --review` to inspect observations needing attention. Debug logs stored at `~/.devflow/logs/{project-slug}/`. The `knowledge-persistence` skill is a format specification only; the actual writer is `scripts/hooks/background-learning` via `json-helper.cjs render-ready`.

**Claude Code Flags**: Typed registry (`src/cli/utils/flags.ts`) for managing Claude Code feature flags (env vars and top-level settings). Pure functions `applyFlags`/`stripFlags`/`getDefaultFlags` follow the `applyTeamsConfig`/`stripTeamsConfig` pattern. Initial flags: `tool-search`, `lsp`, `clear-context-on-plan` (default ON), `brief`, `disable-1m-context` (default OFF). Manageable via `devflow flags --enable/--disable/--status/--list`. Stored in manifest `features.flags: string[]`.

**Two-Mode Init**: `devflow init` offers Recommended (sensible defaults, quick setup) or Advanced (full interactive flow) after plugin selection. `--recommended` / `--advanced` CLI flags for non-interactive use. Recommended applies: ambient ON, memory ON, learn ON, HUD ON, teams OFF, default-ON flags, .claudeignore ON, auto-install safe-delete if trash CLI detected, user-mode security deny list.

**Migrations**: Run-once migrations execute automatically on `devflow init`, tracked at `~/.devflow/migrations.json` (scope-independent; single file regardless of user-scope vs local-scope installs). Registry: append an entry to `MIGRATIONS` in `src/cli/utils/migrations.ts`. Scopes: `global` (runs once per machine, no project context) vs `per-project` (sweeps all discovered Claude-enabled projects in parallel). Failures are non-fatal — migrations retry on next init. **D37 edge case**: a project cloned *after* migrations have run won't be swept (the marker is global, not per-project). Recovery: `rm ~/.devflow/migrations.json` forces a re-sweep on next `devflow init`.

## Project Structure

```
Expand Down Expand Up @@ -113,11 +115,12 @@ Working memory files live in a dedicated `.memory/` directory:
├── .learning-session-count # Session IDs pending batch (one per line)
├── .learning-batch-ids # Session IDs for current batch run
├── .learning-notified-at # New artifact notification marker (epoch timestamp)
├── .learning-manifest.json # Rendered artifact manifest — reconciled at session-start for feedback loop
├── .pending-turns.jsonl # Queue of captured user/assistant turns (JSONL, ephemeral)
├── .pending-turns.processing # Atomic handoff during background processing (transient)
└── knowledge/
├── decisions.md # Architectural decisions (ADR-NNN, append-only)
└── pitfalls.md # Known pitfalls (PF-NNN, area-specific gotchas)
├── decisions.md # Architectural decisions (ADR-NNN, append-only) — written by background-learning extractor via render-ready
└── pitfalls.md # Known pitfalls (PF-NNN, area-specific gotchas) — written by background-learning extractor via render-ready

~/.devflow/logs/{project-slug}/
├── .learning-update.log # Background learning agent log
Expand Down Expand Up @@ -162,7 +165,7 @@ Working memory files live in a dedicated `.memory/` directory:
- 3-tier system: Foundation (shared patterns), Specialized (auto-activate), Domain (language/framework)
- Each skill has one non-negotiable **Iron Law** in its `SKILL.md`
- Target: ~120-150 lines per SKILL.md with progressive disclosure to `references/`
- Skills default to read-only (`allowed-tools: Read, Grep, Glob`); exceptions: git/review skills add `Bash`, interactive skills add `AskUserQuestion`, `knowledge-persistence`/`quality-gates` add `Write` for state persistence, and `router` omits `allowed-tools` entirely (unrestricted, as the main-session orchestrator)
- Skills default to read-only (`allowed-tools: Read, Grep, Glob`); exceptions: git/review skills add `Bash`, interactive skills add `AskUserQuestion`, `quality-gates` adds `Write` for state persistence, and `router` omits `allowed-tools` entirely (unrestricted, as the main-session orchestrator)
- All skills live in `shared/skills/` — add to plugin `plugin.json` `skills` array, then `npm run build`

### Agents
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Devflow: IMPLEMENT/ORCHESTRATED

**Memory that persists.** Session context survives restarts, `/clear`, and context compaction. Your AI picks up exactly where it left off. Architectural decisions and known pitfalls accumulate in `.memory/knowledge/` and inform every future session. No manual bookkeeping.

**It learns how you work.** A self-learning mechanism detects repeated workflows and procedural patterns across sessions, then creates reusable slash commands and skills automatically.
**It learns how you work.** A self-learning mechanism detects 4 observation types across sessions — workflow patterns, procedural knowledge, architectural decisions, and recurring pitfalls. Workflow and procedural observations create reusable slash commands and skills automatically. Decisions and pitfalls are written directly to `.memory/knowledge/decisions.md` and `.memory/knowledge/pitfalls.md` — informing every future review and implementation session.

**18 parallel code reviewers.** Security, architecture, performance, complexity, consistency, regression, testing, and more. Each produces findings with severity, confidence scoring, and concrete fixes. Conditional reviewers activate when relevant (TypeScript for `.ts` files, database for schema changes). Every finding gets validated and resolved automatically.

Expand Down Expand Up @@ -108,7 +108,7 @@ npx devflow-kit init # Install (interactive wizard)
npx devflow-kit init --plugin=implement # Install specific plugin
npx devflow-kit list # List available plugins
npx devflow-kit ambient --enable # Toggle ambient mode
npx devflow-kit learn --enable # Toggle self-learning
npx devflow-kit learn --enable # Toggle self-learning (4-type extraction: workflow, procedural, decision, pitfall)
npx devflow-kit uninstall # Remove Devflow
```

Expand Down
7 changes: 6 additions & 1 deletion docs/reference/skills-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ Shared patterns used by multiple agents.
| `patterns` | CRUD, API endpoints, events, config, logging | Coder, Resolver |
| `agent-teams` | Agent Teams patterns for peer-to-peer collaboration, debate, consensus | /code-review, /implement, /debug, /plan |
| `router` | Intent classification and proportional skill loading for Devflow mode (unrestricted tools — orchestrator) | Ambient UserPromptSubmit hook |
| `knowledge-persistence` | Record/load architectural decisions and pitfalls to `.memory/knowledge/` | /implement, /code-review, /resolve, /debug, /plan, /self-review |
| `qa` | Scenario-based acceptance testing methodology, evidence collection | Tester |

### Tier 1b: Pattern Skills
Expand Down Expand Up @@ -67,6 +66,12 @@ Language and framework patterns. Referenced by agents via frontmatter and condit
| `java` | Records, sealed classes, composition, modern Java | Java codebases |
| `rust` | Ownership, borrowing, error handling, type-driven design | Rust codebases |

### Format-Spec Skills (Not Plugin-Distributed)

Some skills exist in `shared/skills/` but are not distributed to any plugin. They serve as on-disk format specifications consumed by background processes, not by agents or commands.

- **knowledge-persistence** — Format spec for `.memory/knowledge/decisions.md` and `pitfalls.md` (entry format, lock protocol, capacity limits). Consumed by `scripts/hooks/background-learning` via `json-helper.cjs render-ready`. Not distributed to plugins per D9.

## How Skills Activate

Skills activate through two guaranteed mechanisms:
Expand Down
Loading
Loading