Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
4d3446f
spike: raw Anthropic SDK for reliable tool execution (D30)
lunelson Apr 2, 2026
c846eaa
feat: replace claude agent sdk with raw anthropic sdk (D30)
lunelson Apr 2, 2026
e0e8e4b
docs: traceability for SDK migration slice (6b, FE-559)
lunelson Apr 2, 2026
3988a12
docs: spec D31 agent loop + plan slice 6c (pi-mono reference)
lunelson Apr 2, 2026
dd086bf
test: oracle coverage for A27 schema sync and observer code-fence str…
lunelson Apr 2, 2026
dbffecb
fix: enable thinking + tool_choice auto, add agent-tail for dev logs
lunelson Apr 2, 2026
0727c39
docs: handoff — SDK migration complete, live rendering regression open
lunelson Apr 2, 2026
92c8de0
dev tooling
lunelson Apr 2, 2026
5222f84
codex refactor of types and boundaries
lunelson Apr 3, 2026
e444b35
spike: core filesystem tools for ToolLoopAgent
lunelson Apr 3, 2026
8bd5356
plan, spec, and readme updates; mcp configs; small import fixes
lunelson Apr 3, 2026
bb8b605
interview workspace test
lunelson Apr 3, 2026
6c2a3b9
delete handoff; scope refactor
lunelson Apr 3, 2026
bfd18b3
igore .tours for codetours
lunelson Apr 3, 2026
db4cb56
skill updates, including assets path
lunelson Apr 3, 2026
cf5c89d
add ln skills comparison note
lunelson Apr 6, 2026
1a0f45c
docs: align ln skill family contracts
lunelson Apr 6, 2026
44d639d
fix augment skills link
lunelson Apr 6, 2026
de41e83
refactor: characterize and isolate client capability boundaries
lunelson Apr 6, 2026
2d75a27
refactor: split rich rendering from text-first transcript path
lunelson Apr 6, 2026
f1effd8
refactor: introduce a workspace data adapter
lunelson Apr 6, 2026
0c0801b
refactor: load workspace project and entities together
lunelson Apr 6, 2026
48df661
refactor: make workspace hydration policy explicit
lunelson Apr 6, 2026
76d3d8f
refactor: surface client mutation failures
lunelson Apr 6, 2026
11ed0bc
refactor: make render-sensitive client primitives pure
lunelson Apr 6, 2026
21803c8
refactor: split workspace controller
lunelson Apr 6, 2026
d1e5887
refactor: deepen workspace client boundaries
lunelson Apr 6, 2026
e105c79
test: deepen client workspace seam oracles
lunelson Apr 6, 2026
42ef3c9
refactor: close client performance boundaries
lunelson Apr 6, 2026
353e6a2
docs: realign interview mode model
lunelson Apr 7, 2026
5014c7c
feat: project live turn card before route invalidation
lunelson Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions .agents/skills/ln-build/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,17 @@ description: "Implement one scoped slice using TDD red-green-refactor. Use when
argument-hint: "[paste or reference a ln-scope card]"
---

# Dev Build
# Ln Build

Implement **one** slice. Beck's red-green-refactor, one cycle, no scope creep.

## Input

A scope card from `ln-scope`: $ARGUMENTS
A scope card from `ln-scope`, or one commit-sized step from `memory/REFACTOR.md`: $ARGUMENTS

The canonical path is `ln-scope` → `ln-build`. If no scope card exists, suggest `ln-scope` first. Accept a raw behavior description only for trivial changes where scoping would be ceremony.
The canonical path is `ln-scope` → `ln-build`. For refactors, `ln-refactor` may hand off one commit-sized step to implement. If neither a scope card nor a single refactor step exists, suggest `ln-scope` or `ln-refactor` first. Accept a raw behavior description only for trivial changes where scoping would be ceremony.

Extract: target behavior, boundary crossings, acceptance criteria, and **verification approach**.
Extract: target behavior, boundary crossings, acceptance criteria, and **verification approach**. For refactor steps, derive these from the selected commit step and existing tests before writing new code.

## Red

Expand Down Expand Up @@ -42,9 +42,9 @@ Run the project's verification harness. All checks must pass. Commit: `feat: [ta

After the slice lands and verification passes, do all of these before presenting routing options:

1. Mark the slice `done` in `memory/PLAN.md`. Check `## Dependencies` — if this slice unblocked multiple downstream slices, note them as newly available (some may be parallelizable)
2. Update assumption confidence in `memory/SPEC.md` §Assumptions — set validated assumptions to `**validated**`, invalidated ones to `**invalidated**` and flag implicated slices in PLAN.md
3. Add new invariants to `memory/SPEC.md` §Invariants — each structural property now protected by tests. Update `memory/PLAN.md` slice with `Invariants established: I#`
1. If working from `memory/PLAN.md`, mark the slice `done`. Check `## Dependencies` — if this slice unblocked multiple downstream slices, note them as newly available (some may be parallelizable). If working from `memory/REFACTOR.md`, mark the commit step complete there instead
2. Update `memory/SPEC.md` §Assumptions — set `Status` to `validated` or `invalidated` as evidence warrants, update `Confidence` if the evidence changed it, and flag implicated slices in PLAN.md
3. Add new invariants to `memory/SPEC.md` §Invariants — each structural property now protected by tests. If working from `memory/PLAN.md`, update the `Invariants established` field on the corresponding slice
4. Add any new decisions to `memory/SPEC.md` §Decisions, new assumptions to §Assumptions
5. Update `memory/SPEC.md` §Verification Design → Current Coverage with new test files and counts

Expand Down
7 changes: 5 additions & 2 deletions .agents/skills/ln-consult/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,24 @@
---
name: ln-consult
description: "Lightweight triage for the ln-* skill set. Use when unsure which dev skill to use next, starting work on something new, or when the user asks for guidance on their development process."
description: "Lightweight triage for the ln-* skill set. Use when unsure which ln skill to use next, starting work on something new, or when the user asks for guidance on their development process."
---

# Dev Consult
# Ln Consult

Assess where the user is and suggest one `ln-*` skill.

If context is unclear, ask **one** clarifying question — then recommend.

Canonical flow is usually `ln-grill → ln-spec → ln-plan → [ln-design when interface shape is uncertain] → [ln-oracles when verification strategy needs explicit design] → ln-scope → [ln-spike] → ln-build → ln-review → [ln-refactor] → [ln-sync]`.

## Routing table

| Situation | Suggest |
| -------------------------------------------------------- | ------------- |
| Idea is vague, needs fleshing out | `ln-grill` |
| Understanding exists, needs a written spec | `ln-spec` |
| Spec exists, needs a plan with slices | `ln-plan` |
| Plan/spec exists, needs explicit verification strategy | `ln-oracles` |
| Plan exists, next slice needs a scope card | `ln-scope` |
| Module interface needs exploration | `ln-design` |
| Scope card exists (from `ln-scope`), ready to code | `ln-build` |
Expand Down
2 changes: 1 addition & 1 deletion .agents/skills/ln-design/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Explore radically different module shapes before committing to one
argument-hint: "[module or API boundary to explore]"
---

# Dev Design
# Ln Design

Apply Ousterhout's "Design It Twice": generate **3+ radically different module shapes**, compare on depth, and synthesize. The goal is deep modules — small API surfaces hiding significant complexity. Do not implement; this is purely about the shape of the boundary.

Expand Down
4 changes: 2 additions & 2 deletions .agents/skills/ln-grill/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: ln-grill
description: "Interview the user relentlessly about a plan or design until reaching shared understanding. Use when fleshing out an idea, stress-testing a design, or when the user says \"grill me\"."
---

# Dev Grill
# Ln Grill

Walk the design tree branch by branch. Resolve dependencies between intents/desires/decisions, one by one. Be Socratic — question premises, not just requirements. The user's *why* matters as much or more than their *what*: knowing motivation lets you suggest alternatives they haven't considered.

Expand All @@ -23,7 +23,7 @@ When understanding is reached, present these options to the user (use `tool-ask-

| # | Label | Target | Why |
| --- | --------------- | ---------- | --------------------------------------- |
| 1 | Write a spec | `ln-spec` | Understanding is sufficient for a PRD |
| 1 | Write a spec | `ln-spec` | Understanding is sufficient for a spec |
| 2 | Plan slices | `ln-plan` | Problem is clear, needs slice breakdown |
| 3 | Scope one slice | `ln-scope` | One slice is already obvious |

Expand Down
9 changes: 6 additions & 3 deletions .agents/skills/ln-handoff/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Capture volatile session state into a structured handoff document
argument-hint: "[optional: path for handoff file, default HANDOFF.md]"
---

# Dev Handoff
# Ln Handoff

Capture what lives in chat but not on disk. Git can reconstruct file changes. But a half-formed scope card, a spike 60% through its investigation, a plan discussion that hasn't hit `memory/PLAN.md` — those are gone on compaction.

Expand All @@ -17,7 +17,7 @@ The handoff must let a new thread act immediately without asking clarifying ques
Which `ln-*` skill was most recently active? Where in the flow is the work?

```
grill → spec → plan → scope → [spike] → build → review → [sync]
grill → spec → plan → [design] → [oracles] → scope → [spike] → build → review → [refactor] → [sync]
```

Be precise about state:
Expand All @@ -32,8 +32,11 @@ This is the critical step. Scan the conversation for volatile artifacts — info

- **Scope cards** from `ln-scope` — target behavior, boundary crossings, acceptance criteria
- **Plan drafts** from `ln-plan` — slice lists, ordering decisions, dependency reasoning not yet in `memory/PLAN.md`
- **Design outputs** from `ln-design` — alternative module shapes considered, the chosen shape, and rejected tradeoffs
- **Oracle design outputs** from `ln-oracles` — O/R/C assessment, selected oracle families, per-slice verification approaches, acknowledged blind spots, and whether slice verification design is complete / pending / stale relative to the code
- **Spike state** from `ln-spike` — the question, what was tried, partial findings, verdict if reached
- **Review findings** from `ln-review` — **ALL findings, not just the one being acted on.** Review debt is critical context. Name every finding, its status (addressed / in-progress / deferred), and any remaining implications. A fresh thread that only knows about the active finding will lose track of deferred review debt.
- **Refactor state** from `ln-refactor` — commit sequence, target structure, and any constraints on safe ordering
- **Grill insights** from `ln-grill` — constraints surfaced, decisions reached
- **Decisions and assumptions** discussed but not yet in `memory/SPEC.md`
- **Evidence that informed diagnoses** — concrete proof points (API responses, test output, log lines, specific data) that caused the investigation to shift direction or a hypothesis to be confirmed/rejected. Without this, a new thread inherits conclusions but not the reasoning, and may re-investigate or contradict settled evidence.
Expand All @@ -51,7 +54,7 @@ What IS on disk:

### 4. Produce handoff

Write structured markdown following `@resources/handoff-template.md`.
Write structured markdown following `./assets/handoff-template.md`.

Write to the path given as argument, or `HANDOFF.md` at the nearest workspace root. In a monorepo, this is the workspace (package) the session was working in — not the repository root. Determine the workspace from the files touched during the session: look for the nearest `package.json`, `Cargo.toml`, `go.mod`, or similar project marker up from the most-edited files.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,16 @@

- **Last completed skill**: `ln-<skill>` — [what it produced]
- **Current skill**: `ln-handoff` (or other if handoff is mid-skill)
- **Flow position**: `grill → spec → plan → scope → [spike] → build → review → [sync]`
- **Flow position**: `grill → spec → plan → [design] → [oracles] → scope → [spike] → build → review → [refactor] → [sync]`
- **Handoff trigger**: [why the handoff is happening]

## In-flight work

> CRITICAL: These artifacts exist only in the prior conversation, not on disk.
> Reproduce them here with full fidelity.

[Scope cards, spike verdicts, plan drafts, grill insights,
decisions — in their native format, not summarized]
[Scope cards, plan drafts, design alternatives, oracle designs (including verification state), spike verdicts,
refactor plans, grill insights, decisions — in their native format, not summarized]

### Review findings

Expand Down Expand Up @@ -75,7 +75,7 @@

Paste this into a new session:

> Read `HANDOFF.md` in the project root. It contains the full state of in-progress work.
> Read `HANDOFF.md` in the workspace root for this work area. It contains the full state of in-progress work.
> The immediate next step is: [first action from Next steps].
> Start by [specific instruction — e.g., "reviewing the scope card in the In-flight section and running ln-build"].
```
14 changes: 8 additions & 6 deletions .agents/skills/ln-oracles/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
---
name: ln-oracles
description: "Design verification strategy: diagnose observability, select oracle families, map to loop tiers, surface blind spots. Use after ln-plan when slices need oracle design, or when verification coverage has drifted."
description: "Design verification strategy: diagnose observability, select oracle families, map to loop tiers, surface blind spots. Use after ln-plan when slices need oracle design — especially for LLM, visual, or compositional work — or when verification coverage has drifted."
argument-hint: "[slices to design oracles for, or 'all' for full reassessment]"
---

# Dev Oracles
# Ln Oracles

Design what proves the system works before choosing how to build it.

The best oracle removes the most bad degrees of freedom per unit time (Regehr). A system without feedback is open-loop -- it cannot correct errors (Wiener). Verification is first-class work, not accessory: second only to building the product itself. A slice without an oracle strategy is not scoped.

Read `@resources/diagnostic-framework.md` and `@resources/oracle-taxonomy.md` before starting.
Not every slice needs a full oracle-design pass. For trivial, purely structural slices, `ln-scope` may name the inner-loop checks directly. Use `ln-oracles` when the verification strategy itself is uncertain or materially shapes implementation order.

Read `./assets/diagnostic-framework.md` and `./assets/oracle-taxonomy.md` before starting.

## Input

Expand All @@ -24,7 +26,7 @@ This is an **interactive process** -- each step involves presenting analysis and

### 1. Diagnose

Score **Observability**, **Reproducibility**, and **Controllability** (see `@resources/diagnostic-framework.md`). Present the scoring table to the user with specific notes per dimension. Low scores constrain which oracle families are feasible and must be addressed before oracle selection proceeds.
Score **Observability**, **Reproducibility**, and **Controllability** (see `./assets/diagnostic-framework.md`). Present the scoring table to the user with specific notes per dimension. Low scores constrain which oracle families are feasible and must be addressed before oracle selection proceeds.

**Grill**: For each dimension scored below `high`, ask: is this a deliberate deferral, a blind spot, or something we should address now? What would change the score?

Expand All @@ -40,15 +42,15 @@ From SPEC.md invariant bundles, acceptance criteria, and PLAN.md slice definitio

### 3. Select oracle families

Using `@resources/oracle-taxonomy.md`, select families ranked by ROI for this project's verification needs. Apply the combination principle: the best oracle is a pair of independent artifacts. Prefer pairs when they compound; don't force them when a single oracle suffices.
Using `./assets/oracle-taxonomy.md`, select families ranked by ROI for this project's verification needs. Apply the combination principle: the best oracle is a pair of independent artifacts. Prefer pairs when they compound; don't force them when a single oracle suffices.

**Grill**: For each selected family, present: what it proves, what it costs, and what it misses. Ask the user which tradeoffs are acceptable given timeline and confidence levels.

### 4. Map to loop tiers

Assign each selected oracle to inner (ms, agent-autonomous), middle (seconds-minutes, regression/fitness), or outer (slow hardening). Apply verification economics: cheapest checks first, expensive checks less often.

**Boundary with ln-spec**: ln-spec owns the inner loop (verification commands, policy, fast automated checks). ln-oracles owns the middle and outer loops, plus strategic framing (diagnostic, stance, blind spots). When updating, preserve ln-spec's inner loop content and extend with middle/outer strategy.
**Boundary with ln-spec**: ln-spec owns project-wide inner-loop verification commands, policy, and fast automated checks. ln-oracles owns the middle and outer loops, plus strategic framing (diagnostic, stance, blind spots), and may recommend slice-specific inner-loop oracle families when they affect implementation strategy. When updating, preserve ln-spec's command/policy content and extend with middle/outer strategy.

**Grill**: For middle-loop oracles that require external resources (API calls, fixtures), ask: how will fixtures be created? What bootstraps ground truth? Is single-shot measurement sufficient or do we need multi-run variance?

Expand Down
21 changes: 11 additions & 10 deletions .agents/skills/ln-plan/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Break a feature or project into vertical slices and update memory/
argument-hint: "[feature or project area to plan]"
---

# Dev Plan
# Ln Plan

Break a feature into tracer-bullet slices and spikes (Hunt & Thomas), grouped into temporal phases. Slices are thin end-to-end paths through all integration layers. Order by uncertainty first, dependency second (Reinertsen: retire risk early, not just finish tasks early).

Expand All @@ -20,28 +20,29 @@ If context is thin, run a brief interview (not a full `ln-grill`) to fill gaps.

## Plan

**Mode detection.** If the user is inserting or reordering specific slices — not replanning from scratch — this is a **patch**. Read PLAN.md, make the targeted edits, then jump to the post-edit checklist (step 5).
**Mode detection.** If the user is inserting or reordering specific slices — not replanning from scratch — this is a **patch**. Read PLAN.md, make the targeted edits, then jump to the post-edit checklist (step 6).

1. If `memory/PLAN.md` exists, read it first. Retire completed slices (mark `done`). Assess what remains and what's changed.
2. Explore the codebase. Identify architectural constraints the slices must respect (routes, schema, auth, third-party boundaries).
3. Draft or revise phases and slices. Each slice must be independently demoable and independently grabbable where possible. Group into temporal phases. For each, name dependent requirements and assumptions from `memory/SPEC.md`.
4. Confirm with user — adjust granularity, reorder, split or merge.
5. **Post-edit checklist** — after any addition, removal, or reordering:
3. Draft or revise phases and slices. Each slice must be independently demoable and independently grabbable where possible. Group into temporal phases. For each, name dependent requirements and assumptions from `memory/SPEC.md`, plus any candidate invariant goals to establish or existing invariants to respect.
4. Observe and respect local project protocols for mapping slices/spikes to issues or tickets, associated codes, and branch naming conventions, if any. Capture project-specific tracking metadata as optional execution detail — not as the core identity of the slice.
5. Confirm with user — adjust granularity, reorder, split or merge.
6. **Post-edit checklist** — after any addition, removal, or reordering:
- Update the `## Dependencies` ASCII graph to reflect new/changed edges
- Update `### Parallelism opportunities` if new concurrent paths opened
- Verify every new slice names its requirements, assumptions, invariants to establish, and invariants to respect from SPEC.md
- Verify every new slice names its requirements, assumptions, candidate invariant goals, and invariants to respect from SPEC.md

## Output

Write or update `./memory/PLAN.md` following the template at `@resources/plan-template.md`.
Write or update `./memory/PLAN.md` following the template at `./assets/plan-template.md`.

### Traceability

Every slice and spike must name its dependent requirements and assumptions from `memory/SPEC.md`. This is the bridge between the two documents — invalidating an assumption in SPEC surfaces every slice it touches in PLAN.
Every slice and spike must name its dependent requirements and assumptions from `memory/SPEC.md`. Slices should also capture candidate invariant goals to establish or existing invariants to respect, and a verification approach when one is already known. This is the bridge between the two documents — invalidating an assumption in SPEC surfaces every slice it touches in PLAN.

## Routing

After writing the roadmap, present these options to the user (use `tool-ask-question`):
After writing the plan, present these options to the user (use `tool-ask-question`):

| # | Label | Target | Why |
| --- | ----------------- | ------------ | ----------------------------------------------- |
Expand All @@ -52,4 +53,4 @@ After writing the roadmap, present these options to the user (use `tool-ask-ques
Recommended: **1**

---
*Draws from [mattpocock/skills/prd-to-plan](https://github.com/mattpocock/skills/tree/main/prd-to-plan) and [mattpocock/skills/prd-to-issues](https://github.com/mattpocock/skills/tree/main/prd-to-issues).*
*Draws from [mattpocock/skills/prd-to-plan](https://github.com/mattpocock/skills/tree/main/prd-to-plan) and [mattpocock/skills/prd-to-issues](https://github.com/mattpocock/skills/tree/main/prd-to-issues), adapted toward a generic PLAN.md workflow rather than project-specific issue/branch bindings.*
Loading