Skip to content

Inject site-agent context into worktrees on creation #45

@chubes4

Description

@chubes4

Summary

When wp datamachine-code workspace worktree add runs from a WordPress site that has DM installed and an active agent identity, optionally inject that agent's persistent context (memory + identity files) into the new worktree as runtime-readable local-only files, so an agent session cooking inside the worktree starts with the same context the originating site has.

Why this matters now

This gap surfaced concretely on 2026-04-21 when a fresh agent session was spawned in a data-machine worktree to implement a substrate refactor (Extra-Chill/data-machine#1143). The session did the core work cleanly but hit a homeboy test environment failure it had no documented path to resolve. Reviewing the trace, the underlying issue wasn't the test env — it was that the spawned session had only the prompt + the repo's auto-loaded files for context. None of the originating site's accumulated architectural context (MEMORY.md, USER.md, etc.) traveled to the worktree.

This is structural, not incidental:

  • DM's memory model is per-site-per-agent — files live under wp-content/uploads/datamachine-files/agents/<slug>/
  • A worktree at ~/Developer/<repo>@<branch-slug> is not a WordPress site — no DM installation, no agent resolver, no memory layer
  • Whatever the agent runtime auto-loads from the worktree cwd (CLAUDE.md, AGENTS.md, README) is repo-source contributor docs, not agent context
  • The originating site's context is reachable only via active studio wp datamachine agent read calls — but the spawned session has to know to make them

This becomes a sharper gap once Data Machine's repo AGENTS.md is deprecated (planned). With both that file and site memory absent, a worktree session has effectively zero auto-loaded context beyond its prompt.

Proposal

Extend workspace worktree add with a context-injection step:

wp datamachine-code workspace worktree add <repo> <branch> [--no-inject-context]

Default behavior: inject. Flag exists as escape hatch.

What gets written

For a worktree at <workspace>/<repo>@<branch-slug>/:

<worktree>/
├── .claude/
│   └── CLAUDE.local.md       ← Claude Code reads this; gitignored per-checkout
├── .opencode/
│   └── AGENTS.local.md       ← OpenCode reads this; gitignored per-checkout
└── .git/
    └── info/
        └── exclude            ← appended with the two paths above

Both files contain the same payload — runtime-agnostic.

Payload shape

# Injected context from <site-name>

This worktree was created from the <site-name> WordPress site
on <ISO-timestamp>. The agent that created it has the following
persistent context:

## MEMORY.md
<contents of site's MEMORY.md at injection time>

## USER.md
<contents of site's USER.md>

## RULES.md
<contents of site's RULES.md if non-empty>

## Fetching fresher context
The source site has `studio wp` available. Run:
  studio wp datamachine agent read MEMORY.md
  studio wp datamachine agent search <term>
to pull updates that accumulated after this worktree was created.

## Source site
- Slug: <agent-slug>
- Site URL: <site-url>
- Studio path: <studio-path>

Ordering matters: prepend the injected context section in a way that doesn't conflict with the runtime's standard file location semantics. CLAUDE.local.md is Claude Code's recognized convention for local-only Claude context. AGENTS.local.md is the analogous OpenCode pattern.

Per-checkout gitignore (no repo pollution)

Use .git/info/exclude to ignore the injected files per-checkout rather than touching the tracked .gitignore. This means:

  • The repo is unaffected — no commit dirties the tracked gitignore
  • Other worktrees + the primary checkout don't see these files at all
  • The exclusion is invisible to anyone cloning fresh
  • Each worktree manages its own injected context independently

Refresh semantics

Injected context is a snapshot at worktree-creation time. Two ways to keep it current:

  1. Manual refresh: wp datamachine-code workspace worktree refresh-context <handle> re-reads memory files from the source site and rewrites the injected files.
  2. In-session pull: the injected payload tells the agent how to call studio wp datamachine agent read for updates. No automation, agent decides when to refresh.

v1 ships #1 only. #2 is a documentation note in the payload, not infrastructure.

Site identity tracking

Worktree metadata (already stored by DMC) gains a created_from_site field recording the originating site's URL + agent slug. Refresh operations resolve back to it. If the originating site is no longer reachable (laptop moved between machines, site deleted), refresh fails gracefully and the existing snapshot stays.

Why DMC is the right layer

  • DMC already owns worktree creation
  • DMC already has the access path to memory files via DM's agent abilities (agent paths, agent read)
  • DMC already manages workspace metadata (per-binding state, per-worktree paths)
  • The injection is a small composition on top of capabilities that already exist
  • No new permissions surface — DMC reads from DM the same way it does today for other operations

Acceptance criteria

  • workspace worktree add accepts --no-inject-context flag (default: inject when invoked from a site context)
  • On injection, <worktree>/.claude/CLAUDE.local.md and <worktree>/.opencode/AGENTS.local.md are written with the merged payload
  • <worktree>/.git/info/exclude is appended (idempotent — no duplicate entries on rerun)
  • Worktree metadata records created_from_site (URL + agent slug + timestamp)
  • workspace worktree refresh-context <handle> rewrites injected files from current site state
  • When invoked outside a site context (e.g. directly via shell on a non-WP machine), behavior unchanged — no injection, no error
  • Existing worktrees (created before this PR) are unaffected; refresh-context can be called against them to inject retroactively
  • Documentation update describing what gets injected, gitignore semantics, refresh flow

Out of scope

  • Cross-machine memory federation (worktree on machine A, source site on machine B). Defer until there's a real use case.
  • Auto-refresh on a schedule. Manual + in-session pull is enough for v1.
  • Selective injection (only inject MEMORY.md but not USER.md, etc.). Ship the full payload; selectivity is a follow-up if needed.
  • Differential injection (only inject memory written since worktree creation). Snapshot-on-create + manual refresh is simpler and sufficient.
  • Two-way sync (edits to injected files flow back to site). Injected files are read-only-by-convention; the source of truth stays site-side.

Related context

  • DM AGENTS.md deprecation (no public issue yet) makes this pattern more important — without the repo's auto-loaded conventions doc, the worktree starts with even less context
  • LinkGraph extensibility: make extractor + resolver pluggable via filters data-machine#1143 — concrete cook that surfaced this gap during execution
  • Companion: documenting repo-local dev environment requirements (homeboy test prerequisites etc.) lives elsewhere — that's repo-hygiene, this issue is cross-site context plumbing

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions