An AI staff engineer for your software project. It picks up tickets, writes code, reviews its own work, opens pull requests, responds to reviewer comments, and drives changes all the way up to the point of merge. It stops there and hands the decision to you. GitHub is the only tracker with a working implementation today; the config and bootstrap interview also accept Jira, Linear, and GitLab, but those backends are placeholders that throw NotSupportedError until their real implementations land (see the Current implementation status table below). It reads your project's config once so every action it takes is consistent with how your team already operates.
Sets itself up. On first run it asks you a short set of questions (release cadence, team size, which tracker hosts your tickets, e2e setup, compliance context) and writes a single config file. Everything downstream reads from that file, so the agent behaves predictably across sessions.
Drives the full issue-to-PR loop. You point it at an issue. It creates a branch, implements the change, runs the full local review chain (format, lint, type-check, unit, integration, e2e where applicable), delegates code review on its own work to @ctxr/skill-code-review, opens a PR with reviewers requested and the right template applied, moves the issue to "In review", and stops. You merge.
Iterates on review comments until the PR is green. After the PR is open, it requests an external reviewer (e.g. GitHub Copilot), polls CI and review threads, classifies each unresolved comment as stale, actionable, or suggestion, fixes the actionable ones, commits, pushes, resolves what it addressed, re-requests review, and loops until three conditions hold: local review is still green, every thread is resolved, CI is green on the current commit.
Adapts when your project changes. Tell it in plain English: "we handle PHI now", "we added a Chrome extension target", "we're migrating from Jira to Linear", "dropped the legacy analytics SDK". It produces a preview diff across your config, your label taxonomy, your issue templates, your internal rules, and its own memory. Nothing is written until you approve.
Keeps plan files honest. Plan files under .claude/plans/ get moved through todo → in-progress → in-review → done folders automatically, stay in the right checkbox format, and have their one-line status (the [ ] / [x] marker and the frontmatter) kept in sync with the linked tracker issue. If a plan says [x] but the tracker says "In progress", the local checkbox is rewritten to match the tracker. The plan body stays yours.
Tracks releases. Release umbrella issues have their status computed automatically from their linked dev issues. The umbrella body gets a live summary (2 Done / 1 In review / 3 In progress / 6 Backlog) plus a blocker list. The agent auto-moves the umbrella through Backlog → In progress, but never to Done (that's you).
Triages regressions. When you report a bug, it looks up historical closed issues by commit, file path, area label, or title keyword, and proposes: reopen the original, file a new linked bug, or investigate further. It writes a regression report with its reasoning so you can audit the decision.
Reconciles label taxonomy. When your label set changes (you added a new area/* or renamed priority/*), it computes an add / edit / deprecate plan and applies it after you approve.
These are non-negotiable, baked into every skill:
- It never merges a PR. Period.
- It never sets a dev issue to
Done. Period.
These two decisions are always yours.
- Your tracker is the single source of truth. Configured in
trackers.*. When a plan file's status marker conflicts with the tracker (e.g. plan says[x]but the tracker issue is "In progress"), the plan file's status line is rewritten to match the tracker. Scope of this rule, specifically: plan-file one-liners and frontmatter under.claude/plans/. It does NOT rewrite your source code, your README, yourCLAUDE.md, yourops.config.json, the body of a plan file, your reports under.development/shared/, or anything the tracker itself holds. The rule stops at "status markers in local plan files". - No em or en dashes in anything the agent writes: issue bodies, PR descriptions, comments, plan files. Uses commas, colons, parentheses, or line breaks.
- Never runs destructive git commands. No
reset --hard, no--force, no discarding uncommitted changes. - Never amends commits. Every change is a new commit so history stays reviewable.
- Runs the full local review loop before pushing. If any gate is red, it fixes and re-runs before it pushes.
- Halts and asks when state is ambiguous (weird branch, orphan PR, contradictory status) rather than guessing.
- Never touches your project's
daily/orknowledge/folders. Those belong to your own tooling. - Never writes above the managed block in your
CLAUDE.md. Your hand-written content is preserved byte-for-byte across agent updates.
The agent targets four tracker kinds. GitHub, Linear, GitLab, and Jira all have real implementations for issues and labels; review iteration is full on GitHub, partial on GitLab, and not applicable on Linear / Jira.
| Capability | GitHub | Linear | GitLab | Jira |
|---|---|---|---|---|
| Open PRs, iterate on review comments | works | stub (no native PR) | partial (poll+resolve, no requestReview) | stub (no native PR) |
| Status moves, comments, issue create | works | works | works | works |
| Label taxonomy reconcile | works | works | works | works (advisory: Jira labels are uncontrolled strings) |
| Release umbrella status | works | stub | stub | stub |
| Regression lookups | works | stub | stub | stub |
| Bootstrap interview + config | works | works | works | works |
Linear uses its GraphQL API (direct fetch, no SDK). Auth via LINEAR_API_KEY. GitLab uses REST API v4 (direct fetch). Auth via GITLAB_TOKEN. GitLab's review.* is partially implemented: poll, fetch threads, resolve, and CI state work; requestReview is stubbed because GitLab's approval flow requires admin-configured approval rules. Jira uses REST API v3 (direct fetch) with basic auth; set JIRA_EMAIL + JIRA_API_TOKEN. Rich-text payloads render to Atlassian Document Format (ADF). For projects on Jira whose code lives on GitHub, set workflow.external_review.provider = "github" so pr-iteration dispatches review against GitHub while tracker ops still go to Jira.
The interview's cadence question picks which umbrella rhythm fits your project:
- Per-wave / per-sprint (the default; one umbrella per
intentlabel value likewave-1,wave-2). - Per-version (
initial,v1,v2, ...). Pickper-versionin the cadence question. - Continuous (minimal: just
initialandpost-launch). Pickcontinuousin the cadence question.
Teams that don't coordinate releases with umbrella issues (solo dev on tag-based continuous deploy, milestone-only workflows) answer no to the release-umbrella question in the interview. The agent omits trackers.release from the generated config; release-tracker halts silently and dev-loop skips the link-umbrella step.
For projects where sibling directories have their own git repos and possibly different trackers (e.g., a Jira-tracked library next to a GitHub-tracked app), the interview asks you to declare each member: project-relative path, short name, and its own dev tracker (plus optional release tracker). The runtime dispatcher (pickTrackerForMember / resolveMemberFromPath) routes each operation through the owning member deepest-first: a file under libs/shared/x.ts resolves to the libs/shared member. For files that don't match any nested member, include an explicit member with path . in workspace.members[] to catch them as the root; without one, unmatched files fall back to the top-level trackers.dev instead. Single-repo projects answer no and keep the single-tracker path with zero config overhead.
# 1. Install the agent. Kit offers an interactive menu for the destination:
# .claude/agents/, .agents/agents/, ~/.claude/agents/, or a custom --dir.
npx @ctxr/kit install @ctxr/agent-staff-engineerThen in Claude Code, ask Claude to run the agent, for example:
Run the agent-staff-engineer and help me set it up for this project.
On first run, the agent detects that .claude/ops.config.json is missing and self-bootstraps: it runs its own installer, launches the interactive interview (ten topics covering release cadence, team size and push principals, e2e setup, which tracker hosts dev issues (GitHub / Jira / Linear / GitLab) plus the target coordinates, whether you coordinate releases with umbrella issues (and if so, which tracker hosts them), whether to customise branch naming, whether this is a multi-repo workspace (and if so, each member's path + tracker binding), additional repos to observe, observation depth, compliance context, optional project-specific rules to seed), writes ops.config.json, generates thin wrapper files at the canonical Claude Code locations, and hands control back.
On every later invocation the agent acts on your request directly, guided by the configured rules.
- Claude Code CLI or IDE extension.
- Node.js 20+ (preflight enforces this and offers platform-specific install guidance on mismatch).
- Git (for the bundle repo + dev loop operations).
- A tracker CLI or token matching
trackers.*.kindinops.config.json:- GitHub:
ghCLI authed with scopesrepo,project,read:org,workflow. - Jira:
JIRA_EMAIL+JIRA_API_TOKENenv vars (basic auth per Atlassian's 2025 guidance; the email is the Atlassian account's login address, the token is generated at https://id.atlassian.com/manage-profile/security/api-tokens). Realissues.*andlabels.*implementations ship in this release;review.*andprojects.*are stubbed by design (no native PR surface, instance-specific custom fields). - Linear:
LINEAR_API_KEYenv var. - GitLab:
GITLAB_TOKENenv var (optionallyglabif present).
- GitHub:
The agent requires @ctxr/skill-llm-wiki to be installed before it can apply its initial config. Every document under .development/** (reports, plans, runbooks) is managed as a semantically-routed LLM wiki by that skill, so the agent can navigate its own history without re-reading everything on every session.
What happens when it's missing:
- Interactive install (you're sitting at a terminal): the installer pauses, tells you which skill is missing, prints the exact command to run (
npx @ctxr/kit install @ctxr/skill-llm-wiki), and waits for you. Run the install in another terminal, press Enter, and the agent rechecks and continues. At the prompt you can also typehelp(prints troubleshooting tips for common install failures, lets you ask the agent for help debugging) orabort(cancels cleanly). The installer never silently proceeds without the dependency. - Non-interactive / scripted install (CI,
--yes, piped stdin): the installer prints the same "missing skill" message with the install command and exits non-zero so the pipeline fails fast. Install the skill and re-run.
You can opt out entirely by setting wiki.required: false in ops.config.json, but then you own .development/ yourself.
- Installable via kit. One command places the bundle; one more (or just "run the agent") bootstraps it.
- Wrapper model: canonical skills, rules, and memory seeds stay inside the bundle folder. The installer writes thin wrappers at
.claude/skills/agent-staff-engineer_<name>/SKILL.md,.claude/rules/agent-staff-engineer_<name>.md, and.claude/memory/seed-agent-staff-engineer_<name>.md. The agent-name prefix is derived frompackage.json -> nameso wrappers never collide with files shipped by other agents or skills. Each wrapper points at the canonical file and has a marker line; anything you add below the marker survives every update. - Auto-update via
git pullinside the bundle folder. Wrappers reference stable in-bundle paths, so content updates take effect immediately. Runinstall.mjs --updateonly when the canonical file set or schema changes. - Interactive bootstrap asks the right questions; user input wins over heuristic detections.
- Continuous adaptation: the
adapt-systemskill takes free-form user intent ("we handle PHI now", "we added a Chrome extension target", "dropped the legacy analytics SDK") and produces cascading diffs across config, labels, templates, rules, and memory seeds. Idempotent, diff-previewed, never silent. - Multi-tracker observation: the config's
trackersblock binds a dev tracker, a release tracker, and zero or more read-onlyobservedtrackers (each independently kinded across github / jira / linear / gitlab). Every entry carries its own depth setting (full,umbrella-only,assigned-to-principals,labeled:X,issues-only,read-only). Multi-repo workspaces route per-member via the optional top-levelworkspace.members[]block. - Code review default: the
dev-loopskill delegates self-review to@ctxr/skill-code-review(up to 18 specialist reviewers, GO / CONDITIONAL / NO-GO verdict). Configurable; falls back to an internal template on projects that have not installed the external skill.
git clone https://github.com/ctxr-dev/agent-staff-engineer.git .claude/agents/agent-staff-engineer
node .claude/agents/agent-staff-engineer/scripts/install.mjs --target . --applyThe scripts self-locate via import.meta.url, so they work regardless of where kit or you placed the bundle.
AGENT.md: Claude Code agent entry point with self-bootstrap instructions.bundle-index.md: routing doc; the agent reads this first to jump to the relevant skill / rule rather than re-reading everything.skills/: workflow skills (bootstrap-ops-config, adapt-system, tracker-sync, dev-loop, release-tracker, regression-handler, plan-keeper, pr-iteration).rules/: portable process rules (tracker as source of truth, PR workflow, pr-iteration, ambiguity-halt, no dashes, plan management, review loop, memory hygiene, adaptation, llm-wiki).memory-seeds/: starter memory entries, stack-tag filtered at install time.templates/: issue / PR / report templates with{{ placeholder }}substitution.schemas/ops.config.schema.json: strict JSON Schema validated on every install.scripts/: Node.js ESM installer, bootstrap, adapt, seed installer, validator, preflight, update_self, plus shared helpers underscripts/lib/(includingscripts/lib/trackers/which holds the Tracker interface and per-kind implementations).examples/: fully-populated fictitious example config.tests/:node:testunit + E2E.design/: MASTER-PLAN, DECISIONS, ARCHITECTURE, OPEN-QUESTIONS, RISKS.
See CONTRIBUTING.md to add skills, rules, or seeds. See INSTALL.md for the full install reference including update and uninstall.
Releases are PR-gated. Version bumps land on main through a review gate like any other change; only the tag push is automated.
Enable these on the repo before your first release:
- Repository secret
NPM_TOKENset to an npm access token with publish rights on the@ctxrscope (npm token create). - Settings → Actions → General → Workflow permissions: enable Allow GitHub Actions to create and approve pull requests so
release.ymlcan open its version-bump PR withGITHUB_TOKEN. If the checkbox is greyed out, an organization-level Actions policy is restricting it; ask an org admin to unlock the setting first. - (Optional, recommended) GitHub-managed CodeQL default setup: Security → Code security → enable default setup for
javascript-typescriptandactions. - (Optional) A branch ruleset on
mainrequiring PR review + code scanning. The release flow works without it; gates are strictly stricter when enabled.
- Actions → Release → Run workflow.
- Branch selector:
main(the workflow refuses any other ref). - Version bump:
patch/minor/major. - Click Run workflow.
- Branch selector:
- The workflow bumps
package.json(andnpm-shrinkwrap.jsonwhen present) on a freshrelease/v<version>branch and opens a PR tomaintitledrelease: v<version>. - Review the PR (diff is just version fields). Approve + merge.
- On merge,
tag-on-main.ymlfires automatically:- Detects the version change.
- Creates and pushes the annotated
v<version>tag viaGITHUB_TOKEN.
- Actions → Publish to npm → Run workflow on the
v<version>tag. The workflow runsnpm ci + preflight + validate + lint + test, verifies the tag matchespackage.json, and publishes the package to npm.
Why a manual dispatch for step 5? GitHub's built-in
GITHUB_TOKENcannot trigger further workflows (on: push: tagswon't fire when a workflow pushed the tag). So the tag auto-creation stops at the tag. Publishing is one extra click. To make it fully automatic, swap the push credential intag-on-main.ymlfor a GitHub App token or fine-grained PAT stored as a repo secret, then thepush: tagstrigger onpublish.ymlwill fire and step 5 happens by itself.
From Run workflow on Release to published on npm is one dispatch + one PR merge + one dispatch (or one dispatch + one PR merge, once a PAT/App-token is wired in).
See GitHub Releases for the changelog.
-
Release workflow fails with "Release workflow must be dispatched from main": you selected a feature branch in the Actions UI. Re-dispatch with
main. -
tag-on-mainfails with "Tag vX.Y.Z exists on the remote but points at …": a stale or orphan tag from a prior failed release. Delete and re-run:git push origin --delete vX.Y.Z
Then merge a trivial no-op PR to
main(or revert-and-re-merge the release PR) to retriggertag-on-main. Direct pushes tomainmay be blocked by branch protection, so the PR path is the reliable retrigger. -
publish.ymlfails on "Verify version matches tag": tag andpackage.jsondisagree. Investigate the merge commit; this should not happen under the PR-based flow. -
GitHub Actions is not permitted to create pull requests: org or enterprise policy blocks the
GITHUB_TOKENfrom opening PRs. Enable Allow GitHub Actions to create and approve pull requests at the org level (Settings → Actions → General → Workflow permissions), or ask the enterprise admin to unlock the setting.