Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,13 @@
## [Unreleased]

### Added
- **New skill `/ghcp-handoff`.** Bounded delegation to the GitHub Copilot CLI (`copilot`). Generates structured prompts with hard boundaries, NOT-in-scope lists, stream-scoped deliverables, and PR contracts, then invokes `copilot -p` non-interactively and verifies the resulting PR stayed within bounds. Three modes: guided (interactive question flow), `--from-plan <path>` (extract a `## The GHCP prompt` section from a plan file and wrap it with worktree + metadata), and `verify <pr>` (diff-vs-NOT-in-scope glob check + dependency deviation check + PR body contract check). Sibling to `/codex`: where `/codex` pulls a second opinion in, `/ghcp-handoff` pushes bounded mechanical work out. Verification writes a `gstack-review-log` JSONL entry so results surface in the Plan Status Footer. Adds `yaml` and `minimatch` as dependencies.
- **New skill `/ghcp-handoff`.** Bounded delegation to the GitHub Copilot CLI (`copilot`). Generates structured prompts with hard boundaries, NOT-in-scope lists, stream-scoped deliverables, and PR contracts, then invokes `copilot -p` non-interactively and verifies the resulting PR stayed within bounds. Three modes: guided (interactive question flow), `--from-plan <path>` (extract a `## The GHCP prompt` section from a plan file and wrap it with worktree + metadata), and `verify <pr>` (diff-vs-NOT-in-scope glob check + dependency deviation check + PR body contract check). Sibling to `/codex`: where `/codex` pulls a second opinion in, `/ghcp-handoff` pushes bounded mechanical work out. Verification writes a `gstack-review-log` JSONL entry so results surface in the Plan Status Footer. Listed in the gstack skill routing rules, README power-tools table, and install-instructions skill list. Adds `yaml` and `minimatch` as dependencies.

### Fixed
- **`/ghcp-handoff verify` no longer greenlights PRs that only mention required sections in prose.** The PR body check now matches Markdown headings and bold labels (e.g. `## Summary`, `**Summary:**`, `**Summary**:`) instead of any substring — so sentences like "this PR includes a summary of changes" no longer satisfy the contract.
- **Dependency-deviation checks stop flagging package.json metadata as new dependencies.** The manifest diff now parses the actual before/after file contents via `git show` instead of regex-matching `+` diff lines. Top-level `package.json` keys (`author`, `repository`, `engines`, `keywords`, etc.), `pyproject.toml` keys under `[project]` / `[tool.poetry]`, and `Cargo.toml` `[package]` keys are no longer misclassified as deps. Only keys under `dependencies` / `devDependencies` / `peerDependencies` / `optionalDependencies` (JSON), `[*dependencies]` / `[tool.poetry.*dependencies]` / `[tool.poetry.group.*.dependencies]` (TOML), `require` blocks (go.mod), and `gem "..."` lines (Gemfile) count.
- **`ghcp-detect.sh` comment matches behavior.** Removed the misleading "try `gh copilot` as fallback" claim — the extension doesn't support the `-p --allow-all-tools --output-format json --no-ask-user` flags the handoff requires, so only the standalone `copilot` CLI is detected.
- **Prompt template no longer leaks placeholders.** Removed unused `{{project_name}}`, `{{repo_abs_path}}`, `{{user_name}}`, and `{{forbidden_branches}}` variables that the guided-mode flow never collected, so they could render literally in the prompt sent to Copilot. The guided-mode Render step now lists the exact placeholder set and adds a "grep for remaining double-brace" pre-flight check.

## [0.18.3.0] - 2026-04-17

Expand Down
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ gstack/
├── benchmark/ # /benchmark skill (performance regression detection)
├── canary/ # /canary skill (post-deploy monitoring loop)
├── codex/ # /codex skill (multi-AI second opinion via OpenAI Codex CLI)
├── ghcp-handoff/ # /ghcp-handoff skill (bounded delegation to GitHub Copilot CLI)
├── land-and-deploy/ # /land-and-deploy skill (merge → deploy → canary verify)
├── office-hours/ # /office-hours skill (YC Office Hours — startup diagnostic + builder brainstorm)
├── investigate/ # /investigate skill (systematic root-cause debugging)
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Fork it. Improve it. Make it yours. And if you want to hate on free open source

Open Claude Code and paste this. Claude does the rest.

> Install gstack: run **`git clone --single-branch --depth 1 https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup`** then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /design-shotgun, /design-html, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /cso, /autoplan, /plan-devex-review, /devex-review, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, /learn. Then ask the user if they also want to add gstack to the current project so teammates get it.
> Install gstack: run **`git clone --single-branch --depth 1 https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup`** then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /design-shotgun, /design-html, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /ghcp-handoff, /cso, /autoplan, /plan-devex-review, /devex-review, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, /learn. Then ask the user if they also want to add gstack to the current project so teammates get it.

### Step 2: Team mode — auto-update for shared repos (recommended)

Expand Down Expand Up @@ -228,6 +228,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
| Skill | What it does |
|-------|-------------|
| `/codex` | **Second Opinion** — independent code review from OpenAI Codex CLI. Three modes: review (pass/fail gate), adversarial challenge, and open consultation. Cross-model analysis when both `/review` and `/codex` have run. |
| `/ghcp-handoff` | **Bounded Copilot Handoff** — delegate scaffolds, CI wiring, and mechanical refactors to the GitHub Copilot CLI. Generates structured prompts with hard boundaries, NOT-in-scope paths, stream-scoped deliverables, and a PR contract, then verifies the resulting PR stayed within bounds. Sibling to `/codex`: where `/codex` pulls a second opinion in, `/ghcp-handoff` pushes bounded work out. |
| `/careful` | **Safety Guardrails** — warns before destructive commands (rm -rf, DROP TABLE, force-push). Say "be careful" to activate. Override any warning. |
| `/freeze` | **Edit Lock** — restrict file edits to one directory. Prevents accidental changes outside scope while debugging. |
| `/guard` | **Full Safety** — `/careful` + `/freeze` in one command. Maximum safety for prod work. |
Expand Down
1 change: 1 addition & 0 deletions SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -451,6 +451,7 @@ quality gates that produce better results than answering inline.
- User asks to update docs after shipping → invoke `/document-release`
- User asks for a weekly retro, what did we ship → invoke `/retro`
- User asks for a second opinion, codex review → invoke `/codex`
- User asks to hand off to Copilot, delegate scaffold/mechanical work, or says "ghcp-handoff" → invoke `/ghcp-handoff`
- User asks for safety mode, careful mode → invoke `/careful` or `/guard`
- User asks to restrict edits to a directory → invoke `/freeze` or `/unfreeze`
- User asks to upgrade gstack → invoke `/gstack-upgrade`
Expand Down
1 change: 1 addition & 0 deletions SKILL.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ quality gates that produce better results than answering inline.
- User asks to update docs after shipping → invoke `/document-release`
- User asks for a weekly retro, what did we ship → invoke `/retro`
- User asks for a second opinion, codex review → invoke `/codex`
- User asks to hand off to Copilot, delegate scaffold/mechanical work, or says "ghcp-handoff" → invoke `/ghcp-handoff`
- User asks for safety mode, careful mode → invoke `/careful` or `/guard`
- User asks to restrict edits to a directory → invoke `/freeze` or `/unfreeze`
- User asks to upgrade gstack → invoke `/gstack-upgrade`
Expand Down
20 changes: 16 additions & 4 deletions ghcp-handoff/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -763,10 +763,22 @@ AskUserQuestion: "PR description sections — accept defaults or customize?"
### Render
After all answers:
1. Read `${CLAUDE_SKILL_DIR}/templates/prompt-template.md`.
2. Fill all double-brace-style template variables (e.g. `task_summary`, `target_branch`, `stream_blocks`) from collected answers.
3. Write to `.gstack/ghcp-handoff/<slug>.md` + `.meta.json`.
4. Show the rendered prompt in a fenced block.
5. Proceed to **Step 2-exec**.
2. Substitute every double-brace placeholder in that file from the collected
answers. The full placeholder set is:
- `task_summary` — answer to Q1
- `phase_note` + `stub_policy` — answers to Q4
- `target_branch` — answer to Q2
- `context_files_bullets` — rendered as `- <file>` per line from Q3
- `numbered_boundaries` — one numbered line per entry from Q6
- `stream_blocks` — each stream rendered as the `STREAM N — <name>` block (Q5)
- `not_in_scope_paths` / `not_in_scope_reasons` — YAML list items from Q7,
each prefixed with `- ` and indented 4 spaces to sit under the `paths:` / `reasons:` keys
- `forbidden_deps_list` — YAML list items from Q8 indented 2 spaces (or `[]` if none)
3. Before writing, grep the rendered output for any remaining double-brace
placeholders — if found, stop and tell the user which placeholder failed to substitute.
4. Write to `.gstack/ghcp-handoff/<slug>.md` + `.meta.json`.
5. Show the rendered prompt in a fenced block.
6. Proceed to **Step 2-exec**.

---

Expand Down
20 changes: 16 additions & 4 deletions ghcp-handoff/SKILL.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -182,10 +182,22 @@ AskUserQuestion: "PR description sections — accept defaults or customize?"
### Render
After all answers:
1. Read `${CLAUDE_SKILL_DIR}/templates/prompt-template.md`.
2. Fill all double-brace-style template variables (e.g. `task_summary`, `target_branch`, `stream_blocks`) from collected answers.
3. Write to `.gstack/ghcp-handoff/<slug>.md` + `.meta.json`.
4. Show the rendered prompt in a fenced block.
5. Proceed to **Step 2-exec**.
2. Substitute every double-brace placeholder in that file from the collected
answers. The full placeholder set is:
- `task_summary` — answer to Q1
- `phase_note` + `stub_policy` — answers to Q4
- `target_branch` — answer to Q2
- `context_files_bullets` — rendered as `- <file>` per line from Q3
- `numbered_boundaries` — one numbered line per entry from Q6
- `stream_blocks` — each stream rendered as the `STREAM N — <name>` block (Q5)
- `not_in_scope_paths` / `not_in_scope_reasons` — YAML list items from Q7,
each prefixed with `- ` and indented 4 spaces to sit under the `paths:` / `reasons:` keys
- `forbidden_deps_list` — YAML list items from Q8 indented 2 spaces (or `[]` if none)
3. Before writing, grep the rendered output for any remaining double-brace
placeholders — if found, stop and tell the user which placeholder failed to substitute.
4. Write to `.gstack/ghcp-handoff/<slug>.md` + `.meta.json`.
5. Show the rendered prompt in a fenced block.
6. Proceed to **Step 2-exec**.

---

Expand Down
11 changes: 7 additions & 4 deletions ghcp-handoff/bin/ghcp-detect.sh
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
#!/usr/bin/env bash
# ghcp-detect.sh — Detect copilot CLI binary and print version.
# Exit 0 with version on stdout if found; exit 1 with empty stdout if not.
# ghcp-detect.sh — Detect the standalone GitHub Copilot CLI (`copilot`) and
# print its version. Exits 0 with version on stdout if found; exit 1 with
# empty stdout if not.
#
# Note: the `gh copilot` extension is intentionally NOT a fallback. The
# handoff workflow invokes `copilot -p ... --allow-all-tools --output-format
# json --no-ask-user`, flags that only the standalone CLI supports.

set -euo pipefail

# Try 'copilot' first, then 'gh copilot' as fallback
if command -v copilot &>/dev/null; then
VERSION=$(copilot --version 2>/dev/null | head -1 || echo "unknown")
echo "$VERSION"
exit 0
fi

# Not found
exit 1
160 changes: 113 additions & 47 deletions ghcp-handoff/bin/ghcp-verify-boundaries.ts
Original file line number Diff line number Diff line change
Expand Up @@ -107,55 +107,115 @@ export function checkBoundaries(
}

/**
* Simple dep manifest diff parser.
* Returns list of added dependency names from a unified diff of a manifest file.
* Dep-section names for each supported manifest. Keys outside these
* sections (e.g. `name`, `version`, `author`, `repository` in package.json;
* `[tool.poetry]` metadata in pyproject.toml) must NOT be flagged as deps.
*/
export function extractAddedDeps(
diff: string,
manifestName: string
): string[] {
const added: string[] = [];

// package.json: lines like + "foo": "^1.0.0"
if (manifestName === "package.json") {
const matches = diff.matchAll(/^\+\s+"([^"]+)":\s*"/gm);
for (const m of matches) {
// Skip metadata keys
if (
!["name", "version", "private", "scripts", "description"].includes(
m[1]
)
) {
added.push(m[1]);
const PACKAGE_JSON_DEP_SECTIONS = [
"dependencies",
"devDependencies",
"peerDependencies",
"optionalDependencies",
] as const;

const CARGO_DEP_SECTION_RE = /^(?:dependencies|dev-dependencies|build-dependencies|target\..+?\.dependencies)$/;
const PYPROJECT_DEP_SECTION_RE = /^(?:dependencies|tool\.poetry\.dependencies|tool\.poetry\.dev-dependencies|tool\.poetry\.group\..+?\.dependencies)$/;

function parsePackageJsonDeps(content: string): Set<string> {
const deps = new Set<string>();
try {
const parsed = JSON.parse(content);
for (const section of PACKAGE_JSON_DEP_SECTIONS) {
const block = parsed?.[section];
if (block && typeof block === "object") {
for (const name of Object.keys(block)) deps.add(name);
}
}
} catch {
// malformed JSON — nothing we can flag with confidence
}
return deps;
}

// pyproject.toml: lines like +foo = ">=1.0"
if (manifestName === "pyproject.toml") {
const matches = diff.matchAll(/^\+([a-zA-Z0-9_-]+)\s*=/gm);
for (const m of matches) added.push(m[1]);
/**
* Walk TOML-ish content line by line tracking the active `[section]` header
* and collect `key = ...` entries only when the section matches a dep regex.
* This is intentionally minimal (no full TOML parse) — good enough to catch
* the common case without a new dependency.
*/
function parseTomlDeps(content: string, sectionRe: RegExp): Set<string> {
const deps = new Set<string>();
let section = "";
for (const raw of content.split("\n")) {
const line = raw.trim();
if (!line || line.startsWith("#")) continue;
const header = line.match(/^\[(.+?)\]$/);
if (header) {
section = header[1].trim();
continue;
}
if (!sectionRe.test(section)) continue;
const kv = line.match(/^([A-Za-z0-9_-]+)\s*=/);
if (kv) deps.add(kv[1]);
}
return deps;
}

// go.mod: lines like +\trequire foo v1.0.0
if (manifestName === "go.mod") {
const matches = diff.matchAll(/^\+\t([^\s]+)\s+v/gm);
for (const m of matches) added.push(m[1]);
function parseGoModDeps(content: string): Set<string> {
const deps = new Set<string>();
// Matches both `require module v...` (single) and lines inside `require ( ... )` blocks.
for (const raw of content.split("\n")) {
const line = raw.trim();
const m = line.match(/^(?:require\s+)?([^\s()]+)\s+v\d/);
if (m && m[1] !== "require") deps.add(m[1]);
}
return deps;
}

// Cargo.toml: like package.json pattern
if (manifestName === "Cargo.toml") {
const matches = diff.matchAll(/^\+([a-zA-Z0-9_-]+)\s*=/gm);
for (const m of matches) added.push(m[1]);
function parseGemfileDeps(content: string): Set<string> {
const deps = new Set<string>();
for (const m of content.matchAll(/^\s*gem\s+['"]([^'"]+)['"]/gm)) {
deps.add(m[1]);
}
return deps;
}

// Gemfile: lines like +gem "foo"
if (manifestName === "Gemfile") {
const matches = diff.matchAll(/^\+gem\s+"([^"]+)"/gm);
for (const m of matches) added.push(m[1]);
/**
* Parse a manifest's current dependency names. Returns an empty set for
* unsupported manifests (callers should gate on the supported list).
*/
export function parseManifestDeps(
content: string,
manifestName: string
): Set<string> {
switch (manifestName) {
case "package.json":
return parsePackageJsonDeps(content);
case "pyproject.toml":
return parseTomlDeps(content, PYPROJECT_DEP_SECTION_RE);
case "Cargo.toml":
return parseTomlDeps(content, CARGO_DEP_SECTION_RE);
case "go.mod":
return parseGoModDeps(content);
case "Gemfile":
return parseGemfileDeps(content);
default:
return new Set();
}
}

return added;
/**
* Return dependency names present in `after` but not in `before`.
* `before` may be empty string (file added on this branch).
*/
export function diffManifestDeps(
before: string,
after: string,
manifestName: string
): string[] {
const beforeDeps = parseManifestDeps(before, manifestName);
const afterDeps = parseManifestDeps(after, manifestName);
return [...afterDeps].filter((d) => !beforeDeps.has(d));
}

/**
Expand Down Expand Up @@ -235,19 +295,25 @@ if (import.meta.main) {
const allAddedDeps: { name: string; manifest: string }[] = [];

for (const manifest of manifests) {
if (changedFiles.includes(manifest)) {
const mDiff = Bun.spawnSync([
if (!changedFiles.includes(manifest)) continue;

const fetchAt = (ref: string): string => {
const proc = Bun.spawnSync([
"git",
"diff",
`${baseBranch}...${prBranch}`,
"--",
manifest,
"show",
`${ref}:${manifest}`,
]);
const diffText = new TextDecoder().decode(mDiff.stdout);
const added = extractAddedDeps(diffText, manifest);
for (const name of added) {
allAddedDeps.push({ name, manifest });
}
// Exit code is non-zero when the file did not exist at that ref.
return proc.exitCode === 0
? new TextDecoder().decode(proc.stdout)
: "";
};

const before = fetchAt(baseBranch);
const after = fetchAt(prBranch);
const added = diffManifestDeps(before, after, manifest);
for (const name of added) {
allAddedDeps.push({ name, manifest });
}
}

Expand Down
Loading