Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/taskbound.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,4 @@ jobs:

- uses: ./
with:
task: ${{ github.event.pull_request.title }}
fail-on: none
77 changes: 75 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ Post-session scope creep review for AI agent edits.
TaskBound is a free OSS CLI and GitHub Action that compares a **stated task** against a repo diff and flags when an agent wandered outside the goal.

- Out-of-scope file edits inferred from the task text
- Pull request title plus body as scope context in GitHub Actions
- Optional LLM-assisted scope extraction with heuristic fallback
- `.taskbound.yml` severity overrides for per-repo calibration
- Sensitive surface touches such as MCP config, CI workflows, and `package.json`
- New dependencies, env file changes, and capability signals in added lines
- Terminal, Markdown, JSON, and line-level GitHub annotation output
Expand Down Expand Up @@ -53,6 +56,27 @@ Compare two git refs:
node dist/index.js review --task "Fix header CSS styling" --repo . --base main --head HEAD --format markdown
```

Use extra scope context, such as a PR description:

```powershell
node dist/index.js review --task "Fix header CSS styling" --scope-context "Only touch styles/header.css." --repo . --base main --head HEAD --format markdown
```

Use a GitHub event payload as the task source. TaskBound reads `pull_request.title`
as the stated task and `pull_request.body` as additional scope context:

```powershell
node dist/index.js review --github-event event.json --repo . --base main --head HEAD --format markdown
```

Use optional LLM-assisted scope extraction. If `OPENAI_API_KEY` is missing, the
network is unavailable, or the model call fails, TaskBound keeps running with the
heuristic inferer:

```powershell
node dist/index.js review --task "Fix header CSS styling" --scope-llm gpt-4o-mini --repo . --base main --head HEAD --format markdown
```

JSON output:

```powershell
Expand Down Expand Up @@ -82,11 +106,32 @@ jobs:

- uses: Conalh/TaskBound@v0.1.0
with:
task: ${{ github.event.pull_request.title }}
fail-on: none
```

The action uploads nothing by default. It reads local git state from the checked-out repository, writes a Markdown report to the GitHub Actions step summary, and emits PR-visible warning annotations for each finding.
The action uploads nothing by default. It reads local git state from the checked-out repository, writes a Markdown report to the GitHub Actions step summary, and emits PR-visible warning annotations for each finding. On pull requests, it defaults to `github.event.pull_request.title` as the stated task and feeds `github.event.pull_request.body` as additional scope context.

You can still override the task explicitly:

```yaml
- uses: Conalh/TaskBound@v0.1.0
with:
task: Fix header CSS styling
fail-on: none
```

To enable LLM-assisted scope extraction in Actions, pass a model and expose an
API key to the job. If the key is absent or the model call fails, TaskBound falls
back to heuristic scope inference and records `scopeSource: llm_fallback` in JSON.

```yaml
- uses: Conalh/TaskBound@v0.1.0
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
with:
scope-llm: gpt-4o-mini
fail-on: none
```

Start with `fail-on: none` so TaskBound is advisory while you tune policy. Raise it to `high` or `critical` once the findings are trusted.

Expand All @@ -99,6 +144,34 @@ Action outputs:
- `changed-file-count`: number of changed files in the diff
- `scope-match-count`: number of changed files that matched the inferred task scope

## Severity Calibration

Add `.taskbound.yml` to tune findings for your repo. Rule keys match JSON
finding `kind` values.

```yaml
severity:
out_of_scope_file: high
script_pipe_to_shell: critical

rules:
external_fetch_added:
severity: low
```

Supported severities are `low`, `medium`, `high`, and `critical`. In git mode,
TaskBound reads `.taskbound.yml` from the base ref when available so a pull
request cannot silently weaken an existing policy.

## Findings Schema

JSON findings include stable routing fields for future package extraction:

- `kind`: machine-readable rule id, such as `external_fetch_added`
- `category`: broad group, such as `scope`, `capability`, or `lifecycle`
- `severity`: `low`, `medium`, `high`, or `critical`
- `file`, `line`, `subject`, `message`, and `recommendation`

## Current Findings

TaskBound v0 detects:
Expand Down
37 changes: 27 additions & 10 deletions action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@ branding:

inputs:
task:
description: Stated task or goal for the agent session or pull request.
required: true
description: Optional stated task or goal. Defaults to the pull request title and uses the pull request body as additional scope context.
required: false
default: ''
scope-llm:
description: Optional OpenAI model for LLM-assisted scope extraction. Falls back to heuristic scope when offline or unauthenticated.
required: false
default: ''
repo:
description: Repository checkout path to inspect. Defaults to GITHUB_WORKSPACE.
required: false
Expand Down Expand Up @@ -67,6 +72,8 @@ runs:
BOUND_BASE: ${{ inputs.base }}
BOUND_HEAD: ${{ inputs.head }}
BOUND_FAIL_ON: ${{ inputs.fail-on }}
BOUND_SCOPE_LLM: ${{ inputs.scope-llm }}
BOUND_SCOPE_CONTEXT: ${{ github.event.pull_request.body }}
DEFAULT_BASE: ${{ github.event.pull_request.base.sha || github.event.before }}
DEFAULT_HEAD: ${{ github.event.pull_request.head.sha || github.sha }}
run: |
Expand All @@ -77,11 +84,8 @@ runs:
base="${BOUND_BASE:-$DEFAULT_BASE}"
head="${BOUND_HEAD:-$DEFAULT_HEAD}"
fail_on="${BOUND_FAIL_ON:-none}"

if [ -z "$task" ]; then
echo "::error::TaskBound requires a task input."
exit 2
fi
scope_llm="${BOUND_SCOPE_LLM:-}"
scope_context="${BOUND_SCOPE_CONTEXT:-}"

if [ -z "$base" ] || [ -z "$head" ]; then
echo "::error::TaskBound needs base and head refs. Pass base/head inputs or run on pull_request with actions/checkout fetch-depth: 0."
Expand All @@ -90,10 +94,23 @@ runs:

report_file="${RUNNER_TEMP:-.}/taskbound-report.md"
json_file="${RUNNER_TEMP:-.}/taskbound-report.json"
taskbound_args=(review --repo "$repo" --base "$base" --head "$head")
if [ -n "$task" ]; then
taskbound_args+=(--task "$task")
fi
if [ -n "${GITHUB_EVENT_PATH:-}" ]; then
taskbound_args+=(--github-event "$GITHUB_EVENT_PATH")
fi
if [ -n "$scope_context" ]; then
taskbound_args+=(--scope-context "$scope_context")
fi
if [ -n "$scope_llm" ]; then
taskbound_args+=(--scope-llm "$scope_llm")
fi

node "$GITHUB_ACTION_PATH/dist/index.js" review --task "$task" --repo "$repo" --base "$base" --head "$head" --format markdown | tee "$report_file"
node "$GITHUB_ACTION_PATH/dist/index.js" review --task "$task" --repo "$repo" --base "$base" --head "$head" --format json > "$json_file"
node "$GITHUB_ACTION_PATH/dist/index.js" review --task "$task" --repo "$repo" --base "$base" --head "$head" --format github
node "$GITHUB_ACTION_PATH/dist/index.js" "${taskbound_args[@]}" --format markdown | tee "$report_file"
node "$GITHUB_ACTION_PATH/dist/index.js" "${taskbound_args[@]}" --format json > "$json_file"
node "$GITHUB_ACTION_PATH/dist/index.js" "${taskbound_args[@]}" --format github

if [ -n "${GITHUB_STEP_SUMMARY:-}" ]; then
cat "$report_file" >> "$GITHUB_STEP_SUMMARY"
Expand Down
100 changes: 100 additions & 0 deletions src/config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
import { readFile } from 'node:fs/promises';
import { join } from 'node:path';
import { readFileAtGitRef } from './git-diff.js';
import type { Finding, Severity } from './types.js';

const SEVERITIES = new Set<Severity>(['low', 'medium', 'high', 'critical']);

export interface TaskBoundConfig {
severityOverrides: Record<string, Severity>;
}

export type ConfigMode =
| { mode: 'directories'; oldRoot: string; newRoot: string }
| { mode: 'git'; repo: string; base: string; head: string };

export async function loadTaskBoundConfig(mode: ConfigMode): Promise<TaskBoundConfig> {
if (mode.mode === 'directories') {
return parseTaskBoundConfig(await readConfigFromDirectory(mode.newRoot));
}

const baseConfig = await readFileAtGitRef(mode.repo, mode.base, '.taskbound.yml');
if (baseConfig !== null) {
return parseTaskBoundConfig(baseConfig);
}

return parseTaskBoundConfig((await readFileAtGitRef(mode.repo, mode.head, '.taskbound.yml')) ?? '');
}

export function applySeverityOverrides(findings: Finding[], config: TaskBoundConfig): Finding[] {
return findings.map((finding) => {
const severity = config.severityOverrides[finding.kind];
return severity ? { ...finding, severity } : finding;
});
}

function parseTaskBoundConfig(text: string): TaskBoundConfig {
const severityOverrides: Record<string, Severity> = {};
let section: 'severity' | 'rules' | undefined;
let currentRule: string | undefined;

for (const rawLine of text.split(/\r?\n/)) {
const lineWithoutComment = rawLine.replace(/\s+#.*$/, '');
const trimmed = lineWithoutComment.trim();
if (!trimmed || trimmed.startsWith('#')) {
continue;
}

const indent = lineWithoutComment.length - lineWithoutComment.trimStart().length;
if (indent === 0) {
currentRule = undefined;
if (trimmed === 'severity:') {
section = 'severity';
} else if (trimmed === 'rules:') {
section = 'rules';
} else {
section = undefined;
}
continue;
}

if (section === 'severity' && indent >= 2) {
const entry = parseKeyValue(trimmed);
if (entry && isSeverity(entry.value)) {
severityOverrides[entry.key] = entry.value;
}
continue;
}

if (section === 'rules' && indent === 2 && trimmed.endsWith(':')) {
currentRule = trimmed.slice(0, -1).trim();
continue;
}

if (section === 'rules' && currentRule && indent >= 4) {
const entry = parseKeyValue(trimmed);
if (entry?.key === 'severity' && isSeverity(entry.value)) {
severityOverrides[currentRule] = entry.value;
}
}
}

return { severityOverrides };
}

async function readConfigFromDirectory(root: string): Promise<string> {
try {
return await readFile(join(root, '.taskbound.yml'), 'utf8');
} catch {
return '';
}
}

function parseKeyValue(line: string): { key: string; value: string } | undefined {
const match = line.match(/^([\w.-]+)\s*:\s*([\w.-]+)\s*$/);
return match ? { key: match[1], value: match[2] } : undefined;
}

function isSeverity(value: string): value is Severity {
return SEVERITIES.has(value as Severity);
}
10 changes: 7 additions & 3 deletions src/detectors/capability-signals.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { isCommentLine, isPackageJsonFile } from '../paths.js';
import { isCommentLine, isPackageJsonFile, isTestFixturePath } from '../paths.js';
import { isRecord, lineOfJsonKey, lineOfJsonStringValue } from '../discovery.js';
import { readFileFromSide } from '../git-diff.js';
import type { AddedLine, ChangedFile, Finding } from '../types.js';
Expand All @@ -13,7 +13,7 @@ export function detectCapabilitySignals(addedLines: AddedLine[]): Finding[] {
const findings: Finding[] = [];

for (const added of addedLines) {
if (isPackageJsonFile(added.file)) {
if (isPackageJsonFile(added.file) || isTestFixturePath(added.file)) {
continue;
}

Expand All @@ -35,7 +35,7 @@ export async function detectLifecycleScriptChanges(
const findings: Finding[] = [];

for (const changed of changedFiles) {
if (!isPackageJsonFile(changed.file) || changed.status === 'deleted') {
if (!isPackageJsonFile(changed.file) || isTestFixturePath(changed.file) || changed.status === 'deleted') {
continue;
}

Expand All @@ -59,6 +59,7 @@ function detectFetch(added: AddedLine): Finding[] {
return [
{
kind: 'external_fetch_added',
category: 'capability',
severity: 'medium',
file: added.file,
line: added.line,
Expand All @@ -77,6 +78,7 @@ function detectSubprocess(added: AddedLine): Finding[] {
return [
{
kind: 'subprocess_spawn_added',
category: 'capability',
severity: 'high',
file: added.file,
line: added.line,
Expand All @@ -101,6 +103,7 @@ function compareLifecycleScripts(file: string, oldText: string, newText: string)
const line = lineOfJsonKey(newText, key) ?? lineOfJsonStringValue(newText, newValue);
findings.push({
kind: 'lifecycle_script_changed',
category: 'lifecycle',
severity: 'high',
file,
line,
Expand All @@ -112,6 +115,7 @@ function compareLifecycleScripts(file: string, oldText: string, newText: string)
if (/(?:curl[^\n|]*\|\s*(?:ba)?sh|wget[^\n|]*\|\s*sh|Invoke-Expression|iex\s*\()/i.test(newValue)) {
findings.push({
kind: 'script_pipe_to_shell',
category: 'lifecycle',
severity: 'critical',
file,
line,
Expand Down
6 changes: 4 additions & 2 deletions src/detectors/dependency-drift.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { isRecord, lineOfJsonKey } from '../discovery.js';
import { readFileFromSide } from '../git-diff.js';
import { isPackageJsonFile } from '../paths.js';
import { isPackageJsonFile, isTestFixturePath } from '../paths.js';
import type { ChangedFile, Finding } from '../types.js';

type PackageDiffMode =
Expand All @@ -14,7 +14,7 @@ export async function detectDependencyDrift(
const findings: Finding[] = [];

for (const changed of changedFiles) {
if (!isPackageJsonFile(changed.file) || changed.status === 'deleted') {
if (!isPackageJsonFile(changed.file) || isTestFixturePath(changed.file) || changed.status === 'deleted') {
continue;
}

Expand All @@ -35,6 +35,7 @@ function compareDependencies(file: string, oldText: string, newText: string): Fi
if (!(name in oldDeps)) {
findings.push({
kind: 'dependency_added',
category: 'dependency',
severity: 'medium',
file,
line: lineOfJsonKey(newText, name),
Expand All @@ -48,6 +49,7 @@ function compareDependencies(file: string, oldText: string, newText: string): Fi
if (oldDeps[name] !== version) {
findings.push({
kind: 'dependency_changed',
category: 'dependency',
severity: 'low',
file,
line: lineOfJsonKey(newText, name),
Expand Down
1 change: 1 addition & 0 deletions src/detectors/env-drift.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ export function detectEnvDrift(changedFiles: ChangedFile[]): Finding[] {

findings.push({
kind: 'env_file_changed',
category: 'env',
severity: 'high',
file: changed.file,
subject: changed.file,
Expand Down
Loading