Daily exploratory testing session (§24597696067) identified several reproducible bugs in the audit, logs, and compile MCP tools. Testing was performed on 2026-04-18 against 194 workflows.
Bug 1 — logs --count N always fails (CLI bridge passes integers as strings)
Severity: High — count-based pagination is completely broken
The --count parameter is passed as a string "3" by the CLI bridge but the MCP tool's JSON Schema requires an integer. Every call with --count produces a validation error.
Reproduction:
agenticworkflows logs --count 3
Actual output:
validating "arguments": validating root: validating /properties/count:
type: 3 has type "string", want "integer"
Expected: Logs limited to 3 results.
Impact: Users cannot paginate or limit log results. The CLI always downloads the full dataset regardless of what count value is specified.
Bug 2 — compile --workflows "name" always fails (CLI bridge passes arrays as strings)
Severity: High — targeted single-workflow compilation is completely broken
The --workflows parameter expects a JSON array but the CLI bridge passes the value as a plain string. There is no way to compile a specific workflow without compiling all 194.
Reproduction:
agenticworkflows compile --workflows "daily-issues-report"
Actual output:
validating "arguments": validating root: validating /properties/workflows:
type: daily-issues-report has type "string", want one of "null, array"
Expected: Only daily-issues-report.md is compiled.
Impact: Every compile invocation must process all workflows, making targeted validation slow and expensive. Also blocks the max_tokens efficiency pattern recommended in workflow tooling documentation.
Bug 3 — audit error responses have isError: false instead of isError: true
Severity: Medium — breaks automated error detection
When audit is called with an invalid or non-existent run ID, it returns error content but sets isError: false in the MCP response envelope. Per the MCP specification, tool errors must set isError: true.
Reproduction:
agenticworkflows audit --run_id_or_url "9999999999"
Actual MCP response:
{
"result": {
"isError": false,
"content": [{"type": "text", "text": "{\"error\":\"failed to audit workflow run: ✗ failed to fetch run metadata\",\"run_id_or_url\":\"9999999999\"}"}]
}
}
Expected: "isError": true
Impact: Any client checking isError to distinguish success from failure will silently treat error responses as successes.
Bug 4 — logs cache directory not accessible to callers
Severity: Medium — logs output is unreadable
logs writes results to /tmp/gh-aw/logs-cache/<hash>.json and returns that path to the caller, but the directory has permissions that prevent access:
ls: cannot open directory '/tmp/gh-aw/logs-cache/': Permission denied
Unlike /tmp/gh-aw/aw-mcp/logs/ (which is accessible and pre-populated with run directories), the logs-cache/ directory used for inline tool responses is not readable. The logs tool's returned file_path field is therefore unusable.
Workaround: Use the pre-downloaded logs in /tmp/gh-aw/aw-mcp/logs/ directly.
What Works Correctly
View passing test results
- ✅
status — lists all 194 workflows with engine, trigger, and compile status
- ✅
audit — rich output for valid runs: overview, metrics, tool_usage, safe_output_summary, created_items, key_findings, recommendations, behavior_fingerprint
- ✅
audit safe outputs — correctly reports safe_output_summary (type breakdown) and created_items (with URLs and timestamps) for runs with safe output activity
- ✅
audit failed run — correctly identifies failure, reports high token usage and excessive turn count
- ✅
logs --workflow_name — filters to a named workflow (empty summary returned gracefully for non-existent workflows)
- ✅
logs --start_date / --end_date — date range filtering works; future dates and old dates handled gracefully (no crash)
- ✅
compile (no args) — compiles all 194 workflows, 0 errors, 0 warnings
- ✅
compile error detection — correctly reports invalid tool names and invalid YAML with line numbers and context
- ✅
compile YAML parse errors — reports exact line/column with surrounding context
Environment
- Repository: github/gh-aw
- Run ID: §24597696067
- Date: 2026-04-18
- Workflows tested: 194
- Audit runs tested: 4 (success, failure, safe-outputs, invalid ID)
Generated by Daily CLI Tools Exploratory Tester · ● 3.1M · ◷
Daily exploratory testing session (§24597696067) identified several reproducible bugs in the
audit,logs, andcompileMCP tools. Testing was performed on 2026-04-18 against 194 workflows.Bug 1 —
logs --count Nalways fails (CLI bridge passes integers as strings)Severity: High — count-based pagination is completely broken
The
--countparameter is passed as a string"3"by the CLI bridge but the MCP tool's JSON Schema requires an integer. Every call with--countproduces a validation error.Reproduction:
Actual output:
Expected: Logs limited to 3 results.
Impact: Users cannot paginate or limit log results. The CLI always downloads the full dataset regardless of what count value is specified.
Bug 2 —
compile --workflows "name"always fails (CLI bridge passes arrays as strings)Severity: High — targeted single-workflow compilation is completely broken
The
--workflowsparameter expects a JSON array but the CLI bridge passes the value as a plain string. There is no way to compile a specific workflow without compiling all 194.Reproduction:
Actual output:
Expected: Only
daily-issues-report.mdis compiled.Impact: Every compile invocation must process all workflows, making targeted validation slow and expensive. Also blocks the
max_tokensefficiency pattern recommended in workflow tooling documentation.Bug 3 —
auditerror responses haveisError: falseinstead ofisError: trueSeverity: Medium — breaks automated error detection
When
auditis called with an invalid or non-existent run ID, it returns error content but setsisError: falsein the MCP response envelope. Per the MCP specification, tool errors must setisError: true.Reproduction:
Actual MCP response:
{ "result": { "isError": false, "content": [{"type": "text", "text": "{\"error\":\"failed to audit workflow run: ✗ failed to fetch run metadata\",\"run_id_or_url\":\"9999999999\"}"}] } }Expected:
"isError": trueImpact: Any client checking
isErrorto distinguish success from failure will silently treat error responses as successes.Bug 4 —
logscache directory not accessible to callersSeverity: Medium — logs output is unreadable
logswrites results to/tmp/gh-aw/logs-cache/<hash>.jsonand returns that path to the caller, but the directory has permissions that prevent access:Unlike
/tmp/gh-aw/aw-mcp/logs/(which is accessible and pre-populated with run directories), thelogs-cache/directory used for inline tool responses is not readable. The logs tool's returnedfile_pathfield is therefore unusable.Workaround: Use the pre-downloaded logs in
/tmp/gh-aw/aw-mcp/logs/directly.What Works Correctly
View passing test results
status— lists all 194 workflows with engine, trigger, and compile statusaudit— rich output for valid runs: overview, metrics, tool_usage, safe_output_summary, created_items, key_findings, recommendations, behavior_fingerprintauditsafe outputs — correctly reportssafe_output_summary(type breakdown) andcreated_items(with URLs and timestamps) for runs with safe output activityauditfailed run — correctly identifies failure, reports high token usage and excessive turn countlogs --workflow_name— filters to a named workflow (empty summary returned gracefully for non-existent workflows)logs --start_date / --end_date— date range filtering works; future dates and old dates handled gracefully (no crash)compile(no args) — compiles all 194 workflows, 0 errors, 0 warningscompileerror detection — correctly reports invalid tool names and invalid YAML with line numbers and contextcompileYAML parse errors — reports exact line/column with surrounding contextEnvironment