Threat detection gate is advisory-only by default — `needs.detection.outputs.detection_success` unused in safe_outputs condition
[Content truncated due to length]

## Summary

When `GH_AW_DETECTION_CONTINUE_ON_ERROR=true` (the compiled default for all workflows at v0.68.3), the detection parse step calls `core.warning()` instead of `core.setFailed()` for every detection failure path — `threat_detected`, `parse_error`, and `agent_failure`. The detection GitHub job therefore always exits with result `success`, regardless of what the model returns. The `safe_outputs` job condition gates solely on `needs.detection.result == 'success'` (job-level exit code), not on the semantic step output `needs.detection.outputs.detection_success`. As a result, safe outputs are never blocked by threat detection in the default configuration, even when the model flags `prompt_injection=true`, `secret_leak=true`, or `malicious_patch=true`. This directly contradicts the published documentation, which states detection is a hard gate.

## Affected Area

Output trust boundary — threat detection gate (`parse_threat_detection_results.cjs`, `safe_output_handler_manager.cjs`, compiled `safe_outputs` job condition in lock files).

## Reproduction Outline

1. Compile any gh-aw workflow with threat detection enabled (v0.68.3; default: `GH_AW_DETECTION_CONTINUE_ON_ERROR: "true"` in all 32 tested lock files).
2. Observe the generated `safe_outputs` job condition: `if: (!cancelled()) && needs.agent.result != 'skipped' && needs.detection.result == 'success'`.
3. Trigger the workflow and allow the detection model to return `prompt_injection=true` (or simulate a parse error).
4. Observe in `parse_threat_detection_results.cjs`: `setDetectionFailure` calls `core.warning()`, not `core.setFailed()`. Detection job exits 0.
5. Because `needs.detection.result == 'success'` is satisfied, the `safe_outputs` job runs and processes output regardless of detection verdict.
6. Confirm that `needs.detection.outputs.detection_success` (which correctly holds `"false"`) is produced but never referenced in the job condition or in `safe_output_handler_manager.cjs`.

## Observed Behavior

All compiled workflows with threat detection enabled operate in advisory-only mode by default. Detected threats (prompt injection, secret leak, malicious patch) and detection failures are logged as warnings, but safe outputs are processed unconditionally. `GH_AW_DETECTION_CONCLUSION` and `GH_AW_DETECTION_REASON` are passed to the `safe_outputs` job environment but are never consumed by `safe_output_handler_manager.cjs`.

## Expected Behavior

Per published documentation: *"If any threat is detected, the workflow fails and safe outputs are blocked."* When the detection model flags a threat, the `safe_outputs` job should not run. At minimum, the `safe_outputs` job condition should include `&& needs.detection.outputs.detection_success == 'true'`, so the semantic verdict gates output processing independently of warn/strict mode.

## Security Relevance

The threat detection gate is documented as the primary control preventing AI-generated malicious output (prompt injection results, secret leaks, malicious patches) from reaching GitHub write surfaces via safe outputs. If this gate is advisory-only by default, any payload that triggers a positive detection result is still processed through safe outputs — the documented security control provides no runtime enforcement under the default compiled configuration.

## Suggested Fixes

1. **Smallest direct fix**: Add `&& needs.detection.outputs.detection_success == 'true'` to the generated `safe_outputs` job condition in the compiler (e.g., `safe_jobs_threat_detection.go`). Works regardless of warn/strict mode because `detection_success` is always set.
2. **Change default**: Set `continue-on-error: false` as the default in `ThreatDetectionConfig` so the job-level exit code also reflects the detection outcome.
3. **Defense-in-depth**: Add a detection conclusion check at the top of `safe_output_handler_manager.cjs` that reads `GH_AW_DETECTION_CONCLUSION` (already in env) and fails early if it is `"warning"` or `"failure"`.
4. **Documentation**: Update `docs/src/content/docs/reference/threat-detection.md` to document the `continue-on-error` option and clarify that hard-gate behavior requires explicit configuration.

## Additional Context

If warn mode is an intentional design decision (e.g., to preserve availability when the detection engine fails unexpectedly), this assumption should be documented explicitly. The published documentation currently describes only hard-gate semantics with no mention of warn mode or `continue-on-error`. A more targeted design would hard-fail on `threat_detected` while falling back to warn mode only for `agent_failure` / `parse_error`.

**gh-aw version**: v0.68.3

Original finding: https://github.com/githubnext/gh-aw-security/issues/2188




> Generated by [File Issue](https://github.com/githubnext/gh-aw-security/actions/runs/25741670719/agentic_workflow) · ● 357.8K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+githubnext%2Fgh-aw-security%2Ffile-issue%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threat detection gate is advisory-only by default — `needs.detection.outputs.detection_success` unused in safe_outputs condition [Content truncated due to length] #31708

Summary

Affected Area

Reproduction Outline

Observed Behavior

Expected Behavior

Security Relevance

Suggested Fixes

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Threat detection gate is advisory-only by default — needs.detection.outputs.detection_success unused in safe_outputs condition [Content truncated due to length] #31708

Description

Summary

Affected Area

Reproduction Outline

Observed Behavior

Expected Behavior

Security Relevance

Suggested Fixes

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Threat detection gate is advisory-only by default — `needs.detection.outputs.detection_success` unused in safe_outputs condition [Content truncated due to length] #31708