Add test suite for parse_threat_detection_results and harden threat detection failure handling#22828
Add test suite for parse_threat_detection_results and harden threat detection failure handling#22828
Conversation
Extract parseThreatDetectionResult pure function and add 11 tests covering: - default/builtin false values when no result line present - canonical all-false result line parsing - prompt_injection, secret_leak, malicious_patch detection - multiple reasons array parsing - embedded result line in multi-line content - whitespace trimming - first-match semantics Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e5cd566e-000f-40c8-958f-6deb7f6d5b28
|
…nResult
- parseThreatDetectionResult now returns { verdict, found } instead of the verdict directly
- Logs core.info when result line is found, core.warning when it is missing
- main() calls setFailed when found === false (missing result = security failure)
- Update tests: check found property, assert logging calls, 15 tests total
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1b924856-4ab9-4a65-b094-036f0841369a
Done in 18e2162:
|
|
@copilot make sure the conclusion job handle the detection failure state |
There was a problem hiding this comment.
Pull request overview
Adds automated test coverage for threat detection result parsing and refactors the parser into a reusable function, while tightening the workflow behavior to fail when the expected result marker is missing.
Changes:
- Extracted and exported
parseThreatDetectionResult(content)frommain()to isolate parsing logic. - Added a new Vitest suite covering default verdict behavior and multiple parsing scenarios.
- Updated
main()to treat a missingTHREAT_DETECTION_RESULT:line as anERR_VALIDATIONfailure.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| actions/setup/js/parse_threat_detection_results.cjs | Extracts parsing into parseThreatDetectionResult, logs when found/missing, and fails the action when the marker line is absent. |
| actions/setup/js/parse_threat_detection_results.test.cjs | Adds unit tests validating default verdict values, parsing behavior, and logging expectations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (trimmedLine.startsWith("THREAT_DETECTION_RESULT:")) { | ||
| const jsonPart = trimmedLine.substring("THREAT_DETECTION_RESULT:".length); | ||
| core.info("🔍 Found THREAT_DETECTION_RESULT line, parsing JSON payload"); | ||
| return { verdict: { ...verdict, ...JSON.parse(jsonPart) }, found: true }; | ||
| } |
There was a problem hiding this comment.
parseThreatDetectionResult() calls JSON.parse(jsonPart) without handling parse failures. If the JSON is malformed, this will throw, and main() will fall into the catch block and continue, potentially emitting success=true with the default all-false verdict (i.e., treating an unparseable threat result as safe). Consider catching JSON parse errors here (or in main) and treating them as ERR_VALIDATION failures (set output success=false, call core.setFailed, and return). Also consider logging the "Found..." info message only after the JSON payload has been parsed successfully to avoid misleading logs on failures.
| } | ||
| verdict = parsed; | ||
| } catch (error) { | ||
| core.warning("Failed to parse threat detection results: " + getErrorMessage(error)); |
There was a problem hiding this comment.
main() now fails when no result line is found, but in the catch block it only logs a warning and then proceeds to evaluate/output the default verdict. This means I/O errors or JSON parse errors can still lead to success=true. For threat detection, it seems safer to treat any parsing exception as a validation failure (set output success=false, call core.setFailed(...), and return) rather than continuing with an implicit all-clear.
| core.warning("Failed to parse threat detection results: " + getErrorMessage(error)); | |
| const message = "Failed to parse threat detection results: " + getErrorMessage(error); | |
| core.error(message); | |
| core.setOutput("success", "false"); | |
| core.setFailed(`${ERR_VALIDATION}: ❌ ${message}`); | |
| return; |
| const { parseThreatDetectionResult } = require("./parse_threat_detection_results.cjs"); | ||
|
|
There was a problem hiding this comment.
The new behavior in main() (treating a missing THREAT_DETECTION_RESULT: line as a validation failure) isn’t covered by tests here—this suite only exercises parseThreatDetectionResult(). Since main() now has security-significant control flow (setting outputs + failing the run), consider adding a small set of integration-style tests for main() (e.g., file missing, missing result line, valid result line, malformed JSON) by mocking fs/path and asserting core.setOutput/core.setFailed calls.
|
@copilot review comments |
- notify_comment.go: pass GH_AW_DETECTION_CONCLUSION to agentFailureEnvVars when
threat detection is enabled, so handle_agent_failure receives it alongside the
existing conclusion step
- handle_agent_failure.cjs: read GH_AW_DETECTION_CONCLUSION, log it, and add
hasDetectionFailure to the early-exit guard so detection failures trigger issue
creation even when the agent itself succeeded
- Extract buildDetectionFailureContext(hasDetectionFailure, runUrl) helper (exported
and tested); wire it into both the "update comment" and "create issue" template
contexts as detection_failure_context
- agent_failure_issue.md + agent_failure_comment.md: add {detection_failure_context}
placeholder after {secret_verification_context}
- handle_agent_failure.test.cjs: 4 new tests for buildDetectionFailureContext
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/106d8af4-de83-4d99-ae07-b8c9e7349005
Done in 60031f4. The conclusion job now fully handles the detection failure state:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
parse_threat_detection_results.cjshad no tests covering its core parsing logic — specifically whether theTHREAT_DETECTION_RESULT:line is parsed correctly and whether the builtin default fields (prompt_injection,secret_leak,malicious_patch) hold strictfalsevalues.Changes
Extracted
parseThreatDetectionResult(content)— function pulled out ofmain()that takes raw file content, scans for theTHREAT_DETECTION_RESULT:line, and returns{ verdict, found }. Exported alongsidemain.core.infowhen the result line is found.core.warningwhen no result line is present.found: false) causesmain()to callcore.setFailed()— absence of a result is treated as a security failure rather than an all-clear.Added
parse_threat_detection_results.test.cjs— 15 tests across two groups:found: falseand all three fields default tofalse,reasonsto[], on empty content or content with no result marker; warning log is asserted.found: truefor canonical result line, info log asserted, each threat field individuallytrue, multiple reasons, result embedded in surrounding output, whitespace trimming, first-match semantics, and strictfalse(not truthy, not null, not undefined) for explicitly-false fields.Conclusion job handles detection failure state — when the detection job fails (threats detected or missing result line), the conclusion job now surfaces it as a first-class failure:
notify_comment.go:GH_AW_DETECTION_CONCLUSIONis now passed toagentFailureEnvVars(in addition to the status-comment step), sohandle_agent_failurereceives it.handle_agent_failure.cjs: readsGH_AW_DETECTION_CONCLUSION, logs it, and addshasDetectionFailureto the early-exit guard — a detection failure triggers failure issue/comment creation even when the agent job itself succeeded.buildDetectionFailureContext(hasDetectionFailure, runUrl)as an exported helper; wired into both the "update comment" and "create issue" template contexts asdetection_failure_context.agent_failure_issue.mdandagent_failure_comment.mdinclude{detection_failure_context}after{secret_verification_context}.buildDetectionFailureContextinhandle_agent_failure.test.cjs.⚡ Quickly spin up Copilot coding agent tasks from anywhere on your macOS or Windows machine with Raycast.