Skip to content

Add test suite for parse_threat_detection_results and harden threat detection failure handling#22828

Closed
Copilot wants to merge 4 commits intomainfrom
copilot/add-test-suite-parse-threat-detection
Closed

Add test suite for parse_threat_detection_results and harden threat detection failure handling#22828
Copilot wants to merge 4 commits intomainfrom
copilot/add-test-suite-parse-threat-detection

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 25, 2026

parse_threat_detection_results.cjs had no tests covering its core parsing logic — specifically whether the THREAT_DETECTION_RESULT: line is parsed correctly and whether the builtin default fields (prompt_injection, secret_leak, malicious_patch) hold strict false values.

Changes

  • Extracted parseThreatDetectionResult(content) — function pulled out of main() that takes raw file content, scans for the THREAT_DETECTION_RESULT: line, and returns { verdict, found }. Exported alongside main.

    • Emits core.info when the result line is found.
    • Emits core.warning when no result line is present.
    • A missing result line (found: false) causes main() to call core.setFailed() — absence of a result is treated as a security failure rather than an all-clear.
  • Added parse_threat_detection_results.test.cjs — 15 tests across two groups:

    • Default verdict: found: false and all three fields default to false, reasons to [], on empty content or content with no result marker; warning log is asserted.
    • Parsing: found: true for canonical result line, info log asserted, each threat field individually true, multiple reasons, result embedded in surrounding output, whitespace trimming, first-match semantics, and strict false (not truthy, not null, not undefined) for explicitly-false fields.
  • Conclusion job handles detection failure state — when the detection job fails (threats detected or missing result line), the conclusion job now surfaces it as a first-class failure:

    • notify_comment.go: GH_AW_DETECTION_CONCLUSION is now passed to agentFailureEnvVars (in addition to the status-comment step), so handle_agent_failure receives it.
    • handle_agent_failure.cjs: reads GH_AW_DETECTION_CONCLUSION, logs it, and adds hasDetectionFailure to the early-exit guard — a detection failure triggers failure issue/comment creation even when the agent job itself succeeded.
    • Extracted buildDetectionFailureContext(hasDetectionFailure, runUrl) as an exported helper; wired into both the "update comment" and "create issue" template contexts as detection_failure_context.
    • agent_failure_issue.md and agent_failure_comment.md include {detection_failure_context} after {secret_verification_context}.
    • 4 new tests for buildDetectionFailureContext in handle_agent_failure.test.cjs.
// Example: missing result line is a failure
const { found } = parseThreatDetectionResult("agent ran successfully\nno result here");
expect(found).toBe(false); // main() will call core.setFailed()

// Example: parsing the expected output format
const { verdict, found } = parseThreatDetectionResult(
  'THREAT_DETECTION_RESULT:{"prompt_injection":true,"secret_leak":false,"malicious_patch":false,"reasons":["Prompt injection detected"]}'
);
expect(found).toBe(true);
expect(verdict.prompt_injection).toBe(true);
expect(verdict.reasons).toEqual(["Prompt injection detected"]);

⚡ Quickly spin up Copilot coding agent tasks from anywhere on your macOS or Windows machine with Raycast.

Extract parseThreatDetectionResult pure function and add 11 tests covering:
- default/builtin false values when no result line present
- canonical all-false result line parsing
- prompt_injection, secret_leak, malicious_patch detection
- multiple reasons array parsing
- embedded result line in multi-line content
- whitespace trimming
- first-match semantics

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e5cd566e-000f-40c8-958f-6deb7f6d5b28
Copilot AI changed the title [WIP] Add test suite for parsing threat detection results Add test suite for parse_threat_detection_results Mar 25, 2026
Copilot AI requested a review from pelikhan March 25, 2026 04:30
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot

  • make sure lack of results is marked as form of failure
  • add logging

…nResult

- parseThreatDetectionResult now returns { verdict, found } instead of the verdict directly
- Logs core.info when result line is found, core.warning when it is missing
- main() calls setFailed when found === false (missing result = security failure)
- Update tests: check found property, assert logging calls, 15 tests total

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1b924856-4ab9-4a65-b094-036f0841369a
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 25, 2026

@copilot

  • make sure lack of results is marked as form of failure...

Done in 18e2162:

  • Missing result = failure: parseThreatDetectionResult now returns { verdict, found }. When found is false, main() calls core.setFailed() with an ERR_VALIDATION error — agent output without a THREAT_DETECTION_RESULT: line is rejected rather than treated as all-clear.
  • Logging: core.info is emitted when the result line is found; core.warning is emitted when no result line is present.
  • Tests updated to cover both logging calls and the found property (15 tests total).

@pelikhan pelikhan marked this pull request as ready for review March 25, 2026 04:50
Copilot AI review requested due to automatic review settings March 25, 2026 04:50
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot make sure the conclusion job handle the detection failure state

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automated test coverage for threat detection result parsing and refactors the parser into a reusable function, while tightening the workflow behavior to fail when the expected result marker is missing.

Changes:

  • Extracted and exported parseThreatDetectionResult(content) from main() to isolate parsing logic.
  • Added a new Vitest suite covering default verdict behavior and multiple parsing scenarios.
  • Updated main() to treat a missing THREAT_DETECTION_RESULT: line as an ERR_VALIDATION failure.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
actions/setup/js/parse_threat_detection_results.cjs Extracts parsing into parseThreatDetectionResult, logs when found/missing, and fails the action when the marker line is absent.
actions/setup/js/parse_threat_detection_results.test.cjs Adds unit tests validating default verdict values, parsing behavior, and logging expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +32 to +36
if (trimmedLine.startsWith("THREAT_DETECTION_RESULT:")) {
const jsonPart = trimmedLine.substring("THREAT_DETECTION_RESULT:".length);
core.info("🔍 Found THREAT_DETECTION_RESULT line, parsing JSON payload");
return { verdict: { ...verdict, ...JSON.parse(jsonPart) }, found: true };
}
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseThreatDetectionResult() calls JSON.parse(jsonPart) without handling parse failures. If the JSON is malformed, this will throw, and main() will fall into the catch block and continue, potentially emitting success=true with the default all-false verdict (i.e., treating an unparseable threat result as safe). Consider catching JSON parse errors here (or in main) and treating them as ERR_VALIDATION failures (set output success=false, call core.setFailed, and return). Also consider logging the "Found..." info message only after the JSON payload has been parsed successfully to avoid misleading logs on failures.

Copilot uses AI. Check for mistakes.
}
verdict = parsed;
} catch (error) {
core.warning("Failed to parse threat detection results: " + getErrorMessage(error));
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main() now fails when no result line is found, but in the catch block it only logs a warning and then proceeds to evaluate/output the default verdict. This means I/O errors or JSON parse errors can still lead to success=true. For threat detection, it seems safer to treat any parsing exception as a validation failure (set output success=false, call core.setFailed(...), and return) rather than continuing with an implicit all-clear.

Suggested change
core.warning("Failed to parse threat detection results: " + getErrorMessage(error));
const message = "Failed to parse threat detection results: " + getErrorMessage(error);
core.error(message);
core.setOutput("success", "false");
core.setFailed(`${ERR_VALIDATION}: ❌ ${message}`);
return;

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +15
const { parseThreatDetectionResult } = require("./parse_threat_detection_results.cjs");

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new behavior in main() (treating a missing THREAT_DETECTION_RESULT: line as a validation failure) isn’t covered by tests here—this suite only exercises parseThreatDetectionResult(). Since main() now has security-significant control flow (setting outputs + failing the run), consider adding a small set of integration-style tests for main() (e.g., file missing, missing result line, valid result line, malformed JSON) by mocking fs/path and asserting core.setOutput/core.setFailed calls.

Copilot uses AI. Check for mistakes.
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review comments

- notify_comment.go: pass GH_AW_DETECTION_CONCLUSION to agentFailureEnvVars when
  threat detection is enabled, so handle_agent_failure receives it alongside the
  existing conclusion step
- handle_agent_failure.cjs: read GH_AW_DETECTION_CONCLUSION, log it, and add
  hasDetectionFailure to the early-exit guard so detection failures trigger issue
  creation even when the agent itself succeeded
- Extract buildDetectionFailureContext(hasDetectionFailure, runUrl) helper (exported
  and tested); wire it into both the "update comment" and "create issue" template
  contexts as detection_failure_context
- agent_failure_issue.md + agent_failure_comment.md: add {detection_failure_context}
  placeholder after {secret_verification_context}
- handle_agent_failure.test.cjs: 4 new tests for buildDetectionFailureContext

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/106d8af4-de83-4d99-ae07-b8c9e7349005
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 25, 2026

@copilot make sure the conclusion job handle the detection failure state

Done in 60031f4. The conclusion job now fully handles the detection failure state:

  • notify_comment.go: GH_AW_DETECTION_CONCLUSION is now passed to agentFailureEnvVars (in addition to the existing customEnvVars for the status-comment step), so handle_agent_failure receives it.
  • handle_agent_failure.cjs: reads GH_AW_DETECTION_CONCLUSION, logs it, and adds hasDetectionFailure to the early-exit guard — a detection failure triggers the failure issue/comment even when the agent job itself succeeded.
  • Extracted buildDetectionFailureContext(hasDetectionFailure, runUrl) as an exported helper; wired into both template contexts (detection_failure_context).
  • Both agent_failure_issue.md and agent_failure_comment.md include {detection_failure_context} after {secret_verification_context}.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh /usr/bin/gh api graphql -f query=query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { hasDiscussionsEnabled } } -f owner=github -f name=gh-aw (http block)
  • https://api.github.com/repos/actions/ai-inference/git/ref/tags/v1
    • Triggering command: /usr/bin/gh gh api /repos/actions/ai-inference/git/ref/tags/v1 --jq .object.sha -bool -buildtags /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -errorsas -ifaceassert -nilfunc /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -ato�� -bool -buildtags /home/REDACTED/work/_temp/ghcca-node/node/bin/bash -errorsas -ifaceassert -nilfunc bash (http block)
  • https://api.github.com/repos/actions/checkout/git/ref/tags/v3
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v3 --jq .object.sha &#34;prettier&#34; --check &#39;**/*.cjs&#39; &#39;**/*.ts&#39; &#39;**/*.js-errorsas on ache/go/1.25.0/x64/pkg/tool/linux_amd64/vet /../../.prettiergit erignore ache/node/24.14.--show-toplevel ache/go/1.25.0/x64/pkg/tool/linux_amd64/vet (http block)
  • https://api.github.com/repos/actions/checkout/git/ref/tags/v5
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v5 --jq .object.sha th .prettierigno-errorsas (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v5 --jq .object.sha --noprofile iptables /usr/bin/git -t security p/bin/bash git rev-�� --show-toplevel bash /usr/bin/git json&#39; --ignore-pgit 53 ache/go/1.25.0/x--show-toplevel git (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v5 --jq .object.sha --show-toplevel x_amd64/vet /usr/bin/git --noprofile (http block)
  • https://api.github.com/repos/actions/checkout/git/ref/tags/v6
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v6 --jq .object.sha --noprofile git (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v6 --jq .object.sha --noprofile git /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet --local commit.gpgsign _modules/.bin/sh--show-toplevel /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -uns�� -unreachable=false /tmp/go-build4184831366/b064/vet.cfg /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet ath ../../../.prgit (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v6 --jq .object.sha --show-toplevel x_amd64/vet /usr/bin/git --noprofile conntrack x_amd64/vet git rev-�� --show-toplevel x_amd64/vet /usr/bin/git --noprofile (http block)
  • https://api.github.com/repos/actions/github-script/git/ref/tags/v8
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v8 --jq .object.sha (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v8 --jq .object.sha 4831366/b374/_pkg_.a git 4831366/b374=&gt; h ../../../.pretgit b/gh-aw/pkg/workrev-parse nfig/composer/ve--show-toplevel /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -uns�� Wuwn/hpWLpek2A0oO-XO8Wuwn /tmp/go-build4184831366/b066/vet.cfg /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet ath ../../../.prgit (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v8 --jq .object.sha --noprofile git /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet linuxagent-logcogit commit.gpgsign sh /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -uns�� -unreachable=false /tmp/go-build4184831366/b129/vet.cfg 4831366/b374/vet.cfg ath ../../../.prgit (http block)
  • https://api.github.com/repos/actions/setup-go/git/ref/tags/v4
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-go/git/ref/tags/v4 --jq .object.sha --noprofile git 4831366/b250/vet.cfg h ../../../.pretgit gpg.program /sh /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -uns�� -unreachable=false /tmp/go-build4184831366/b094/vet.cfg 4831366/b368/vet.cfg ath ../../../.prgit (http block)
  • https://api.github.com/repos/actions/setup-node/git/ref/tags/v4
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-node/git/ref/tags/v4 --jq .object.sha --noprofile git /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet ignore-path ../.git gpg.program /home/REDACTED/.do--show-toplevel /opt/hostedtoolcache/go/1.25.0/x64/pkg/tool/linux_amd64/vet -uns�� -unreachable=false /tmp/go-build4184831366/b088/vet.cfg 4831366/b372/vet.cfg --noprofile (http block)
  • https://api.github.com/repos/actions/upload-artifact/git/ref/tags/v4
    • Triggering command: /usr/bin/gh gh api /repos/actions/upload-artifact/git/ref/tags/v4 --jq .object.sha te &#39;**/*.cjs&#39; &#39;**/*.ts&#39; &#39;**/*.json&#39; --ignore-pat-errorsas (http block)
  • https://api.github.com/repos/github/gh-aw-actions/git/ref/tags/v1.0.0
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw-actions/git/ref/tags/v1.0.0 --jq .object.sha te &#39;**/*.cjs&#39; &#39;**/*.ts&#39; &#39;**/*.json&#39; --ignore-path ../../../.prettierignore .cfg 64/pkg/tool/linux_amd64/vet (http block)
  • https://api.github.com/repos/github/gh-aw-actions/git/ref/tags/v1.2.3
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw-actions/git/ref/tags/v1.2.3 --jq .object.sha te &#39;**/*.cjs&#39; &#39;**/*.ts&#39; &#39;**/*.json&#39; --ignore-path ../../../.pret.prettierignore (http block)
  • https://api.github.com/repos/github/gh-aw/contents/.github%2Fworkflows%2Faudit-workflows.md
    • Triggering command: /opt/hostedtoolcache/node/24.14.0/x64/bin/node /opt/hostedtoolcache/node/24.14.0/x64/bin/node --experimental-import-meta-resolve --require /home/REDACTED/work/gh-aw/gh-aw/actions/setup/js/node_modules/vitest/suppress-warnings.cjs --conditions node --conditions development /home/REDACTED/work/gh-aw/gh-aw/actions/setup/js/node_modules/vitest/dist/workers/forks.js npushed-branch p/bin/git /usr/lib/git-cor--stdout main�� run --auto t --detach git k/gh-aw/gh-aw/no--count git (http block)
  • https://api.github.com/repos/github/gh-aw/git/ref/tags/v1.0.0
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw/git/ref/tags/v1.0.0 --jq .object.sha --noprofile (http block)
  • https://api.github.com/repos/nonexistent/action/git/ref/tags/v999.999.999
    • Triggering command: /usr/bin/gh gh api /repos/nonexistent/action/git/ref/tags/v999.999.999 --jq .object.sha files...&#34; (http block)
  • invalid.example.invalid
    • Triggering command: /usr/lib/git-core/git-remote-https /usr/lib/git-core/git-remote-https origin https://invalid.example.invalid/nonexistent-repo.git git add mental-zEfAah&#39; mental-zEfAah&#39; 0/x64/lib/node_modules/npm/node_modules/@npmcli/run-script/lib/node-gyp-bin/git /tmp/bare-incremgit gin/feature-brancheckout de/node/bin/git git clon�� /tmp/bare-incremental-JuP3Al . t ranch..feature-bgit Initial commit it git (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Add test suite for parse_threat_detection_results Add test suite for parse_threat_detection_results and harden threat detection failure handling Mar 25, 2026
@pelikhan pelikhan closed this Mar 25, 2026
Copilot stopped work on behalf of pelikhan due to an error March 25, 2026 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants