-
Notifications
You must be signed in to change notification settings - Fork 29
[docs] docs: unbloat threat-detection guide (28% reduction) #4705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -9,12 +9,7 @@ GitHub Agentic Workflows includes automatic threat detection to analyze agent ou | |||||||||||||||||||
|
|
||||||||||||||||||||
| ## How It Works | ||||||||||||||||||||
|
|
||||||||||||||||||||
| Threat detection provides an additional security layer that: | ||||||||||||||||||||
|
|
||||||||||||||||||||
| 1. **Analyzes Agent Output**: Reviews all safe output items (issues, comments, PRs) for malicious content | ||||||||||||||||||||
| 2. **Scans Code Changes**: Examines git patches for suspicious patterns, backdoors, and vulnerabilities | ||||||||||||||||||||
| 3. **Uses Workflow Context**: Leverages the workflow source to distinguish legitimate actions from threats | ||||||||||||||||||||
| 4. **Runs Automatically**: Executes after the main agentic job completes but before safe outputs are applied | ||||||||||||||||||||
| Threat detection provides an additional security layer by analyzing agent output for malicious content, scanning code changes for suspicious patterns, using workflow context to distinguish legitimate actions from threats, and running automatically after the main job completes but before safe outputs are applied. | ||||||||||||||||||||
|
|
||||||||||||||||||||
| **Security Architecture:** | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
|
@@ -49,11 +44,7 @@ safe-outputs: | |||||||||||||||||||
| create-pull-request: | ||||||||||||||||||||
| ``` | ||||||||||||||||||||
|
|
||||||||||||||||||||
| The default configuration uses AI-powered analysis with the Agentic engine to detect: | ||||||||||||||||||||
|
|
||||||||||||||||||||
| - **Prompt Injection**: Malicious instructions attempting to manipulate AI behavior | ||||||||||||||||||||
| - **Secret Leaks**: Exposed API keys, tokens, passwords, or credentials | ||||||||||||||||||||
| - **Malicious Patches**: Code changes introducing vulnerabilities, backdoors, or suspicious patterns | ||||||||||||||||||||
| The default configuration uses AI-powered analysis to detect prompt injection (malicious instructions manipulating AI behavior), secret leaks (exposed API keys, tokens, passwords, credentials), and malicious patches (code changes introducing vulnerabilities, backdoors, or suspicious patterns). | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ## Configuration Options | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
|
@@ -89,13 +80,12 @@ safe-outputs: | |||||||||||||||||||
|
|
||||||||||||||||||||
| **Configuration Fields:** | ||||||||||||||||||||
|
|
||||||||||||||||||||
| - **`enabled`** (boolean): Enable or disable threat detection. Default: `true` when safe-outputs exist | ||||||||||||||||||||
| - **`prompt`** (string): Additional custom instructions appended to the default threat detection prompt | ||||||||||||||||||||
| - **`engine`** (string | object | false): AI engine configuration for detection | ||||||||||||||||||||
| - String format: `"copilot"` | ||||||||||||||||||||
| - Object format: Full engine configuration (same as main workflow engine) | ||||||||||||||||||||
| - `false`: Disable AI-based detection, run only custom steps | ||||||||||||||||||||
| - **`steps`** (array): Additional GitHub Actions steps to run after AI analysis | ||||||||||||||||||||
| | Field | Type | Description | | ||||||||||||||||||||
| |-------|------|-------------| | ||||||||||||||||||||
| | `enabled` | boolean | Enable or disable detection (default: `true` when safe-outputs exist) | | ||||||||||||||||||||
| | `prompt` | string | Custom instructions appended to default detection prompt | | ||||||||||||||||||||
| | `engine` | string/object/false | AI engine config (`"copilot"`, full config object, or `false` for no AI) | | ||||||||||||||||||||
| | `steps` | array | Additional GitHub Actions steps to run after AI analysis | | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ## AI-Based Detection (Default) | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
|
@@ -116,10 +106,7 @@ safe-outputs: | |||||||||||||||||||
| Analyze code and create pull requests with improvements. | ||||||||||||||||||||
| ``` | ||||||||||||||||||||
|
|
||||||||||||||||||||
| The AI engine receives the workflow source context and analyzes: | ||||||||||||||||||||
| - Agent output items (issues, comments, PRs) | ||||||||||||||||||||
| - Git patch files with code changes | ||||||||||||||||||||
| - Workflow intent and legitimate use cases | ||||||||||||||||||||
| The AI engine receives the workflow source context and analyzes agent output items (issues, comments, PRs), git patch files with code changes, and workflow intent to distinguish legitimate actions. | ||||||||||||||||||||
|
|
||||||||||||||||||||
| **Output Format:** | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
|
@@ -212,20 +199,9 @@ safe-outputs: | |||||||||||||||||||
| path: /tmp/gh-aw/threat-detection/ | ||||||||||||||||||||
| ``` | ||||||||||||||||||||
|
|
||||||||||||||||||||
| **Available Artifacts:** | ||||||||||||||||||||
|
|
||||||||||||||||||||
| Custom steps have access to these downloaded artifacts: | ||||||||||||||||||||
| **Available Artifacts:** Custom steps have access to `/tmp/gh-aw/threat-detection/prompt.txt` (workflow prompt), `agent_output.json` (safe output items), and `aw.patch` (git patch file). | ||||||||||||||||||||
|
|
||||||||||||||||||||
| - `/tmp/gh-aw/threat-detection/prompt.txt` - Workflow prompt | ||||||||||||||||||||
| - `/tmp/gh-aw/threat-detection/agent_output.json` - Safe output items | ||||||||||||||||||||
| - `/tmp/gh-aw/threat-detection/aw.patch` - Git patch file | ||||||||||||||||||||
|
|
||||||||||||||||||||
| **Execution Order:** | ||||||||||||||||||||
|
|
||||||||||||||||||||
| 1. Download artifacts (prompt, output, patch) | ||||||||||||||||||||
| 2. Run AI-based analysis (if engine not disabled) | ||||||||||||||||||||
| 3. Execute custom steps | ||||||||||||||||||||
| 4. Upload detection log artifact | ||||||||||||||||||||
| **Execution Order:** Download artifacts → Run AI analysis (if enabled) → Execute custom steps → Upload detection log. | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ## Example: LlamaGuard Integration | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
|
@@ -243,74 +219,32 @@ safe-outputs: | |||||||||||||||||||
| uses: actions/github-script@v7 | ||||||||||||||||||||
| with: | ||||||||||||||||||||
| script: | | ||||||||||||||||||||
| const fs = require('fs'); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Install Ollama | ||||||||||||||||||||
| // Install and start Ollama service | ||||||||||||||||||||
| await exec.exec('curl', ['-fsSL', 'https://ollama.com/install.sh', '-o', '/tmp/install.sh']); | ||||||||||||||||||||
| await exec.exec('sh', ['/tmp/install.sh']); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Start Ollama service | ||||||||||||||||||||
| exec.exec('ollama', ['serve'], { detached: true }); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Wait for service | ||||||||||||||||||||
| let ready = false; | ||||||||||||||||||||
| for (let i = 0; i < 30; i++) { | ||||||||||||||||||||
| try { | ||||||||||||||||||||
| await exec.exec('curl', ['-f', 'http://localhost:11434/api/version'], { silent: true }); | ||||||||||||||||||||
| ready = true; | ||||||||||||||||||||
| break; | ||||||||||||||||||||
| } catch (e) { | ||||||||||||||||||||
| await new Promise(r => setTimeout(r, 1000)); | ||||||||||||||||||||
| } | ||||||||||||||||||||
| } | ||||||||||||||||||||
|
|
||||||||||||||||||||
| if (!ready) { | ||||||||||||||||||||
| core.setFailed('Ollama service failed to start'); | ||||||||||||||||||||
| return; | ||||||||||||||||||||
| } | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Pull LlamaGuard model | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Pull model and scan output | ||||||||||||||||||||
| await exec.exec('ollama', ['pull', 'llama-guard3:1b']); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Scan agent output | ||||||||||||||||||||
| const outputPath = '/tmp/gh-aw/threat-detection/agent_output.json'; | ||||||||||||||||||||
| if (fs.existsSync(outputPath)) { | ||||||||||||||||||||
| const content = fs.readFileSync(outputPath, 'utf8'); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| const response = await exec.getExecOutput('curl', [ | ||||||||||||||||||||
| '-X', 'POST', | ||||||||||||||||||||
| 'http://localhost:11434/api/chat', | ||||||||||||||||||||
| '-H', 'Content-Type: application/json', | ||||||||||||||||||||
| '-d', JSON.stringify({ | ||||||||||||||||||||
| model: 'llama-guard3:1b', | ||||||||||||||||||||
| messages: [{ role: 'user', content }], | ||||||||||||||||||||
| stream: false | ||||||||||||||||||||
| }) | ||||||||||||||||||||
| ]); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| const result = JSON.parse(response.stdout); | ||||||||||||||||||||
| const output = result.message?.content || ''; | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Check if safe | ||||||||||||||||||||
| const isSafe = output.toLowerCase().trim() === 'safe' || output.includes('s8'); | ||||||||||||||||||||
|
|
||||||||||||||||||||
| if (!isSafe) { | ||||||||||||||||||||
| core.setFailed(`LlamaGuard detected threat: ${output}`); | ||||||||||||||||||||
| } else { | ||||||||||||||||||||
| core.info('✅ Content appears safe'); | ||||||||||||||||||||
| } | ||||||||||||||||||||
| } | ||||||||||||||||||||
|
|
||||||||||||||||||||
| timeout-minutes: 20 # Allow time for model download | ||||||||||||||||||||
| const content = require('fs').readFileSync('/tmp/gh-aw/threat-detection/agent_output.json', 'utf8'); | ||||||||||||||||||||
|
||||||||||||||||||||
| const content = require('fs').readFileSync('/tmp/gh-aw/threat-detection/agent_output.json', 'utf8'); | |
| // In production, check file existence before reading. Here, we handle missing file gracefully. | |
| let content; | |
| try { | |
| content = require('fs').readFileSync('/tmp/gh-aw/threat-detection/agent_output.json', 'utf8'); | |
| } catch (err) { | |
| core.setFailed('agent_output.json not found: ' + err.message); | |
| return; | |
| } |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition isSafe check on line 237 is overly simplified and may produce incorrect results. The original implementation checked for both output.toLowerCase().trim() === 'safe' and output.includes('s8'), but this only checks if 'safe' appears anywhere in the content (case-insensitive). This could lead to false negatives if the response contains 'safe' as part of a larger warning message. Consider adding a comment noting this simplification or being more explicit about the check (e.g., checking for 'safe' at the start of the response).
| const isSafe = result.message?.content.toLowerCase().includes('safe'); | |
| // Check for exact "safe" response or model-specific code (e.g., "s8"). | |
| const output = result.message?.content?.toLowerCase().trim(); | |
| const isSafe = output === 'safe' || output.includes('s8'); |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message "LlamaGuard detected threat" on line 238 is less informative than the original implementation which included the actual threat output. When a threat is detected, users need to know what the threat was, not just that one exists. Consider including at least a reference to checking the logs or adding ${result.message?.content} to provide actionable information.
| if (!isSafe) core.setFailed('LlamaGuard detected threat'); | |
| if (!isSafe) core.setFailed(`LlamaGuard detected threat: ${result.message?.content}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplified code example in lines 222-238 has a critical flaw: the Ollama service is started in detached mode without any wait or readiness check. This can lead to race conditions where the subsequent commands (pull model, curl API) fail because the service isn't ready yet.
The original implementation included proper service readiness polling which was essential. While the goal is to simplify the example, removing all error handling and readiness checks makes this code unreliable in practice. Consider keeping at least a minimal wait/retry mechanism or adding a comment warning that service readiness checking is needed for production use.