Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 36 additions & 20 deletions docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,27 +57,43 @@ Developed by GitHub and Microsoft, workflows run with added guardrails, using sa

## Guardrails Built-In

AI agents can be manipulated into taking unintended actions—through malicious repository content, compromised tools, or prompt injection. GitHub Agentic Workflows addresses this with five security layers that work together to contain the impact of a confused or compromised agent.

### Read-only tokens

The AI agent receives a GitHub token scoped to read-only permissions. Even if the agent attempts to create a pull request, push code, or delete a file, the underlying token simply doesn't allow it. The agent can observe your repository; it cannot change it.

### Zero secrets in the agent

The agent process never receives write tokens, API keys, or other sensitive credentials. Those secrets exist only in separate, isolated jobs that run _after_ the agent has finished and its output has passed review. A compromised agent has nothing to steal and no credentials to misuse.

### Containerized with a network firewall

The agent runs inside an isolated container. A built-in network firewall—the [Agent Workflow Firewall](/gh-aw/introduction/architecture/#agent-workflow-firewall-awf)—routes all outbound traffic through a Squid proxy enforcing an explicit domain allowlist. Traffic to any other destination is dropped at the kernel level, so a compromised agent cannot exfiltrate data or call out to unexpected servers.

### Safe outputs with strong guardrails

The agent cannot write to GitHub directly. Instead, it produces a structured artifact describing its intended actions—for example, "create an issue with this title and body." A separate job with [scoped write permissions](/gh-aw/reference/safe-outputs/) reads that artifact and applies only what your workflow explicitly permits: hard limits per operation (such as a maximum of one issue per run), required title prefixes, and label constraints. The agent requests; a gated job decides.

### Agentic threat detection
AI agents can be manipulated by prompt injection, malicious repository content, or compromised tools. GitHub Agentic Workflows uses layered controls to keep each run contained: sandboxing limits where code can execute, scoped permissions limit what it can request, and gated outputs ensure only approved actions reach GitHub.

```mermaid
flowchart LR
INPUT["Repository + Prompt Input"] --> TOKENS["Read-only Token"]
TOKENS --> SECRETS["No Secrets in Agent"]
SECRETS --> SANDBOX["Sandbox + Network Firewall"]
SANDBOX --> SAFE["Safe Outputs Gate"]
SAFE --> DETECT["Threat Detection Scan"]
DETECT --> APPLY["Scoped Write Job"]
```

Before any output is applied, a dedicated [threat detection job](/gh-aw/reference/threat-detection/) runs an AI-powered scan of the agent's proposed changes. It checks for prompt injection attacks, leaked credentials, and malicious code patterns. If anything looks suspicious, the workflow fails immediately and nothing is written to your repository.
<CardGrid>
<Card title="Read-only token">
The agent can read repository state, but it cannot
push commits or write to issues directly.
</Card>
<Card title="No secrets in agent runtime">
Sensitive credentials stay in isolated downstream
jobs, not inside the agent process.
</Card>
<Card title="Sandbox + network firewall">
The agent runs in a container behind the
[Agent Workflow Firewall](/gh-aw/introduction/architecture/#agent-workflow-firewall-awf)
and can only reach allowed destinations.
</Card>
<Card title="Safe outputs gate">
Requested actions are validated against your
configured [safe outputs](/gh-aw/reference/safe-outputs/)
policy before anything is applied.
</Card>
<Card title="Threat detection">
A dedicated
[threat detection job](/gh-aw/reference/threat-detection/)
scans proposed outputs and blocks suspicious changes.
</Card>
</CardGrid>

See the [Security Architecture](/gh-aw/introduction/architecture/) for a full breakdown of the layered defense-in-depth model.

Expand Down