Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/secret-scan.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run gitleaks
uses: gitleaks/gitleaks-action@v2
env:
Expand Down
76 changes: 44 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,20 @@ Kubernetes orchestrator that turns GitHub issues into pull requests using AI age

This project automates the **Issue -> Label -> Pull Request** flow: an `ai-pr-*` label on an issue triggers an AI worker that clones the repo, solves the problem, and opens a PR.

It avoids vendor lock-in with 3 built-in providers:
It avoids AI vendor lock-in with 3 built-in worker providers:

| Label | Provider | Backend |
| -------------- | ----------- | ----------------------- |
| `ai-pr-claude` | Claude Code | Anthropic |
| `ai-pr-codex` | Codex | OpenAI |
| `ai-pr-aider` | Aider | OpenRouter (extensible) |

The architecture is designed to easily add more providers (see `CONTRIBUTING.md`).
The source hosting layer is abstracted behind `SourceProvider`; GitHub is the
only built-in source provider today. See
`docs/adr/0001-source-provider-abstraction.md` for the design decision.

The worker architecture is designed to easily add more AI providers (see
`CONTRIBUTING.md`).

Tested on: VPS / 8 GB RAM / 4 vCPU / k3s single-node.

Expand All @@ -33,11 +38,12 @@ GitHub Issue (label ai-pr-*)
POST /webhook/github
|
v
+-------------------+
| Orchestrator | Deployment FastAPI
| app/app.py |
+--------+----------+
| creates a K8s Job based on the provider
+-------------------+
| Orchestrator | Deployment FastAPI
| app/app.py |
| providers/source | GitHub webhook + clone credentials
+--------+----------+
| creates a K8s Job based on the AI worker provider
v
+----------------+ +----------------+ +----------------+
| worker-claude | | worker-codex | | worker-aider |
Expand All @@ -48,7 +54,9 @@ GitHub Issue (label ai-pr-*)
clone > AI fix > commit > push > PR
```

**GitHub auth flow**: the orchestrator generates an ephemeral installation token (1h) via GitHub App JWT. Workers never receive the PEM key.
**Source auth flow**: `GitHubProvider` generates an ephemeral installation token
(1h) via GitHub App JWT and returns git clone credentials to the orchestrator.
Workers receive only the short-lived token and never receive the PEM key.

---

Expand All @@ -57,7 +65,7 @@ GitHub Issue (label ai-pr-*)
### 1. Prerequisites

- A VPS (or machine) with 4 vCPU / 8 GB RAM minimum
- API keys for your desired providers
- API keys for your desired AI worker providers
- **Ansible option**: `ansible` installed locally + SSH root access to the VPS
- **Manual option**: k3s, Docker, and `kubectl` installed on the VPS

Expand Down Expand Up @@ -251,22 +259,24 @@ curl -s -X POST http://127.0.0.1:8080/jobs/run -H "Authorization: Bearer <ADMIN_

| Surface | Risk | Mitigation |
| --- | --- | --- |
| **Incoming webhook** | Fake webhook to trigger a job | HMAC-SHA256 signature (`WEBHOOK_SECRET`) verified on every request |
| **Admin endpoints** | Unauthorized access | Bearer token (`ADMIN_TOKEN`), not exposed via Ingress |
| **GitHub App private key** | Theft = full access | PEM in orchestrator pod only, workers receive an ephemeral token (1h) |
| **GitHub token (workers)** | Compromised worker | Token stored in ephemeral K8s Secret (ownerReference to Job), scoped to one installation, expires in 1h, ephemeral container |
| **AI API keys** | Leak | Injected via K8s `secretKeyRef`, one secret per provider |
| **AI code execution** | Malicious code | Workers run as non-root, ephemeral, no persistent volume |
| **Git credentials** | Token in logs | Auth via `GIT_ASKPASS`, no token in URLs |
| **K8s RBAC** | Out-of-scope access | Role limited to `ai-bot` namespace, workers without ServiceAccount |
| **Incoming webhook** | Fake webhook to trigger a job | HMAC-SHA256 signature (`WEBHOOK_SECRET`) verified on every request |
| **Admin endpoints** | Unauthorized access | Bearer token (`ADMIN_TOKEN`), not exposed via Ingress |
| **Source-provider private key** | Theft = source repo access | Secret stays in orchestrator pod only; workers receive short-lived clone credentials |
| **GitHub token (workers)** | Compromised worker | Token stored in ephemeral K8s Secret (ownerReference to Job), scoped to one installation, expires in 1h, ephemeral container |
| **AI API keys** | Leak | Injected via K8s `secretKeyRef`, one secret per AI worker provider |
| **AI code execution** | Malicious code | Workers run as non-root, ephemeral, no persistent volume |
| **Git credentials** | Token in logs | Auth via `GIT_ASKPASS`, no token in URLs |
| **K8s RBAC** | Out-of-scope access | Role limited to `ai-bot` namespace, workers without ServiceAccount |

### Production Recommendations

- Use a secrets operator (Sealed Secrets, External Secrets)
- Restrict RBAC access to Secrets and Jobs
- Monitor jobs > 30 min (token expires at 1h)
- Regularly rotate `WEBHOOK_SECRET` and `ADMIN_TOKEN`
- See `SECURITY.md` for vulnerability reporting
- Restrict RBAC access to Secrets and Jobs
- Monitor jobs > 30 min (token expires at 1h)
- Regularly rotate `WEBHOOK_SECRET` and `ADMIN_TOKEN`
- Review any new `SourceProvider` for webhook verification, credential scope,
and logging behavior
- See `SECURITY.md` for vulnerability reporting and provider security rules

---

Expand All @@ -292,9 +302,10 @@ sudo systemctl status k3s --no-pager -l

```text
.
|-- app/
| |-- app.py # FastAPI Orchestrator
| `-- requirements.txt
|-- app/
| |-- app.py # FastAPI Orchestrator
| |-- config.py # Runtime env/config
| `-- requirements.txt
|-- images/
| |-- orchestrator/Dockerfile
| |-- worker-claude/ # Dockerfile + run.sh
Expand All @@ -307,21 +318,22 @@ sudo systemctl status k3s --no-pager -l
| |-- ai-issue-*.yaml # Manual jobs per provider
| |-- debug-*.yaml # Debug jobs per provider
| `-- secrets/ # Templates (no values)
|-- providers/
| |-- git_workflow.sh # Shared Git logic
| |-- claude_code.sh
| |-- openai.sh
| `-- aider.sh
|-- providers/
| |-- source/ # SourceProvider interface + GitHub implementation
| |-- git_workflow.sh # Shared Git logic
| |-- claude_code.sh
| |-- openai.sh
| `-- aider.sh
|-- ansible/
| |-- playbook.yml # Full VPS deployment
| |-- inventory.ini
| |-- inventory-local.ini
| |-- inventory-prod.ini # gitignored
| |-- requirements.yml # Ansible collections
| `-- group_vars/vps.yml
|-- docs/
| |-- catalog-info.yaml # Backstage service catalog
| `-- workspace.dsl # C4 architecture (Structurizr)
|-- docs/
| |-- adr/ # Architecture decision records
| `-- workspace.dsl # C4 architecture (Structurizr)
|-- .github/
| `-- workflows/secret-scan.yml # CI secret scanning
|-- CONTRIBUTING.md
Expand Down
27 changes: 26 additions & 1 deletion SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,38 @@ If your report contains secrets, rotate them immediately after sharing.

## Scope Notes

- This project handles sensitive material (GitHub App private key, API keys, webhook secret, admin token).
- This project handles sensitive material: source-provider credentials, AI API keys, webhook secrets, admin tokens, and short-lived git clone tokens.
- GitHub is currently the only built-in source provider. Its App private key must stay in the orchestrator pod only.
- Workers must receive only short-lived source credentials, never long-lived source-provider private keys.
- Never commit real secret values to git history.
- Kubernetes secret manifests under `k8s/secrets/` are templates only.
- Webhook fixture files under `tests/` must be anonymized and must not contain real repository names, users, tokens, signatures, private issue content, or internal URLs.

## Source Provider Security

Source providers live under `providers/source/` and own provider-specific webhook
verification, event parsing, API calls, and git clone credentials.

Provider implementations must:

- Verify webhook authenticity before parsing or acting on the payload.
- Treat webhook bodies, `raw` payloads, comments, issue bodies, and PR bodies as untrusted input.
- Return the shortest-lived and narrowest-scoped git credentials available from `get_clone_credentials(repo)`.
- Avoid logging tokens, webhook signatures, private keys, issue bodies, comments, or full raw payloads.
- Keep provider-specific secrets in the orchestrator, not in worker images or manifests.
- Document any provider that cannot issue short-lived repo-scoped credentials.

For GitHub, `get_clone_credentials(repo)` uses a GitHub App installation token.
Workers receive that token through an ephemeral Kubernetes Secret and do not
receive the GitHub App PEM key.

## Hardening Expectations

- Restrict public exposure to `/webhook/github` only.
- Keep admin endpoints (`/secrets/github-app`, `/jobs/run`) private.
- Use strong random `ADMIN_TOKEN` values and rotate credentials regularly.
- Run workers only in isolated, ephemeral environments.
- Keep worker ServiceAccount permissions minimal. Workers should not need access to Kubernetes Secrets.
- Prefer a secrets operator such as Sealed Secrets or External Secrets for production deployments.
- Rotate webhook secrets and source-provider credentials after suspected exposure.
- Review new source-provider implementations for webhook verification, credential lifetime, token scope, and logging behavior before enabling them.
1 change: 1 addition & 0 deletions app/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Loading
Loading