Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/worktrees/agent-a24d2354
Submodule agent-a24d2354 added at 4c591c
1 change: 1 addition & 0 deletions .claude/worktrees/agent-ae80961c
Submodule agent-ae80961c added at 4c591c
23 changes: 23 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
version: 2
updates:
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
time: "04:00"
open-pull-requests-limit: 10
labels: ["dependencies", "security"]
commit-message:
prefix: "deps"

- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
time: "04:00"
open-pull-requests-limit: 5
labels: ["dependencies", "ci"]
commit-message:
prefix: "ci"
2 changes: 1 addition & 1 deletion .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ jobs:

- name: Open PR with refreshed RESULTS.md
if: github.event_name != 'release'
uses: peter-evans/create-pull-request@v7
uses: peter-evans/create-pull-request@v8
with:
add-paths: benchmarks/competitive/RESULTS.md
branch: chore/refresh-benchmarks
Expand Down
69 changes: 69 additions & 0 deletions .github/workflows/security.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: Security

on:
push:
branches: [main]
pull_request:
branches: [main]
schedule:
- cron: "0 4 * * 1"

permissions:
contents: read
security-events: write

jobs:
bandit:
name: Bandit (SAST)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.13"
- run: pip install bandit
- run: bandit -r src/ -f screen -ll

semgrep:
name: Semgrep (OWASP rulesets)
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
steps:
- uses: actions/checkout@v4
- run: semgrep --config=p/python --config=p/security-audit --config=p/owasp-top-ten --error --severity ERROR --severity WARNING src/

pip-audit:
name: pip-audit (dependency CVEs)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.13"
- run: pip install pip-audit
- run: pip-audit --strict

gitleaks:
name: Gitleaks (secrets)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: gitleaks/gitleaks-action@v2

codeql:
name: CodeQL (semantic SAST)
runs-on: ubuntu-latest
permissions:
security-events: write
actions: read
contents: read
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
with:
languages: python
queries: security-extended,security-and-quality
- uses: github/codeql-action/analyze@v3
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.1.6] - 2026-05-16

### Security

- **[HIGH · CWE-290] `hawkapi.flags.get_flags` no longer trusts client-supplied identity headers.** The DI helper previously built `EvalContext(user_id=request.headers.get("x-user-id"), tenant_id=request.headers.get("x-tenant-id"))`, letting any attacker claim any user/tenant by setting the header — bypassing flag targeting for admin previews and sensitive feature toggles. `user_id` and `tenant_id` are now always `None` on the default context; operators MUST derive identity from an authenticated dependency and build their own `EvalContext`. The headers are still exposed on `ctx.headers` for non-identity targeting (region, A/B variant).
- **[HIGH · CWE-352] GraphQL `GET` request can no longer execute mutations or subscriptions via multi-operation documents.** The previous `_is_mutation` check inspected only the first non-comment token, so a document `query A {…} mutation B {…}` with `?operationName=B` snuck a mutation through the GET guard — a CSRF vector for image tags, prefetch, and cache poisoning. The handler now parses every top-level operation and rejects GET whenever the selected operation (or any of them, if `operationName` is omitted) is not a `query`.
- **[HIGH · CWE-770] GraphQL endpoint gained depth and timeout limits.** `make_graphql_handler` now accepts `max_depth: int | None = 15` (selection-set nesting cap, evaluated before executor dispatch) and `timeout_s: float | None = 30.0` (wraps the executor in `asyncio.wait_for`). A single deeply-nested or alias-explosion query can no longer pin a worker indefinitely.
- **[MEDIUM · CWE-200] GraphiQL UI is now opt-in.** `app.mount_graphql(...)` ships with `graphiql=False` by default; the in-browser explorer (and the schema introspection it implies) must be explicitly enabled for dev environments. Production deployments that previously relied on the default are unaffected because schema browsing is no longer exposed unless requested.
- **[MEDIUM] gRPC server now has a default concurrent-RPC cap.** `app.mount_grpc(...)` accepts `maximum_concurrent_rpcs: int | None = 1000` and passes it to `grpc.aio.server(...)`. Pass `None` to restore the previous unbounded behaviour.

### Added

- `docs/security/threat-model.md` — STRIDE per subsystem for the five 0.1.3–0.1.5 additions (doctor / gRPC / GraphQL / flags / bulkhead).
- `docs/security/code-review-2026-05-16.md` — focused security code review covering `security/**`, security-relevant middleware, and request/response boundaries.
- `docs/security/owasp-api-top10-2023.md` — OWASP API Security Top 10 (2023) compliance map.
- `SECURITY.md` — responsible-disclosure policy + supported-versions table.
- `.github/workflows/security.yml` — Bandit + Semgrep (p/python + p/security-audit + p/owasp-top-ten) + pip-audit + Gitleaks + CodeQL on every push, PR, and weekly cron.
- `.github/dependabot.yml` — weekly pip and github-actions update PRs.

### Changed

- `hawkapi.doctor.rules.deps.DOC050` PyPI fetch now explicit-scheme-checks the hard-coded URL and ships with `# nosemgrep` / `# nosec` markers — satisfies Bandit B310 and Semgrep `dynamic-urllib-use-detected` cleanly while preserving the `--offline` opt-out.

## [0.1.5] - 2026-04-19

### Fixed
Expand Down
81 changes: 81 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Security Policy

## Supported Versions

We patch security issues in the latest minor release. Earlier 0.1.x patches receive critical fixes only.

| Version | Supported |
|---------|--------------------|
| 0.1.5+ | ✅ active |
| < 0.1.5 | ⚠️ critical only |

## Reporting a Vulnerability

**Do not open a public issue for security problems.**

Email `hawkapi@users.noreply.github.com` with:

1. A clear description of the issue
2. Steps to reproduce (minimal proof-of-concept)
3. The framework version (`hawkapi --version`) and Python version
4. Your name / handle for credit (optional)

You will receive an acknowledgement within **72 hours**.

### Disclosure timeline

| Phase | Duration |
|------------------|----------|
| Acknowledgement | 72 hours |
| Triage + fix | 14 days |
| Coordinated release | 7 days after fix is ready |
| Public CVE | within 30 days of patch |

If a fix takes longer than 30 days we will keep you updated and credit you in the eventual advisory.

## Scope

In-scope:

- The `hawkapi` package on PyPI and the `ashimov/HawkAPI` repository
- The official plugins `hawkapi-sentry`, `hawkapi-otel`
- All CI workflows in this repository

Out of scope:

- Vulnerabilities in optional dependencies that have not been triggered through HawkAPI APIs (report those upstream)
- Issues that require root / local-machine compromise of the developer's machine
- Best-practice / hardening suggestions without a concrete exploit path — open a regular issue instead

## Security tooling

The repository runs five automated security scans on every push and weekly:

- **Bandit** — Python AST-level SAST
- **Semgrep** — OWASP Top 10 + python + security-audit rulesets
- **pip-audit** — known CVEs in installed dependencies
- **Gitleaks** — secrets in git history
- **CodeQL** — semantic SAST with security-extended + security-and-quality queries

Run them locally:

```bash
bandit -r src/ -ll
semgrep --config=p/python --config=p/security-audit --config=p/owasp-top-ten src/
pip-audit --strict
gitleaks detect --source .
```

## Hardening defaults

HawkAPI ships with secure defaults — `hawkapi doctor app:app` lints for 18 common misconfigurations across security, observability, performance, correctness and dependency hygiene. CI integration:

```bash
hawkapi doctor app:app --severity=warn
```

Exits non-zero on any warning, so it can gate deploys.

## Acknowledgements

Researchers who responsibly disclose security issues are credited in the [`CHANGELOG.md`](CHANGELOG.md) under the published fix.
77 changes: 77 additions & 0 deletions docs/security/code-review-2026-05-16.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# HawkAPI 0.1.5 Focused Security Code Review

Date: 2026-05-16
Scope: `security/**`, selected middleware, request/response boundaries, `staticfiles.py`.
Confidence threshold: HIGH only. Already-fixed items from 0.1.5 not repeated.

## HIGH

### H-1. GraphQL GET can execute mutations via multi-operation document (CWE-352)

- File: `src/hawkapi/graphql/_handler.py:43-50, 76-91`
- Issue: `_is_mutation` inspects only the first non-comment token. A request `GET /graphql?query=query+A+{…}+mutation+B+{…}&operationName=B` passes the GET-mutation guard and runs `mutation B`.
- Impact: CSRF on mutations — image tags, prefetch, cache poisoning can all trigger writes.
- Fix: parse the document with the executor's parser before dispatch; reject GET whenever any `OperationDefinition` whose `name` matches `operationName` (or the only operation, if `operationName` is omitted) has `operation != "query"`.

### H-2. Unauthenticated identity headers feed flag targeting (CWE-290)

- File: `src/hawkapi/flags/_di.py:21-26`
- Issue: `EvalContext(user_id=request.headers.get("x-user-id"), tenant_id=request.headers.get("x-tenant-id"))` trusts client-supplied headers as identity for flag evaluation.
- Impact: any flag gated on `user_id == "alice"` (admin previews, sensitive feature toggles) can be reached by anyone with `X-User-Id: alice`.
- Fix: remove implicit header read; require operator to pass explicit `context_factory`. Default `EvalContext()` must be empty.

### H-3. GraphQL endpoint has no depth, complexity or timeout limit (CWE-770)

- File: `src/hawkapi/graphql/_handler.py:119-128`
- Issue: `await executor(...)` runs to completion with no in-band budget; nested-selection / alias-explosion queries pin a worker indefinitely.
- Impact: single unauthenticated request → worker DoS.
- Fix: wrap executor call in `asyncio.wait_for(...)`; add `max_depth` that pre-walks the parsed document and short-circuits with 400.

## MEDIUM

### M-1. gRPC server runs with unbounded concurrent RPCs

- File: `src/hawkapi/grpc/_mount.py:70-74`
- Issue: `grpc.aio.server(...)` started without `maximum_concurrent_rpcs`; HTTP rate-limit / bulkhead middleware does not cover the gRPC port.
- Fix: expose `maximum_concurrent_rpcs: int | None = 1000` on `mount_grpc`.

### M-2. GraphiQL UI enabled by default

- File: `src/hawkapi/graphql/_handler.py:57-74`
- Issue: `graphiql: bool = True` default. UI ships in every deployment; combined with no introspection control the schema is publicly browsable in prod.
- Fix: change default to `False`.

### M-3. `RedisBulkheadBackend._try_acquire_once` is racy (CWE-662)

- File: `src/hawkapi/middleware/bulkhead_redis.py:50-67`
- Issue: `HSET` → `HLEN` → conditional `HDEL` pipelined but not transactional. Multiple acquirers may each `HSET` first then read `occupancy ≤ limit` and all stay registered.
- Fix: replace pipeline with Lua script doing `HLEN` first, returning 0 when full, only then `HSET`.

### M-4. CSRF middleware never validates the HMAC it generates

- File: `src/hawkapi/middleware/csrf.py:67-78, 197-224`
- Issue: `_generate_token` produces `{raw}.{hmac(raw)}`, but `_verify_token` is dead code — the unsafe-method path only does `hmac.compare_digest(submitted_token, cookie_token)`. `secret=` param is functionally unused.
- Impact: not exploitable today (double-submit equality is enough), but API misleads operators and future changes can regress silently.
- Fix: call `_verify_token` on both before equality check, or drop the dead `_verify_token`/`_secret` API.

## LOW

- **L-1.** Session middleware claims "optionally encrypted" but is sign-only (`middleware/session.py:25-27`). Docstring fix or add AES-GCM.
- **L-2.** CSRF cookie `HttpOnly=False` by design (`middleware/csrf.py:38`); document the trade-off.
- **L-3.** Multipart parser has no `max_parts` cap (`requests/form_data.py:96-146`). Default 1000 recommended.
- **L-4.** Multipart `Content-Type` boundary split breaks on quoted `;` (`requests/request.py:226-234`).
- **L-5.** Response header **names** not CRLF-scrubbed (`responses/response.py:62-69`); raise on `\r`/`\n`/`:` in key.
- **L-6.** `FileResponse` does not constrain `path` to a base dir (`responses/file_response.py:33-37`); add optional `root=`.
- **L-7.** CORS `expose_headers` / `allow_methods` not CRLF-scrubbed before joining (`middleware/cors.py:67-90`).
- **L-8.** `RateLimitMiddleware._default_key_func` uses raw `scope["client"]`; docstring should advise placing `TrustedProxyMiddleware` first.

## Executive Summary

| Severity | Count |
|---|---|
| CRITICAL | 0 |
| HIGH | 3 |
| MEDIUM | 4 |
| LOW | 8 |

Already-fixed items NOT re-reported (per 0.1.5 CHANGELOG): StreamingResponse double-execution, path-param coercion, GraphiQL SRI, FileFlagProvider mtime ordering, `_execute_trivial_route` lazy-import hoisting.
57 changes: 57 additions & 0 deletions docs/security/owasp-api-top10-2023.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# OWASP API Security Top 10 — 2023 Compliance Map

Date: 2026-05-16 · HawkAPI version: 0.1.6
Mapping each OWASP API Top 10 (2023) category to the framework's posture and the operator's responsibility.

| API# | Category | HawkAPI provides | Operator must |
|---|---|---|---|
| **API1** | Broken Object-Level Authorization (BOLA) | `@app.get(..., permissions=["..."])` declarative permission scope + `PermissionPolicy` resolver | Implement a resolver that maps the authenticated principal to per-object roles and call it from `permissions=`. The framework does not infer object ownership |
| **API2** | Broken Authentication | `HTTPBasic`, `HTTPBearer`, `APIKeyHeader/Query/Cookie`, `OAuth2PasswordBearer` extractors; `Security(dep, scopes=[…])`; `SecurityScopes` injection; `secrets.compare_digest` documented as the only safe way to compare extracted credentials | Choose a hashing scheme (`argon2id` recommended), implement rate-limit on `/login`, rotate signing keys |
| **API3** | Broken Object-Property-Level Authorization | `response_model=` filters at the framework level + `response_model_exclude_{none,unset,defaults}` for finer redaction; msgspec Struct field omission on serialisation | Define explicit response Structs for every route — never return raw ORM objects |
| **API4** | Unrestricted Resource Consumption | `RequestLimitsMiddleware` (query / header size), body-size cap via `HawkAPI(max_body_size=…)`, `RateLimitMiddleware` (token bucket), `RedisRateLimitMiddleware` (distributed), `Bulkhead` primitive, `request_timeout`, GraphQL `max_depth` + `timeout_s` (since 0.1.6), gRPC `maximum_concurrent_rpcs=1000` default (since 0.1.6) | Tune limits to traffic profile; place `TrustedProxyMiddleware` **before** rate-limit so per-IP buckets see the real client; use bulkheads around external dependencies |
| **API5** | Broken Function-Level Authorization | Same primitives as API1 — `permissions=` per route, `Security(dep, scopes=[…])`, OpenAPI `operation.security` reflection so reviewers can audit the matrix | Document the role / function matrix; add CI test that every route either declares `permissions=` or is explicitly marked public |
| **API6** | Unrestricted Access to Sensitive Business Flows | `Bulkhead` for per-flow concurrency caps, `RateLimitMiddleware` with custom `key_func` for per-tenant / per-flow budgets, `CSRFMiddleware` (double-submit) | Identify high-value flows (signup, refund, withdrawal); apply per-flow rate limits and human-verification (CAPTCHA / webauthn) outside the framework |
| **API7** | Server-Side Request Forgery (SSRF) | Framework itself makes no outbound HTTP calls except `doctor` DOC050 (hard-coded `https://pypi.org` + scheme validation, `--offline` opt-out) | When your handler fetches a URL, validate scheme + resolved IP against allow-list; never pass user input directly to `httpx`/`requests` |
| **API8** | Security Misconfiguration | `hawkapi doctor` ships 18 rules across 5 categories (security, observability, performance, correctness, deps). CSRF/Session use HMAC by default, GraphiQL ships disabled (since 0.1.6), gRPC reflection is opt-in, headers sanitised for CRLF, debug mode flagged by doctor | Run `hawkapi doctor app:app --severity=warn` as a deploy gate; pin actions to SHAs; enable Dependabot |
| **API9** | Improper Inventory Management | OpenAPI 3.1 auto-gen, `/docs` `/redoc` `/scalar` opt-in (set `docs_url=None` for prod), version routing (`@app.get("/users", version="v1")` + `VersionRouter`), deprecation headers (RFC 8594 `Deprecation` / `Sunset` / `Link`), `detect_breaking_changes` for API governance, contract smoke tests | Track every released version in changelog; mark deprecated routes; remove docs URLs in prod or gate behind auth |
| **API10** | Unsafe Consumption of APIs | `hawkapi gen-client` produces typed TS/Python clients with response-shape validation via msgspec; OpenAPI linter enforces `operation-id-required` / response descriptions | Validate downstream API responses; rate-limit + circuit-break upstream calls (use `CircuitBreakerMiddleware` / `RedisCircuitBreakerMiddleware` on the client side) |

## Framework-level controls summary

* **Input validation**: type-driven via msgspec / Pydantic; query / path / header / body / cookie all validated.
* **Output filtering**: `response_model`, `response_model_exclude_*`, explicit Struct return types.
* **Auth primitives**: HTTPBasic / HTTPBearer / APIKey* / OAuth2 + `Security(dep, scopes=[…])`.
* **Headers**: response value CRLF-stripped; SecurityHeadersMiddleware available; trusted-proxy + IP-allowlist.
* **DoS posture**: body-size, query/header limits, rate-limit (local + Redis), bulkhead, adaptive concurrency, GraphQL depth+timeout, gRPC max concurrent RPCs.
* **Secrets**: CSRF / Session use HMAC-SHA256, `secrets.compare_digest` documented for handler-side comparison.
* **CSRF**: double-submit cookie token, signed with HMAC-SHA256.
* **Logging**: `StructuredLoggingMiddleware`, W3C Trace Context, `request.id` middleware, gRPC observability interceptor.
* **Supply chain**: weekly Bandit + Semgrep (OWASP + python + security-audit rulesets) + pip-audit + Gitleaks + CodeQL via `.github/workflows/security.yml`; Dependabot weekly.

## CI gates

Required jobs that fail the build on findings:

| Job | Tool | Severity threshold |
|---|---|---|
| Bandit | bandit | Medium and above |
| Semgrep | semgrep (p/python + p/security-audit + p/owasp-top-ten) | ERROR + WARNING |
| pip-audit | pip-audit | Any CVE (`--strict`) |
| Gitleaks | gitleaks-action | Any leak |
| CodeQL | github/codeql-action | security-extended + security-and-quality queries |

Run locally:

```bash
bandit -r src/ -ll
semgrep --config=p/python --config=p/security-audit --config=p/owasp-top-ten src/
pip-audit --strict
gitleaks detect --source .
```

## What HawkAPI deliberately does NOT do

* Authentication / authorization business logic — operator provides via DI.
* Identity from `X-User-Id` / `X-Tenant-Id` headers — never trusted by framework.
* Auto-redaction of secrets in logs — operator opts in via `StructuredLoggingMiddleware` filter.
* WAF / rule-based payload scanning — out of scope (use Cloudflare / ModSecurity in front).
Loading
Loading