Skip to content

Security hardening: SSRF, headers, Redis TLS, container lockdown#208

Merged
RafaelPo merged 17 commits intomainfrom
feat/security-hardening
Feb 24, 2026
Merged

Security hardening: SSRF, headers, Redis TLS, container lockdown#208
RafaelPo merged 17 commits intomainfrom
feat/security-hardening

Conversation

@RafaelPo
Copy link
Contributor

Summary

  • Add transport-aware server instructions and SSRF protection (redirect validation, hostname re-check transport)
  • Security headers middleware, rate limiter fail-open with in-memory fallback, body size limits
  • Shell injection prevention via shlex.quote, Redis TLS support, container lockdown (non-root, read-only FS)
  • User isolation: per-user Redis key namespacing, token encryption at rest
  • Fix Docker build (--frozen removal), Redis healthcheck, and .dockerignore

Depends on PR #207 (unified input API).

Test plan

  • All 283 unit tests pass (26 new tests for security features)
  • SSRF protection blocks private IPs, metadata endpoints, unresolvable hosts
  • Rate limiter falls back to in-memory when Redis unavailable
  • Body size middleware rejects oversized uploads
  • Upload handler returns generic error messages (no info leaks)
  • Manual: Docker Compose build and healthcheck verification

🤖 Generated with Claude Code

@RafaelPo RafaelPo force-pushed the feat/security-hardening branch from d20b24f to dc1789b Compare February 24, 2026 19:57
@RafaelPo RafaelPo force-pushed the feat/security-hardening branch from dc1789b to bcd6dab Compare February 24, 2026 20:02
Base automatically changed from feat/unified-input-api-v2 to main February 24, 2026 20:05
RafaelPo and others added 4 commits February 24, 2026 20:05
Instructions:
- Add _INSTRUCTIONS_STDIO and _INSTRUCTIONS_HTTP to app.py
- HTTP instructions guide agent to use request_upload_url for local files
- server.py sets instructions based on transport mode

Security & correctness (from parallel review):
- SSRF protection: block internal IPs in URL fetching
- __Host- cookie prefix for auth state cookie
- Rate limiter: in-memory fallback when Redis unavailable
- Upload endpoint: use caller's API token, limit CSV rows
- Poll token via Authorization header (not just query param)
- Progress URL no longer leaks poll token in URL

Also:
- Update upload_data docstring and error message for HTTP mode
- Sync manifest.json description

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ckdown

- Fix shell injection in upload curl command via shlex.quote()
- Add SecurityHeadersMiddleware (HSTS, X-Content-Type-Options, X-Frame-Options,
  Cache-Control, Referrer-Policy) on all HTTP responses
- Add Redis TLS support (REDIS_SSL setting)
- Stream URL fetch with size limit (max_fetch_size_bytes) to prevent OOM
- Validate UPLOAD_SECRET at startup instead of first request
- Warn on missing REDIS_PASSWORD in HTTP mode at startup
- Enable rate limiting in --no-auth mode; cap in-memory fallback at 50K entries
- Container hardening: cap_drop ALL, no-new-privileges, read-only rootfs,
  CPU limits, REDISCLI_AUTH for healthcheck, pinned Redis image, network isolation
- Add --frozen to Dockerfile uv sync to prevent lockfile drift
- Sanitize SSRF error (no longer leaks resolved IPs)
- Add repr=False on sensitive config fields to prevent accidental logging
- Add Vary: Origin, Access-Control-Max-Age on CORS preflight responses
- Default to 127.0.0.1 in --no-auth mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ockerignore

- Fix SSRF DNS-rebinding TOCTOU: add _SSRFSafeTransport that re-validates
  hostnames at request time; block GKE metadata hostname; cap max_redirects=5
- Add user-scoped data isolation: record task owner (JWT sub) on submission,
  check ownership in everyrow_results_http to prevent cross-user access
- Encrypt tokens at rest in Redis using Fernet (derived from UPLOAD_SECRET):
  task tokens, poll tokens, auth codes, refresh tokens, upload metadata
- Add root .dockerignore with deny-all allowlist to prevent secrets leaking
  into Docker build context

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove --frozen from Dockerfile (incompatible with --no-sources)
- Fix Redis healthcheck: pass REDIS_PASSWORD as env var, use -a flag
- Remove cap_drop ALL, read_only, tmpfs from redis service (prevents
  Redis user switching)
- Fix test_auth cookie prefix assertion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo RafaelPo force-pushed the feat/security-hardening branch from bcd6dab to 5cad02e Compare February 24, 2026 20:06
@RafaelPo
Copy link
Contributor Author

@claude code review

@github-actions

This comment was marked as outdated.

If the api_token field in upload metadata is missing or corrupted,
decrypt_value raises InvalidToken (from Fernet) which would cause an
unhandled 500. Catch the exception and fall through to the existing
empty-token → 403 response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo and others added 8 commits February 24, 2026 20:12
…n, healthcheck

- Fix __Host-mcp_auth_state cookie: handle_start now sets the correct
  prefixed name, matching handle_callback and delete_cookie
- Block IPv4-mapped IPv6 SSRF bypass (::ffff:127.0.0.1): unwrap mapped
  addresses before checking against blocked networks
- Use HKDF instead of raw SHA-256 for Fernet key derivation
- Use REDISCLI_AUTH env var instead of -a flag in Redis healthcheck to
  avoid exposing password in docker inspect output

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pd.read_csv(nrows=N) silently drops rows beyond N, returning a 201
success that misleads users into thinking the full file was ingested.

Now: parse the full CSV, then return 413 if it exceeds max_upload_rows.
Also lower the default from 100k to 50k rows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The cross-user ownership guard on everyrow_results_http silently
continued when verification failed (fail-open). Additionally,
everyrow_progress and everyrow_cancel had no ownership checks at all,
allowing any authenticated user to poll or cancel another user's tasks.

- Extract shared _check_task_ownership helper (fail-closed: denies
  access when owner is missing or cannot be verified)
- Add ownership guard to everyrow_progress and everyrow_cancel
- Make store_task_owner in tool_helpers raise on missing auth so tasks
  are never created without an owner record
- Update tests to mock auth context in HTTP mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…iption

- Replace input_csv/left_csv/right_csv with inline data/left_data/right_data
  across all 7 integration tests (screen, rank, dedupe, merge, agent,
  single_agent x2)
- Revert manifest.json upload_data description back to "URL or local file"

All 7 integration tests pass against the live API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests the stdio upload flow: write CSV to disk, upload via
everyrow_upload_data to get an artifact_id, then pass it to
everyrow_screen. Verifies the artifact path label and that results
match the inline-data test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add trusted_ip_header setting (default X-Forwarded-For) instead of
  hardcoded Cloudflare + XFF header checks in middleware
- Move get_access_token import to module level in tools.py and
  tool_helpers.py; update mock patch paths in tests
- Fix upload_data docstring to match manifest ("URL or local file")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…or paths

- TestArtifactReuseIntegration: upload once, reuse artifact_id in screen + rank
- TestUrlUploadIntegration: upload CSV from public GitHub URL
- TestCancelIntegration: cancel running task + cancel nonexistent task
- TestErrorPathsIntegration: bad UUID, empty data, both inputs, no input, merge missing side

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Explicit Cloudflare header config so GKE deployments can override
with X-Forwarded-For without risk of header spoofing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo
Copy link
Contributor Author

@claude code review

@github-actions
Copy link
Contributor

github-actions bot commented Feb 24, 2026

Claude finished @RafaelPo's task in 2m 17s —— View job


Code Review: Security Hardening PR

  • Read CLAUDE.md and repo guidelines
  • Review SSRF protection (utils.py)
  • Review security headers middleware (middleware.py)
  • Review Redis TLS & key namespacing (redis_store.py)
  • Review token encryption at rest (auth.py)
  • Review upload handler changes (uploads.py)
  • Review route changes (routes.py, app.py, http_config.py)
  • Review tool/server changes (tools.py, tool_helpers.py, server.py)
  • Review config changes (config.py, models.py, templates.py)
  • Review Docker/compose changes
  • Review test coverage
  • Post final review summary

This is a well-structured security hardening PR. The defense-in-depth approach is solid, with multiple layers of protection across SSRF, authentication, encryption, and container security. I reviewed all 26 changed files; here is my analysis.

What looks good

SSRF Protection (utils.py) — The multi-layer approach is thorough: pre-flight DNS validation, redirect validation via httpx event hooks, and a custom _SSRFSafeTransport that re-validates hostnames at request time to narrow the TOCTOU window. IPv4-mapped IPv6 unwrapping (::ffff:127.0.0.1127.0.0.1) is correctly handled. Streaming response size enforcement is a nice touch.

Rate Limiter (middleware.py) — Redis-primary with in-memory fallback is the right pattern. Thread-safe in-memory counters with periodic cleanup and a 50k hard cap prevent unbounded memory growth. The BodySizeLimitMiddleware is a clean pure-ASGI implementation that handles chunked transfers correctly.

Token Encryption (redis_store.py) — HKDF → Fernet for at-rest encryption of tokens in Redis is solid. The build_key sanitizer prevents Redis key injection. Sentinel support and TLS options are well-configured.

Auth (auth.py) — Atomic consume via GETDEL, client-ID mismatch re-storage, input length caps (256 chars), __Host- cookie prefix, rate limiting on auth endpoints, and JWKS lock for concurrent access are all well-implemented.

Upload System (uploads.py) — shlex.quote for shell injection prevention, hmac.compare_digest for timing-safe comparison, consume-on-use metadata, and generic error messages (no info leaks) are all correct.

Container Lockdown (Dockerfile/compose) — Non-root user, read-only filesystem, CAP_DROP ALL, no-new-privileges, resource limits, private Redis network, and 127.0.0.1 port binding are all best practices.

Test Coverage — 26 new tests covering SSRF blocking, rate limiter fallback, body size enforcement, upload validation, and generic error messages. Coverage is comprehensive.


Issues found

1. Incomplete cloud metadata hostname blocklist

utils.py:33-38_BLOCKED_HOSTNAMES only covers Google Cloud. While _BLOCKED_NETWORKS blocks 169.254.0.0/16 (covering metadata IPs for all providers), adding AWS and Azure metadata hostnames provides defense-in-depth against DNS edge cases:

_BLOCKED_HOSTNAMES = frozenset(
    {
        "metadata.google.internal",
        "metadata.google.internal.",
+       "metadata.aws.internal",
+       "metadata.aws.internal.",
    }
)

Severity: Low (IPs are already blocked by _BLOCKED_NETWORKS; this is defense-in-depth only).

Fix this →

2. _decode_trusted_server_jwt should restrict algorithms

auth.py:159-161 — This function disables signature verification (acceptable for trusted server-to-server tokens from Supabase) but hardcodes algorithms=["RS256"]. Since verification is off, PyJWT ignores the algorithms param anyway. Consider adding an "alg" claim check or a comment clarifying that algorithm confusion doesn't apply here because verification is disabled.

def _decode_trusted_server_jwt(token: str) -> dict[str, Any]:
    return pyjwt.decode(
        token, options={"verify_signature": False}, algorithms=["RS256"]
    )

Severity: Low (signature verification is intentionally off; algorithm confusion isn't a risk when you don't verify).

3. Misleading exception type in redirect validation

utils.py:157-160_check_redirect raises httpx.TooManyRedirects for SSRF blocks. While this works to abort the request, it could confuse debugging. A comment explaining the choice (or a dedicated exception) would improve clarity:

async def _check_redirect(response: httpx.Response) -> None:
    if response.is_redirect:
        location = response.headers.get("location", "")
        if location:
            try:
                _validate_url_target(location)
            except ValueError:
+               # Use TooManyRedirects to abort the redirect chain — httpx
+               # does not expose a more specific "redirect rejected" error.
                raise httpx.TooManyRedirects(
                    f"Redirect to blocked address: {location}",
                    request=response.request,
                )

Severity: Nit.

4. HKDF derivation uses salt=None

redis_store.py:61-66 — The HKDF key derivation uses salt=None. While the info parameter provides domain separation, HKDF's security properties are stronger with a salt. Consider using a static salt:

key = HKDF(
    algorithm=hashes.SHA256(),
    length=32,
-   salt=None,
+   salt=b"everyrow-mcp-fernet-v1",
    info=b"everyrow-mcp-fernet",
).derive(settings.upload_secret.encode())

Severity: Low (current approach is cryptographically acceptable for single-key derivation from a high-entropy secret, but a salt is recommended by RFC 5869).

5. Redis warning should fire for auth-mode HTTP only

http_config.py:78-82 — The Redis password warning fires for all HTTP mode (settings.is_http), but it's arguably only critical for authenticated deployments. In --no-auth dev mode, Redis on localhost without a password may be fine.

-   if settings.is_http and not settings.redis_password:
+   if settings.is_http and not no_auth and not settings.redis_password:

Severity: Nit (cosmetic; a warning in dev mode is noisy but harmless).


Summary

This is a strong security PR with no high-severity issues. The SSRF protection, middleware stack, encryption-at-rest, and container hardening are all well-implemented with proper test coverage. The issues noted above are low-severity improvements that can be addressed in follow-up if desired.
|

RafaelPo and others added 3 commits February 24, 2026 21:06
- Add comment explaining why TooManyRedirects is used to abort SSRF
  redirects (httpx has no dedicated redirect-rejected error)
- Only warn about missing REDIS_PASSWORD in authenticated HTTP mode,
  not in --no-auth dev mode where localhost Redis without a password
  is expected

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Some backend rows can take >6 minutes under load; 30 polls (~6.5 min)
was too tight. 60 polls (~13 min) gives enough headroom.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Base compose file is now environment-neutral (proxy headers off by
default, no Cloudflare-specific error messages). Local override exposes
Redis on localhost and sets MCP_SERVER_URL to http://localhost:8000.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines 66 to 70

Raises:
ValueError: If the response cannot be parsed as CSV.
ValueError: If the response cannot be parsed as CSV, or URL targets a blocked network.
httpx.HTTPStatusError: On non-2xx responses.
"""

This comment was marked as outdated.

_validate_hostname checked direct IP literals against _BLOCKED_NETWORKS
without unwrapping IPv4-mapped IPv6 addresses. An attacker could bypass
SSRF protection by using ::ffff:127.0.0.1 instead of 127.0.0.1, since
Python's ipaddress silently returns False for IPv6-in-IPv4-network
checks. The DNS-resolution path (_is_blocked_ip) already had the
unwrap; this aligns the direct IP literal path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo RafaelPo merged commit 4000b88 into main Feb 24, 2026
5 checks passed
@RafaelPo RafaelPo deleted the feat/security-hardening branch February 24, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant