fix(gateway): security closeout — wildcard rejection, normalization, healthcheck by marcusrbrown · Pull Request #638 · fro-bot/agent

marcusrbrown · 2026-05-17T18:16:47Z

Three security findings from the ce:review pass on PR #635 bundled into one focused follow-up. All deploy/-only, no TypeScript touched.

Changes

`OBJECT_STORE_HOSTS` validation (P1 + P2)

The env var that scopes mitmproxy's S3/R2 allowlist now refuses to start with bad config. Three classes of misconfiguration are blocked at module import time:

Wildcards (P1) — OBJECT_STORE_HOSTS="*.s3.amazonaws.com" re-introduced the over-broad allowlist hole that PR feat(gateway): Unit 4 — gateway daemon skeleton + Docker Compose stack #635 fixed. Same blast radius, different surface. Now raises ValueError with an actionable message that names the entry and explains the security rationale.
Ports — OBJECT_STORE_HOSTS="localhost:9000" never matched because mitmproxy's flow.request.host inclusion of port depends on version and proxy mode. Now raises with a clear bare-hostname-only message.
Invalid hostnames — non-RFC 1123 hostnames (leading/trailing dots, consecutive dots, invalid chars, oversized) raise with the entry name in the message.

Operator-friendly: surviving entries are lowercased so OBJECT_STORE_HOSTS="FOO.S3.AMAZONAWS.COM" works as expected.

Validation order: empty-skip → wildcard-reject → port-reject → hostname-validate → lowercase-normalize → append.

Gateway healthcheck (P2)

Replaced [CMD, node, -e, process.exit(0)] (unconditionally passing) with [CMD-SHELL, "test -s /etc/ssl/certs/mitmproxy-ca-cert.pem"]. If mitmproxy's cert volume becomes unreadable or zero-byte (corrupted write, mount issue), the gateway now transitions to unhealthy and Docker's restart policy can act. The deeper TCP probe option (nc against mitmproxy:8080) is deferred.

Tests

deploy/mitmproxy/test_allowlist.py: 20 tests → 29 tests. New coverage:

Wildcard apex + subdomain rejection (asserts 'Set exact bucket hostnames' guidance)
Invalid hostname rejection (asserts entry name in message)
Valid bucket host acceptance
Uppercase normalization
Port rejection
Hostname validator edge cases: leading dot, trailing dot, consecutive dots

Verification

Python: 29/29 pass (python3 deploy/mitmproxy/test_allowlist.py)
Gateway TS: 72/72 pass
Workspace: 1463+ pass
Lint, types, build: clean
docker compose -f deploy/compose.yaml config: validates

Residual

One P3 follow-up tracked locally: deciding whether to reject IP literals in OBJECT_STORE_HOSTS (currently accepted, which preserves the MinIO use case but lets an attacker with deploy-config access reach metadata-service IPs). Deferred to a future reliability batch — OBJECT_STORE_HOSTS is already inside the deploy-config trust boundary.

…healthcheck 3 ready todos from PR #635's ce:review bundled into one focused PR: todo 016 (P1), todo 015 (P2), todo 010 (P2). All deploy/-only. Todo 016 (P1) — Reject wildcards in OBJECT_STORE_HOSTS: The env var that scopes mitmproxy's S3/R2 allowlist accepted wildcard entries unchecked. An operator (or attacker with deploy-config access) could set OBJECT_STORE_HOSTS="*.s3.amazonaws.com" and silently re- introduce the same over-broad-allowlist hole that PR #635 fixed. Same blast radius, different surface. Added _is_valid_hostname() RFC 1123 helper. Validation now raises ValueError at module import time (mitmproxy fails-fast on bad config) when an entry: - starts with "*." (wildcard) — message names the entry and explains the security rationale - fails hostname validation (1-253 chars, allowed [a-z 0-9 . -], label rules) Todo 015 (P2) — Normalize OBJECT_STORE_HOSTS entries: Two real misconfigurations slipped through the previous parser: 1. Uppercase entries never matched because _is_allowed lowercases the request host but not the allowlist entries. 2. Entries with ports (e.g. "localhost:9000" for MinIO) never matched because flow.request.host inclusion of port depends on mitmproxy version and proxy mode — ambiguous. Hybrid fix: normalize case (operator-friendly, just .lower() the entry), but reject ports at startup (genuine ambiguity — operator must know whether mitmproxy will see the port). Validation order is now: empty-skip → wildcard-reject → port-reject → hostname-validate → lowercase-normalize → append. Todo 010 (P2) — Gateway cert-readability healthcheck: The Docker healthcheck was [CMD, node, -e, process.exit(0)] which passes unconditionally. If mitmproxy's cert volume gets corrupted or unmounted after gateway startup, the gateway has no signal — the operator's only feedback is failed requests. Replaced with [CMD-SHELL, "test -r /etc/ssl/certs/mitmproxy-ca-cert.pem"] which verifies the CA cert is readable. If the cert disappears the gateway transitions to unhealthy and Docker's restart policy can act. The deeper TCP-probe option (nc against mitmproxy:8080) is deferred to a future pass per the todo's scope_decision frontmatter. Tests: - deploy/mitmproxy/test_allowlist.py: +6 tests covering all validation paths (wildcard apex/subdomain, invalid hostname, valid bucket, uppercase normalization, port rejection) Verification: 26/26 Python tests (was 20, +6), 72/72 gateway TS, 1463+ workspace, lint/types/build clean, compose config validates.

3 P2 findings from a 3-reviewer ce:review pass on the security closeout (commit f61a463). All bounded, all in deploy/. Changes: - allowlist.py _is_valid_hostname: explicitly reject leading-dot, trailing-dot, and consecutive-dots BEFORE the per-label loop. Previously rstrip('.') was silently normalizing trailing dots and the empty-label check missed leading dots — `.example.com`, `example.com.`, and `example..com` all incorrectly returned True. - compose.yaml gateway healthcheck: tightened test -r to test -s so a zero-byte CA cert file (mid-write corruption, mount issue) also triggers unhealthy state. Catches the case where the file exists but isn't a valid cert. - test_allowlist.py: tightened error-message assertions on the wildcard and invalid-hostname rejection tests so they pin the operator-actionable guidance text ("Set exact bucket hostnames", the entry name), not just the type or a partial word. A future refactor that swallows the actionable guidance can't slip through. - test_allowlist.py: +3 new tests for the hostname-validator edge cases (leading dot, trailing dot, consecutive dots). Residual: todo 017 (P3, IP literal decision) captures the residual risk that OBJECT_STORE_HOSTS accepts private/loopback/metadata-service IPs. Deferred to PR B (reliability) since deploy-config is already a trust boundary and the MinIO use case wants IP support. Verification: 29/29 Python tests (was 26, +3), 72/72 gateway TS, lint/types/build clean, compose config validates.

fro-bot

Verdict: CONDITIONAL

All three security fixes are correctly implemented and the test suite passes cleanly (29/29). One blocking issue found.

Blocking issues

** accepts IP literals, and the port-reject error message misleads for IPv6**

is caught by the port-reject step (colons trigger it), so IPv6 literals are blocked — but the error message says "contains a port" when the real problem is that it's an IPv6 address. This is a cosmetic mislead for a security-relevant validator. More importantly, bare IPv4 literals (e.g. ) pass because digits and dots satisfy the RFC 1123 label check:

$ python3 -c "
import sys; from unittest.mock import MagicMock
sys.modules['mitmproxy']=MagicMock(); sys.modules['mitmproxy.http']=MagicMock()
import os, importlib.util
os.environ['OBJECT_STORE_HOSTS']='169.254.169.254'
spec=importlib.util.spec_from_file_location('al','deploy/mitmproxy/allowlist.py')
mod=importlib.util.module_from_spec(spec); spec.loader.exec_module(mod)
print('allowed:', mod._is_allowed('169.254.169.254'))
"

(AWS instance metadata) would load and be allowed. The PR description explicitly defers this as a P3 on the basis of deploy-config trust — that rationale is reasonable, but the deferred item should be tracked in code (a or issue reference) rather than just locally. Without it, the next reviewer won't know the decision was deliberate.

Required fix: Add an inline comment in at the point where is called (or in itself) that says IP literals are intentionally accepted — cite the MinIO use case and P3 tracking reference. This is a one-liner comment, not a code change.

Non-blocking concerns

Healthcheck probe gap — checks that the cert file is non-empty, which is a solid improvement over the unconditional . However, it does not verify that the cert is actually parseable or that mitmproxy is listening on 8080. A zero-byte partial write followed by a complete write between health intervals would pass on the second check. Accepted risk at this scope, but worth a comment noting the deeper probe is deferred (the PR description mentions it — just not in the compose.yaml).
Static ALLOWLIST wildcard entries remain — and in the static list use wildcard semantics. This is intentional and was in-scope before this PR, but the wildcard-reject guard only applies to . A future reader might wonder why wildcards are fine in the static list but rejected in the env var — a brief comment near the definition would pre-empt the question.
Port-reject error message for IPv6 — is technically wrong for . Low impact since IPv6 literals are blocked (desired), but the message is confusing. Non-blocking because IPv4 literals are the real P3 concern.

Missing tests

No test covering a bare IPv4 literal in (e.g., or ). Given the explicit P3 deferral, adding a test that documents current behavior ("IPv4 literals are accepted") would make the intentionality explicit and catch an accidental future change.

Risk assessment: LOW

The code changes are deploy/-scoped, additive, and fail-closed. The wildcard-reject and port-reject paths are covered by tests. The healthcheck improvement is strictly better than the previous unconditional pass. The only untracked risk is the IP-literal acceptance, which the author has already reasoned about — it just needs to be visible in the code.

Run Summary

Field	Value
Event	pull_request
Repository	fro-bot/agent
Run ID	25998851366
Cache	hit
Session	ses_1c8d69d7effebdWSGKQpMtcm0O

fro-bot

Verdict: CONDITIONAL

All three security fixes are correctly implemented and the test suite passes cleanly (29/29). One blocking issue found.

Blocking issues

_is_valid_hostname accepts IPv4 literals — intentional but undocumented in code

Bare IPv4 literals (e.g. 169.254.169.254) pass _is_valid_hostname because digits and dots satisfy the RFC 1123 label check. The AWS instance metadata IP would be accepted as a valid OBJECT_STORE_HOSTS entry. The PR description explicitly defers this as a P3 (MinIO use case, deploy-config trust boundary) — that rationale is sound — but the decision is tracked only locally, not in the code.

IPv6 literals (2001:db8::1) are incidentally blocked by the port-reject step (colons trigger it), but with a misleading error message ("contains a port").

Required fix: Add a one-line comment in allowlist.py near the _is_valid_hostname call in the OBJECT_STORE_HOSTS loop documenting the intentional acceptance of IPv4 literals, citing the MinIO use case and the P3 tracking decision. No code change needed — just a comment so the next reviewer knows it was deliberate.

Non-blocking concerns

Healthcheck probe gap — test -s /etc/ssl/certs/mitmproxy-ca-cert.pem checks that the cert file is non-empty, which is a solid improvement over the unconditional node -e process.exit(0). It does not verify the cert is parseable or that mitmproxy is listening on 8080. Accepted risk; the PR description already notes the deeper nc probe is deferred. A comment in compose.yaml noting the deferral would prevent the next contributor from re-litigating it.
Static ALLOWLIST wildcards vs. env-var wildcard rejection — *.discord.com and *.discord.gg exist in the static list, but wildcard-reject only applies to OBJECT_STORE_HOSTS. This is intentional by design, but a brief comment near the ALLOWLIST definition explaining the asymmetry would pre-empt confusion for future contributors.
Port-reject error message for IPv6 — The message "contains a port" is technically wrong for 2001:db8::1. Low impact since IPv6 literals being blocked is the desired outcome, but the message could confuse an operator who genuinely intended to enter an IPv6 address.

Missing tests

No test covering a bare IPv4 literal in OBJECT_STORE_HOSTS (e.g. 192.168.1.1 or 169.254.169.254). Given the explicit P3 deferral, a test that documents current behavior ("IPv4 literals are currently accepted") would make the intentionality explicit and catch an accidental future change to reject them.

Risk assessment (LOW)

The code changes are deploy/-scoped, additive, and fail-closed. The wildcard-reject and port-reject paths are covered by tests. The healthcheck improvement is strictly better than the previous unconditional pass. The only untracked risk is the IPv4-literal acceptance, which the author has already reasoned about — it just needs one comment to be visible in the code.

Run Summary

Field	Value
Event	pull_request
Repository	fro-bot/agent
Run ID	25998851366
Cache	hit
Session	ses_1c8d69d7effebdWSGKQpMtcm0O

… clarity 1 blocking + 3 non-blocking items from Fro Bot's CHANGES_REQUESTED. All documentation/clarity fixes; no behavior changes to validation logic beyond a sharper IPv6 error message. Blocking — IP literal acceptance is now visible in code: The OBJECT_STORE_HOSTS validation accepts IPv4 literals because RFC 1123 labels permit pure digits. That's a deliberate choice (MinIO/self-hosted object-store use case), and the residual risk (metadata-service IPs 169.254.169.254 reachable if deploy-config is compromised) is accepted on deploy-config-trust grounds. Previously this rationale lived only in the local todo 017 — invisible to anyone reading the code. Added an inline comment at the validator call site that cites the MinIO case, names the accepted risk, and points to the todo for the deferred tightening decision. Non-blocking 1 — Wildcard asymmetry comment: The static ALLOWLIST permits *.discord.com / *.discord.gg / etc. The OBJECT_STORE_HOSTS env-var path rejects ANY wildcard entry. Future readers wondering about the inconsistency now see the explanation inline: wildcards in code go through review; wildcards via env var would let operators (or attackers with deploy-config access) silently re-introduce over-broad allowlist holes. Non-blocking 2 — IPv6 port-reject message clarity: OBJECT_STORE_HOSTS=::1 was rejected by the port-reject branch because it contains a colon, but the error message said "contains a port" — wrong for IPv6. Split the port-reject step into two arms: if the entry looks like IPv6 (multiple colons OR leading [), raise an IPv6-specific message; otherwise the original port message. IPv6 is still rejected (not yet supported) but the diagnostic is now accurate. Non-blocking 3 — Test documenting current IPv4 acceptance: Added test_object_store_hosts_accepts_ipv4_literal that captures the deliberate choice to accept 10.0.0.5 etc. and references todo 017. An accidental future refactor that rejects IPv4 will now break a test with a clear docstring explaining why the behavior matters. Also added test_object_store_hosts_rejects_ipv6_literal to confirm the new branch fires and emits an IPv6-specific error message. Verification: 31/31 Python tests (was 29, +2), 72/72 gateway TS, lint/types/build clean, compose config validates.

fro-bot

Verdict: PASS

All blocking issues from my previous review are fully resolved in commit daf9409. 31/31 tests pass.

What changed in the follow-up commit

Blocking issue — IPv4 literal acceptance now documented in code (allowlist.py:133-138): The 7-line comment names the MinIO use case, states the accepted risk (metadata-service IPs), and references the todo tracking the future decision. Exactly what was asked.

Non-blocking — IPv6 error message corrected (allowlist.py:120-125): The port-reject step now distinguishes between host:port (single colon) and IPv6 literals (multiple colons or [ prefix), with a dedicated error message. fe80::1, [::1]:443, and ::1 all raise with "IPv6" in the message; host:80 still says "contains a port".

Non-blocking — static ALLOWLIST wildcard asymmetry documented (allowlist.py:33-38): The comment at the top of ALLOWLIST now explains why wildcards are permitted in the static list (code review gate) but rejected in OBJECT_STORE_HOSTS (operator/attacker surface).

Missing test — IPv4 literal behavior documented (test_allowlist.py:368-377): test_object_store_hosts_accepts_ipv4_literal locks in current behavior with a docstring that cites todo 017.

Missing test — IPv6 literal rejection (test_allowlist.py:357-365): New test asserts "IPv6" appears in the message and "contains a port" does not.

Blocking issues

None.

Non-blocking concerns

None.

Missing tests

None.

Risk assessment (LOW)

No behavior changes since the previous review — only comments, a sharper error message for an already-rejected input class, and two new tests. Blast radius is unchanged.

Run Summary

Field	Value
Event	pull_request
Repository	fro-bot/agent
Run ID	25999003235
Cache	hit
Session	ses_1c8d69d7effebdWSGKQpMtcm0O

marcusrbrown added 2 commits May 17, 2026 10:25

marcusrbrown requested a review from fro-bot as a code owner May 17, 2026 18:16

fro-bot added the agent: working label May 17, 2026

fro-bot requested changes May 17, 2026

View reviewed changes

fro-bot removed the agent: working label May 17, 2026

fro-bot added the agent: working label May 17, 2026

fro-bot approved these changes May 17, 2026

View reviewed changes

fro-bot removed the agent: working label May 17, 2026

marcusrbrown merged commit 7234d25 into main May 17, 2026
10 checks passed

marcusrbrown deleted the fix/gateway-security-closeout branch May 17, 2026 18:27

fro-bot Bot mentioned this pull request May 17, 2026

chore(release): pending release v0.44.0 #637

Merged

marcusrbrown mentioned this pull request May 17, 2026

fix(gateway): reliability and test-coverage follow-ups #639

Merged

fro-bot mentioned this pull request May 18, 2026

Daily Maintenance Report #252

Open

46 tasks

This was referenced May 23, 2026

fix(gateway): detect mitmproxy loss in healthcheck #661

Merged

docs(plan): reconcile gateway v1 status with shipped deploy-readiness work #663

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): security closeout — wildcard rejection, normalization, healthcheck#638

fix(gateway): security closeout — wildcard rejection, normalization, healthcheck#638
marcusrbrown merged 3 commits into
mainfrom
fix/gateway-security-closeout

marcusrbrown commented May 17, 2026

Uh oh!

fro-bot left a comment

Uh oh!

fro-bot left a comment

Uh oh!

fro-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

marcusrbrown commented May 17, 2026

Changes

OBJECT_STORE_HOSTS validation (P1 + P2)

Gateway healthcheck (P2)

Tests

Verification

Residual

Uh oh!

fro-bot left a comment

Choose a reason for hiding this comment

Verdict: CONDITIONAL

Blocking issues

Non-blocking concerns

Missing tests

Risk assessment: LOW

Uh oh!

fro-bot left a comment

Choose a reason for hiding this comment

Verdict: CONDITIONAL

Blocking issues

Non-blocking concerns

Missing tests

Risk assessment (LOW)

Uh oh!

fro-bot left a comment

Choose a reason for hiding this comment

Verdict: PASS

What changed in the follow-up commit

Blocking issues

Non-blocking concerns

Missing tests

Risk assessment (LOW)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`OBJECT_STORE_HOSTS` validation (P1 + P2)