Skip to content

bug: flaky private/reserved IP detection in policy advisory security notes #1777

@alangou

Description

@alangou

Agent Diagnostic

This issue originated security-architecture code read of the
private-IP detection used when annotating proposed network-policy rules. An
agent then traced the reported function through the codebase:

  • Starting point flagged in review:
    crates/openshell-sandbox/src/mechanistic_mapper.rs
    generate_security_notes() (the host.starts_with("172.") block).
  • Searched the tree for the same pattern and found a second, duplicated copy
    of the naive string check:
    crates/openshell-server/src/grpc/policy.rsgenerate_security_notes()
    ("Re-validate security notes server-side").
  • Key finding: a canonical, RFC-accurate classifier already exists in
    crates/openshell-core/src/net.rs:
    • is_internal_ip(IpAddr) — already covers RFC 1918 (via Ipv4Addr::is_private(),
      i.e. correct 172.16.0.0/12), loopback, link-local/APIPA, unspecified,
      CGNAT 100.64.0.0/10 (RFC 6598), IETF protocol assignments 192.0.0.0/24,
      benchmarking 198.18.0.0/15 (RFC 2544), TEST-NET-2/3 (RFC 5737), and
      IPv6 ULA fc00::/7.
    • is_always_blocked_ip(IpAddr), is_always_blocked_net(IpNet),
      is_link_local_ip(IpAddr), is_known_metadata_hostname(&str).
    • This module already does exactly what the review proposed (hand-pick the
      stable std::net predicates + manually add the ranges that are still
      nightly-only in Ipv4Addr::is_global). It is well unit-tested.
  • The runtime SSRF enforcement path (crates/openshell-sandbox/src/proxy.rs,
    e.g. reject_internal_resolved_addrs() at the is_internal_ip(addr.ip())
    check, and the is_always_blocked_ip checks) already uses these canonical
    helpers and is unaffected.

Scope / severity clarification: the two generate_security_notes()
functions only produce human-readable advisory text attached to proposed
policy rules. They are not an enforcement boundary — a wrong note never
allows or blocks a connection. The actual private-IP/SSRF blocking is done in
the proxy with the correct is_internal_ip / is_always_blocked_ip helpers.
So this is a correctness / tech-debt bug in advisory output, not a security
vulnerability
, which is why it is filed here rather than via SECURITY.md.

Description

Actual behavior — both generate_security_notes() implementations classify
a destination as "internal/private" with naive string prefixes:

if host.starts_with("10.")
    || host.starts_with("172.")
    || host.starts_with("192.168.")
    || host == "localhost"
    || host.starts_with("127.")
    || host.starts_with("169.254.")
{
    notes.push(format!("Destination '{host}' appears to be an internal/private address."));
}

This is flaky in three ways:

  1. Hostname false positives. The host is a string, not a parsed IP, so a
    public DNS name like 10.example.org, 172.foo.com, or 127.acme.io is
    labeled "internal/private".
  2. Wrong CIDR for the 172 block. starts_with("172.") matches the entire
    172.0.0.0/8, but RFC 1918 only reserves 172.16.0.0/12. Public addresses
    in 172.0.0.0172.15.255.255 and 172.32.0.0172.255.255.255 are
    incorrectly flagged as private.
  3. Incomplete coverage. No IPv6 handling at all, and no CGNAT (100.64.0.0/10),
    benchmarking (198.18.0.0/15), IETF (192.0.0.0/24), or TEST-NET ranges —
    even though the canonical helper already covers all of these.

Expected behavior — classification should parse the host as an IpAddr and
delegate to the existing canonical helpers in openshell_core::net instead of
re-implementing (incorrectly) the logic with string prefixes. Non-IP hostnames
should not be classified as private by prefix matching.

Reproduction Steps

These functions are pure, so the defect is reproducible by their inputs.
For each call site (mechanistic_mapper.rs and grpc/policy.rs
generate_security_notes):

  1. host = "10.example.org", port = 443
    → currently appends "appears to be an internal/private address" (WRONG — it
    is a public DNS hostname).
  2. host = "172.15.0.1", port = 443 (public, just below 172.16.0.0/12)
    → currently flagged internal (WRONG).
  3. host = "172.32.0.1", port = 443 (public, just above the RFC 1918 block)
    → currently flagged internal (WRONG).
  4. host = "100.64.0.1", port = 443 (CGNAT, genuinely internal)
    → currently not flagged (missed).
  5. host = "fd12:3456:789a:1::1", port = 443 (IPv6 ULA, genuinely internal)
    → currently not flagged (missed).

Affected source:

  • crates/openshell-sandbox/src/mechanistic_mapper.rsgenerate_security_notes()
    (string-prefix block ~L297–L303).
  • crates/openshell-server/src/grpc/policy.rsgenerate_security_notes()
    (string-prefix block ~L3174–L3184).

Environment

  • N/A — code-level correctness bug (not host/runtime specific).
  • Branch: main (observed at local 97986d90).
  • Affected crates: openshell-sandbox, openshell-server.
  • Canonical helper already present in openshell-core (src/net.rs).

Logs

N/A — advisory-text generation path, reproduced by function inputs above.

Proposed fix (for the implementer)

  1. In both generate_security_notes() functions, replace the starts_with(...)
    block with a parse-then-classify approach using the existing
    openshell_core::net helpers, e.g.:
    • parse host as IpAddr; if it parses, use is_internal_ip(ip) (and/or
      is_always_blocked_ip(ip)) to decide whether to add the internal/private note;
    • for non-IP hosts, only special-case known names (localhost,
      is_known_metadata_hostname) rather than prefix-matching.
  2. De-duplicate: both crates already depend on openshell-core, so there is no
    reason to keep two divergent copies of this logic. Consider a single shared
    helper that returns the advisory note (or at least route both call sites
    through openshell_core::net).
  3. Add unit tests covering the cases in "Reproduction Steps" (hostname false
    positives, the 172.0/8 vs 172.16/12 boundary, CGNAT, and IPv6 ULA).
  4. No change is expected in proxy.rs; it already uses the canonical helpers.
    Confirm mise run test passes.

Agent-First Checklist

  • I pointed my agent at the repo and had it investigate this issue
  • I reviewed the existing IP-classification code (openshell-core/src/net.rs,
    proxy.rs) and confirmed the canonical helper already exists
  • The diagnostic above explains the root cause and the proposed consolidation

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions