Skip to content

feat(orthrus): server-side Docker proxy listener for agent sessions#1027

Merged
Wikid82 merged 7 commits into
developmentfrom
feature/hecate
May 19, 2026
Merged

feat(orthrus): server-side Docker proxy listener for agent sessions#1027
Wikid82 merged 7 commits into
developmentfrom
feature/hecate

Conversation

@Wikid82
Copy link
Copy Markdown
Owner

@Wikid82 Wikid82 commented May 18, 2026

Problem

After switching a RemoteServer from a generic socat socket proxy to an Orthrus agent, all Docker operations fail with:

Cannot connect to Docker. Please ensure Docker is running and the socket is accessible (e.g., /var/run/docker.sock is mounted).

Root cause: The Orthrus agent already handles streamTypeDocker = 0x01 — when the server opens a yamux stream with that type byte, the agent bridges it to its local /var/run/docker.sock. However, the server-side half was a stub (proxyPort field always 0, GetProxyAddr() always returns ""). Charon's Docker handler had no way to route requests through an agent and fell back to tcp://host:port which points at nothing.


What This PR Does

backend/internal/orthrus/session.go

  • StartDockerProxy() — binds an ephemeral 127.0.0.1:N TCP listener when an agent connects; idempotency guard (s.listener != nil) prevents double-allocation
  • runProxyListener() — accept loop that spawns proxyConn goroutines
  • proxyConn() — for each TCP connection: opens a yamux stream, writes type byte 0x01, bidirectionally copies via two io.Copy goroutines + sync.WaitGroup
  • GetProxyAddr() — returns "127.0.0.1:N" once the proxy is live; empty string otherwise
  • Close() — closes the listener under the mutex; in-flight proxyConn goroutines complete asynchronously as yamux stream close propagates

backend/internal/orthrus/server.go

  • Calls sess.StartDockerProxy() after each agent WebSocket connects; logs but does not abort on failure (Docker is optional)

backend/internal/api/handlers/docker_handler.go

  • Adds orthrusProxyResolver interface + SetOrthrusResolver() setter
  • ListContainers (and other Docker methods) now branch on connection_type == "orthrus":
    • OrthrusAgentUUID == nil400
    • orthrusResolver == nil (no encryption key) → 503 with explanation
    • Agent not in sessions map → 502 "Orthrus agent is not currently connected"
    • Agent connected → uses tcp://127.0.0.1:N as Docker host

backend/internal/api/routes/routes.go

  • Hoists orthrusServer declaration before the encryption-key block; wires it into dockerHandler after construction (nil is safe — handler returns 503)

Tests

12 new acceptance criteria, all green:

Test Covers
TestAgentSession_StartDockerProxy_ListensOnLoopback Listener bound to 127.0.0.1
TestAgentSession_StartDockerProxy_GetProxyAddr Address returned after start
TestAgentSession_StartDockerProxy_CalledTwice Idempotency — second call returns error, address unchanged
TestAgentSession_Close_ClosesListener Listener closed on session close
TestAgentSession_ProxyConn_WritesStreamTypeByte First byte written to yamux stream is 0x01
TestDockerHandler_ListContainers_OrthrusNotConnected 502 when agent offline
TestDockerHandler_ListContainers_OrthrusNoResolver 503 when unconfigured
TestDockerHandler_ListContainers_OrthrusConnected Docker client uses proxy address
TestDockerHandler_ListContainers_OrthrusMissingUUID 400 for nil agent UUID
+ 3 integration stubs (//go:build integration) End-to-end proxy flow

Coverage: 88.6% lines / 88.5% statements (gate: 87%). Race detector: clean.


Security Notes

  • TCP listener is bound exclusively to 127.0.0.1 (loopback) — never exposed outside the host
  • Ephemeral port (:0) — no fixed port to block or exploit
  • Agent-side muzzle filter already restricts allowed Docker API paths — unchanged
  • No credentials in proxy stream; authentication is at the WebSocket layer

Checklist

  • All pre-commit hooks pass (go-vet, golangci-lint, semgrep)
  • 88.6% backend coverage ≥ 87% gate
  • Race detector clean
  • GORM security scan: no CRITICAL/HIGH findings
  • Trivy FS scan: 0 actionable findings
  • E2E tests: pass in CI (local Playwright 1.60 blocked by Ubuntu 26.04 infra issue)

defer conn.Close() and stream.Close() in proxyConn are cleanup-only
deferred calls where the error is not actionable; use _ = x.Close()
idiom to satisfy golangci-lint errcheck rule
@github-advanced-security
Copy link
Copy Markdown
Contributor

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

✅ Supply Chain Verification Results

PASSED

📦 SBOM Summary

  • Components: 1487

🔍 Vulnerability Scan

Severity Count
🔴 Critical 0
🟠 High 0
🟡 Medium 4
🟢 Low 2
Total 6

📎 Artifacts

  • SBOM (CycloneDX JSON) and Grype results available in workflow artifacts

Generated by Supply Chain Verification workflow • View Details

@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/internal/api/handlers/docker_handler.go 95.23% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Wikid82 Wikid82 changed the title feat(orthrus): server-side Docker proxy listener for agent sessions (PR 5) feat(orthrus): server-side Docker proxy listener for agent sessions May 18, 2026
@Wikid82 Wikid82 self-assigned this May 18, 2026
@Wikid82 Wikid82 added this to Charon May 18, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in Charon May 18, 2026
@Wikid82 Wikid82 moved this from Backlog to In Review in Charon May 19, 2026
… Orthrus UUID

- Add TestAgentSession_ProxyConn_OpenFails: exercises the early-return
  path inside proxyConn when the underlying yamux session is already
  closed, causing Open() to return ErrSessionShutdown immediately

- Add TestAgentSession_ProxyConn_WriteFails: exercises the early-return
  path when stream.Write() fails; the test goroutine consumes the 12-byte
  yamux SYN frame from the pipe (allowing Open to succeed) then closes the
  connection so the subsequent DATA frame write returns an error

- Add TestDockerHandler_ListContainers_OrthrusEmptyAgentUUID: exercises
  the right-hand branch of the OrthrusAgentUUID guard (non-nil pointer to
  empty string), covering the 400 response path that was previously only
  half-covered by the nil-pointer test

Together these tests bring Codecov patch coverage for the Orthrus feature
branch from 83% toward the 90% threshold required for merge

Closes coverage gap on PR #1027
@Wikid82 Wikid82 marked this pull request as ready for review May 19, 2026 10:38
Copilot AI review requested due to automatic review settings May 19, 2026 10:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the server-side half of the Orthrus agent Docker tunnel: when an agent WebSocket connects, the server allocates an ephemeral loopback TCP listener that proxies each TCP connection into a fresh yamux stream prefixed with the 0x01 Docker stream-type byte. The Docker handler now routes connection_type == "orthrus" remote servers through that loopback address instead of tcp://host:port, with explicit 400/502/503 errors for misconfiguration, offline agents, and disabled subsystems.

Changes:

  • Add StartDockerProxy/runProxyListener/proxyConn/listener lifecycle to AgentSession and wire the listener start/stop into the WebSocket connect and heartbeat-watcher paths.
  • Add orthrusProxyResolver interface + SetOrthrusResolver to DockerHandler and branch on ConnectionType when resolving the Docker host; hoist orthrusServer in routes.go to wire it.
  • Add unit tests, an integration-build stub, an extensive feature spec, and a QA/security audit report; minor unrelated test-file whitespace/struct-alignment cleanups.

Reviewed changes

Copilot reviewed 18 out of 21 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
backend/internal/orthrus/session.go Adds streamTypeDocker, listener field, StartDockerProxy, runProxyListener, proxyConn; updates Close/GetProxyAddr sentinel
backend/internal/orthrus/session_test.go New tests covering loopback bind, idempotency, close-stops-listener, and type-byte write
backend/internal/orthrus/session_coverage_test.go Extra coverage for proxyConn Open/Write failure paths; updates existing test for new sentinel
backend/internal/orthrus/server.go Calls StartDockerProxy on connect; calls sess.Close() in heartbeat-watcher cleanup
backend/internal/orthrus/server_test.go New end-to-end WebSocket test asserting a loopback proxy address is registered
backend/internal/orthrus/server_coverage_test.go Updates existing GetProxyAddr test for new listener sentinel
backend/internal/orthrus/proxy_integration_test.go New //go:build integration skip-stub
backend/internal/api/handlers/docker_handler.go Adds orthrusProxyResolver interface, setter, and ConnectionType switch with 400/502/503 branches
backend/internal/api/handlers/docker_handler_test.go New tests for connected/offline/no-resolver/nil-UUID/empty-UUID and nil setter cases
backend/internal/api/routes/routes.go Hoists orthrusServer declaration and wires SetOrthrusResolver
backend/internal/config/config.go Re-aligns struct fields (adds CertExpiryWarningDays)
backend/internal/api/handlers/crowdsec_stop_lapi_test.go Struct-field alignment formatting
backend/internal/api/handlers/crowdsec_coverage_target_test.go Whitespace/indent cleanup
backend/internal/api/handlers/crowdsec_wave5_test.go, security_notifications_single_source_test.go, system_permissions_*_test.go, certificate_validator_extra_coverage_test.go Blank-line whitespace cleanups only
SECURITY.md Updates Last Updated date
docs/reports/qa-security-audit-2026-05-18.md New QA/security audit report
docs/plans/current_spec.md Replaces archived CI-fix spec with the Orthrus Docker proxy feature spec

Comment thread backend/internal/api/routes/routes.go Outdated
…nter

Unconditionally passing orthrusServer (a nil *OrthrusServer) to
SetOrthrusResolver stored a non-nil interface with a nil concrete
pointer, defeating the h.orthrusResolver == nil guard and causing
a panic on GetProxyAddr. Two complementary fixes applied:

1. Call site (routes.go): only invoke SetOrthrusResolver when
   orthrusServer is non-nil, consistent with the existing guard on
   SetOrthrusServer.

2. Defense-in-depth (docker_handler.go): SetOrthrusResolver now
   normalizes typed-nil concrete pointers to a proper nil interface
   via reflect, preventing the trap regardless of call site.

Adds a typed-nil regression test that verifies the handler returns
503 rather than panicking when SetOrthrusResolver receives a nil
*orthrus.OrthrusServer.
@Wikid82 Wikid82 merged commit 30d8fe3 into development May 19, 2026
39 checks passed
@github-project-automation github-project-automation Bot moved this from In Review to Done in Charon May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants