feat(ssh): add SSH agent proxy with host-based filtering by dpup · Pull Request #34 · majorcontext/moat

dpup · 2026-01-19T21:54:05Z

Add SSH agent proxy that enables containers to use SSH keys for git operations without exposing private
keys
Keys are filtered by granted hosts—containers only see keys for hosts they're authorized to access
Sign requests are forwarded to the user's real SSH agent; private keys never enter the container
All SSH agent operations are logged to the tamper-proof audit store

How it works

  # Grant SSH access to a host (one time)
  moat grant ssh --host github.com

  # Use in runs
  moat run --grant ssh:github.com -- git clone git@github.com:org/repo.git

The proxy connects to the user's local SSH agent and exposes a filtered socket to the container. Only keys mapped to granted hosts are visible, and sign requests are validated against the grant before forwarding.

Runtime modes:

Docker on Linux / Apple containers: Unix socket mounted into container
Docker on macOS/Windows: TCP bridge with socat (Unix sockets can't cross the VM boundary)

Changes

internal/sshagent/ - New package implementing the filtering SSH agent proxy
internal/credential/ssh.go - SSH host-to-key mapping storage
internal/run/manager.go - SSH proxy lifecycle integration
internal/audit/ - SSH event types for audit logging
cmd/moat/cli/grant_ssh.go - moat grant ssh --host command
internal/deps/ - Adds openssh-client and socat to images when SSH grants are present

claude · 2026-01-19T21:58:37Z

PR Review: SSH Agent Proxy Feature

This is an excellent implementation of SSH agent proxying with host-based filtering. The code quality is high, security considerations are well thought out, and the integration is clean. Here's my detailed feedback:

✅ Strengths

Architecture & Design

Clean abstraction layers: The separation between Proxy, Server, and AgentClient is well-designed and testable
Runtime-aware implementation: Smart handling of Docker vs Apple containers and TCP vs Unix socket modes
Minimal interface: AgentClient interface is focused and makes testing straightforward
Proper lifecycle management: SSH server cleanup is handled in all code paths (Create errors, Stop, Destroy)

Security

Defense in depth: Private keys never enter containers - only sign requests are forwarded
Host-based filtering: Keys are filtered per-host, preventing unauthorized access
Audit logging: All SSH operations (list, sign_allowed, sign_denied) are logged to tamper-proof audit store
Socket permissions: Properly set to 0666 for container access while limiting scope via directory isolation
Single-host fallback: Reasonable fallback for non-git SSH when key maps to exactly one host (proxy.go:131-134)

Code Quality

Comprehensive tests: Unit tests cover filtering, signing, host validation, upstream errors, and fallback scenarios
Good error messages: Clear, actionable errors (e.g., "SSH grants require SSH_AUTH_SOCK...")
Proper concurrency: Uses sync.RWMutex correctly, atomic.Value for current host
Cleanup handling: Socket files removed on shutdown, directories cleaned up on error

Documentation

Clear examples: README shows both CLI usage and agent.yaml configuration
Design docs: Detailed planning documents explain architecture decisions
Code comments: Security model documented in server.go:82-87

🔍 Issues & Recommendations

1. Potential Resource Leak (Medium Priority)

Location: internal/run/manager.go:440-446

When connecting to the upstream SSH agent, if a subsequent error occurs, the agent connection may not be closed:

upstreamAgent, err := sshagent.ConnectAgent(upstreamSocket)
if err != nil {
    if proxyServer != nil {
        _ = proxyServer.Stop(context.Background())
    }
    return nil, fmt.Errorf("connecting to SSH agent: %w", err)
}
// If error occurs after this, upstreamAgent.Close() may not be called

Recommendation: Add cleanup for upstreamAgent in all error paths after it's created, or use a defer with error handling:

defer func() {
    if err != nil && upstreamAgent != nil {
        _ = upstreamAgent.Close()
    }
}()

2. Race Condition in Audit Callback (Low Priority)

Location: internal/sshagent/proxy.go:55-62

The audit() method reads auditFunc under RLock, but the function could theoretically be called while being set. While unlikely in practice (it's set once during initialization), consider setting it during construction:

// Option 1: Pass audit func to NewProxy
func NewProxy(upstream AgentClient, auditFunc AuditFunc) *Proxy

// Option 2: Use sync.Once for the first call

3. Missing Error Check (Low Priority)

Location: internal/run/manager.go:477

_, tcpPort, _ := parseHostPort(tcpAddr)

The error from parseHostPort is silently ignored. If parsing fails, the proxy may not work correctly.

Recommendation: Check the error and handle appropriately:

_, tcpPort, err := parseHostPort(tcpAddr)
if err != nil {
    // cleanup and return error
}

4. Test Permissions Check is Weak (Low Priority)

Location: internal/sshagent/server_test.go:76-81

if mode&0077 != 0 {
    t.Logf("Socket permissions: %o (note: some systems allow different permissions)", mode)
}

This test logs a message but doesn't fail. The comment in server.go:88 states permissions should be 0666, but the test expects 0600 or similar. There's a mismatch between intention and validation.

Recommendation: Either:

Make the test fail if permissions are incorrect
Or update the test to verify the actual requirement (0666)

5. Incomplete Host Parsing Error Handling (Low Priority)

Location: cmd/moat/cli/grant_ssh.go:157-171

The fingerprintFromAuthorizedKey function returns an empty string on error but doesn't indicate what went wrong (bad format, bad base64, etc.). Consider returning an error for better diagnostics.

6. Socket Directory Cleanup (Low Priority)

Location: internal/run/manager.go:494-502

The socket directory is created but only cleaned up on immediate error. If the run fails later, the directory may be left behind. Consider cleanup in Destroy().

📝 Minor Suggestions

Code Style

Consistent error wrapping: Most errors use fmt.Errorf with %w, but a few use %v. Stick with %w for better error chains.
Magic numbers: Consider constants for:
- Socket permissions (0666 in server.go:88)
- TCP bind address ("127.0.0.1:0" in manager.go:462)

Testing

E2E test coverage: The E2E tests verify basic setup but don't test actual SSH operations (e.g., git clone). Consider adding a test that performs a real SSH operation if feasible.
Mock consistency: The mock agent in tests is defined inline. Consider moving it to a test helper file for reuse across test files.

Documentation

Security model: The proxy security model differs between runtimes (localhost vs 0.0.0.0). This is documented in CLAUDE.md but might be worth a comment in the code near the binding logic (manager.go:287-297).
TCP bridge mode: The socat bridge setup for Docker on macOS is clever but could use more explanation in the code comments about why it's needed (VM boundary issue).

🎯 Performance Considerations

No obvious performance issues: The proxy uses efficient Go stdlib SSH libraries
Concurrency: Proper use of RWMutex allows concurrent reads
Memory: No unbounded allocations or leaks detected

🔒 Security Assessment

✅ Security Done Right

✅ Private keys never exposed to containers
✅ Host-based filtering prevents key misuse
✅ Audit logging for compliance/forensics
✅ Proper socket permissions and isolation
✅ No credential storage in containers

⚠️ Security Considerations

Unix socket permissions (0666): While documented as safe due to directory isolation, this is broader than typical SSH agent permissions (0600). Consider if there's any way a malicious process could access the socket directory.
Single-host fallback: The fallback in proxy.go:131-134 is reasonable but could be exploited if a user accidentally grants the same key to multiple hosts. Document this behavior clearly for users.
TCP mode on macOS: TCP binding to 127.0.0.1 is secure, but ensure Docker Desktop's networking doesn't expose this to other containers or the network.

📊 Test Coverage Assessment

Strong Coverage

✅ Proxy filtering logic
✅ Host-based authorization
✅ Upstream error propagation
✅ Server lifecycle
✅ Socket cleanup

Could Be Improved

⚠️ Integration between proxy and actual SSH operations
⚠️ TCP mode end-to-end testing
⚠️ socat bridge setup verification
⚠️ Concurrent access patterns

📚 Conventional Commits

The commit message format is good, but note per CLAUDE.md:

Do not include Co-Authored-By lines for Claude in commit messages

The current commit includes Claude attribution, which should be removed per project conventions.

✨ Overall Assessment

Rating: Excellent (9/10)

This is production-quality code with thoughtful design, good test coverage, and strong security considerations. The issues identified are mostly minor and don't block merging. The architecture is extensible and well-integrated with the existing moat codebase.

Recommended Action

Approve with minor fixes

Address the resource leak (issue #1) and error handling (issue #3) before merging. Other issues can be handled in follow-up PRs if desired.

Great work! 🎉

claude · 2026-01-19T22:15:50Z

Code Review: SSH Agent Proxy Implementation

I've reviewed this PR and found it to be a well-architected implementation of SSH agent forwarding with host-based filtering. Here's my detailed feedback:

✅ Strengths

Security Model

Strong security design: Private keys never enter containers, only signing requests are forwarded
Host-based filtering correctly limits key visibility per grant
Audit logging captures all SSH operations (list, sign_allowed, sign_denied) for tamper-proof tracking
Proper use of cryptographic fingerprints (SHA256) for key identification
Fallback logic for single-host keys is sensible (proxy.go:131-134)

Code Quality

Clean separation of concerns: protocol parsing, proxy filtering, server lifecycle
Proper use of concurrency primitives: atomic.Value for currentHost, sync.RWMutex for allowedKeys
Comprehensive test coverage with unit tests for filtering, signing, and protocol handling
Good error messages with actionable guidance (grant_ssh.go:53-56, 71-74, 105-106)
Proper cleanup in error paths and graceful shutdown

Architecture

Runtime-aware TCP vs Unix socket handling for Docker cross-platform support
Integration with existing audit and credential infrastructure
Deterministic image tagging includes SSH dependencies (builder.go:37-39)

🔍 Issues & Suggestions

1. Race Condition in Host Tracking (Medium Priority)

The currentHost approach assumes SSH connections are sequential per container. If multiple SSH operations occur concurrently, SetCurrentHost could race:

// Thread 1: SetCurrentHost("github.com")
// Thread 2: SetCurrentHost("gitlab.com") 
// Thread 1: Sign() - might use wrong host!

Concern: The PR description mentions host tracking via an SSH wrapper, but I don't see SetCurrentHost being called anywhere in the codebase. How is the current host being communicated to the proxy?

Recommendation:

If relying on an external wrapper calling SetCurrentHost: Add documentation explaining this requirement and the threading assumption
Consider per-connection context instead of shared state, or use connection-scoped proxies
Add integration tests validating concurrent SSH operations to different hosts

2. Missing Host Tracking Implementation (High Priority)

# This search should show how SetCurrentHost is called:
$ grep -r "SetCurrentHost" internal/
# Only found in tests and the proxy itself

The planning docs mention an ssh-wrapper.sh, but it's not in the changeset. Without this:

Sign requests will only work for single-host keys (fallback on line 131)
Multi-host keys will fail with "cannot determine target" error

Questions:

Is the wrapper script implementation pending?
How does the container communicate the target host to the proxy?
Should there be environment-based host tracking?

3. Socket Permissions Security (Low Priority)

server.go:88 sets socket permissions to 0666 (world-readable/writable):

if err := os.Chmod(s.socketPath, 0666); err != nil {

Analysis: The comment correctly notes security is maintained by:

Per-run socket directories (~/.moat/sockets/<run-id>/)
Proxy-enforced host filtering
Directory only mounted to specific container

Suggestion: Consider adding a reference to the proxy security model doc or expanding the comment to note this is required for cross-UID access.

4. TCP Server Binding (Low - Informational)

For Docker mode, the TCP server binds to 127.0.0.1:0 (manager.go:462). The comment in the PR description mentions this is for macOS/Windows Docker-in-VM scenarios.

Question: On Linux, does Docker Desktop also require TCP mode, or does it use Unix sockets? Might be worth documenting the runtime detection logic.

5. Error Handling in socat Bridge (Low Priority)

The moat-init.sh script starts socat in the background but doesn't validate it stays running:

socat UNIX-LISTEN:/run/moat/ssh/agent.sock,fork TCP:"$MOAT_SSH_TCP_ADDR" &

If socat fails after the health check loop, SSH operations will fail silently. Consider:

Trapping signals and checking socat exit status
More robust health checking beyond socket existence

6. Fingerprint Parsing (Low Priority)

grant_ssh.go:166 decodes base64 from authorized_keys format:

keyBlob, err := base64.StdEncoding.DecodeString(parts[1])

SSH public keys can use StdEncoding or RawStdEncoding. While standard openssh-keygen uses standard encoding, consider using ssh.ParseAuthorizedKey from golang.org/x/crypto/ssh for more robust parsing.

7. Audit Data Consistency (Low Priority)

The SSHData struct in audit/entry.go:58-64 uses omitempty tags. For tamper-proof logs, consider always including fields (even if empty) to ensure consistent hash computation across all operations.

📝 Documentation

Strengths:

README examples are clear and actionable
Error messages follow project guidelines (tell users exactly what to set and how)
Inline comments explain security trade-offs

Suggestions:

Document the host tracking mechanism once implemented
Add architecture diagram showing proxy flow (user agent → filtering proxy → container)
Consider adding troubleshooting section for common issues (agent not running, key not added, socat failures)

🧪 Testing

Coverage: Good unit test coverage for core logic (filtering, signing, protocols)

Missing:

Integration tests for TCP vs Unix socket modes
Concurrent sign request testing
Error path testing (upstream agent failures, network issues)
E2E test with actual git operations

Performance Considerations

Proxy adds minimal latency (one extra hop for List/Sign operations)
Unix socket mode should have near-zero overhead
TCP + socat mode adds two extra hops but is still acceptable for SSH operations

Final Verdict

Overall Assessment: This is a solid implementation with good security fundamentals. The architecture is sound and integrates well with moat's existing systems.

Blocking Issues:

Missing host tracking implementation (Issue Add Claude Code GitHub Workflow #2 above) - unclear if this is complete or pending

Recommended Next Steps:

Clarify/implement the host tracking mechanism
Add integration tests for concurrent scenarios
Consider the suggestions above for production hardening

The PR already has owner approval, so these are suggestions for follow-up improvements rather than blocking concerns.

Code Quality: ⭐⭐⭐⭐ (4/5)
Security: ⭐⭐⭐⭐ (4/5)
Test Coverage: ⭐⭐⭐⭐ (4/5)
Documentation: ⭐⭐⭐⭐ (4/5)

Adds secure SSH agent forwarding to containers with per-host key access control. Users grant access via `moat grant ssh --host github.com` which maps a specific SSH key to a host. At runtime, the proxy only allows signing requests for authorized host/key combinations. Key features: - Host-based key filtering (keys only work for granted hosts) - Automatic container setup when SSH grants are present - TCP bridge for Docker on macOS (Unix sockets via socat in container) - Direct Unix socket mounting for Linux/Apple containers - Audit logging for all SSH agent operations Usage: moat grant ssh --host github.com # Grant access (one-time) moat run ./agent # SSH just works if grants present The agent.yaml only needs: grants: - ssh:github.com

claude · 2026-01-19T22:57:35Z

PR Review: SSH Agent Proxy with Host-Based Filtering

Overview

This is a well-architected feature that adds SSH agent forwarding with security-conscious host filtering. The implementation follows moat's security model and integrates cleanly with existing credential and audit systems.

Strengths

Architecture & Design

Clean abstraction layers: The sshagent package properly separates concerns
Security-first approach: Host-based filtering prevents unauthorized SSH access
Runtime-aware: Handles Docker/Apple containers correctly with Unix sockets vs TCP
Audit integration: SSH operations flow into tamper-proof audit store (manager.go:840-847)

Code Quality

Good error messages: Follows CLAUDE.md guidelines with actionable messages
Comprehensive testing: Unit tests cover filtering logic, edge cases, E2E scenarios
Type safety: Strong interfaces make code testable and maintainable

Issues & Concerns

1. Socket Permissions Too Permissive (Security)

Location: internal/sshagent/server.go:88

Setting socket permissions to 0666 (world read/write) is overly permissive. While comments explain security relies on directory isolation, this violates defense-in-depth. Any process discovering the socket path can connect.

Recommendation: Use 0600 or 0660 instead. Investigate ACLs or group permissions for cross-user access.

2. Race Condition in Host Tracking (Bug)

Location: internal/sshagent/proxy.go:103

currentHost is set by external wrapper but validated during Sign(). No guarantee host is set before signing. Could cause incorrect validation, wrong audit logs, or unexpected failures.

Recommendation: Pass host as parameter to Sign() instead of shared state, or document that proxy is not safe for concurrent use.

3. TCP Server Binding Unclear

Location: internal/run/manager.go:462

TCP server binds to 127.0.0.1, but comments suggest Apple containers need 0.0.0.0 (like HTTP proxy). Configuration seems inconsistent.

Recommendation: Document how Docker containers access TCP server. Consider token auth like HTTP proxy.

4. Missing Input Validation (Security)

Location: cmd/moat/cli/grant_ssh.go:44

No validation that host is valid hostname. Could accept empty strings, special characters, IPs.

Recommendation: Add hostname validation.

5. Incomplete Error Handling (Bug)

Location: internal/sshagent/server.go:119-124

Accept errors silently ignored and retried indefinitely. File descriptor exhaustion won't be surfaced.

Recommendation: Log errors or implement backoff.

6. Socat Bridge Lacks Health Check (Reliability)

Location: internal/deps/scripts/moat-init.sh:14

Socat started in background with no health check or retry logic. Could fail silently.

Recommendation: Add socket validation after creation loop.

7. Test Assertion Missing

Location: internal/sshagent/server_test.go:78-80

Test logs but doesn't assert. Name implies validation but doesn't actually test.

Recommendation: Make proper assertion, rename as informational, or remove.

Security Assessment

Strong Controls

Private keys never enter containers
Host-based key filtering at proxy layer
All operations audited to tamper-proof store
Grants stored encrypted

Areas for Improvement

Socket permissions (issue 1)
Race conditions in host tracking (issue 2)
TCP binding model unclear (issue 3)
Input validation gaps (issue 4)

Test Coverage

Excellent overall:

Unit tests for proxy filtering logic
Server lifecycle tests
E2E grant flow tests
Edge cases covered

Missing:

TCP server mode
Socat bridge functionality
Concurrent sign operations
Malformed public key handling

Recommendations

Must Fix Before Merge

Socket permissions (security concern)
Race condition in host tracking (potential bug)
TCP binding clarification (may not work as intended)

Should Fix

Input validation
Error handling in server
Socat health check

Nice to Have

Test assertion fix
Concurrent usage tests
Document TCP mode security model

Summary

This is high-quality work with sound architecture and good security practices. However, socket permissions and race conditions should be addressed before merging. The implementation shows strong understanding of moat's architecture and follows project conventions well.

Recommendation: Request changes for must-fix issues, then approve.

dpup requested a review from andybons January 19, 2026 21:55

dpup force-pushed the ssh branch from 4cba501 to f404da0 Compare January 19, 2026 21:56

andybons approved these changes Jan 19, 2026

View reviewed changes

dpup force-pushed the ssh branch from f404da0 to 5ebbaec Compare January 19, 2026 22:13

dpup force-pushed the ssh branch from 5ebbaec to a944db6 Compare January 19, 2026 22:53

dpup merged commit 762994b into main Jan 20, 2026
4 checks passed

dpup deleted the ssh branch January 20, 2026 02:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ssh): add SSH agent proxy with host-based filtering#34

feat(ssh): add SSH agent proxy with host-based filtering#34
dpup merged 1 commit into
mainfrom
ssh

dpup commented Jan 19, 2026

Uh oh!

claude Bot commented Jan 19, 2026

Uh oh!

claude Bot commented Jan 19, 2026

Uh oh!

claude Bot commented Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dpup commented Jan 19, 2026

How it works

Runtime modes:

Changes

Uh oh!

claude Bot commented Jan 19, 2026

PR Review: SSH Agent Proxy Feature

✅ Strengths

Architecture & Design

Security

Code Quality

Documentation

🔍 Issues & Recommendations

1. Potential Resource Leak (Medium Priority)

2. Race Condition in Audit Callback (Low Priority)

3. Missing Error Check (Low Priority)

4. Test Permissions Check is Weak (Low Priority)

5. Incomplete Host Parsing Error Handling (Low Priority)

6. Socket Directory Cleanup (Low Priority)

📝 Minor Suggestions

Code Style

Testing

Documentation

🎯 Performance Considerations

🔒 Security Assessment

✅ Security Done Right

⚠️ Security Considerations

📊 Test Coverage Assessment

Strong Coverage

Could Be Improved

📚 Conventional Commits

✨ Overall Assessment

Recommended Action

Uh oh!

claude Bot commented Jan 19, 2026

Code Review: SSH Agent Proxy Implementation

✅ Strengths

🔍 Issues & Suggestions

1. Race Condition in Host Tracking (Medium Priority)

2. Missing Host Tracking Implementation (High Priority)

3. Socket Permissions Security (Low Priority)

4. TCP Server Binding (Low - Informational)

5. Error Handling in socat Bridge (Low Priority)

6. Fingerprint Parsing (Low Priority)

7. Audit Data Consistency (Low Priority)

📝 Documentation

🧪 Testing

Performance Considerations

Final Verdict

Uh oh!

claude Bot commented Jan 19, 2026

PR Review: SSH Agent Proxy with Host-Based Filtering

Overview

Strengths

Architecture & Design

Code Quality

Issues & Concerns

1. Socket Permissions Too Permissive (Security)

2. Race Condition in Host Tracking (Bug)

3. TCP Server Binding Unclear

4. Missing Input Validation (Security)

5. Incomplete Error Handling (Bug)

6. Socat Bridge Lacks Health Check (Reliability)

7. Test Assertion Missing

Security Assessment

Strong Controls

Areas for Improvement

Test Coverage

Recommendations

Must Fix Before Merge

Should Fix

Nice to Have

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects