Skip to content

feat(cli): Add interactive modes with attach/detach. Persist state to disk.#29

Merged
dpup merged 6 commits into
mainfrom
cmd
Jan 19, 2026
Merged

feat(cli): Add interactive modes with attach/detach. Persist state to disk.#29
dpup merged 6 commits into
mainfrom
cmd

Conversation

@dpup
Copy link
Copy Markdown
Collaborator

@dpup dpup commented Jan 19, 2026

  • Add moat attach command to reconnect to running containers
  • Add -it flags for interactive sessions with TTY support
  • Implement escape sequences (Ctrl-/ d/k) for detach/stop in interactive mode
  • Add run persistence across manager restarts via container state reconciliation
  • Add interactive field to agent.yaml for default interactive mode

Details

Attach/Detach Model

Runs now exist independently of your terminal. By default, moat run attaches to show output:

  • Non-interactive mode: Ctrl+C detaches (run continues), Ctrl+C twice within 500ms stops the run
  • Interactive mode (-it): Ctrl-/ d detaches, Ctrl-/ k stops, Ctrl+C passes through to container

New Commands

  moat attach <run-id>      # Reconnect to see output
  moat attach -it <run-id>  # Reconnect interactively
  moat run -it -- bash      # Start interactive session
  moat run -d ./project     # Start detached (background)

Run Persistence

Runs now survive manager restarts. On startup, the manager reconciles persisted run metadata with actual container state via the new ContainerState() runtime method.

Configuration

Interactive mode can be set in agent.yaml:

  command: ["bash"]
  interactive: true

@majorcontext majorcontext deleted a comment from claude Bot Jan 19, 2026
@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

PR Review: Interactive Attach/Detach Support

Great work on this feature! The attach/detach functionality is well-implemented with clean separation of concerns.

✅ Strengths

  • Clean separation: Interactive vs non-interactive modes well separated
  • State persistence: Run metadata persists across manager restarts with proper reconciliation
  • Escape sequence handling: EscapeProxy implementation is elegant and well-tested
  • Excellent test coverage: escape_test.go has comprehensive edge case coverage
  • Error handling: Proper error propagation and context cancellation throughout

🔍 Critical Issues

1. Terminal State Leak (attach.go:214)

If goroutine panics between EnableRawMode and RestoreTerminal, terminal left in raw mode.
Fix: Add panic recovery in defer.

2. Race Condition: Double Signal Handling (attach.go:251-261, run.go:267-308)

Both attachInteractiveMode and runInteractive set up SIGINT/SIGTERM handlers.
Fix: Document contract or consolidate handlers.

3. Incomplete Error Context (manager.go:100-107)

Transient errors treated same as not-found errors during state reconciliation.
Fix: Add error type checking.

4. Resource Leak: Proxy Server (manager.go:626-642)

Proxy could leak if SaveMetadata panics.
Fix: Move cleanup to defer after creation.

5. Missing Validation (run.go:210)

Interactive without TTY can cause undefined behavior.
Fix: Always set TTY when Interactive is true.

⚠️ Potential Issues

  • 200ms container exit delay is arbitrary
  • 500ms double Ctrl+C window might be too tight
  • No timeout on Attach operations
  • Firewall errors could be more actionable

🎯 Testing Recommendations

  1. Add E2E tests for attach/detach
  2. Test state reconciliation after daemon restart
  3. Test signal handling edge cases
  4. Test terminal restoration after abnormal exits

Final Verdict

Approve with fixes

Excellent work! Address the terminal state leak (critical) and consider the race condition. Other issues can be follow-ups.

Priority: Terminal leak > error handling > signal docs

@dpup
Copy link
Copy Markdown
Collaborator Author

dpup commented Jan 19, 2026

@claude fix the lint errors. The code review issues seem minor. If you have permission, file issues for the serious ones otherwise we'll come back to it.

@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

Claude finished @dpup's task in 4m 36s —— View job


Lint fixes completed ✓

Summary

Fixed all 6 golangci-lint errors and pushed changes to the cmd branch.

Changes Made

  1. docker.go:502 - Check error from CloseWrite() and propagate if no earlier error
  2. attach.go:214 & run.go:331 - Check RestoreTerminal() errors in defer function
  3. attach.go:157 & run.go:300 - Move lastSigTime assignment to before return to fix ineffectual assignment
  4. escape.go:159 - Remove redundant nil check by adding explicit break statements

Commit: 53f45e9 - "fix(lint): address golangci-lint errors"

Critical Review Issues (Unable to File)

I don't have permission to create GitHub issues. Here are the critical findings from the code review that should be tracked:

  1. Terminal State Leak (attach.go:214, run.go:331) - PARTIALLY ADDRESSED

    • Fixed unchecked error, but goroutine panic could still leave terminal in raw mode
    • Recommend: Add panic recovery in defer
  2. Race Condition: Double Signal Handling (attach.go:251-261, run.go:267-308)

    • Both attachInteractiveMode and runInteractive set up SIGINT/SIGTERM handlers
    • Recommend: Document contract or consolidate handlers
  3. Incomplete Error Context (manager.go:100-107)

    • Transient errors treated same as not-found during state reconciliation
    • Recommend: Add error type checking
  4. Resource Leak: Proxy Server (manager.go:626-642)

    • Proxy could leak if SaveMetadata panics
    • Recommend: Move cleanup to defer after creation
  5. Missing Validation (run.go:210)

    • Interactive without TTY can cause undefined behavior
    • Recommend: Always set TTY when Interactive is true

--- | Branch: cmd

@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

Code Review: PR #29 - Interactive Modes with Attach/Detach

Thanks for this substantial feature addition! The attach/detach functionality and run persistence are great additions to moat. I've reviewed the changes and have feedback across several areas.

Critical Issues 🔴

1. Security: Firewall IP Filtering Disabled

Files: internal/container/docker.go:224-258, internal/container/apple.go:338-376

Both firewall implementations explicitly do not filter by destination IP address (see comments in code). While I understand the reasoning (dynamic IPs for host.docker.internal), this means the firewall allows traffic to ANY destination on the proxy port, not just the proxy.

Impact: If another service runs on the same port on a different host, the firewall won't block it. For Docker on Linux with localhost binding, there's also no authentication token protection.

Recommendations:

  • Consider resolving host.docker.internal inside the container and using that IP in iptables rules
  • Document this limitation clearly in security documentation
  • Consider adding authentication tokens for Docker containers as well (currently only Apple containers have this)

2. Potential Resource Leak in Docker Attach

File: internal/container/docker.go:465-528

The Attach() method starts two goroutines (stdin and stdout) but only waits for one to complete via select. If stdin fails with an error (e.g., escape sequence triggers), the output goroutine may still be running.

Recommendation: Use a sync.WaitGroup or explicitly wait for both goroutines to ensure proper cleanup.

3. Incomplete Container State Handling

File: internal/run/manager.go:99-121

The loadPersistedRuns() method maps container states but doesn't handle all possible states. Docker can return "paused", "removing", "unknown", etc. Line 119 falls back to State(meta.State) which could be invalid.

Recommendation: Add explicit handling for all possible container states and log warnings for unknown states.


Moderate Issues 🟡

4. Missing Error Handling in Attach Command

File: cmd/moat/cli/attach.go:195-201

Recent logs are displayed but errors are silently ignored. If RecentLogs() fails, users get no indication.

Recommendation: Log a debug message when recent logs can't be fetched.

5. Windows Support Incomplete

File: internal/term/raw_windows.go:15-27

Raw mode always returns an error on Windows, which means interactive mode won't work properly. Escape sequences will echo and the UX will be degraded.

Recommendation: Either implement Windows raw mode using the Console API, or prominently document this limitation.

6. Duplicate Logic Between attach.go and run.go

The attachInteractiveMode() (attach.go:193-316) and runInteractive() (run.go:318-441) share nearly identical logic for terminal setup, signal handling, escape sequences, and attach lifecycle.

Issue: Violates DRY principle and makes maintenance harder. Bug fixes might need to be applied twice.

Recommendation: Extract common interactive attach logic into a shared function in internal/run or internal/term.

7. Complex Control Flow in runInteractive

File: cmd/moat/cli/run.go:318-441

This function has deeply nested select statements and handles multiple concerns: signal handling, escape sequences, attach lifecycle, container exit detection, and context cancellation.

Recommendation: Break into smaller functions with clear responsibilities. Consider a state machine pattern for the different modes (attached, detached, exiting, etc.).


Test Coverage 🧪

8. No Tests for Attach Command

File: cmd/moat/cli/attach.go (316 lines, no test file)

The entire attach command has no unit tests despite handling complex state machines and user interaction.

Recommendation: Add unit tests covering:

  • Basic attach scenarios
  • Escape sequence handling
  • Signal handling (SIGINT, SIGTERM)
  • Container exit detection
  • Error cases (run not found, not running, etc.)

9. Missing Edge Cases in Escape Proxy Tests

File: internal/term/escape_test.go

Existing tests are good but missing:

  • Buffer boundary conditions (escape sequence split across buffer boundaries)
  • Very large inputs
  • Rapid succession of escape sequences

10. No E2E Tests for Interactive Mode

Recommendation: Add E2E tests that:

  • Start a run in interactive mode
  • Send escape sequences
  • Verify detach/stop behavior
  • Test attach/reattach flows

Security Notes 🔒

11. Proxy Authentication Token in Environment Variable

File: internal/run/manager.go:281-292, 349-357

The proxy auth token is passed via HTTP_PROXY environment variable, making it visible to all processes in the container via /proc/<pid>/environ.

Severity: Medium - This is standard practice for HTTP proxy auth, but worth noting as an attack vector.

Recommendation: Document this in security docs. Consider certificate-based auth for higher security scenarios.

12. Escape Sequence Injection

File: internal/term/escape.go:52-59

If a malicious container could inject control characters matching the escape sequence, it could trigger detach/stop actions.

Severity: Low - Requires attacker to control container output, which is already a compromised scenario.

Recommendation: Document that escape sequences should not be used in untrusted environments.


Performance 🚀

13. Polling for Container Exit in Apple Containers

File: internal/container/apple.go:177-217

waitByPolling() polls every 500ms to check if container has exited, adding up to 500ms latency.

Recommendation: Investigate if Apple's container CLI has a native blocking wait command, or consider file system watches on container state files.

14. Double Log Processing

File: internal/run/manager.go:733-753, 838-846

Logs are streamed to stdout during execution, then read again after container exits. For long-running containers with lots of output, this doubles I/O.

Recommendation: Consider buffering logs during streaming so they can be written to storage without re-reading.


Documentation 📚

15. State Transitions Undocumented

File: internal/run/run.go:15-25

Run states are defined but valid transitions aren't documented. Can a run go from StateStopping to StateRunning? What happens if attach is called during StateStarting?

Recommendation: Add state transition diagram in comments or docs.

16. Escape Sequence Choice

File: internal/term/escape.go:52-59

Ctrl-/ is chosen as the escape prefix but:

  • May conflict with some terminal emulators
  • Not all keyboards have easy access to / key
  • No documentation on why Ctrl-/ was chosen

Recommendation: Document the choice and consider making it configurable.


Minor Issues

17. Silent Error Ignoring

Multiple locations use _ = to ignore errors without logging:

  • internal/run/manager.go:105, 628, 730

Silent failures could lead to inconsistent state on disk.

Recommendation: At minimum, log errors at debug level. For critical operations like initial metadata save, consider returning the error.

18. Magic Numbers

  • Port binding retry: internal/run/manager.go:693-702 (5 retries, 50ms delay)
  • lastNLines: internal/run/manager.go:1029-1048 (n+1 check)

Recommendation: Extract to named constants with explanatory comments.


Positive Aspects ✅

  • Excellent test coverage for escape proxy functionality
  • Good separation of concerns with new internal/term package
  • Comprehensive comments explaining complex logic (especially in escape.go)
  • Proper use of atomic operations for concurrent store access
  • Graceful degradation when raw mode can't be enabled
  • Consistent use of context for cancellation
  • Well-designed run persistence with proper reconciliation
  • Security-conscious proxy authentication for Apple containers

Summary

This PR adds valuable functionality, but several areas need attention before merge:

Must Fix:

  1. Security issue with firewall IP filtering (feat(routing): add hostname-based service routing for agents #1)
  2. Resource leak potential in Docker attach (Add Claude Code GitHub Workflow #2)
  3. Windows support documentation (perf(audit): optimize Merkle tree rebuilding to O(log n) per append #5)
  4. Inconsistent container state handling (feat(audit): add tamper-proof audit logging with cryptographic verification #3)

Should Fix:
5. Duplicate logic between attach.go and run.go (#6)
6. Missing tests for attach command (#8)
7. Complex control flow extraction (#7)

Nice to Have:
8. Document state transitions (#15)
9. Add retry constant explanations (#18)
10. Edge case test coverage (#9)

The core functionality appears sound, but the complexity and lack of tests for critical paths (especially attach) is concerning for production use. Great work on this substantial feature!

@dpup
Copy link
Copy Markdown
Collaborator Author

dpup commented Jan 19, 2026

@claude rebase this branch against main

dpup added 2 commits January 19, 2026 09:42
Allow specifying a default command in agent.yaml that runs when no
command is provided on the CLI. CLI commands take precedence.
- Validate that command[0] is not empty when command is specified
- Document default behavior (no command = container exits immediately)
- Add security note about shell command patterns and variable interpolation
- Consolidate command precedence documentation in Configuration section
@majorcontext majorcontext deleted a comment from claude Bot Jan 19, 2026
dpup and others added 3 commits January 19, 2026 17:45
Implement attach/detach functionality allowing users to connect to and
disconnect from running containers without stopping them.

Key features:
- `moat attach <run-id>` to reconnect to running containers
- `moat run -it` for interactive sessions with TTY
- Escape sequences (Ctrl-/ d/k) for detach/stop in interactive mode
- Double Ctrl+C (within 500ms) to stop in non-interactive mode
- Run persistence across manager restarts via container state reconciliation
- Recent logs displayed when re-attaching to interactive sessions
- Interactive mode configurable via agent.yaml

New files:
- cmd/moat/cli/attach.go: Attach command implementation
- internal/term/escape.go: Escape sequence detection for detach/stop
- internal/term/raw.go: Terminal raw mode for clean escape handling
- docs/plans/2026-01-19-interactive-attach-model.md: Design document

Runtime interface extended with:
- ContainerState(): Query container status for state reconciliation
- Attach(): Connect stdin/stdout/stderr to running containers

🤖 Generated with [Claude Code](https://claude.com/claude-code)
When attaching to a running container, `moat attach` now defaults to
the same interactive mode that was used when the run was started:
- If started with `moat run -it`, attach uses interactive mode
- If started without `-it`, attach uses output-only mode

Use `-i=false` to force output-only mode even if the run was started
interactively. The `-it` flags still work to force interactive mode.

Changes:
- Add `Interactive` field to run metadata (persisted to disk)
- Restore interactive mode when loading persisted runs
- Update attach command to check run's interactive flag
- Use Cobra's `Changed()` to detect explicit flag usage

🤖 Generated with [Claude Code](https://claude.com/claude-code)
- Check CloseWrite() error in docker.go
- Check RestoreTerminal() error in attach.go and run.go
- Fix ineffectual assignment warnings by moving lastSigTime update
- Fix redundant nil check in escape.go by breaking immediately

Co-authored-by: Daniel Pupius <dpup@users.noreply.github.com>
@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

Pull Request Review: Interactive Attach/Detach Model

This is a significant feature addition that adds interactive mode support, attach/detach capabilities, and run persistence. Overall the implementation is solid with good attention to detail. Below are my findings organized by category:

✅ Strengths

  1. Well-architected state management: The reconciliation logic in manager.go:loadPersistedRuns() properly handles manager restarts by checking actual container state against persisted metadata
  2. Comprehensive escape sequence handling: The EscapeProxy implementation is thorough with excellent test coverage (escape_test.go)
  3. Clear separation of concerns: Interactive vs non-interactive modes are cleanly separated in attach.go
  4. Security considerations: Proxy authentication tokens for Apple containers (line 288-295 in manager.go)
  5. Good error handling: Proper cleanup of resources (proxy servers, containers) when operations fail
  6. Thoughtful UX: Shows recent logs when re-attaching (attach.go:195-201), clear escape sequence instructions

🐛 Potential Bugs

  1. Race condition in escape detection (internal/term/escape.go:186-193)

    • When sawPrefix=true and output exists, the prefix state is preserved but output is returned
    • A subsequent Read() might not properly handle the dangling prefix if the underlying reader has no more data
    • Consider adding test coverage for this edge case
  2. Container state string comparison (internal/run/manager.go:110-120)

    • Hardcoded string comparisons ("running", "exited", "dead", etc.) are fragile
    • If container runtimes return slightly different strings, reconciliation could fail
    • Recommendation: Define constants for container states and document the expected values from each runtime
  3. Non-atomic metadata updates (multiple locations)

    • Metadata is saved with "best-effort; non-fatal if it fails" comments
    • If metadata save fails, the in-memory state diverges from disk state
    • On restart, stale metadata could be loaded
    • Recommendation: Either make saves reliable or add validation/repair on load
  4. Missing context propagation (internal/run/manager.go:873, 907)

    • Uses context.Background() when capturing logs after container exits
    • If the original context is cancelled, log capture still proceeds
    • Could cause delays in shutdown scenarios
    • Recommendation: Use a short timeout context instead of Background

⚠️ Potential Issues

  1. Attach modes confusion (cmd/moat/cli/attach.go:86-91)

    • The flag precedence logic is clear in code but the UX could be confusing
    • -i defaults to false, but attach uses run's original mode if flag not changed
    • Users might expect -i to mean "use interactive" not "override to interactive"
    • Recommendation: Consider renaming or adding clearer help text
  2. Double Ctrl+C timing (cmd/moat/cli/attach.go:141-157)

    • 500ms window is reasonable but not configurable
    • On slow terminals or under load, users might accidentally stop when they meant to detach
    • Recommendation: Document this behavior prominently, consider making it configurable
  3. Log streaming race (internal/run/manager.go:870-878)

    • After Wait returns, ContainerLogsAll is called to ensure complete logs
    • But logs are written to storage via LogWriter which could fail silently
    • No verification that logs were actually persisted
    • Recommendation: Add warning if log write fails
  4. Apple container DNS hardcoding (internal/container/apple.go:97-100)

    • Uses Google DNS (8.8.8.8) as fallback without user control
    • Could be problematic in restricted networks
    • Recommendation: Make DNS configurable via agent.yaml
  5. Container cleanup on failure paths

    • Some error paths properly clean up containers (e.g., line 614 in manager.go)
    • Others might leave containers running (e.g., if SaveMetadata after successful start fails)
    • Recommendation: Audit all error paths for resource leaks

See next comment for Security, Performance, Test Coverage, and Recommendations...

@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

🔒 Security Concerns

  1. Proxy authentication token generation (internal/run/manager.go:290-294)

    • Uses crypto/rand which is correct
    • Token is 32 bytes (64 hex chars) which is sufficient
    • ✅ Well implemented
  2. Credential injection security model

    • Properly documented in internal/proxy/proxy.go (mentioned in CLAUDE.md)
    • Apple containers use token auth when binding to 0.0.0.0
    • Docker uses localhost binding
    • ✅ Security model is sound
  3. Raw terminal mode (cmd/moat/cli/attach.go:204-215)

    • Enables raw mode but has proper cleanup in defer
    • Handles errors gracefully
    • ✅ Well implemented

📊 Performance Considerations

  1. Polling for container state (internal/run/manager.go:726-734)

    • Retries GetPortBindings up to 5 times with 50ms sleep
    • Could add up to 250ms delay on startup
    • This is acceptable for startup, but document why it's needed
  2. Log capture on exit (internal/run/manager.go:870-878)

    • Calls ContainerLogsAll which reads entire log buffer
    • For long-running containers with verbose logs, this could be slow
    • Consider streaming to disk instead of reading all into memory
  3. Escape sequence processing (internal/term/escape.go)

    • Processes byte-by-byte which is necessary for correctness
    • Additional buffer copies (line 115, 239) could add overhead
    • For typical terminal I/O this is fine, but worth noting

🧪 Test Coverage

Excellent test coverage for escape handling:

  • escape_test.go has comprehensive tests for all escape sequences
  • Tests edge cases like EOF, small reads, mixed content
  • ✅ Well tested

Missing tests:

  • No tests for attach.go detach/attach logic
  • No tests for manager.go reconciliation logic
  • No tests for raw terminal mode handling
  • Recommendation: Add integration tests for attach/detach flows

📝 Code Quality

  1. Good code organization: Clear separation of concerns across packages
  2. Proper use of Go idioms: Defer for cleanup, error wrapping, context usage
  3. Documentation: Inline comments are helpful, especially for complex logic
  4. Constants: Good use of named constants (doublePressWindow, containerExitCheckDelay)

Minor suggestions:

  • Consider extracting the "last N lines" logic (manager.go:1061-1079) to a utility function
  • The workspaceToClaudeDir function (manager.go:1098) could use more documentation about why this transformation is needed

@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

🎯 Recommendations Summary

High Priority:

  1. Fix container state string comparison fragility (use constants)
  2. Add tests for attach/detach flows
  3. Audit error paths for container/resource leaks
  4. Fix context propagation in Wait method

Medium Priority:

  1. Make DNS configurable for Apple containers
  2. Add validation for metadata persistence failures
  3. Document the 500ms double-Ctrl+C window more prominently
  4. Add warning if log write fails

Low Priority:

  1. Optimize log capture for large log buffers
  2. Make doublePressWindow configurable
  3. Extract utility functions for better reusability

📋 Conventional Commits Compliance

✅ Commit message follows the conventional commits format specified in CLAUDE.md:

  • Format: feat(cli): Add interactive modes with attach/detach. Persist state to disk.
  • Type feat is appropriate for this feature addition
  • No Co-Authored-By line (as specified in CLAUDE.md)

Overall Assessment: This is a well-implemented feature with good attention to edge cases and security. The main concerns are around state persistence reliability and test coverage for the attach flows. The code quality is high and follows Go best practices.

Recommendation: Approve with suggested improvements to be addressed in follow-up PRs or before merge, depending on severity assessment.

@claude
Copy link
Copy Markdown

claude Bot commented Jan 19, 2026

Pull Request Review - Interactive Mode & Attach/Detach

This is a significant and well-implemented feature that adds interactive modes, attach/detach capabilities, and run persistence. Overall, the code quality is high and follows the project conventions. Here's my detailed feedback:


Strengths

Architecture & Design

  • Clean separation of concerns: The new internal/term package is well-isolated with clear responsibilities
  • Excellent escape sequence handling: The EscapeProxy implementation is robust with proper buffering and edge case handling
  • Good runtime abstraction: The ContainerState() method cleanly extends the Runtime interface for state reconciliation
  • Smart mode detection: Auto-detecting interactive mode from agent.yaml or command name is user-friendly

Code Quality

  • Comprehensive tests: The term/escape_test.go covers edge cases (EOF handling, small reads, literal prefix, etc.)
  • Error handling: Proper error propagation with context-aware cleanup (proxy servers, containers)
  • Documentation: Good inline comments explaining timing constants, security model, and complex flows
  • Consistent patterns: Signal handling follows similar patterns across attach.go and run.go

User Experience

  • Helpful hints: Warning when running bash without -i flag
  • Clear escape help text: Users know how to detach/stop in each mode
  • Graceful degradation: Falls back when raw mode isn't available

🔍 Issues & Concerns

1. Race Condition in Proxy Logger Setup (Medium Priority)

Location: internal/run/manager.go:299-330

The atomic store reference pattern is used to avoid data races, but there's a window where network requests can be lost:

var storeRef atomic.Value // holds *storage.RunStore
p.SetLogger(func(data proxy.RequestLogData) {
    store, _ := storeRef.Load().(*storage.RunStore)
    if store == nil {
        // Store not yet initialized - early request during container startup.
        // This is expected and non-fatal; the request won't be logged.
        log.Debug("skipping network log: store not yet initialized", ...)
        return
    }
    ...
})

Problem: Network requests made before line 644 (r.storeRef.Store(store)) are silently dropped. For containers that make network requests immediately on startup, this could lose important audit data.

Suggestion:

  • Buffer early requests in memory until store is ready, or
  • Initialize store earlier in the flow (before proxy starts), or
  • Document this limitation explicitly in comments and ensure audit system accounts for it

2. Timing-Dependent Container Exit Detection (Low-Medium Priority)

Location: cmd/moat/cli/attach.go:159-178, 286-304

Multiple places use a 200ms delay to detect if a container exited:

case <-logsDone:
    select {
    case err := <-waitDone:
        // Container exited
    case <-time.After(containerExitCheckDelay):
        // Check run state manually

Problem: This is timing-dependent and could race on slow systems. A fast container exit might not be detected within 200ms, or the state check might race with actual container shutdown.

Suggestion:

  • Consider using container events/notifications instead of polling with delays
  • If timing is unavoidable, document why 200ms was chosen and what happens if it's too short

3. Metadata Save Errors Are Silent (Low Priority)

Location: internal/run/manager.go:660, 762, 830, 903

Metadata persistence failures are ignored:

_ = r.SaveMetadata()

Problem: If disk is full or permissions are wrong, state won't persist across restarts. This could lead to confusion when moat attach can't find a run that was just created.

Suggestion:

  • At minimum, log these errors at WARN level
  • Consider failing the operation if metadata save fails on initial creation (line 660)
  • Document that runs without persisted metadata won't survive manager restarts

4. Double Ctrl+C Window May Be Too Short (UX)

Location: cmd/moat/cli/attach.go:19

doublePressWindow = 500 * time.Millisecond

Issue: 500ms might be too fast for users, especially with network latency or when typing over SSH. Common practice is 1-2 seconds for double-press detection.

Suggestion:

  • Consider increasing to 1 second
  • Make it configurable via environment variable or config

5. Signal Handling Inconsistency (Minor)

Location: cmd/moat/cli/attach.go:250-260 vs cmd/moat/cli/run.go:370-383

In attachInteractiveMode:

if sig == syscall.SIGTERM {
    fmt.Printf("\nStopping run %s...\n", r.ID)
    // stops the run
}
// SIGINT is forwarded to container

But in runInteractive, the same logic applies. This is correct but could use a shared helper function.

Suggestion: Extract signal handling into a shared function to ensure consistency and reduce duplication.


6. Potential Resource Leak on Error Paths (Low)

Location: internal/run/manager.go:433-436

if err := secrets.ResolveAll(ctx, opts.Config.Secrets); err != nil {
    if proxyServer != nil {
        _ = proxyServer.Stop(context.Background())
    }
    return nil, err
}

Issue: Container was created at line 587, but on error at line 434, the container isn't cleaned up. Similar issue with CA directory creation.

Suggestion:

  • Use defer pattern with cleanup flag, or
  • Ensure all error paths after container creation include RemoveContainer call

7. Test Coverage Gaps

What's Missing:

  • E2E tests for attach/detach flows
  • Tests for run persistence and state reconciliation on manager restart
  • Tests for the double Ctrl+C timing window
  • Tests for interactive mode with actual containers

Suggestion: Add integration tests that:

  • Start a run, detach, reattach
  • Restart manager and verify runs are recovered
  • Test escape sequences end-to-end with real terminals (may need pty)

🎯 Performance Considerations

Storage Writes

The atomic store reference pattern is good, but consider:

  • Network request logging is synchronous in the proxy callback (line 316)
  • JSON marshaling on every request could be expensive for high-traffic agents
  • Consider batching or async writes if this becomes a bottleneck

Container State Polling

loadPersistedRuns calls ContainerState for every run on startup (line 100). For many runs:

  • Could be slow (N Docker API calls)
  • Consider caching or lazy loading
  • Document expected startup time with many persisted runs

🔒 Security Review

Escape Sequence Handling

✅ Well-designed: Ctrl-/ is unlikely to conflict with application input
✅ Escape escaping: Ctrl-/ Ctrl-/ sends literal Ctrl-/
✅ No injection risk: Binary comparison, not string parsing

Raw Mode Terminal

✅ Proper cleanup with defer
⚠️ Error on raw mode setup is logged but continues - document that escapes may echo to user

Authentication Token for Apple Containers

✅ Good use of 32-byte cryptographically random tokens (line 290)
✅ Token passed in URL for compatibility
✅ Documentation clearly explains why binding to 0.0.0.0 is safe


📝 Documentation & Style

Follows CLAUDE.md Guidelines

✅ Conventional commit format: feat(cli): Add interactive modes...
✅ No Co-Authored-By line (as required)
✅ Error messages are actionable (e.g., "run %s is not running (state: %s)")
✅ Examples use generic placeholder names

Comments & Docs

✅ Good architectural documentation in docs/plans/2026-01-19-interactive-attach-model.md
✅ Clear comments explaining timing constants
✅ README updates are comprehensive with examples

Suggestions

  • Add godoc comments to exported types in internal/run/run.go
  • Document the state machine for run states (Created → Starting → Running → Stopping → Stopped)
  • Add architecture diagram for attach/detach flow to docs

🧪 Testing Recommendations

# Scenarios to test manually:
# 1. Detach and reattach multiple times
moat run -i -- bash
# Press Ctrl-/ d to detach
moat attach <run-id>

# 2. Manager restart persistence
moat run -d ./project
# Kill moat manager process
# Restart and verify run is still tracked

# 3. Fast-exiting containers
moat run -- sh -c 'echo hello'
# Verify logs are captured

# 4. Network requests during startup
moat run --grant github -- sh -c 'curl https://api.github.com'
# Verify network.jsonl captures the request

Nice Touches

  1. Recent logs on reattach (attach.go:194-200): Showing last 50 lines gives context
  2. Helpful hint for bare shells (run.go:158-169): Warns about missing -i flag
  3. Graceful proxy cleanup: Proxy servers stop on all exit paths
  4. Platform-specific implementations: Windows raw mode stub shows forethought

📊 Summary

Category Rating Notes
Code Quality 8.5/10 Well-structured, clean separation of concerns
Test Coverage 7/10 Good unit tests, needs more integration tests
Security 9/10 Proper auth token handling, escape sequence safety
Performance 8/10 Minor concerns with startup scaling
Documentation 8.5/10 Good inline docs, could use more architecture docs
User Experience 9/10 Intuitive modes, helpful hints, clear messages

Overall: 8.3/10 - Strong implementation with minor issues that should be addressed.


🎬 Recommendation

Approve with minor changes

Must fix before merge:

  1. Add logging for metadata save failures (at minimum)
  2. Document the network request loss window during proxy initialization

Should fix soon after merge:
3. Address race condition in proxy logger setup
4. Increase double Ctrl+C window to 1 second
5. Add integration tests for attach/detach

Nice to have:
6. Extract shared signal handling helper
7. Add state machine documentation
8. Container cleanup on error paths


Great work on this PR! The attach/detach model is well-designed and the implementation quality is high. The issues I've raised are relatively minor and don't block the core functionality.

@dpup dpup merged commit 744ca85 into main Jan 19, 2026
4 checks passed
@dpup dpup deleted the cmd branch January 19, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants