Skip to content

feat(tui): add TTY trace system and fix footer rendering#108

Merged
dpup merged 1 commit into
mainfrom
feat/tty-trace-and-footer-fix
Feb 1, 2026
Merged

feat(tui): add TTY trace system and fix footer rendering#108
dpup merged 1 commit into
mainfrom
feat/tty-trace-and-footer-fix

Conversation

@dpup
Copy link
Copy Markdown
Collaborator

@dpup dpup commented Feb 1, 2026

Add terminal I/O tracing infrastructure for debugging TUI rendering issues. Fix Claude Code banner garbling by implementing debounced footer redraws.

Problem:

Claude Code's banner rendered with garbled/overlapping text when running inside moat. The issue was caused by moat redrawing the status bar footer after every stdout write, which interrupted Claude's multi-step rendering sequences wrapped in ESC[?2026h/l brackets.

Solution:

  1. TTY Trace System: Added infrastructure to capture, analyze, and decode terminal I/O with nanosecond timing for debugging TUI issues
  2. Footer Debouncing: Changed from immediate footer redraw to debounced redraw with 50ms delay, allowing child process to complete rendering frames without interruption
  3. Scroll Region Clearing: Changed from full screen clear (ESC[2J) to scroll region clear (ESC[J) to avoid clearing status bar area

New trace system features:

  • Capture all terminal I/O (stdin/stdout/stderr) with timing
  • Record resize events and terminal environment
  • Decode ANSI/VT100 control sequences
  • Find patterns like screen clears and resize timing issues
  • CLI: moat tty-trace analyze --decode/--find-clears

Footer rendering improvements:

  • Debounce timer resets on each write, redraw only after 50ms quiet
  • Timer stopped when entering compositor mode
  • Alternative approaches documented in comments for future consideration

@dpup dpup force-pushed the feat/tty-trace-and-footer-fix branch from 419b9af to c8eae32 Compare February 1, 2026 04:42
@claude
Copy link
Copy Markdown

claude Bot commented Feb 1, 2026

📦 Previous Review (superseded by newer review)

Pull Request Review: TTY Trace System and Footer Rendering Fix

This PR adds excellent debugging infrastructure and fixes a real rendering issue. The implementation is well-structured and the problem/solution are clearly documented. Here are my findings:

Code Quality ✅

Strengths:

  • Clean separation of concerns: tracing infrastructure (internal/trace/) is independent from TUI changes (internal/tui/writer.go)
  • Well-structured trace package with clear types (Trace, Recorder, Event)
  • Good use of Go idioms: io.Reader/Writer wrappers, mutex protection for concurrent access
  • Comprehensive ANSI/VT100 sequence decoder in analyze.go with proper parsing
  • Helpful CLI commands with sensible defaults

Minor suggestions:

  • The ttyTracer.save() method is called with defer but handles nil internally - consider making this pattern more explicit in documentation
  • In cmd/moat/cli/exec.go:244-246, there's redundant code where stdin is assigned escapeProxy in both branches - could simplify:
    stdin := io.Reader(escapeProxy)
    if tracer != nil {
        stdin = trace.NewRecordingReader(escapeProxy, tracer.recorder, trace.EventStdin)
    }

Potential Bugs 🐛

1. Race condition in footer debouncing (internal/tui/writer.go:545-550)
The scheduleFooterRedrawLocked() creates a timer with a callback that acquires the mutex:

w.footerTimer = time.AfterFunc(w.footerDebounceDelay, func() {
    w.mu.Lock()
    defer w.mu.Unlock()
    w.redrawFooterLocked()
})

This is generally safe, but there's a potential deadlock if Cleanup() is called while holding the lock and the timer fires. Consider checking if cleanup is in progress or add a stopped flag.

2. Timer leak potential (internal/tui/writer.go:316-323)
When entering compositor mode, the footer timer is stopped and set to nil. However, if Write() is called concurrently (after the check but before entering compositor), a new timer could be created. This is unlikely but worth documenting or adding a state check.

3. File permissions (internal/trace/format.go:192)

return os.WriteFile(path, data, 0644)

Trace files may contain sensitive data (as documented). Consider using 0600 to restrict access to owner only, or make this configurable.

Performance Considerations ⚡

1. Memory growth in long sessions
The Recorder appends all events to an in-memory slice. For long Claude Code sessions with lots of I/O, this could consume significant memory. Consider:

  • Adding a warning in the docs about memory usage
  • Implementing streaming to disk (as noted in "Future Enhancements")
  • Adding a --max-events or --max-size limit

2. Data copying overhead (internal/trace/recorder.go:43-45)
Every write allocates and copies the buffer:

dataCopy := make([]byte, len(data))
copy(dataCopy, data)

This is correct (necessary to avoid buffer reuse issues), but adds overhead. The performance impact is acceptable for debugging, but worth documenting that tracing should not be used in production.

3. Footer debouncing efficiency
The 50ms debounce is well-chosen based on the problem description. Nice work identifying this was a multi-step rendering issue!

Security Concerns 🔒

1. Sensitive data in traces ✅ Well-documented
The docs correctly warn about API keys, passwords, etc. in traces. The planned sanitize command is a good idea.

Recommendations:

  • Add a --redact flag to the trace command that performs basic redaction (common patterns like "api_key": "...", Authorization: Bearer ...)
  • Consider adding a hash/signature to trace files to detect tampering (similar to audit logs)

2. Path traversal in trace save
trace.Save(path) accepts arbitrary paths. While this is a debugging tool run by authorized users, consider:

  • Validating the path doesn't escape intended directories
  • Warning if writing to sensitive locations

Test Coverage 📊

Strengths:

  • analyze_test.go has good coverage of decoder functions
  • Tests for clear screen detection and resize issue finding
  • Round-trip save/load test

Gaps:

  • No tests for RecordingWriter/RecordingReader wrappers
  • No tests for concurrent access to Recorder
  • No integration test for the CLI commands
  • Missing tests for edge cases:
    • Empty data handling (tested in code but not in tests)
    • Very large events
    • Incomplete ANSI sequences

Recommendation: Add tests for the recording wrappers and concurrent access patterns.

TUI Writer Changes 🎨

The debouncing fix is elegant
Changing from immediate redraw to 50ms debounced redraw solves the Claude Code banner garbling without being too aggressive. The comment explaining alternative approaches (bracketed paste mode detection, whitelist) is helpful for future maintainers.

Screen clear changes (internal/tui/writer.go:123)
Changing from ESC[2J (clear entire screen) to ESC[J (clear from cursor) is the right fix. This preserves the status bar during setup. The test updates correctly reflect this change.

Documentation 📚

Excellent documentation

  • docs/tty-trace-debugging.md is comprehensive and practical
  • Good examples of diagnosing common issues
  • Clear warning about sensitive data
  • Honest about limitations and future enhancements

Minor suggestions:

  • Add a "Performance Impact" section noting the overhead of tracing
  • Document the trace file format version (for future compatibility)
  • Add an example of what a "normal" trace looks like vs a "problematic" one

Overall Assessment ✨

This is a high-quality PR that solves a real problem and adds valuable debugging infrastructure. The TTY trace system is well-designed and will be useful for diagnosing TUI issues beyond just this one case.

Recommended actions before merge:

  1. Fix the redundant stdin assignment in exec.go
  2. Consider changing trace file permissions to 0600
  3. Add a brief comment about timer lifecycle in scheduleFooterRedrawLocked()
  4. Add tests for RecordingWriter/RecordingReader

Nice to have:

  • Add memory usage warning to docs
  • Implement basic redaction for sensitive patterns
  • Add concurrent access tests

The core implementation is solid and the footer debouncing fix is well-reasoned. Great work! 🚀

@dpup dpup force-pushed the feat/tty-trace-and-footer-fix branch from c8eae32 to f864d24 Compare February 1, 2026 04:54
@claude
Copy link
Copy Markdown

claude Bot commented Feb 1, 2026

📦 Previous Review (superseded by newer review)

Pull Request Review: TTY Trace System and Footer Rendering Fix

This PR adds a comprehensive terminal I/O tracing system and fixes TUI rendering issues through debounced footer redraws. The implementation is well-structured and addresses a real problem.

✅ Strengths

Architecture & Design:

  • Clean separation of concerns with dedicated internal/trace package
  • Well-defined API with Recorder, RecordingWriter, and RecordingReader wrappers
  • Structured trace format with metadata and timestamped events
  • Comprehensive ANSI/VT100 sequence decoding in analyze.go

Documentation:

  • Excellent documentation in docs/tty-trace-debugging.md with practical examples
  • Clear inline comments explaining the debouncing strategy and alternatives
  • Security warning about sensitive data in traces (good practice)

Testing:

  • Unit tests cover core functionality (analyze_test.go)
  • Tests for save/load, decoding, and issue detection
  • Updated existing tests to match new behavior (ESC[J vs ESC[2J)

Code Quality:

  • Follows Go conventions and project style
  • Thread-safe with proper mutex usage in Recorder
  • Proper resource cleanup with defer patterns

🔍 Issues & Suggestions

1. Potential Nil Pointer Dereference (Medium severity)

cmd/moat/cli/exec.go:224 and cmd/moat/cli/attach.go:60

tracer := setupTTYTracer(tracePath, r, command)
defer tracer.save()  // ← tracer can be nil!

When tracing is disabled, setupTTYTracer returns nil, but defer tracer.save() is called unconditionally. While save() has a nil check, this pattern is fragile.

Fix: Add nil check before defer:

if tracer != nil {
    defer tracer.save()
}

2. Data Race in Footer Timer (Low-Medium severity)

internal/tui/writer.go:559-577

The scheduleFooterRedrawLocked() spawns a goroutine via time.AfterFunc that acquires the mutex. If Cleanup() is called concurrently, there's a small window where the timer could fire after cleanup, potentially writing to a closed output.

Mitigation: Add a closed bool field to check in the timer callback:

w.footerTimer = time.AfterFunc(w.footerDebounceDelay, func() {
    w.mu.Lock()
    defer w.mu.Unlock()
    if w.closed {
        return
    }
    w.redrawFooterLocked()
})

3. Redundant else Clause (Low severity)

cmd/moat/cli/attach.go:78-84

var stdin io.Reader = escapeProxy
if tracer != nil {
    stdin = trace.NewRecordingReader(escapeProxy, tracer.recorder, trace.EventStdin)
} else {
    stdin = escapeProxy  // ← Already assigned on line 78
}

Fix: Remove the else clause.

4. Large Trace Files (Low severity, acknowledged in docs)

Consider documenting typical file sizes and adding compression support for long sessions.

5. Test Coverage Gaps

  • No tests for the debouncing behavior itself
  • No tests for concurrent footer timer + cleanup
  • No integration test showing the Claude Code banner fix working

🔒 Security Considerations

✅ Good:

  • File permissions set to 0600 (user-only)
  • Clear warning about sensitive data in documentation

Consider:

  • Add moat tty-trace sanitize command mentioned in TODO
  • Add .moat-trace to .gitignore examples

📊 Performance Impact

Overhead should be minimal. Main concern is memory usage for very long sessions since trace events accumulate in RAM until save. Consider streaming to disk.

📝 Documentation Follow-ups

Per CLAUDE.md guidelines, update:

  • docs/content/reference/01-cli.md - Add --tty-trace flag documentation
  • docs/content/guides/ - Add debugging guide reference

Minor Nits

  1. internal/trace/analyze.go:734 - "37..." truncation is a magic number
  2. internal/tui/writer.go:1367 - Comment shows "(2)" and "(3)" but "(1)" is missing
  3. Inconsistent naming: tty-trace-debugging.md vs ttytrace.go

Verdict

Approve with minor fixes recommended. This is a solid addition that solves a real problem and adds valuable debugging infrastructure.

Recommended fixes before merge:

  1. Fix nil pointer dereference issue (feat(routing): add hostname-based service routing for agents #1) - blocking
  2. Add closed check to timer callback (Add Claude Code GitHub Workflow #2) - blocking for safety
  3. Remove redundant else clause (feat(audit): add tamper-proof audit logging with cryptographic verification #3) - non-blocking cleanup

Excellent work overall!

@dpup dpup force-pushed the feat/tty-trace-and-footer-fix branch from f864d24 to 3eb780e Compare February 1, 2026 05:00
@claude
Copy link
Copy Markdown

claude Bot commented Feb 1, 2026

📦 Previous Review (superseded by newer review)

Pull Request Review: TTY Trace System and Footer Rendering Fix

This is a well-engineered solution to a tricky TUI rendering problem. The PR adds comprehensive terminal I/O tracing infrastructure and fixes the Claude Code banner garbling issue through debounced footer redraws.

Overall Assessment

Strengths:

  • Excellent problem identification and root cause analysis
  • Clean, well-tested implementation with good separation of concerns
  • Comprehensive documentation with practical examples
  • Proper concurrency handling with mutexes
  • Thoughtful debounce approach to avoid interrupting rendering sequences

Rating: ✅ Approve with minor suggestions


Code Quality & Best Practices

1. Trace Package (internal/trace/)

Strengths:

  • Clean API design with recorder/writer/reader pattern
  • Proper data copying to avoid buffer reuse issues (recorder.go:44-45, 128)
  • Good test coverage including edge cases (data isolation tests)
  • Comprehensive ANSI/VT100 sequence decoding

Suggestions:

  • analyze.go:67: Control character decoding ^%c could overflow for values > 127. Consider: data[i] < 32 && data[i] >= 0
  • recorder.go:36-37: Early return for empty data is good, but consider documenting this behavior (e.g., EOF events won't be recorded)

2. TUI Writer Updates (internal/tui/writer.go)

Strengths:

  • Well-documented debounce rationale with alternative approaches (lines 72-77)
  • Proper timer cleanup in multiple exit paths (Cleanup, enterCompositorLocked)
  • Changed from ESC[2J (full clear) to ESC[J (scroll region clear) - smart optimization

Concerns:

  • writer.go:573-577: The debounce timer callback captures w and locks the mutex. If Cleanup() is called while the timer is pending but hasn't fired, there's a potential race:

    • Thread A: Cleanup() acquires lock, stops timer
    • Thread B: Timer fires just before Stop(), callback blocks waiting for lock
    • Thread A: Releases lock, cleans up state
    • Thread B: Acquires lock, calls redrawFooterLocked() on cleaned-up state

    Mitigation: The code checks !w.altScreen (line 187) before scheduling, and the callback should be safe even if called after cleanup since it just writes to w.out. However, consider adding a nil check in redrawFooterLocked() or setting a "closed" flag to be more defensive.

  • writer.go:97: 50ms debounce delay is hardcoded. Consider making this configurable or at least document why 50ms was chosen (testing showed it works for Claude Code's rendering?).

3. CLI Integration (cmd/moat/cli/)

Strengths:

  • Consistent integration pattern across attach.go and exec.go
  • Proper defer cleanup with tracer.save()
  • Resize events captured for tracing

Minor Issues:

  • attach.go:77-82: Redundant stdin assignment:
    var stdin io.Reader = escapeProxy
    if tracer != nil {
        stdin = trace.NewRecordingReader(escapeProxy, tracer.recorder, trace.EventStdin)
    } else {
        stdin = escapeProxy  // redundant: already assigned above
    }
    Simplify to:
    stdin := io.Reader(escapeProxy)
    if tracer != nil {
        stdin = trace.NewRecordingReader(escapeProxy, tracer.recorder, trace.EventStdin)
    }

Performance Considerations

  1. Trace File Size: Recording every byte with nanosecond timestamps will create large files for long sessions. Consider:

    • Adding size limits or rotation
    • Documenting expected trace sizes (e.g., "~1MB per minute of active terminal usage")
  2. Memory Allocation: recorder.go:44-45 copies data on every Write(). For high-throughput scenarios (e.g., streaming logs), this could cause GC pressure. Current implementation is fine for typical TUI usage.

  3. Footer Debounce Timer: Creates a new timer on every stdout write (writer.go:573). For busy output, this could create many short-lived timers. Consider:

    • Reusing the timer with Reset() instead of creating new ones
    • The current implementation is simpler and correct though, so this is optional

Security Considerations

Excellent awareness:

  • Documentation warns about sensitive data in traces (tty-trace-debugging.md:185-193)
  • File permissions set to 0600 for trace files (format.go:1249)

Suggestions:

  • Consider adding runtime warnings when trace files are created (currently only logged, not shown to user unless they check logs)
  • The setupTTYTracer function prints the path (exec.go:179) - good!

Test Coverage

Strengths:

  • Comprehensive unit tests for recorder, analyze, and writer
  • Data isolation tests ensure buffer safety
  • Edge cases covered (EOF, empty writes, buffer reuse)

Missing Tests:

  • No integration test verifying actual Claude Code rendering works
  • No test for the debounce behavior (hard to unit test, but could mock time)
  • No test for concurrent access to Recorder (though mutex is present)

Recommendation: Add a comment documenting that the 50ms debounce was validated manually with Claude Code, since it's hard to automate.


Documentation

Excellent work:

  • tty-trace-debugging.md is thorough and practical
  • Good examples for common debugging scenarios
  • Architecture comments in code explain the "why"

Minor Suggestions:

  • Document the 50ms debounce choice in comments (was it empirically tested?)
  • Add a note about trace file sizes in the docs
  • Consider adding a "Known Limitations" section about what types of TUI issues the trace system won't help with

Specific Code Comments

internal/trace/analyze.go:263-264

func containsClearScreen(data []byte) bool {
    return bytes.Contains(data, []byte("\x1b[2J")) || bytes.Contains(data, []byte("\x1b[3J"))
}

Good, but this only detects full clears. The scroll region clear (ESC[J) used in the fix won't be caught. Consider expanding this or adding FindScrollClear() if you want to trace the new behavior.

internal/tui/writer.go:128

buf.WriteString("\x1b[H\x1b[J")

Nice fix! This is much better than the previous full screen clear. The comment explains it well.

cmd/moat/cli/exec.go:156-159

command := []string{"(attach to " + r.Name + ")"}

Placeholder command for attach is fine, but consider storing the original command in the Run metadata if available.


Potential Bugs

  1. Timer race in writer.go (mentioned above) - low probability but worth addressing
  2. analyze.go:67 - potential byte overflow for DEL character (127)

Suggestions for Future PRs

  1. Make debounce delay configurable via environment variable or flag
  2. Add trace file rotation/size limits
  3. Implement the moat tty-trace sanitize command mentioned in TODO
  4. Consider adding --tty-trace-buffer-size flag to limit memory usage
  5. Add integration test that spawns Claude Code and verifies no garbling

Testing Recommendations

Since tests require approval, please verify:

  1. go test ./internal/trace/... passes
  2. go test ./internal/tui/... passes
  3. Manual test: moat claude --tty-trace=test.json produces clean banner
  4. Manual test: trace file is created and analyzable

Conclusion

This is high-quality work that solves a real problem with a well-designed solution. The trace infrastructure is valuable for future debugging, and the debounce fix is a clever approach to the rendering issue.

The code follows Go best practices, has good test coverage, and includes excellent documentation. My suggestions are mostly minor improvements and defensive programming practices.

Recommendation: Merge after addressing the redundant stdin assignment and considering the timer race condition mitigation.

Great job! 🎉

@dpup dpup force-pushed the feat/tty-trace-and-footer-fix branch from 3eb780e to bdfac9e Compare February 1, 2026 05:04
@claude
Copy link
Copy Markdown

claude Bot commented Feb 1, 2026

📦 Previous Review (superseded by newer review)

Pull Request Review: TTY Trace System and Footer Rendering Fix

This PR adds valuable debugging infrastructure and fixes a concrete TUI rendering issue. The implementation is well-designed with good separation of concerns. Here's my detailed feedback:


Strengths

1. Excellent Problem Definition
The PR clearly identifies the root cause (immediate footer redraws interrupting Claude Code's multi-step rendering) and provides a well-reasoned solution.

2. Strong Test Coverage

  • The trace package has comprehensive unit tests (recorder_test.go: 283 lines, analyze_test.go: 171 lines)
  • Tests cover edge cases like buffer reuse, data isolation, EOF handling, and concurrent access
  • Good use of table-driven tests for decode functionality

3. Clean Architecture

  • Trace system is properly isolated in internal/trace/ with clear separation between recording, storage, and analysis
  • Recording wrappers (RecordingWriter, RecordingReader) use composition cleanly
  • Mutex protection is correctly applied throughout the recorder

4. Good Documentation

  • docs/tty-trace-debugging.md provides practical examples and troubleshooting guides
  • Code comments explain the "why" behind the debouncing approach
  • Alternative approaches are documented inline for future consideration

Issues & Concerns

Critical: Potential Timer Leak in scheduleFooterRedrawLocked

Location: internal/tui/writer.go:566-578

The debouncing implementation has a subtle resource leak:

func (w *Writer) scheduleFooterRedrawLocked() {
    if w.footerTimer != nil {
        w.footerTimer.Stop()  // ⚠️ Doesn't drain the channel
    }
    
    w.footerTimer = time.AfterFunc(w.footerDebounceDelay, func() {
        w.mu.Lock()
        defer w.mu.Unlock()
        w.redrawFooterLocked()
    })
}

Problem: According to Go documentation, Timer.Stop() returns false if the timer has already fired and the function is waiting to execute. In high-frequency write scenarios, you could accumulate pending timer callbacks that will all try to acquire the mutex when they fire.

Recommended Fix:

func (w *Writer) scheduleFooterRedrawLocked() {
    if w.footerTimer != nil {
        if !w.footerTimer.Stop() {
            // Timer already fired, try to drain the channel
            select {
            case <-w.footerTimer.C:
            default:
            }
        }
    }
    
    w.footerTimer = time.AfterFunc(w.footerDebounceDelay, func() {
        w.mu.Lock()
        defer w.mu.Unlock()
        w.redrawFooterLocked()
    })
}

Alternatively, consider using time.NewTimer with explicit reset, as shown in the time package documentation.


Medium: Unbounded Memory Growth in Trace Events

Location: internal/trace/recorder.go:47

r.trace.Events = append(r.trace.Events, Event{...})

Problem: For long-running sessions with high I/O volume, the events slice can grow unbounded. There's no cap, rotation, or memory limit.

Impact: A 10-minute Claude Code session with 1KB/sec output generates ~600KB of trace data, which is acceptable. But a full development session could hit memory limits.

Recommendations:

  1. Document expected memory usage in the trace system
  2. Consider adding a max events limit with rotation (keep last N events)
  3. Or implement streaming writes to disk instead of in-memory accumulation
  4. Add a warning if trace size exceeds a threshold (e.g., 10MB)

Medium: Data Race Risk in ttyTracer.save()

Location: cmd/moat/cli/exec.go:186

defer tracer.save()

Problem: The save() method is called via defer, but there's no guarantee that all goroutines writing to the recorder have finished. If resize events or I/O is still being recorded in background goroutines, you could have a data race.

Current Mitigation: The mutex in Recorder prevents corruption, but you might miss late events.

Recommendation: Add a Close() or Flush() method to the recorder that signals "no more events" and ensures all recording wrappers have finished before saving.


Low: Hardcoded 50ms Debounce Delay

Location: internal/tui/writer.go:720

footerDebounceDelay: 50 * time.Millisecond,

Observation: The 50ms delay works for Claude Code's banner rendering, but might be too aggressive for slower terminal emulators or high-latency SSH sessions.

Recommendation: Consider making this configurable via an environment variable (MOAT_FOOTER_DEBOUNCE_MS) or exposing it as a flag for advanced users. The default of 50ms is reasonable for most cases.


Low: Missing Error Handling in redrawFooterLocked

Location: internal/tui/writer.go:592

w.out.Write(buf.Bytes())  //nolint:errcheck

Problem: Write errors are silently ignored. If stdout is broken (pipe closed, terminal disconnected), this could mask issues.

Recommendation: Either log the error or propagate it. At minimum, consider:

if _, err := w.out.Write(buf.Bytes()); err != nil {
    // Terminal disconnected - stop trying to redraw footer
    if w.footerTimer != nil {
        w.footerTimer.Stop()
        w.footerTimer = nil
    }
}

Code Quality Observations

Good Practices:

  • ✅ Data copying in AddEvent prevents buffer reuse bugs (recorder.go:44-45)
  • ✅ Nil checks before timer operations (writer.go:568, writer.go:785)
  • ✅ Proper use of defer for mutex unlocks
  • ✅ Base64 encoding of binary data in JSON (implicit via []byte marshaling)

Minor Improvements:

  1. CLI Help Text: The --tty-trace flag description could mention the warning about sensitive data capture
  2. Test Coverage: Add integration test for the debouncing behavior (verify footer isn't redrawn during rapid writes)
  3. Documentation: Add a section in tty-trace-debugging.md about performance impact of tracing

Security Considerations

Sensitive Data Exposure:
The trace captures ALL terminal I/O, including:

  • API keys typed or echoed
  • Passwords (even if not echoed, they're in stdin)
  • Environment variables printed to stdout
  • File contents displayed

Mitigations:

  • Good: Documentation warns about this (docs/tty-trace-debugging.md:593-602)
  • Good: File permissions are restrictive (0600 in format.go:1247)
  • Future: The mentioned moat tty-trace sanitize command would be very valuable

Recommendation: Add a prominent warning when --tty-trace is used:

fmt.Printf("⚠️  WARNING: Trace will capture ALL terminal I/O including sensitive data.\n")
fmt.Printf("   Do not share traces without sanitizing them first.\n")

Performance Considerations

  1. I/O Overhead: Each write incurs a data copy (recorder.go:44). For high-throughput applications this could add latency. Consider benchmarking with/without tracing.

  2. Mutex Contention: Every write acquires the recorder mutex. For applications with concurrent I/O from multiple goroutines, this could become a bottleneck.

  3. Timer Churn: In high-frequency write scenarios, you're creating/stopping many timers. This is generally fine for Go's runtime, but worth profiling if you see CPU overhead.


Testing Recommendations

Integration Tests Needed:

  1. Test the full flow: enable tracing → run interactive session → verify trace contents
  2. Test debouncing behavior: rapid writes should only trigger one footer redraw after quiet period
  3. Test cleanup: verify trace files are written even if session crashes

Suggested Test:

func TestFooterDebouncing(t *testing.T) {
    // Create writer with short debounce delay for testing
    w := NewWriter(&buf, bar, "docker")
    w.footerDebounceDelay = 10 * time.Millisecond
    
    // Simulate rapid writes
    for i := 0; i < 100; i++ {
        w.Write([]byte("data\n"))
        time.Sleep(1 * time.Millisecond)
    }
    
    // Wait for debounce to settle
    time.Sleep(20 * time.Millisecond)
    
    // Verify footer was redrawn only once (not 100 times)
    // Implementation would require tracking redraw count
}

Documentation Accuracy

I verified several claims in the documentation against the code:

Correct: Trace format matches documented JSON structure (format.go matches docs/tty-trace-debugging.md:467)
Correct: Event types are accurately listed (EventStdout, EventStderr, EventStdin, EventResize, EventSignal)
Correct: File permissions are 0600 as implied by security section
Correct: CLI flags match documented examples


Breaking Changes / Compatibility

None. This is purely additive:

  • New --tty-trace flag is optional
  • New moat tty-trace subcommand doesn't affect existing commands
  • Footer debouncing is internal behavior change (fix, not breaking)

Overall Assessment

Recommendation: Approve with minor fixes

This is a well-designed feature that solves a real problem (Claude Code banner garbling) while adding valuable debugging capabilities. The trace system is properly architected and well-tested.

Must Fix:

  • Timer leak in scheduleFooterRedrawLocked (see Critical issue above)

Should Fix:

  • Add memory bounds or documentation for trace size limits
  • Add warning message when enabling trace capture
  • Consider adding a Close() method to ensure trace completeness

Nice to Have:

  • Make debounce delay configurable
  • Add integration test for debouncing behavior
  • Handle write errors in redrawFooterLocked

The footer debouncing fix is elegant and the trace system will be invaluable for debugging TUI issues in the future. Great work! 🎉


References

Add comprehensive terminal I/O tracing infrastructure for debugging TUI
rendering issues. Fix Claude Code banner garbling by implementing
debounced footer redraws.

Problem:
Claude Code's banner rendered with garbled/overlapping text when running
inside moat. The issue was caused by moat redrawing the status bar footer
after every stdout write, which interrupted Claude's multi-step rendering
sequences wrapped in ESC[?2026h/l brackets.

Solution:
1. TTY Trace System: Added infrastructure to capture, analyze, and decode
   terminal I/O with nanosecond timing for debugging TUI issues
2. Footer Debouncing: Changed from immediate footer redraw to debounced
   redraw with 50ms delay, allowing child process to complete rendering
   frames without interruption
3. Scroll Region Clearing: Changed from full screen clear (ESC[2J) to
   scroll region clear (ESC[J) to avoid clearing status bar area

New trace system features:
- Capture all terminal I/O (stdin/stdout/stderr) with timing
- Record resize events and terminal environment
- Decode ANSI/VT100 control sequences
- Find patterns like screen clears and resize timing issues
- CLI: moat tty-trace analyze <file> --decode/--find-clears

Footer rendering improvements:
- Debounce timer resets on each write, redraw only after 50ms quiet
- Timer stopped when entering compositor mode
- Alternative approaches documented in comments for future consideration

Testing:
- All existing tests pass with updated expectations
- New trace analysis tests verify control sequence decoding
- Verified fix with actual Claude Code session

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@dpup dpup force-pushed the feat/tty-trace-and-footer-fix branch from bdfac9e to 7912ee0 Compare February 1, 2026 05:16
@claude
Copy link
Copy Markdown

claude Bot commented Feb 1, 2026

Code Review: TTY Trace System and Footer Rendering Fix

This PR adds comprehensive terminal I/O tracing infrastructure and fixes Claude Code banner garbling through debounced footer redraws. The implementation is well-designed with strong test coverage.


Overall Assessment

Strengths:

  • Well-architected solution that addresses a real rendering problem
  • Comprehensive test coverage for new trace infrastructure
  • Good documentation with practical examples
  • Clean separation of concerns between recording, analysis, and CLI

Areas for improvement:

  • Minor concurrency and resource cleanup issues
  • Some edge cases in the debouncing logic
  • Documentation could be more precise about security implications

Code Quality & Best Practices

Excellent Structure

The trace package is well-organized with clear separation:

  • recorder.go - I/O capture
  • analyze.go - Analysis and decoding
  • format.go - Data structures and serialization

Comprehensive Testing

Strong test coverage including:

  • Unit tests for all major functionality
  • Edge cases (empty writes, EOF, buffer reuse)
  • Data isolation tests to catch buffer aliasing bugs

⚠️ Test Cleanup Missing

All TUI writer tests now call w.Cleanup() which is good, but the pattern could be more robust:

// Consider using defer for guaranteed cleanup:
defer w.Cleanup()

This ensures cleanup happens even if tests fail mid-execution.


Potential Bugs & Issues

🐛 Race Condition in Footer Debouncing (internal/tui/writer.go:1807)

w.footerTimer = time.AfterFunc(w.footerDebounceDelay, func() {
    w.mu.Lock()
    defer w.mu.Unlock()
    w.redrawFooterLocked()
})

Issue: If Cleanup() is called concurrently with a timer firing, there's a potential race:

  1. Timer fires and attempts to acquire lock
  2. Cleanup() stops timer but callback may already be executing
  3. Callback runs after cleanup, potentially writing to closed output

Recommendation:
Add a flag to track cleanup state:

type Writer struct {
    // ...
    closed bool
}

// In scheduleFooterRedrawLocked:
w.footerTimer = time.AfterFunc(w.footerDebounceDelay, func() {
    w.mu.Lock()
    defer w.mu.Unlock()
    if \!w.closed {
        w.redrawFooterLocked()
    }
})

// In Cleanup:
w.closed = true

🐛 Nil Pointer Check Pattern (cmd/moat/cli/exec.go:185)

func (t *ttyTracer) save() {
    if t == nil || t.recorder == nil {
        return
    }
    // ...
}

While this defensive nil check works, it's called via defer tracer.save() where tracer could be nil. This is safe but unconventional. Consider:

// More idiomatic Go pattern:
if tracer := setupTTYTracer(...); tracer \!= nil {
    defer tracer.save()
}

This makes the nil case explicit at the call site.

⚠️ Potential Memory Growth (internal/trace/recorder.go:47)

r.trace.Events = append(r.trace.Events, Event{...})

For long-running interactive sessions, unbounded event collection could consume significant memory. Consider:

  • Adding a max events limit
  • Streaming to disk periodically
  • Documenting memory implications in the guide

Performance Considerations

Good: Data Copying

dataCopy := make([]byte, len(data))
copy(dataCopy, data)

Correctly copies data to avoid aliasing issues. Tests verify this works.

Good: Empty Data Check

if len(data) == 0 {
    return
}

Avoids recording empty events.

⚠️ Debounce Delay Choice

The 50ms debounce delay is reasonable but:

  • Could be configurable for different use cases
  • May feel sluggish on very fast terminals
  • Documentation should explain the tradeoff

Security Concerns

🔒 Sensitive Data in Traces (docs/tty-trace-debugging.md:593)

Good: Documentation warns about sensitive data:

"Warning: Traces contain all terminal I/O, including potentially sensitive data"

Concern:

  • Traces capture all I/O including passwords, API keys, etc.
  • File permissions set to 0600 (user-only read/write) which is good
  • But no built-in redaction or sanitization

Recommendations:

  1. Add more prominent warnings when trace is enabled
  2. Consider auto-detecting common secret patterns (API keys, tokens)
  3. Implement the TODO for moat tty-trace sanitize as high priority
  4. Add example showing how to check trace contents safely

🔒 Trace File Permissions (internal/trace/format.go:1247)

return os.WriteFile(path, data, 0600)

✅ Correct - restricts access to owner only.


ANSI Decoder Quality

Comprehensive Coverage

The ANSI/VT100 decoder handles:

  • CSI sequences (cursor movement, clear, SGR)
  • OSC sequences (operating system commands)
  • Character set selection
  • Save/restore cursor

Good Error Handling

if i >= len(data) {
    return "ESC[ (incomplete CSI)", len(data)
}

Gracefully handles incomplete sequences.

💡 Enhancement Opportunity

The decoder is quite complete, but consider adding:

  • DCS (Device Control String) sequences
  • APC (Application Program Command) sequences
  • More detailed OSC parsing (window title changes, etc.)

These aren't critical but would make the tool more comprehensive.


Documentation Quality

Excellent User Guide

The tty-trace-debugging.md is well-written with:

  • Clear examples
  • Practical troubleshooting workflows
  • Common issue patterns

Good Code Comments

// Footer redraw debouncing for scroll mode
// Redraws the footer only after a quiet period to avoid interrupting
// multi-step rendering sequences from the child process.

💡 Minor Improvements

  1. CLI help text (cmd/moat/cli/ttytrace.go:294): Add example output formats
  2. Package documentation: Add package-level godoc for internal/trace
  3. Security section: Expand the security implications section in the guide

Testing Coverage

Strong Test Suite

  • recorder_test.go: 283 lines, covers I/O wrapping thoroughly
  • analyze_test.go: 171 lines, tests decoding and analysis
  • writer_test.go: Updated tests all include cleanup

⚠️ Missing Test Cases

  1. Concurrent access: Test multiple goroutines calling AddEvent simultaneously
  2. Timer cleanup: Test that footer timer is properly stopped on Cleanup
  3. Large traces: Test behavior with very large event streams
  4. Integration test: End-to-end test capturing and analyzing a real trace

Specific File Reviews

internal/tui/writer.go

Line 97: Good choice of 50ms debounce delay. Consider making it configurable or at least documenting the reasoning.

Line 316-318: Timer cleanup on compositor mode entry is correct.

Line 1807-1819: The debouncing implementation is clean but has the race condition mentioned above.

internal/trace/recorder.go

Line 44-45: Excellent - copying data prevents aliasing bugs. Tests verify this.

Line 79-83: Mutex protection for Save is good, but consider whether holding lock during I/O is necessary (could block event recording).

internal/trace/analyze.go

Line 777-781: Text truncation at 40 chars is reasonable. Consider making this configurable for debugging.

Line 949-966: FindClearScreen only checks for ESC[2J and ESC[3J. Consider adding ESC[1J (clear to beginning) for completeness.

cmd/moat/cli/exec.go

Line 154-183: setupTTYTracer is well-structured. The nil return on empty path is clean.

Line 185-199: save() method correctly handles nil receiver, but see note above about call site pattern.


Recommendations Summary

Critical (Should Fix Before Merge)

  1. ✅ Fix potential race condition in footer timer cleanup
  2. ✅ Add cleanup state flag to prevent post-cleanup timer callbacks

Important (Should Address Soon)

  1. Document memory implications of long traces
  2. Add more prominent security warnings when trace is enabled
  3. Consider trace size limits or streaming for production use

Nice to Have

  1. Make debounce delay configurable
  2. Add integration tests for trace capture/analysis
  3. Implement trace sanitization (already in TODO)
  4. Add package-level documentation

Conclusion

This is a well-crafted PR that solves a real problem with a thoughtful approach. The trace infrastructure is production-ready with minor fixes, and the footer debouncing fix is elegant. The code quality is high, tests are comprehensive, and documentation is excellent.

Recommendation: Approve with minor fixes for the race condition. The other suggestions are enhancements rather than blockers.

The trace system will be very valuable for debugging TUI issues, and the footer fix should eliminate the Claude Code banner garbling problem.


Questions for PR Author

  1. Have you tested this with actual Claude Code sessions to verify the banner rendering is fixed?
  2. What's the typical trace file size for a 5-10 minute session?
  3. Should the debounce delay be runtime-configurable, or is 50ms universally appropriate?
  4. Any plans for the trace replay functionality mentioned in the docs?

Great work on this feature! 🎉

@dpup dpup merged commit 77847f9 into main Feb 1, 2026
4 checks passed
@dpup dpup deleted the feat/tty-trace-and-footer-fix branch February 1, 2026 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant