Skip to content

test(e2e): wire format integration test with TypeScript client#87

Merged
adnaan merged 2 commits intomainfrom
wire-format-test
Feb 12, 2026
Merged

test(e2e): wire format integration test with TypeScript client#87
adnaan merged 2 commits intomainfrom
wire-format-test

Conversation

@adnaan
Copy link
Copy Markdown
Contributor

@adnaan adnaan commented Feb 11, 2026

Summary

  • Adds TestWireFormat e2e test that validates the complete wire format path: Go server tree generation → JSON over WebSocket → TypeScript client parsing → DOM update
  • First real consumer of WSMessageLogger infrastructure (built in refactor(testing): add Docker Chrome helpers for reliable CI testing #85 but never used in a test)
  • Tests 7 wire format scenarios sequentially, validating both wire JSON structure and resulting DOM

Test Scenarios

# Scenario Wire Format Validated
1 Initial render Tree contains "s" (statics) keys
2 Scalar update Update tree omits statics
3 Range append ["a", ...] operation
4 Range remove ["r", "item-id"] operation
5 Range update ["u", "item-id", {...}] operation
6 Range reorder ["o", [...ids]] operation
7 Conditional toggle DOM element removed, no statics in update

Test plan

  • Test passes consistently (-count=3 verified)
  • Uses existing patterns (Controller+State, SetupDockerChrome, ServeClientLibrary)
  • Diagnostic output on failure (WS messages, console logs, rendered HTML)
  • No regressions in existing e2e tests

Closes #96

🤖 Generated with Claude Code

Validates the complete wire format path: Go server generates tree →
JSON over WebSocket → TypeScript client applies to DOM. First real
consumer of WSMessageLogger infrastructure.

Tests 7 wire format scenarios: initial render statics, update statics
omission, range append/insert, range remove, range update, range
reorder, and conditional toggle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 11, 2026 07:13
@claude
Copy link
Copy Markdown

claude bot commented Feb 11, 2026

Code Review: Wire Format Integration Test

I've reviewed PR #87 and here's my feedback:

✅ Strengths

Excellent Test Design:

  • Comprehensive coverage of 7 wire format scenarios with clear documentation
  • First real consumer of WSMessageLogger infrastructure (built in refactor(testing): add Docker Chrome helpers for reliable CI testing #85)
  • Validates complete wire format path: Go server → JSON over WebSocket → TypeScript client → DOM
  • Well-structured subtests with clear naming convention (1_Initial_Render_Has_Statics, etc.)
  • Diagnostic output on failure with WebSocket messages, console logs, and rendered HTML

Code Quality:

  • Follows established testing patterns from modal_test.go and other e2e tests
  • Good use of helper functions (hasStaticsAnywhere, findRangeOps, getReceivedMessagesWithTree)
  • Proper error handling with detailed error messages
  • Uses condition-based waiting (WaitFor) instead of arbitrary sleeps

Test Coverage:
The test validates:

  • Initial render with statics ("s" keys present)
  • Update messages omit statics correctly
  • Range operations: append/insert, remove, update, reorder
  • Conditional toggle (DOM element removal)
  • Both wire format JSON structure AND resulting DOM state

🔍 Potential Issues

1. Race Condition in waitForActionResponse (e2e/wire_format_test.go:206-220)

func waitForActionResponse(wsLogger *e2etest.WSMessageLogger, action string, timeout time.Duration) (map[string]interface{}, error) {
    deadline := time.Now().Add(timeout)
    for time.Now().Before(deadline) {
        msgs := wsLogger.GetReceived()  // ⚠️ Reads without proper synchronization
        // ... loop through messages ...
        time.Sleep(100 * time.Millisecond)
    }
}

Issue: While WSMessageLogger.GetReceived() uses RLock, there's a potential race between checking messages and the logger receiving new messages. The function might miss messages that arrive just after the check.

Recommendation: Consider using WSMessageLogger.WaitForMessage() from testing/websocket.go:224 which already handles this pattern, or ensure the calling code has appropriate synchronization.

2. Brittle Range Operation Detection (e2e/wire_format_test.go:143-163)

func findRangeOps(v interface{}) [][]interface{} {
    // ...
    if opStr, ok := val[0].(string); ok {
        switch opStr {
        case "a", "i", "p", "r", "u", "o":  // ⚠️ Hardcoded operation types
            ops = append(ops, val)
            return ops
        }
    }
}

Issue: This function makes assumptions about the wire format structure. If the core library adds new range operation types or changes the format, this test will silently pass when it shouldn't, or fail mysteriously.

Recommendation:

  • Add a comment documenting what each operation code means ("a"=append, "r"=remove, etc.)
  • Consider validating against a defined constant set from the core library if available

3. Tolerance in Range Update Test May Hide Bugs (e2e/wire_format_test.go:449-470)

// The update_item response may contain the ["u", ...] operation,
// OR the tree may be empty {} if the diff engine already batched
// the update into a prior response (e.g., remove_item).
// Either way is valid — check both the wire format and the DOM.

Issue: This test accepts two different behaviors as valid. While this might reflect the actual system behavior, it makes the test less effective at catching regressions. If the batching logic changes, the test won't detect it.

Recommendation: Document why batching can occur and whether this is guaranteed behavior. If it's an implementation detail, consider making the test stricter to catch changes in this behavior.

4. Missing Test Case: Error Conditions

Issue: The test only validates happy paths. It doesn't test:

  • What happens when WebSocket connection fails
  • Malformed wire format messages
  • Out-of-order updates
  • Concurrent rapid updates

Recommendation: Consider adding negative test cases in a follow-up PR to improve robustness.

🎯 Performance Considerations

Polling Interval: The 100ms polling interval in helper functions is reasonable but could be tuned:

  • waitForNewTreeMessage: 100ms polling is good
  • Consider exponential backoff for longer waits to reduce CPU usage

🔒 Security Concerns

✅ No security issues identified. The test doesn't:

  • Execute arbitrary code
  • Access sensitive data
  • Expose network services
  • Have injection vulnerabilities

🧪 Test Coverage

Excellent coverage of the wire format protocol. The test validates:

  • ✅ Initial render statics presence
  • ✅ Update statics omission
  • ✅ All range operation types (a, r, u, o)
  • ✅ Conditional rendering updates
  • ✅ DOM state after each operation

Missing coverage:

  • ⚠️ Error recovery paths
  • ⚠️ Edge cases (empty lists, rapid sequential operations)
  • ⚠️ Performance under load

📝 Minor Suggestions

  1. Line 235: Consider extracting the initial state into a table-driven test structure for easier modification

  2. Line 295-302: Server setup is repeated in many tests - could be extracted to a shared helper

  3. Line 449-470: The fallback logic in range update test could be simplified with a helper like findOperationInMessages()

  4. Documentation: Consider adding a comment at the top explaining the wire format protocol being tested (what are "statics", what does the "s" key mean, etc.)

✅ Final Verdict

This is high-quality test code that follows the project's conventions and provides excellent coverage of the wire format protocol. The identified issues are minor and don't block merging.

Recommendation: APPROVE with minor suggestions for follow-up

The test is production-ready and provides valuable regression protection for the wire format implementation. The issues identified are opportunities for incremental improvement rather than blocking problems.


Test Plan Verified:

Great work! 🎉

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new browser E2E test that exercises LiveTemplate’s end-to-end “wire format” update path (Go server → WS JSON → TS client → DOM), intended to validate statics behavior and range-diff operations across several scenarios.

Changes:

  • Introduces TestWireFormat covering initial render + 6 incremental update scenarios (scalar, range ops, conditional toggle).
  • Adds local helper functions to inspect received WebSocket payloads for statics/range operation markers.
  • Integrates existing Docker Chrome + client library serving utilities to run the test in CI-like conditions.

Comment on lines +184 to +197
// waitForNewTreeMessage waits for a new received WS message with a "tree" field
// after the given count of previously observed tree messages.
func waitForNewTreeMessage(wsLogger *e2etest.WSMessageLogger, prevCount int, timeout time.Duration) (map[string]interface{}, error) {
deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
msgs := getReceivedMessagesWithTree(wsLogger)
if len(msgs) > prevCount {
return msgs[prevCount].Parsed, nil
}
time.Sleep(100 * time.Millisecond)
}
return nil, fmt.Errorf("timeout waiting for new tree message (had %d, still %d)", prevCount, len(getReceivedMessagesWithTree(wsLogger)))
}

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitForNewTreeMessage is defined but never used in this test file, which adds dead code and makes the test harder to maintain. Consider removing it, or using it consistently (e.g., to wait for the next tree message after each UI action).

Suggested change
// waitForNewTreeMessage waits for a new received WS message with a "tree" field
// after the given count of previously observed tree messages.
func waitForNewTreeMessage(wsLogger *e2etest.WSMessageLogger, prevCount int, timeout time.Duration) (map[string]interface{}, error) {
deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
msgs := getReceivedMessagesWithTree(wsLogger)
if len(msgs) > prevCount {
return msgs[prevCount].Parsed, nil
}
time.Sleep(100 * time.Millisecond)
}
return nil, fmt.Errorf("timeout waiting for new tree message (had %d, still %d)", prevCount, len(getReceivedMessagesWithTree(wsLogger)))
}

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +203
func waitForActionResponse(wsLogger *e2etest.WSMessageLogger, action string, timeout time.Duration) (map[string]interface{}, error) {
deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
msgs := wsLogger.GetReceived()
for _, msg := range msgs {
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitForActionResponse scans all received messages and returns the first one whose meta.action matches. If multiple messages exist for the same action (e.g., reconnect/resend, retries, or multiple clicks), this can return a stale earlier response and make the assertions flaky. Consider tracking the current received-message count (or a since timestamp) before triggering the click and only matching messages observed after that point.

Suggested change
func waitForActionResponse(wsLogger *e2etest.WSMessageLogger, action string, timeout time.Duration) (map[string]interface{}, error) {
deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
msgs := wsLogger.GetReceived()
for _, msg := range msgs {
// It only considers messages received after the function starts, to avoid matching
// stale responses from previous actions.
func waitForActionResponse(wsLogger *e2etest.WSMessageLogger, action string, timeout time.Duration) (map[string]interface{}, error) {
// Snapshot the current number of received messages so we only look at newer ones.
initial := wsLogger.GetReceived()
startIndex := len(initial)
deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
msgs := wsLogger.GetReceived()
if startIndex > len(msgs) {
startIndex = len(msgs)
}
for _, msg := range msgs[startIndex:] {

Copilot uses AI. Check for mistakes.
Comment on lines +284 to +291
// Listen for console errors
chromedp.ListenTarget(ctx, func(ev interface{}) {
if ev, ok := ev.(*runtime.EventConsoleAPICalled); ok {
for _, arg := range ev.Args {
log.Printf("Console: %s", string(arg.Value))
}
}
})
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chromedp.ListenTarget console hook logs every console call via log.Printf even when the test passes, but you already capture console output with consoleLogger and print it on failure. Keeping both is redundant and can produce very noisy CI logs; consider removing this listener (or limiting it to exceptions only) and relying on consoleLogger.Print() in the failure defer.

Copilot uses AI. Check for mistakes.
- Remove unused waitForNewTreeMessage function (dead code)
- Fix waitForActionResponse to accept startFrom parameter, preventing
  stale message matches from previous actions
- Remove redundant chromedp.ListenTarget console hook (already captured
  by ConsoleLogger)
- Search all historical messages for batched remove_item operations
  instead of using forward-looking waitForActionResponse

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Feb 12, 2026

Code Review: Wire Format Integration Test

Overview

This PR adds a comprehensive e2e test that validates the complete wire format path from Go server tree generation through WebSocket to TypeScript client parsing and DOM updates. This is excellent coverage for a critical part of the LiveTemplate system.


✅ Strengths

  1. Comprehensive Coverage: Tests 7 wire format scenarios covering all major operations (initial render, scalar updates, range operations: append, remove, update, reorder, and conditionals)

  2. Well-Structured: Clear separation of concerns with helper functions, good test organization using subtests, and excellent diagnostic output on failures

  3. Good Use of Infrastructure: Properly leverages WSMessageLogger from refactor(testing): add Docker Chrome helpers for reliable CI testing #85, making this a real consumer of that infrastructure

  4. Excellent Documentation: The PR description clearly outlines all 7 test scenarios with a helpful table

  5. Diagnostic Output: Comprehensive debugging helpers (wsLogger.Print(), consoleLogger.Print(), HTML dump) that will be invaluable when tests fail

  6. Follows Patterns: Consistent with existing e2e test patterns (Controller+State, SetupDockerChrome, ServeClientLibrary)


🔍 Code Quality Observations

Positive

  • Clean helper functions (hasStaticsAnywhere, findRangeOps, waitForActionResponse)
  • Good error messages with pretty-printed JSON
  • Proper use of t.Helper() in helper functions
  • Sequential test execution ensures deterministic state
  • Proper cleanup with deferred functions

Areas for Improvement

1. Test Fragility - Batched Operations (e2e/wire_format_test.go:478-506)

The comment in subtest 5 acknowledges that the diff engine may batch operations:

// The update_item response may contain the ["u", ...] operation,
// OR the tree may be empty {} if the diff engine already batched
// the update into a prior response (e.g., remove_item).

This is a code smell. The test is adapting to non-deterministic behavior, searching historical messages from remove_item to find the update operation. This could cause:

  • Flaky tests if batching behavior changes
  • False positives if the test finds operations from unrelated actions
  • Maintenance burden when debugging failures

Recommendation: Either:

  • Make the diff engine batching deterministic and testable
  • Simplify the test to only verify DOM state (ground truth) without checking wire format details for batched scenarios
  • Document the batching rules clearly and test them separately

2. Magic Numbers for Timeouts

Multiple 5-second timeouts scattered throughout:

waitForActionResponse(wsLogger, "increment", msgCount, 5*time.Second)

Recommendation: Extract to a constant:

const actionResponseTimeout = 5 * time.Second

3. Sequential Message Scanning

The waitForActionResponse function linearly scans messages. With startFrom parameter (good fix from the second commit!), this is now safer, but could still be slow if many messages are sent.

Minor optimization: Consider indexing messages by action name if this becomes a bottleneck.

4. Error Handling in Deferred Cleanup

defer func() {
    shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()
    if err := server.Shutdown(shutdownCtx); err != nil {
        t.Logf("Server shutdown warning: %v", err)  // Only logs, doesn't fail
    }
}()

This is fine for cleanup, but be aware that server shutdown errors are silently swallowed.

5. Template String Formatting

The template string is clean and readable. Good use of explicit IDs for testing.


🐛 Potential Issues

1. Race Condition Risk (Low)
The test relies on message ordering and uses startFrom to avoid stale matches. This is good, but there's still a theoretical race if:

  • An action dispatches multiple responses
  • WebSocket messages arrive out of order (though TCP should prevent this)

Mitigation: The current implementation with startFrom is solid. No changes needed unless flakiness appears.

2. Test Data Coupling
The test hardcodes data like item-1, item-2, Alpha, Beta, etc. This is fine for a focused test, but consider:

  • What happens if item IDs change format?
  • Are there edge cases with special characters in names?

Recommendation: This is acceptable for the current scope. Future tests could add edge cases.

3. No Verification of Wire Format Schema

The test checks for presence of operations ("a", "r", "u", "o"), but doesn't validate:

  • The complete structure of these operations
  • That invalid operations are rejected
  • Edge cases like empty arrays, null values, etc.

Recommendation: This is beyond scope for this PR, but consider schema validation tests in the future.


🔒 Security Considerations

✅ No security issues identified. The test:

  • Uses localhost-only servers
  • Doesn't expose sensitive data
  • Properly cleans up resources

⚡ Performance Considerations

  1. Test Duration: E2E tests with Chrome are inherently slow. This test has 7 sequential subtests with multiple WebSocket round-trips.

    • Measured impact: Likely 10-30 seconds total
    • Recommendation: This is acceptable for e2e tests. Ensure it runs with -tags=browser so it's skipped in quick CI runs.
  2. Resource Cleanup: Properly defers cleanup, preventing resource leaks ✅

  3. Parallel Execution: Uses SetupDockerChrome which may or may not support parallel execution (depends on Chrome pool implementation)

    • Consider adding t.Parallel() if the Chrome pool supports it (seen in modal_test.go:22)

📋 Test Coverage

What's Covered: ✅

  • Initial render with statics
  • Updates without statics
  • All range operations (append, remove, update, reorder)
  • Conditional rendering toggle
  • WebSocket message structure
  • DOM updates (ground truth verification)

What's Missing: ℹ️

  • Error cases: What if the WebSocket connection fails? What if an action fails?
  • Edge cases: Empty lists, very long lists, rapid-fire actions
  • Statics optimization: Verify statics are truly omitted in all update scenarios (partially covered)
  • Concurrent actions: What if multiple actions fire simultaneously?

Recommendation: These are nice-to-haves, not blockers. File follow-up issues if desired.


📝 Documentation & Maintainability

Excellent:

  • Clear comments explaining what each subtest validates
  • Good variable naming
  • Helpful diagnostic output
  • PR description documents all scenarios

Minor improvement:

  • Add a comment at the top of TestWireFormat explaining the overall test strategy and wire format contract being validated

🎯 Recommendations Summary

Must Fix: None - code is production-ready

Should Consider:

  1. Extract timeout constants
  2. Add top-level test comment explaining wire format contract
  3. Consider adding t.Parallel() if Chrome pool supports it
  4. Document the batching behavior that caused the special case in subtest 5

Nice to Have:

  1. Follow-up issue for error case testing
  2. Schema validation for wire format operations
  3. Performance benchmarks if this becomes a bottleneck

✅ Approval

This is excellent work. The test is comprehensive, well-structured, and provides valuable coverage for wire format integration. The second commit addressing PR review comments shows good iteration.

Status: ✅ Approved with minor suggestions

The code follows Go best practices, integrates well with existing patterns, and the test infrastructure changes in the second commit (waitForActionResponse fix, dead code removal) show attention to detail.

Great job! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants