Skip to content

test: add 57 TCP client robustness tests#538

Merged
malpern merged 1 commit into
masterfrom
test/530-tcp-client-robustness
May 22, 2026
Merged

test: add 57 TCP client robustness tests#538
malpern merged 1 commit into
masterfrom
test/530-tcp-client-robustness

Conversation

@malpern
Copy link
Copy Markdown
Owner

@malpern malpern commented May 22, 2026

Summary

Closes #530

  • Read buffer edge cases (5 tests): bare newline, multiple empty lines, 50KB payload, Windows line endings, binary data
  • Broadcast filtering robustness (7 tests): extra fields, empty data, null bytes, truncated JSON, numeric status, multiple/mixed response keys
  • Request ID extraction (8 tests): top-level, nested, string values, missing, zero, UInt64.max, invalid JSON, null
  • Request ID generation (2 tests): monotonicity without server, starts at 1
  • TCPReloadResult (4 tests): success/failure/networkError properties, cancellation detection
  • ReloadResult protocol parsing (6 tests): older (ready/timeout_ms) and newer (ok/duration_ms) protocol variants, request_id, empty object
  • HelloOk parsing (3 tests): full payload, minimal server form, capability checking
  • StatusInfo parsing (2 tests): with/without last_reload
  • Error classification (4 tests): timeout, connectionFailed, invalidResponse, non-KeyPathError
  • Error message extraction (5 tests): status error JSON, generic error field, missing field, non-JSON, multi-line
  • extractMessage (3 tests): multi-line extraction, not found, malformed payload
  • Server unavailable (2 tests): checkServerStatus and reloadConfig with no server
  • EngineReloadSingleFlight (2 tests): concurrent coalescing, sequential execution
  • Layer/FakeKey names (3 tests): parsing with request_id, empty list
  • Result type coverage (3 tests): FakeKeyResult, ChangeLayerResult, FakeKeyAction raw values

Test plan

  • All 57 new tests pass (59 total with filter including existing TCPClient tests)
  • No server required — all tests are unit-testable
  • Build passes

Cover read buffer edge cases, broadcast filtering robustness,
request ID extraction and generation, TCPReloadResult/ReloadResult
protocol parsing, HelloOk/StatusInfo decoding variants, error
classification, error message extraction, server-unavailable
scenarios, EngineReloadSingleFlight coalescing, and result type
coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@malpern malpern merged commit 730fa69 into master May 22, 2026
3 checks passed
@malpern malpern deleted the test/530-tcp-client-robustness branch May 22, 2026 15:25
@claude
Copy link
Copy Markdown

claude Bot commented May 22, 2026

Code Review — PR #538: test: add 57 TCP client robustness tests

Note: This PR is already merged. Feedback is for future reference / follow-up.

Overview

Solid test coverage for the KanataTCPClient layer. 57 tests across 15 categories, all unit-testable without a live server — exactly the right approach per the project's architecture. The breadth of edge cases (binary data, UInt64.max, Windows line endings, null JSON fields, concurrent coalescing) is impressive.


Issues

🔴 XCTestCase used instead of KeyPathTestCase

Per CLAUDE.md: "Never call real pgrep in tests → deadlock. Use KeyPathTestCase base class."

// Current (potentially unsafe)
final class TCPClientRobustnessTests: XCTestCase {

// Should be
final class TCPClientRobustnessTests: KeyPathTestCase {

If KanataTCPClient or any downstream call reaches pgrep, this could deadlock in CI. Worth fixing in a follow-up even though it hasn't manifested yet.


🟡 testSingleFlight_CoalescesConcurrentRequests — assertion can never fail

XCTAssertLessThanOrEqual(execCount, 2, "Concurrent requests should coalesce (1) or both run (2)")

With 2 tasks, execCount is always ≤ 2, so this assertion is vacuous. If the goal is to verify coalescing happened, assert == 1. If the semantics allow both outcomes (which the comment suggests), consider either removing the test or asserting >= 1 && <= 2 and adding a note about why. As written it provides no signal.


🟡 testExtractMessage_MalformedPayload — no assertion

_ = result  // no assertion

The comment explains the ambiguity ("may succeed or fail depending on the decoder") but that means the test exercises zero correctness. Either pick a concrete expected behavior and assert it, or delete the test. A crash-guard test only has value if it can actually trigger a crash on regression.


🟡 testExtractFirstLine_WindowsLineEndings — incomplete assertion

XCTAssertNotNil(result, "Should handle \\r\\n by splitting on \\n")

This only checks the result isn't nil but doesn't verify that \r was handled correctly — i.e., that the extracted line doesn't contain a trailing \r. The "Windows line endings" premise is never actually validated.


🟢 Minor: var data should be let

// testExtractFirstLine_BinaryFollowedByNewline
var data = Data([0xFF, 0xFE, 0x00, 0x0A])  // never mutated → should be `let`

Strengths

  • Excellent use of XCTUnwrap instead of force-unwraps throughout
  • async let pattern for concurrency tests is idiomatic
  • 0.1s timeout on the "no server" tests keeps the suite fast
  • @preconcurrency import XCTest addresses Swift Concurrency warnings cleanly
  • Protocol variant tests (older ready/timeout_ms vs newer ok/duration_ms) cover a real compatibility surface
  • testRequestIdStartsAtOne + testRequestIdMonotonicity_NoServer together form a good behavioral contract for the ID generator

Suggested follow-ups

  1. Inherit from KeyPathTestCase to get pgrep protection
  2. Fix or delete testSingleFlight_CoalescesConcurrentRequests (assertion is always true)
  3. Add \r content assertion to the Windows line endings test
  4. Replace _ = result in testExtractMessage_MalformedPayload with a concrete assertion or remove the test

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c73bdf1624

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

XCTAssertTrue(results.0.isSuccess)
XCTAssertTrue(results.1.isSuccess)
let execCount = await counter.get()
XCTAssertLessThanOrEqual(execCount, 2, "Concurrent requests should coalesce (1) or both run (2)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Assert coalescing rather than permitting both runs

testSingleFlight_CoalescesConcurrentRequests currently passes even when coalescing is broken, because it accepts execCount == 2 and does not force overlap between the two calls. If EngineReloadSingleFlight.run starts executing each concurrent request independently, this test still succeeds, so a regression in the single-flight guarantee can slip through. Tighten this to require one execution (with a gate/barrier to ensure true concurrency) so the test actually validates the behavior named in the test.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test: add TCP client robustness tests

1 participant