Skip to content

Add comprehensive testing infrastructure#16

Closed
rsned wants to merge 2 commits intoSpaceMolt:mainfrom
rsned:testing-infrastructure
Closed

Add comprehensive testing infrastructure#16
rsned wants to merge 2 commits intoSpaceMolt:mainfrom
rsned:testing-infrastructure

Conversation

@rsned
Copy link
Copy Markdown

@rsned rsned commented Mar 23, 2026

This PR adds automated testing tools to compare client output against direct API calls for all commands.

New Testing Tool

test-commands.sh (Bash)

  • Compare client binary output vs direct API calls
  • Support for safe-only testing (no mutations)
  • --show-safe option to list all safe commands
  • --resume option to continue after session timeout
  • Automatic exclusion of problematic commands (e.g., logout)
  • Detailed logging and JSON result summaries
  • Full JSON comparison with dynamic field filtering
  • Console summary with pass/fail status

Documentation

tools/README.md

  • Complete testing documentation (all-in-one guide)
  • Quick start guide
  • Usage examples for all test modes
  • Safe command categories (20 commands)
  • Session timeout handling with --resume
  • Excluded commands documentation
  • Status codes reference table
  • Common issues and troubleshooting
  • Debugging guide with examples
  • CI/CD integration example with GitHub Actions

Features

  • Safe Mode: Test 20 query commands that don't modify game state
  • Resume: Continue testing from specific command after timeout
  • Exclusions: Automatically exclude commands that break testing (logout)
  • Logging: Detailed logs with timestamps and full output
  • Results: JSON summaries for programmatic analysis
  • Status Tracking: Identical, different, client_error, curl_error, both_error

Co-Authored-By: Claude Sonnet noreply@anthropic.com

@rsned rsned force-pushed the testing-infrastructure branch from 943f0a6 to 802e04a Compare March 23, 2026 16:51
This PR adds automated testing tools to compare client output against
direct API calls for all commands.

## New Testing Tool

### test-commands.sh (Bash)
- Compare client binary output vs direct API calls
- Support for safe-only testing (no mutations)
- --show-safe option to list all safe commands
- --resume option to continue after session timeout
- Automatic exclusion of problematic commands (e.g., logout)
- Detailed logging and JSON result summaries
- Full JSON comparison with dynamic field filtering

## Documentation

### tools/README.md
- Quick start guide
- Usage examples for all test modes
- Safe command categories (20 commands)
- Session timeout handling
- Excluded commands documentation

### tools/TESTING.md
- Complete testing documentation
- Interpreting test results
- Debugging guide
- CI/CD integration examples

## Features

- **Safe Mode**: Test 20 query commands that don't modify game state
- **Resume**: Continue testing from specific command after timeout
- **Exclusions**: Automatically exclude commands that break testing (logout)
- **Logging**: Detailed logs with timestamps and full output
- **Results**: JSON summaries for programmatic analysis
- **Status Tracking**: Identical, different, errors

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>
@rsned rsned force-pushed the testing-infrastructure branch from 802e04a to 60ba843 Compare March 23, 2026 16:55
@cahaseler
Copy link
Copy Markdown
Contributor

Thanks for the idea here @rsned — comparing client behavior against the live API is exactly the right instinct. We took a similar approach in PR #18, but shifted the comparison point: instead of diffing CLI output against raw API responses (which breaks because the client intentionally formats/transforms data), we compare the client's command list directly against the OpenAPI spec.

The result is a src/api-sync.test.ts that hard-fails if client.ts has stale commands the server no longer exposes, or is missing commands the server has added. Running it immediately found several more stale commands beyond what you'd caught in PR #15, plus three commands that were in the spec but missing from the client entirely.

We also added a GitHub Actions CI workflow so lint and tests run on every PR — that was a gap worth closing regardless. Appreciate the push in this direction!

@cahaseler cahaseler closed this Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants