This repository was archived by the owner on Apr 14, 2026. It is now read-only.
Description Summary
The current integration tests (M4-M7) are working but have grown organically. We need a formal, progressive integration test architecture inspired by dart_monty's integration ladder pattern — fixture-driven, tiered, and extensible for scripting (0008) and future features.
Current State
29 integration tests across 2 files, all passing
run_orchestrator_integration_test.dart — 9 tests (M4/M5/M6)
m7_room_integration_test.dart — 20 tests (19 groups)
Helpers duplicated between files
No fixture files — all test data inline
Server gets overwhelmed under heavy concurrent load (thread deletion warnings)
Test 10 has a latent unicode dash bug (en-dash vs ASCII hyphen)
Proposal: Integration Ladder
Adopt dart_monty's tier-based fixture pattern with progressive complexity.
Tier Structure
Tier
Layer
Focus
Example Tests
tier_01_lifecycle
L0
Basic room lifecycle
Idle -> Running -> Completed, error rooms, 404
tier_02_tools
L1
Tool yielding and resume
Single tool, multi-tool, tool failure recovery
tier_03_conversation
L0+
Multi-turn history
Accumulation, context depth, thread reuse
tier_04_runtime
L2
AgentRuntime patterns
spawn, waitAll, waitAny, cancel, introspection
tier_05_pipelines
L2+
Multi-agent orchestration
Fan-out/fan-in, write-review-revise, cascading
tier_06_advanced
L2++
Complex compositions
Debate, consensus, MapReduce, speculative exec
tier_07_scripting
L3
Monty bridge integration
HostFunctionWiring, MontyToolExecutor, script rooms
Fixture Format
JSON fixtures per tier, matching dart_monty's pattern:
{
"id" : 1 ,
"tier" : 1 ,
"name" : " echo room basic lifecycle" ,
"room" : " echo" ,
"prompt" : " Say hello" ,
"expectedState" : " CompletedState" ,
"responseContains" : null ,
"tools" : null ,
"toolResponses" : null ,
"turns" : 1 ,
"concurrency" : 1 ,
"xfail" : null
}
Shared Test Infrastructure
Extract duplicated helpers into a shared module:
packages/soliplex_agent/test/
integration/
fixtures/
tier_01_lifecycle.json
tier_02_tools.json
tier_03_conversation.json
tier_04_runtime.json
tier_05_pipelines.json
tier_06_advanced.json
tier_07_scripting.json
helpers/
integration_harness.dart # Shared setup/teardown, HTTP clients
state_waiters.dart # _waitForTerminalState, _waitForYieldOrTerminal
fixture_runner.dart # registerLadderTests() equivalent
assertions.dart # Response matchers (unicode-safe)
tier_01_lifecycle_test.dart
tier_02_tools_test.dart
tier_03_conversation_test.dart
tier_04_runtime_test.dart
tier_05_pipelines_test.dart
tier_06_advanced_test.dart
tier_07_scripting_test.dart
Key Improvements
Fixture-driven tests — JSON fixtures enable:
Easy addition of new test cases without code changes
Parity comparison across environments (native vs WASM)
xfail markers for known issues (like WASM concurrency limits)
Shared harness — Single IntegrationHarness class:
Creates/disposes HTTP clients, API, AgUiClient
Manages backend health checks before test runs
Handles thread cleanup with bounded retries (not infinite)
Unicode-safe assertions — Normalize dashes/quotes before string comparison
Sequential tier execution — Run tiers in order to avoid server overload:
Tier 1-3: Sequential (single session tests)
Tier 4-6: Sequential between tiers, parallel within where appropriate
Server health check between tiers
Scripting tier (tier_07) — New tests for 0008-soliplex-scripting:
HostFunctionWiring binds correctly
MontyToolExecutor dispatches tool calls through bridge
ScriptingToolRegistryResolver resolves tools from script context
Script room runs Python code and returns results
Script room with external function calls (yield/resume through Monty bridge)
Error propagation from Python runtime
Known Issues to Address
Migration Path
Extract shared helpers from existing test files
Create fixture JSON files from existing inline test data
Build fixture_runner.dart (registerLadderTests equivalent)
Migrate existing 29 tests to tier files
Add tier_07 scripting tests
Verify all 29+ tests still pass
Remove old test files
Related
Reactions are currently unavailable
Summary
The current integration tests (M4-M7) are working but have grown organically. We need a formal, progressive integration test architecture inspired by dart_monty's integration ladder pattern — fixture-driven, tiered, and extensible for scripting (0008) and future features.
Current State
run_orchestrator_integration_test.dart— 9 tests (M4/M5/M6)m7_room_integration_test.dart— 20 tests (19 groups)Proposal: Integration Ladder
Adopt dart_monty's tier-based fixture pattern with progressive complexity.
Tier Structure
Fixture Format
JSON fixtures per tier, matching dart_monty's pattern:
{ "id": 1, "tier": 1, "name": "echo room basic lifecycle", "room": "echo", "prompt": "Say hello", "expectedState": "CompletedState", "responseContains": null, "tools": null, "toolResponses": null, "turns": 1, "concurrency": 1, "xfail": null }Shared Test Infrastructure
Extract duplicated helpers into a shared module:
Key Improvements
Fixture-driven tests — JSON fixtures enable:
Shared harness — Single IntegrationHarness class:
Unicode-safe assertions — Normalize dashes/quotes before string comparison
Sequential tier execution — Run tiers in order to avoid server overload:
Scripting tier (tier_07) — New tests for 0008-soliplex-scripting:
Known Issues to Address
Migration Path
Related
packages/dart_monty_platform_interface/lib/src/testing/ladder_runner.dartpackages/soliplex_interpreter_monty/test/integration/room_fixture.dart