You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add Go/Ginkgo E2E tests in test/e2e/vmcp_optimizer_test.go covering all optimizer tiers and the standalone regression path: FTS5-only mode (--optimizer) exposes only find_tool/call_tool with BM25 keyword search, managed TEI mode (--optimizer-embedding) auto-starts a TEI container and enables semantic search, fail-fast behaviour when TEI fails to start, idempotent TEI container reuse across two successive invocations with the same model, and a regression test confirming that the standalone vmcp serve binary works identically after the Phase 1 extraction refactor. These tests close out the optimizer implementation chain and provide the definitive end-to-end validation gate for RFC THV-0059 Phase 4.
Context
#4887 wired --optimizer, --optimizer-embedding, --embedding-model, and --embedding-image flags into cmd/thv/app/vmcp.go and integrated EmbeddingServiceManager from #4884 into the serve lifecycle in pkg/vmcp/cli/serve.go. #4888 established the Go/Ginkgo E2E test file test/e2e/vmcp_cli_test.go with the StartLongRunningTHVCommand, WaitForMCPServerReady, and group/workload setup patterns that this item extends. The TEI container is named thv-embedding-<model-short-hash>, idempotent across invocations with the same model, and its start is a hard failure when --optimizer-embedding is explicitly given — no silent FTS5 fallback.
The regression test for standalone vmcp serve is critical because the Phase 1 extraction in #4879 moved ~350 lines of runServe business logic out of cmd/vmcp/app/commands.go into pkg/vmcp/cli/serve.go, making the standalone binary a thin wrapper. An E2E test guards against any accidental behaviour change for Kubernetes/operator deployments.
Dependencies: Depends on #4887 (optimizer flags wired in), #4888 (basic E2E infrastructure established in test/e2e/vmcp_cli_test.go) Blocks: (none — final item in the optimizer chain)
Acceptance Criteria
A new file test/e2e/vmcp_optimizer_test.go exists with SPDX header, package e2e_test, and a Ginkgo Describe("vMCP CLI optimizer", ...) block with Label("vmcp", "optimizer", "e2e")
FTS5 optimizer test: thv vmcp serve --group <name> --optimizer starts successfully, the MCP client connects, ListTools returns exactly the two tools find_tool and call_tool (and no direct backend tool names), and calling find_tool with a keyword query returns a non-error result
TEI managed optimizer test (requires Docker and the TEI CPU image): thv vmcp serve --group <name> --optimizer-embedding starts a thv-embedding-<hash> container, the MCP client connects, ListTools returns find_tool and call_tool, and calling find_tool with a semantic query returns a non-error result; the TEI container is stopped when thv vmcp serve exits
Fail-fast test: invoke thv vmcp serve --group <name> --optimizer-embedding --embedding-image ghcr.io/invalid/nonexistent:latest (or equivalent bad image); assert the process exits non-zero within a reasonable timeout and stderr/stdout contains a clear error referencing the TEI failure — no silent fallback to FTS5
Idempotent TEI reuse test: start thv vmcp serve --optimizer-embedding (instance A), note the container name (thv-embedding-<hash>), stop instance A, start a second thv vmcp serve --optimizer-embedding with the same model (instance B), assert the container count for prefix thv-embedding- does not increase (i.e., instance B reuses the same container), stop instance B
Standalone vmcp serve regression test: start the vmcp binary (not thv) via exec.Command with a valid YAML config and a populated backend group, assert it starts successfully, connect an MCP client, assert ListTools returns at least one backend tool, and stop the process cleanly; the test must pass identically before and after the Phase 1 refactor
All background thv vmcp serve and vmcp processes are stopped via DeferCleanup / AfterEach even on test failure; no process leaks
Group workloads and groups created in BeforeEach are removed in AfterEach via existing StopAndRemoveMCPServer and RemoveGroup helpers
The TEI E2E tests are tagged with an additional Label("requires-docker") and are skipped when the SKIP_DOCKER_TESTS environment variable is set to true (or when Docker is unavailable), so CI without Docker does not fail
Random ports for thv vmcp serve --port <n> and vmcp serve --port <n> are allocated via net.Listen("tcp", "127.0.0.1:0") and closed before passing the port number to the command
WaitForMCPServerReady with a 90-second timeout (TEI model download may take 30–60 s on first pull) is used for the TEI managed optimizer test
All existing tests pass (no regressions)
Code reviewed and approved
Technical Approach
Recommended Implementation
Create test/e2e/vmcp_optimizer_test.go. The file contains a single top-level Describe("vMCP CLI optimizer", ...) block with four Context blocks: FTS5, TEI managed, error cases, and standalone regression.
Shared setup in BeforeEach: create a unique group, run a backend workload (thv run fetch --group <name>), wait for the workload to be ready via WaitForMCPServer, and track it for cleanup. Allocate a free port with allocateFreePort() (the local helper from vmcp_cli_test.go or a duplicate in this file).
Start as a background process with StartLongRunningTHVCommand. Poll readiness with WaitForMCPServerReady(config, "http://127.0.0.1:<port>/sse", "sse", 60*time.Second). Connect NewMCPClientForSSE, initialize, call ListTools. Assert:
len(tools.Tools) == 2
tool names are {"find_tool", "call_tool"} (no backend tool names exposed directly)
call find_tool with {"query": "fetch"} and assert the result has no error
Run this synchronously with RunWithTimeout(120 * time.Second). Assert the command returns a non-nil error. Assert stderr contains a substring indicating TEI failure (e.g., "TEI", "embedding", or "failed to start").
Locate the vmcp binary via os.Getenv("VMCP_BINARY"), falling back to searching PATH.
Generate a minimal YAML config file in os.MkdirTemp referencing the test group's workloads via static backend URLs obtained from GetMCPServerURL.
Start vmcp serve --config <tmpfile> --port <port> with exec.Command (not THVCommand), capturing stdout/stderr to GinkgoWriter.
Poll readiness with WaitForMCPServerReady.
Connect NewMCPClientForSSE, assert ListTools returns at least one tool.
SIGINT the process and wait.
Patterns & Frameworks
Go/Ginkgo v2: Describe / Context / It / BeforeEach / AfterEach / DeferCleanup / Eventually / By — consistent with group_test.go and api_workload_lifecycle_test.go
DeferCleanup before starting each background process: register the SIGINT + Wait cleanup before the process starts, so cleanup always runs even if the It body panics mid-way
StartLongRunningTHVCommand from helpers.go: use for all background thv vmcp serve invocations; it pipes stdout/stderr to GinkgoWriter for CI debugging
WaitForMCPServerReady from mcp_client_helpers.go: pass "sse" as the mode and a 90-second timeout for TEI-backed tests
MCPClientHelper.ListTools + MCPClientHelper.ExpectToolExists from mcp_client_helpers.go: assert optimizer tool exposure
No gomock in E2E: real subprocesses and Docker commands only
Label("requires-docker") on TEI tests + Skip(...) guard on SKIP_DOCKER_TESTS=true
Code Pointers
test/e2e/helpers.go — StartLongRunningTHVCommand, WaitForMCPServer, StopAndRemoveMCPServer, RemoveGroup, CreateAndTrackGroup, GenerateUniqueServerName, CheckTHVBinaryAvailable — all reusable from the new test file
pkg/vmcp/optimizer/optimizer.go — find_tool and call_tool tool names to assert in FTS5/TEI tests
cmd/vmcp/app/commands.go — Standalone vmcp binary entrypoint; --config and --port flag names for the regression test
.claude/rules/testing.md — E2E test strategy, Ginkgo patterns, DeferCleanup usage
Component Interfaces
// test/e2e/vmcp_optimizer_test.go — top-level structurevar_=Describe("vMCP CLI optimizer", Label("vmcp", "optimizer", "e2e"), func() {
var (
config*e2e.TestConfiggroupNamestringbackendNamestringcreatedWorkloads []string
)
BeforeEach(func() {
config=e2e.NewTestConfig()
groupName=e2e.GenerateUniqueServerName("vmcp-opt-group")
backendName=e2e.GenerateUniqueServerName("vmcp-opt-backend")
createdWorkloads=nilExpect(e2e.CheckTHVBinaryAvailable(config)).To(Succeed())
// Create group and start backend workloade2e.CreateAndTrackGroup(config, groupName, &[]string{})
e2e.NewTHVCommand(config, "run", "fetch", "--name", backendName,
"--group", groupName).ExpectSuccess()
createdWorkloads=append(createdWorkloads, backendName)
Expect(e2e.WaitForMCPServer(config, backendName, 60*time.Second)).To(Succeed())
})
AfterEach(func() {
for_, w:=rangecreatedWorkloads {
_=e2e.StopAndRemoveMCPServer(config, w)
}
_=e2e.RemoveGroup(config, groupName)
})
Context("FTS5 optimizer (--optimizer)", func() {
It("exposes only find_tool and call_tool with BM25 search", func() { ... })
})
Context("TEI managed optimizer (--optimizer-embedding)",
Label("requires-docker"), func() {
It("auto-starts TEI container and enables semantic search", func() { ... })
It("stops the TEI container on serve exit", func() { ... })
It("reuses the same TEI container across two invocations", func() { ... })
})
Context("fail-fast behaviour", func() {
It("exits non-zero with a clear error when TEI fails to start", func() { ... })
})
Context("standalone vmcp serve regression", func() {
It("works identically after Phase 1 extraction refactor", func() { ... })
})
})
// skipIfNoDocker skips the current test if Docker is unavailable// or SKIP_DOCKER_TESTS is set to "true".funcskipIfNoDocker() {
ifos.Getenv("SKIP_DOCKER_TESTS") =="true" {
Skip("Skipping Docker-dependent test (SKIP_DOCKER_TESTS=true)")
}
}
Testing Strategy
Unit Tests
Not applicable for this item — E2E tests exercise real thv and vmcp subprocesses; unit test coverage for the optimizer wiring logic is scoped to Wire optimizer flags into thv vmcp serve #4887
TEI readiness timeout: model download can take 30–60 s; use 90-second WaitForMCPServerReady timeout for TEI tests to avoid spurious CI failures
Cleanup on panic: DeferCleanup(func() { vMCPCmd.Process.Signal(syscall.SIGINT); vMCPCmd.Wait() }) registered before the It body starts the serve process — ensures cleanup even if assertions panic
Parallel test safety: each test invocation uses GenerateUniqueServerName for group and backend names, and allocateFreePort() for port numbers, preventing cross-test interference
Tier 3 (config-file based) optimizer where the user specifies optimizer.embeddingService directly in YAML — the existing config-load path already handles this; no new E2E coverage needed
Kubernetes/chainsaw vMCP E2E tests — those remain unchanged under test/e2e/chainsaw/operator/
thv vmcp status subcommand — explicitly deferred per RFC open questions
MCP protocol conformance testing beyond basic ListTools and CallTool connectivity
ARM64 / Apple Silicon platform-specific CI adjustments — CI runs on amd64 Linux; TEI CPU image works there without Rosetta emulation
Description
Add Go/Ginkgo E2E tests in
test/e2e/vmcp_optimizer_test.gocovering all optimizer tiers and the standalone regression path: FTS5-only mode (--optimizer) exposes onlyfind_tool/call_toolwith BM25 keyword search, managed TEI mode (--optimizer-embedding) auto-starts a TEI container and enables semantic search, fail-fast behaviour when TEI fails to start, idempotent TEI container reuse across two successive invocations with the same model, and a regression test confirming that the standalonevmcp servebinary works identically after the Phase 1 extraction refactor. These tests close out the optimizer implementation chain and provide the definitive end-to-end validation gate for RFC THV-0059 Phase 4.Context
#4887 wired
--optimizer,--optimizer-embedding,--embedding-model, and--embedding-imageflags intocmd/thv/app/vmcp.goand integratedEmbeddingServiceManagerfrom #4884 into the serve lifecycle inpkg/vmcp/cli/serve.go. #4888 established the Go/Ginkgo E2E test filetest/e2e/vmcp_cli_test.gowith theStartLongRunningTHVCommand,WaitForMCPServerReady, and group/workload setup patterns that this item extends. The TEI container is namedthv-embedding-<model-short-hash>, idempotent across invocations with the same model, and its start is a hard failure when--optimizer-embeddingis explicitly given — no silent FTS5 fallback.The regression test for standalone
vmcp serveis critical because the Phase 1 extraction in #4879 moved ~350 lines ofrunServebusiness logic out ofcmd/vmcp/app/commands.gointopkg/vmcp/cli/serve.go, making the standalone binary a thin wrapper. An E2E test guards against any accidental behaviour change for Kubernetes/operator deployments.Dependencies: Depends on #4887 (optimizer flags wired in), #4888 (basic E2E infrastructure established in
test/e2e/vmcp_cli_test.go)Blocks: (none — final item in the optimizer chain)
Acceptance Criteria
test/e2e/vmcp_optimizer_test.goexists with SPDX header, packagee2e_test, and a GinkgoDescribe("vMCP CLI optimizer", ...)block withLabel("vmcp", "optimizer", "e2e")thv vmcp serve --group <name> --optimizerstarts successfully, the MCP client connects,ListToolsreturns exactly the two toolsfind_toolandcall_tool(and no direct backend tool names), and callingfind_toolwith a keyword query returns a non-error resultthv vmcp serve --group <name> --optimizer-embeddingstarts athv-embedding-<hash>container, the MCP client connects,ListToolsreturnsfind_toolandcall_tool, and callingfind_toolwith a semantic query returns a non-error result; the TEI container is stopped whenthv vmcp serveexitsthv vmcp serve --group <name> --optimizer-embedding --embedding-image ghcr.io/invalid/nonexistent:latest(or equivalent bad image); assert the process exits non-zero within a reasonable timeout and stderr/stdout contains a clear error referencing the TEI failure — no silent fallback to FTS5thv vmcp serve --optimizer-embedding(instance A), note the container name (thv-embedding-<hash>), stop instance A, start a secondthv vmcp serve --optimizer-embeddingwith the same model (instance B), assert the container count for prefixthv-embedding-does not increase (i.e., instance B reuses the same container), stop instance Bvmcp serveregression test: start thevmcpbinary (notthv) viaexec.Commandwith a valid YAML config and a populated backend group, assert it starts successfully, connect an MCP client, assertListToolsreturns at least one backend tool, and stop the process cleanly; the test must pass identically before and after the Phase 1 refactorthv vmcp serveandvmcpprocesses are stopped viaDeferCleanup/AfterEacheven on test failure; no process leaksBeforeEachare removed inAfterEachvia existingStopAndRemoveMCPServerandRemoveGrouphelpersLabel("requires-docker")and are skipped when theSKIP_DOCKER_TESTSenvironment variable is set totrue(or when Docker is unavailable), so CI without Docker does not failthv vmcp serve --port <n>andvmcp serve --port <n>are allocated vianet.Listen("tcp", "127.0.0.1:0")and closed before passing the port number to the commandWaitForMCPServerReadywith a 90-second timeout (TEI model download may take 30–60 s on first pull) is used for the TEI managed optimizer testTechnical Approach
Recommended Implementation
Create
test/e2e/vmcp_optimizer_test.go. The file contains a single top-levelDescribe("vMCP CLI optimizer", ...)block with fourContextblocks: FTS5, TEI managed, error cases, and standalone regression.Shared setup in
BeforeEach: create a unique group, run a backend workload (thv run fetch --group <name>), wait for the workload to be ready viaWaitForMCPServer, and track it for cleanup. Allocate a free port withallocateFreePort()(the local helper fromvmcp_cli_test.goor a duplicate in this file).FTS5 optimizer context (
Context("FTS5 optimizer (--optimizer)", ...)):Start as a background process with
StartLongRunningTHVCommand. Poll readiness withWaitForMCPServerReady(config, "http://127.0.0.1:<port>/sse", "sse", 60*time.Second). ConnectNewMCPClientForSSE, initialize, callListTools. Assert:len(tools.Tools) == 2{"find_tool", "call_tool"}(no backend tool names exposed directly)find_toolwith{"query": "fetch"}and assert the result has no errorTEI managed optimizer context (
Context("TEI managed optimizer (--optimizer-embedding)", Label("requires-docker"), ...)):Use a 90-second readiness timeout for
WaitForMCPServerReadyto accommodate the model download. After the server is ready:ListToolsreturnsfind_toolandcall_toolfind_toolwith{"query": "a semantic query"}and assert no errordocker ps --filter name=thv-embedding-shows exactly one matching container (Docker CLI invoked viaexec.Command("docker", "ps", "--filter", "name=thv-embedding-", "--format", "{{.Names}}"))thv-embedding-*container is stopped/removed (poll withdocker psfor up to 15 seconds)Fail-fast context (
Context("fail-fast when TEI fails to start", ...)):Run this synchronously with
RunWithTimeout(120 * time.Second). Assert the command returns a non-nil error. Assert stderr contains a substring indicating TEI failure (e.g.,"TEI","embedding", or"failed to start").Idempotent TEI reuse context (
Context("idempotent TEI container reuse", Label("requires-docker"), ...)):--optimizer-embedding, wait for readiness.containersBefore= count ofdocker psoutput lines matchingthv-embedding-.--optimizer-embeddingand default model.docker psoutput count matchingthv-embedding-equalscontainersBefore(no new container created).Standalone
vmcpregression context (Context("standalone vmcp serve regression", ...)):vmcpbinary viaos.Getenv("VMCP_BINARY"), falling back to searchingPATH.os.MkdirTempreferencing the test group's workloads via static backend URLs obtained fromGetMCPServerURL.vmcp serve --config <tmpfile> --port <port>withexec.Command(notTHVCommand), capturing stdout/stderr toGinkgoWriter.WaitForMCPServerReady.NewMCPClientForSSE, assertListToolsreturns at least one tool.Patterns & Frameworks
Describe/Context/It/BeforeEach/AfterEach/DeferCleanup/Eventually/By— consistent withgroup_test.goandapi_workload_lifecycle_test.goDeferCleanupbefore starting each background process: register the SIGINT + Wait cleanup before the process starts, so cleanup always runs even if theItbody panics mid-wayStartLongRunningTHVCommandfromhelpers.go: use for all backgroundthv vmcp serveinvocations; it pipes stdout/stderr toGinkgoWriterfor CI debuggingWaitForMCPServerReadyfrommcp_client_helpers.go: pass"sse"as the mode and a 90-second timeout for TEI-backed testsMCPClientHelper.ListTools+MCPClientHelper.ExpectToolExistsfrommcp_client_helpers.go: assert optimizer tool exposure// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc./// SPDX-License-Identifier: Apache-2.0Label("requires-docker")on TEI tests +Skip(...)guard onSKIP_DOCKER_TESTS=trueCode Pointers
test/e2e/helpers.go—StartLongRunningTHVCommand,WaitForMCPServer,StopAndRemoveMCPServer,RemoveGroup,CreateAndTrackGroup,GenerateUniqueServerName,CheckTHVBinaryAvailable— all reusable from the new test filetest/e2e/mcp_client_helpers.go—NewMCPClientForSSE,MCPClientHelper.Initialize,MCPClientHelper.ListTools,MCPClientHelper.ExpectToolExists,MCPClientHelper.CallTool,WaitForMCPServerReady— MCP connectivity helperstest/e2e/vmcp_cli_test.go(E2E tests: quick mode and config-file mode #4888) —allocateFreePort()helper and theDescribe("vMCP CLI", ...)block structure to match;AfterEachcleanup pattern for background processestest/e2e/e2e_suite_test.go— Ginkgo suite infrastructure; the new test file joins the same suite automaticallytest/e2e/inspector_test.go— Pattern for starting a long-running subcommand, SIGINT cleanup inAfterEachtest/e2e/group_test.go— Group + workload setup/teardown withBeforeEach/AfterEachtest/e2e/api_workload_lifecycle_test.go—Eventually(..., 60*time.Second, 2*time.Second)polling pattern andBy(...)step annotationspkg/vmcp/optimizer/optimizer.go—find_toolandcall_tooltool names to assert in FTS5/TEI testscmd/vmcp/app/commands.go— Standalonevmcpbinary entrypoint;--configand--portflag names for the regression test.claude/rules/testing.md— E2E test strategy, Ginkgo patterns,DeferCleanupusageComponent Interfaces
Testing Strategy
Unit Tests
thvandvmcpsubprocesses; unit test coverage for the optimizer wiring logic is scoped to Wire optimizer flags intothv vmcp serve#4887Integration Tests (E2E — the primary deliverable)
thv vmcp serve --group <name> --optimizer→ListToolsreturns exactly[find_tool, call_tool]→find_toolcall with keyword returns non-errorthv vmcp serve --group <name> --optimizer-embedding→ TEI container visible indocker ps→ListToolsreturns[find_tool, call_tool]→find_toolsemantic call returns non-errorthv vmcp serve --optimizer-embedding,docker psno longer showsthv-embedding-*container within 15 secondsthv-embedding-*containerthv vmcp serve --optimizer-embedding --embedding-image <bad-image>exits non-zero with TEI error message; no vMCP server startsvmcp serveregression:vmcp serve --config <valid.yaml>starts, MCP client connects,ListToolsreturns backend toolsEdge Cases
WaitForMCPServerReadytimeout for TEI tests to avoid spurious CI failuresDeferCleanup(func() { vMCPCmd.Process.Signal(syscall.SIGINT); vMCPCmd.Wait() })registered before theItbody starts the serve process — ensures cleanup even if assertions panicGenerateUniqueServerNamefor group and backend names, andallocateFreePort()for port numbers, preventing cross-test interferenceOut of Scope
pkg/vmcp/cli/serve.go) — those are Wire optimizer flags intothv vmcp serve#4887optimizer.embeddingServicedirectly in YAML — the existing config-load path already handles this; no new E2E coverage neededtest/e2e/chainsaw/operator/thv vmcp statussubcommand — explicitly deferred per RFC open questionsListToolsandCallToolconnectivityReferences
test/e2e/helpers.go—StartLongRunningTHVCommand,WaitForMCPServer, group/workload helperstest/e2e/mcp_client_helpers.go—NewMCPClientForSSE,WaitForMCPServerReady,MCPClientHelpertest/e2e/vmcp_cli_test.go(E2E tests: quick mode and config-file mode #4888) — Base E2E infrastructure:allocateFreePort, Ginkgo structure, background process patterntest/e2e/inspector_test.go— Long-running subcommand (background process) cleanup patterntest/e2e/group_test.go— Group + workload setup/teardown patternpkg/vmcp/optimizer/optimizer.go—find_toolandcall_tooltool namespkg/vmcp/cli/embedding_manager.go(Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) — TEI container naming:thv-embedding-<model-short-hash>cmd/vmcp/app/commands.go— Standalonevmcpbinary flags for regression test.claude/rules/testing.md— E2E test strategy, Ginkgo/Gomega patterns,DeferCleanupusage.claude/rules/go-style.md— SPDX headers, error handling conventions