Skip to content

Conversation

@JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Nov 27, 2025

Summary

  • Adds two new e2e test files for VirtualMCPServer using yardstick as the deterministic backend
  • virtualmcp_yardstick_base_test.go: Tests basic multi-backend aggregation with prefix conflict resolution
  • virtualmcp_aggregation_filtering_test.go: Tests tool filtering per backend workload

Why Yardstick?

Yardstick is a deterministic MCP server designed specifically for testing:

  • Provides a simple echo tool that returns the input
  • No external dependencies or network calls
  • Deterministic responses for reliable testing
  • Available as a container image: ghcr.io/stackloklabs/yardstick/yardstick-server:0.0.2

Test Coverage

Base Test (virtualmcp_yardstick_base_test.go)

  • Creates 2 yardstick backends in an MCPGroup
  • Tests VirtualMCPServer with prefix conflict resolution
  • Verifies tools from both backends are aggregated
  • Tests calling echo tool through vMCP proxy

Filtering Test (virtualmcp_aggregation_filtering_test.go)

  • Tests aggregation.tools[].filter configuration
  • Verifies only filtered tools are exposed
  • Verifies empty filter excludes all tools from a backend
  • Tests that filtered tools can still be called

Test plan

  • Run task thv-operator-e2e-test to execute these tests (requires Kind cluster)
  • Verify linter passes: task lint

🤖 Generated with Claude Code

This adds two new e2e test files for VirtualMCPServer:

1. virtualmcp_yardstick_base_test.go:
   - Tests basic multi-backend aggregation using yardstick
   - Verifies prefix conflict resolution
   - Tests tool listing and tool calls through vMCP
   - Uses deterministic yardstick echo tool for reliable testing

2. virtualmcp_aggregation_filtering_test.go:
   - Tests tool filtering per backend workload
   - Verifies that filtered tools from one backend appear
   - Verifies that empty filter excludes all tools from a backend
   - Tests that filtered tools can still be called

Yardstick is a deterministic MCP server designed for testing that
provides an "echo" tool returning the input - perfect for verifying
data flow through vMCP aggregation and composite tools.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions bot added the size/L Large PR: 600-999 lines changed label Nov 27, 2025
@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.36%. Comparing base (4e36464) to head (9679e2f).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2778      +/-   ##
==========================================
+ Coverage   56.34%   56.36%   +0.02%     
==========================================
  Files         319      319              
  Lines       30887    30887              
==========================================
+ Hits        17402    17409       +7     
+ Misses      11999    11986      -13     
- Partials     1486     1492       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The previous test incorrectly assumed that an empty filter `[]string{}`
would exclude all tools from a backend. In reality, empty/nil filter
means "allow all tools" (no filtering applied).

This fix uses a non-matching filter `["nonexistent_tool"]` as a
workaround to exclude all tools from backend2, since yardstick only
exposes the "echo" tool.

Added TODO comments referencing issue #2779 which proposes adding
an `excludeAll` option for proper tool exclusion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Nov 27, 2025
@jhrozek jhrozek merged commit 419d7c3 into main Nov 28, 2025
35 of 36 checks passed
@jhrozek jhrozek deleted the add-vmcp-yardstick-e2e-tests branch November 28, 2025 11:18
@slyt3
Copy link
Contributor

slyt3 commented Dec 5, 2025

Hello, i have concerns about this code.

so anonymous auth: type: "anonymous" - its like no authentication. Anyone can connect????
and what about no timeout limit, okey some contexts have timeout, but loops processing external data dont check sizes.
and tools.Tools. could be huge. no size checks before iterating.

- = k8sClient.Delete() ignores cleanup failures??
or
testInput := "filtered123" straight to external service with out validation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants