Skip to content

[Reflect] RF-3755: Release new MCP tools for creating and editing tests. Add new SAP-specific test creation skill.#369

Merged
sazap10 merged 3 commits intoSmartBear:mainfrom
tmcneal:RF-3755-2
Mar 18, 2026
Merged

[Reflect] RF-3755: Release new MCP tools for creating and editing tests. Add new SAP-specific test creation skill.#369
sazap10 merged 3 commits intoSmartBear:mainfrom
tmcneal:RF-3755-2

Conversation

@tmcneal
Copy link
Contributor

@tmcneal tmcneal commented Mar 13, 2026

Goal

This adds additional MCP tools and associated instructions to enable Agents to assist Testers in creating and modifying Reflect tests. Additionally, it exposes a new "skill" in the form of a Prompt definition that adds additional instructions for when a user wants to test an SAP application.

Design

The list_segments tool uses the Reflect API just like the existing tools. The other tools interact with Reflect via a persistent WebSocket connection. This connection is made directly to the test session that's running in our Cloud infrastructure. This works the same way for Web tests and Mobile tests. After the WebSocket connection is established via connect_to_session, tools can be invoked which issue commands on the WebSocket connection.

The business logic for each of these WebSocket commands lies within Reflect's infrastructure. This means that the logic in the MCP server is pretty simple - its main responsibilities are maintaining the socket connection, emitting messages on the socket, waiting for messages on the socket, and error handling.

Changeset

The following tools are added:

  • list_segments: Lists the reusable test steps in the account, and includes some relevant metadata about each segment to help the Agent understand when a segment could be used as part of accomplishing a task.
  • connect_to_session: Establishes a WebSocket connection to a live Reflect test recording session.
  • add_prompt_step: Adds a natural language instruction to the test. This could be a single action, assertion, or query.
  • get_screenshot: Returns a screenshot of the current browser window (Web) or device (Mobile). The Agent is instructed to analyze the screenshot to determine the current state of the page and to decide what to do next to complete its assigned task.
  • delete_previous_step: Performs an "undo" on a step or segment that was previously added. The Agent is instructed to use this tool when a step or segment that it has added has failed or did not perform the task as intended.
  • add_segment: Adds an existing set of reusable test steps to the test.

Additionally, the instructions property is now updated with guidance to the Agent on how to properly build a Reflect test.

Finally, a new skill is added that's specific to testing SAP apps.

Testing

  • Ran each of the tools and verified they are working correct and responses are as expected
  • Created a Web test
  • Created a Mobile test
  • Created an SAP test using the reflect-sap-test skill

@tmcneal tmcneal requested review from a team as code owners March 13, 2026 17:58
Comment on lines +53 to +54
const response = (await responsePromise) as Record<string, unknown>;
const result = response.result as Record<string, unknown> | undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should just cast as the correct type

Suggested change
const response = (await responsePromise) as Record<string, unknown>;
const result = response.result as Record<string, unknown> | undefined;
const response = (await responsePromise) as { result?: { typ?: string; response?: string | boolean } };
const result = response.result;

import type { PromptParams } from "../common/types";

export const PROMPTS: PromptParams[] = [
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should follow the same 1 file per entry as the tools IMO rather than starting a non scaling pattern again and then refactoring later

Comment on lines +39 to +43
private activeConnections = new Map<string, WebSocketManager>();
private sessionStates = new Map<
string,
{ platform: TestPlatform; test: { name: string } }
>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to store activeConnections and sessionStates on the shared ReflectClient instance here?

I’m asking because ReflectClient keeps long-lived mutable state (apiToken, activeConnections, sessionStates) on the client instance, while the clients are registered as singletons and the same instance is reused across HTTP MCP sessions. This means one HTTP session could overwrite another session’s Reflect auth and potentially inherit or leak live recording session sockets.

Also, when a session closes, it only removes the transport, it doesn’t clear those Reflect-held connections. Let me know if I misunderstood completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the server.ts to expose a "cleanup" function and am using it to clean up the websocket connections now. I think this only matters when running the MCP server is HTTP mode since in stdio you can only ever have one session.

Comment on lines +98 to +102
waitForResponse(id: string): Promise<unknown> {
return new Promise((resolve, reject) => {
this.pendingResponses.set(id, { resolve, reject });
});
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should happen here if the websocket disconnects or Reflect never sends a response for this request? Should we add timeout/cleanup logic so the tool call doesn’t hang forever pendingResponses? I think adding a close handler that rejects all pending promises and cleans up state would prevent from hanging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had timeout logic here but removed it since some operations were taking longer than the timeout but succeeding in Reflect, but were ending up being considered failed in the MCP server because the timeout was exceeded. We could potentially add reconnect to make it resilient to websocket disconnects, but there might be value in keeping the code simple for now and adding it once we see evidence that websocket reconnect logic is something that would be beneficial to users in practice. For now, given that websockets would only be valid for the length of a recording session, a user can recover by ending the recording session and restarting it...

Comment on lines +53 to +54
const response = (await responsePromise) as Record<string, unknown>;
const result = response.result as Record<string, unknown> | undefined;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Add casting the response to MCPAddPromptStepSuccessResponse (which you already have defined!)

@tmcneal
Copy link
Contributor Author

tmcneal commented Mar 16, 2026

@sazap10 This is ready for final review and approval

@tmcneal
Copy link
Contributor Author

tmcneal commented Mar 18, 2026

@domarmstrong Are you able to approve and merge this?

@tmcneal tmcneal force-pushed the RF-3755-2 branch 2 times, most recently from cea58d9 to 318b30c Compare March 18, 2026 13:39
tmcneal added 2 commits March 18, 2026 11:28
This adds a new set of MCP tools that allows external agents to interact with
a live Web or Mobile recording session. This enables an external agent to assist
in creating and editing tests.

Also includes an update to the base server.ts to support cleaning up MCP
session-related items when a session ends. This is required so that we can cleanup
and close WebSockets associated with a given session when we're running in HTTP mode.
…tests

This populates the MCP server "instructions" property with guidance for how the agent
should approach creating and editing tests in Reflect.
Copy link

@mahavir1166 mahavir1166 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@mahavir1166 mahavir1166 enabled auto-merge (squash) March 18, 2026 16:10
@mahavir1166 mahavir1166 removed the request for review from a team March 18, 2026 16:26
auto-merge was automatically disabled March 18, 2026 16:50

Head branch was pushed to by a user without write access

Via the "prompts" property, we expose a skill that gives the agent additional guidance
when creating Reflect tests for SAP BTP apps and SAP S4/HANA.
@mahavir1166 mahavir1166 enabled auto-merge (squash) March 18, 2026 17:16
@sazap10 sazap10 disabled auto-merge March 18, 2026 18:05
@sazap10 sazap10 merged commit f72b401 into SmartBear:main Mar 18, 2026
39 of 41 checks passed
@tmcneal tmcneal mentioned this pull request Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants