Development and Testing of REST MVC‐enabled Agent

WaitFor Pattern with Spring MVC Integration

Overview

This document explains how to integrate Embabel agents with Spring MVC to create Human-in-the-Loop (HITL) workflows. The pattern allows agents to pause execution, wait for user input via HTTP REST APIs, and then resume processing with the user's response.

Reference

https://github.com/embabel/embabel-agent/blob/main/embabel-agent-api/src/test/kotlin/com/embabel/agent/api/hitl/WaitForMvcIntegrationTest.kt

1. Agent Behavior with waitFor()

What is waitFor()?

waitFor() is a function that interrupts agent execution to request user input. When an agent action calls waitFor(), it:

Throws an exception (AwaitableResponseException) instead of returning normally
The exception carries an Awaitable object containing the question/form for the user
The agent framework catches this exception and transitions the process to WAITING state
The Awaitable is stored in the blackboard for later retrieval
Execution stops - the agent literally pauses mid-action

The Paradox: Return Type vs Exception

A waitFor() call looks like it returns a value:

val userChoice: UserChoice = waitFor(choiceAwaitable)

But waitFor() always throws an exception and never actually returns! The return type serves a different purpose:

It's a type contract telling the framework what type to expect in the blackboard after the user responds
It enables action chaining - the next action can receive this type as a parameter
It allows the planner to formulate a plan - it can see that this action "produces" UserChoice

Think of it like this: The action promises to eventually produce a UserChoice, but not right now - only after the user responds.

Action Chaining Pattern

Agents work by chaining actions where Action A's output becomes Action B's input parameter:

Action 1: getChoice()
  - Declares return type: UserChoice
  - Actually: throws exception, enters WAITING
  - Framework records: "This action produces UserChoice"

[User responds via HTTP]

Action 2: processChoice(choice: UserChoice)
  - Needs UserChoice as input
  - Framework injects UserChoice from blackboard
  - Executes and produces final result

The framework's planner examines available actions and creates a plan:

"getChoice produces UserChoice"
"processChoice needs UserChoice"
"So I'll execute getChoice, then processChoice"

Without the correct return type, the planner cannot formulate a plan and the agent gets STUCK.

The Awaitable's Role

An Awaitable has two key responsibilities:

1. Payload (what the next action needs)

The awaitable's generic payload type MUST match what the next action expects
This is NOT what you show the user, but what you'll put in the blackboard after they respond
Example: AbstractAwaitable<UserChoice, UserChoiceResponse> - payload is UserChoice

2. onResponse (transformation logic)

When the user responds via HTTP, onResponse() is called
It transforms the HTTP response into a domain object
It adds that domain object to the blackboard
This is what makes the "return value" of waitFor magically appear

Think of the Awaitable as a suspended function call - it captures what needs to happen when the answer arrives.

Why Actions Need ActionContext

Every action should accept ActionContext as a parameter:

fun getChoice(input: UserInput, context: ActionContext): UserChoice

The framework uses ActionContext to:

Provide access to the current process state
Allow actions to set values for later use
Enable logging and debugging
Pass configuration and services

Without ActionContext, the action cannot properly integrate with the framework.

2. Controller Behavior

The REST controller manages the lifecycle of waiting agent processes. It has two main responsibilities:

POST /start - Initiating the Agent

What it does:

Creates an agent process - instantiates the agent, creates a blackboard, sets up initial state
Runs the agent - calls agentProcess.run() which executes actions until reaching a terminal state
Detects the state - checks if the process is WAITING, COMPLETED, FAILED, or STUCK
Saves the process - persists to repository so it can be resumed later
Returns appropriate response:
- If WAITING: extracts the Awaitable from blackboard, returns the question/choices to user
- If COMPLETED: extracts the final result from blackboard, returns it to user

Why it might enter WAITING: When the first action calls waitFor(), the framework catches the exception and sets status to WAITING instead of continuing to the next action.

Why it might complete immediately: If the agent doesn't call waitFor() in the first action, it continues executing all actions and completes.

POST /continue - Resuming the Agent

What it does:

Loads the waiting process - retrieves from repository using processId
Validates it's waiting - confirms status is WAITING (not already completed)
Finds the awaitable - retrieves the specific Awaitable from the blackboard
Calls onResponse - transforms user's HTTP input into domain object, adds to blackboard
Resumes execution - calls agentProcess.run() again to continue from where it paused
Detects the new state - checks if still WAITING or now COMPLETED
Saves the updated process - persists the new state
Returns appropriate response:
- If WAITING again: there's another waitFor() call, return the next question
- If COMPLETED: return the final result

Why it might enter WAITING again: The agent can have multiple waitFor() calls in sequence (e.g., ask for username, then password). Each requires a separate HTTP call.

The critical detail: The onResponse() call is what makes the magic happen - it puts the user's choice into the blackboard with the exact type the next action expects, so the framework can inject it as a parameter.

Flexible Return Types with ResponseEntity

Both endpoints return ResponseEntity<Any> because they can return different types:

Awaitable response when status = WAITING
- Contains: processId, awaitableId, prompt, options
- Tells user: "Here are your choices, call /continue with your answer"
Final result when status = COMPLETED
- Contains: the actual result (AdventureResult, LoginResult, etc.)
- Tells user: "Done! Here's your answer"

This flexibility enables multi-step workflows where you don't know in advance how many user interactions are needed.

Repository Pattern

The controller uses InMemoryAgentProcessRepository to:

Persist agent state between HTTP calls (otherwise, state is lost)
Enable resumption - load the exact same process instance with all its history
Support multiple concurrent processes - each user can have their own process ID

In production, you'd use a real database (PostgreSQL, MongoDB, etc.) instead of in-memory storage.

3. Test Behavior

The MockMVC integration test validates the complete workflow end-to-end.

Test Setup

The test creates a minimal Spring Boot application to avoid scanning unnecessary components:

WaitForTestApplication - minimal @SpringBootApplication that only scans the test package
WaitForTestConfig - provides InMemoryAgentProcessRepository as a bean
Avoids AgentApiTestApplication which scans too much and causes dependency issues

This approach ensures:

Fast test startup (no unnecessary beans)
No mocking needed (except MockMvc itself)
Real component integration (controller, repository, agent)

Test 1: Complete Flow - Start to Completion

What it validates:

Phase 1 - Starting:

POST /start with playerName
Expects HTTP 200 with awaitable response (processId, awaitableId, prompt, options)
Verifies process exists in repository with status = WAITING
Verifies awaitable exists in blackboard

Phase 2 - Continuing: 5. POST /continue with processId and user's choice 6. Expects HTTP 200 with final result 7. Verifies process status changed to COMPLETED 8. Verifies final result exists in blackboard with correct value

What it proves:

Agent successfully pauses at waitFor()
State persists between HTTP calls
User input correctly flows through onResponse → blackboard → next action
Agent resumes and completes successfully
Action chaining works (getChoice output → processChoice input)

Test 2: Different User Choice

What it validates:

Same flow as Test 1, but with a different choice ("Forest" instead of "Castle").

Why it matters:

Proves the system isn't hardcoded to one specific choice
Validates that user input truly drives the outcome
Ensures onResponse correctly handles different values

Test 3: Error Handling - 404 Not Found

What it validates:

POST /continue with non-existent processId
Expects HTTP 404 with error message

Why it matters:

Validates error handling for invalid process IDs
Ensures controller fails gracefully (no crashes)
Proves repository lookup works correctly

Logging for Observability

The test includes extensive logging at key points:

Controller entry points - when HTTP requests arrive
Agent action start/complete - when actions execute
Status transitions - when process enters WAITING or COMPLETED
onResponse calls - when user input is processed

This logging serves two purposes:

During development - see exactly what's happening step by step
During debugging - diagnose issues when tests fail

Example log sequence for successful flow:

=== POST /adventure/start - player: Player1 ===
=== getChoice action starting for player: Player1 ===
Action method entering wait state: Awaitable response exception
=== Agent process status after run: WAITING ===

=== POST /adventure/{id}/continue - choice: Castle ===
=== onResponse called, resuming agent process ===
=== processChoice action starting with choice: Castle ===
=== processChoice action completed with result: You chose: Castle ===
=== Agent process status after resume: COMPLETED ===

This trace shows the complete journey from start to finish.

Key Insights

Why the Payload Type Matters

The awaitable's payload type determines what the planner thinks the action produces:

AbstractAwaitable<ChoiceRequest, ...> → planner thinks action produces ChoiceRequest
AbstractAwaitable<UserChoice, ...> → planner thinks action produces UserChoice

If the next action expects UserChoice but the awaitable payload is ChoiceRequest, the planner cannot connect them and the agent gets STUCK.

The solution: Always make the payload type match what the next action's parameter expects.

The Flow of Data

Understanding how data flows through the system:

User submits choice via HTTP → controller receives JSON
Controller creates UserChoiceResponse → contains awaitableId + choice
Controller calls awaitable.onResponse() → receives UserChoiceResponse
onResponse creates UserChoice → domain object with just the choice value
onResponse adds to blackboard → blackboard.addObject(UserChoice("Castle"))
Agent resumes → planner looks for next action
Framework injects parameter → finds UserChoice in blackboard, passes to processChoice
processChoice executes → receives UserChoice("Castle"), produces result

Each step transforms the data:

HTTP JSON → UserChoiceResponse (REST layer)
UserChoiceResponse → UserChoice (domain layer)
UserChoice → AdventureResult (business logic layer)

This separation of concerns keeps the agent code clean and focused on business logic.

The Suspended Execution Model

Think of the agent as having a "pause button":

Agent executes action getChoice
Hits waitFor() - PAUSE ⏸️
State saved, HTTP returns to user
(Time passes... seconds, minutes, hours)
User responds via HTTP
State loaded from repository
onResponse injects user's answer
Agent resumes - PLAY ▶️
Continues with processChoice

The agent doesn't "remember" it was paused - from its perspective, waitFor() simply returned the UserChoice and execution continued normally. The framework handles all the complexity of pausing, saving, loading, and resuming.

Summary

The WaitFor pattern with Spring MVC enables interactive, conversational agents by:

Agents pause execution when calling waitFor(), entering WAITING state
Controllers manage lifecycle - start processes, save state, resume on user input
Awaitables bridge HTTP and domain - transform user responses into domain objects
Action chaining connects steps - output of one action becomes input of next
Tests validate end-to-end - prove the complete flow works with real components

This pattern unlocks powerful use cases:

Multi-step forms and wizards
Approval workflows
Interactive troubleshooting
Conversational interfaces
Human-in-the-loop AI systems

The key innovation is treating user input as just another data source - the agent doesn't care if data comes from a database, API, or human. It just waits for the framework to provide it.

(c) Embabel Software Inc 2024-2025.

Uh oh!

Development and Testing of REST MVC‐enabled Agent

WaitFor Pattern with Spring MVC Integration

Overview

Reference

1. Agent Behavior with waitFor()

What is waitFor()?

The Paradox: Return Type vs Exception

Action Chaining Pattern

The Awaitable's Role

Why Actions Need ActionContext

2. Controller Behavior

POST /start - Initiating the Agent

POST /continue - Resuming the Agent

Flexible Return Types with ResponseEntity

Repository Pattern

3. Test Behavior

Test Setup

Test 1: Complete Flow - Start to Completion

Test 2: Different User Choice

Test 3: Error Handling - 404 Not Found

Logging for Observability

Key Insights

Why the Payload Type Matters

The Flow of Data

The Suspended Execution Model

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally