OpenHands · xingyaoww · Nov 13, 2025
@@ -0,0 +1,214 @@
+---
+title: Theory of Mind (ToM) Agent Integration
+description: Enable personalized user understanding and guidance through ToM agent integration for better handling of vague or ambiguous tasks.
+---
+
+## Overview
+
+The ToM (Theory of Mind) agent integration provides your agent with capabilities to understand user intent and preferences through user modeling. When tasks are vague or ambiguous, the agent can consult the ToM agent for personalized guidance based on conversation history and user patterns.
+
+This feature is useful when:
+- User instructions are unclear or under-specified
+- You need help understanding what the user actually wants
+- You want guidance on the best approach for the current task
+- Building user preferences and patterns from conversation history
+
+## Quick Start
+
+<Note>
+This example is available on GitHub: [examples/01_standalone_sdk/25_tom_agent.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/25_tom_agent.py)
+</Note>
+
+```python icon="python" expandable examples/01_standalone_sdk/25_tom_agent.py
+"""Example demonstrating Tom agent with Theory of Mind capabilities.
+
+This example shows how to set up an agent with Tom tools for getting
+personalized guidance based on user modeling. Tom tools include:
+- TomConsultTool: Get guidance for vague or unclear tasks
+- SleeptimeComputeTool: Index conversations for user modeling
+"""
+
+import os
+
+from pydantic import SecretStr
+
+from openhands.sdk import LLM, Agent, Conversation
+from openhands.sdk.tool import Tool, register_tool
+from openhands.tools.preset.default import get_default_tools, register_default_tools
+from openhands.tools.tom_consult import SleeptimeComputeTool, TomConsultTool
+from openhands.tools.tom_consult.action import SleeptimeComputeAction
+
+
+# Configure LLM
+api_key: str | None = os.getenv("LLM_API_KEY")
+assert api_key is not None, "LLM_API_KEY environment variable is not set."
+
+llm: LLM = LLM(
+    model="openhands/claude-sonnet-4-5-20250929",
+    api_key=SecretStr(api_key),
+    usage_id="agent",
+    drop_params=True,
+)
+
+# Register tools (default tools + Tom tools)
+register_default_tools(enable_browser=False)  # CLI mode, no browser
+register_tool("TomConsultTool", TomConsultTool)
+register_tool("SleeptimeComputeTool", SleeptimeComputeTool)
+
+# Build tools list with Tom tools
+tools = get_default_tools(enable_browser=False)
+
+# Configure Tom tools with parameters
+tom_params: dict[str, bool | str] = {
+    "enable_rag": True,  # Enable RAG in Tom agent
+}
+
+# Add LLM configuration for Tom tools (uses same LLM as main agent)
+tom_params["llm_model"] = llm.model
+if llm.api_key:
+    if isinstance(llm.api_key, SecretStr):
+        tom_params["api_key"] = llm.api_key.get_secret_value()
+    else:
+        tom_params["api_key"] = llm.api_key
+if llm.base_url:
+    tom_params["api_base"] = llm.base_url
+
+# Add both Tom tools to the agent
+tools.append(Tool(name="TomConsultTool", params=tom_params))
+tools.append(Tool(name="SleeptimeComputeTool", params=tom_params))
+
+# Create agent with Tom capabilities
+# This agent can consult Tom for personalized guidance
+# Note: Tom's user modeling data will be stored in ~/.openhands/
+agent: Agent = Agent(llm=llm, tools=tools)
+
+# Start conversation
+cwd: str = os.getcwd()
+PERSISTENCE_DIR = os.path.expanduser("~/.openhands")
+CONVERSATIONS_DIR = os.path.join(PERSISTENCE_DIR, "conversations")
+conversation = Conversation(
+    agent=agent, workspace=cwd, persistence_dir=CONVERSATIONS_DIR
+)
+
+# Optionally run sleeptime compute to index existing conversations
+# This builds user preferences and patterns from conversation history
+sleeptime_compute_tool = conversation.agent.tools_map.get("sleeptime_compute")
+if sleeptime_compute_tool and sleeptime_compute_tool.executor:
+    print("\nRunning sleeptime compute to index conversations...")
+    sleeptime_result = sleeptime_compute_tool.executor(
+        SleeptimeComputeAction(), conversation
+    )
+    print(f"Result: {sleeptime_result.message}")
+    print(f"Sessions processed: {sleeptime_result.sessions_processed}")
+
+# Send a potentially vague message where Tom consultation might help
+conversation.send_message(
+    "I need to debug some code but I'm not sure where to start. "
+    + "Can you help me figure out the best approach?"
+)
+conversation.run()
+
+print("\n" + "=" * 80)
+print("Tom agent consultation example completed!")
+print("=" * 80)
+
+
+# Optional: Index this conversation for Tom's user modeling
+# This builds user preferences and patterns from conversation history
+# Uncomment the lines below to index the conversation:
+#
+# conversation.send_message("Please index this conversation using sleeptime_compute")
+# conversation.run()
+# print("\nConversation indexed for user modeling!")
+
+```
+
+## Key Concepts
+
+### TomConsultTool
+
+The `TomConsultTool` allows your agent to consult the ToM agent for guidance. It analyzes:
+- The current user message
+- Conversation history and context
+- User patterns from previous interactions
+
+The tool returns personalized suggestions on how to approach the task.
+
+### SleeptimeComputeTool
+
+The `SleeptimeComputeTool` processes conversation history to build and update the user model. This tool:
+- Indexes completed conversations
+- Extracts user preferences and patterns
+- Updates the ToM agent's understanding of the user
+
+This is typically used at the end of conversations or when explicitly requested.
+
+## Configuration
+
+### Required Dependencies
+
+The ToM agent integration requires the `tom-swe` package, which is included as an optional dependency:
+
+```bash
+pip install openhands-tools[tom]  # When tom extra is available
+# or install directly:
+pip install tom-swe
+```
+
+### Tool Parameters
+
+Both tools accept the following parameters:
+
+- `enable_rag` (bool): Enable RAG capabilities in the ToM agent (default: True)
+- `llm_model` (str): LLM model to use for ToM agent
+- `api_key` (str): API key for the ToM agent's LLM
+- `api_base` (str): Base URL for the ToM agent's LLM API
+
+### Data Storage
+
+User modeling data is stored in `~/.openhands/` by default. This includes:
+- User preferences and patterns
+- Processed conversation history
+- RAG indices for efficient retrieval
+
+## Best Practices
+
+### When to Use ToM Consultation
+
+Use the ToM consultation when:
+- User messages are vague or ambiguous
+- Multiple valid approaches exist and you need guidance
+- You want to personalize responses based on user history
+- Task requirements are under-specified
+
+### Conversation Indexing
+
+For best results:
+- Index conversations after they're complete
+- Run sleeptime compute periodically to update the user model
+- Ensure sufficient conversation history exists before expecting personalized guidance
+
+## Troubleshooting
+
+### Import Errors
+
+If you encounter import errors with `tom-swe`:
+
+```python
+# The imports are lazy-loaded, so they only fail when actually used
+# Make sure tom-swe is installed:
+pip install tom-swe
+```
+
+### Heavy Dependencies
+
+Note that `tom-swe` has dependencies on scientific Python packages (numpy, scipy, pandas). These are:
+- Required for running the ToM agent
+- Excluded from the binary build of openhands-agent-server
+- Only needed if you're using ToM features
+
+## Related
+
+- [Agent Delegation](/sdk/guides/agent-delegation) - Delegate tasks to sub-agents
+- [Custom Tools](/sdk/guides/custom-tools) - Create your own agent tools
+- [Conversation Persistence](/sdk/guides/convo-persistence) - Persist conversation history