Add framework adapter system for LangGraph, Claude Agent SDK, and other agent frameworks

## Summary

**Revised scope (Feb 2026):** This issue covers Phase 1 of the framework adapter system — the adapter foundation and one working framework starter (Anthropic/Claude). LangGraph starter, query tools, and additional frameworks are tracked in separate follow-up issues.

Add a framework adapter system that lets developers use popular agent frameworks (Claude tool_use, LangGraph, etc.) to build agents that play Agent Arena scenarios. The first adapter and starter will use the Anthropic Python SDK with native tool_use.

**Key framing:** This is NOT "bring your existing agent and test it here." This IS **"Want to learn Claude tool_use? Build an AI agent that plays a game."** Each framework starter is a tutorial, not just a template.

## Architecture: Three Categories

Game information falls into three categories with different costs:

### 1. Context (injected into prompt — free, always present)

The observation is formatted into the LLM prompt automatically. No tool call needed:

```
Position: (5.2, 0, 8.1)  |  Health: 80  |  Energy: 100
Nearby: berry at (10,0,5) dist=4.2, fire at (2,0,1) dist=3.0
Inventory: wood x2, stone x1  |  Explored: 45%
Objective: collect 10 resources (current: 4/10)
```

### 2. Action Tools (sent to Godot — costs a tick, pick exactly ONE)

The agent's final decision. Ends the turn:

```python
move_to(target_position=[10, 0, 5])
collect(target="berry")
craft_item(recipe="torch")
explore(direction="north")
idle()
```

### 3. Query Tools (future — see follow-up issues)

Optional tools the agent can call to dig deeper before deciding (e.g., `query_spatial_memory`, `recall_location`). **Not in scope for this issue** — tracked in #85 and #86.

### Per-Tick Flow (Phase 1)

```
Godot sends observation
        |
        v
  CONTEXT: Observation formatted into prompt (free)
        |
        v
  ACTION: LLM calls exactly one action tool (costs a tick)
    -> move_to([10, 0, 5])
        |
        v
  Adapter captures action, returns as Decision to Godot
```

## How It Runs

The framework runs **inside** the Python process alongside the SDK:

```
+------------------------------------------+
|  PYTHON PROCESS (same as today)          |
|                                          |
|  Framework Agent (Anthropic SDK)         |
|       |                                  |
|       | reads context (observation)      |
|       | returns one action tool call     |
|       v                                  |
|  Agent Arena SDK (AgentArena.run())      |
|       |                                  |
+-------+----------------------------------+
        | HTTP IPC (unchanged)
+-------+----------------------------------+
|  GODOT (unchanged)                       |
|  Scenes, physics, tool execution         |
+------------------------------------------+
```

## Deliverables

### 1. Adapter ABC (`python/sdk/agent_arena_sdk/adapters/base.py`)

```python
class FrameworkAdapter(ABC):
    @abstractmethod
    def decide(self, observation: Observation) -> Decision:
        """Run framework agent loop, return one action."""
        ...

    def format_observation(self, observation: Observation) -> str:
        """Convert observation to prompt text. Shared default implementation."""
        ...

    def get_action_tools(self) -> list[ToolSchema]:
        """Canonical action tool definitions. Shared."""
        ...
```

- `format_observation()` — extracted from existing `starters/llm/agent.py:_build_prompt()`
- `get_action_tools()` — canonical set: move_to, collect, craft_item, idle, explore
- Support both sync `decide()` and async adapters

### 2. AgentArena integration

`AgentArena.run()` accepts either a bare `Callable[[Observation], Decision]` (existing) or a `FrameworkAdapter` instance (new):

```python
# Existing (still works)
arena.run(my_decide_function)

# New
adapter = AnthropicAdapter(model="claude-sonnet-4-20250514")
arena.run(adapter)
```

### 3. Anthropic/Claude starter (`starters/claude/`)

Tutorial-quality starter teaching Anthropic tool_use by building a foraging agent:

| File | Purpose |
|------|---------|
| `agent.py` | Anthropic adapter + agent logic with tutorial comments |
| `run.py` | One-command startup |
| `requirements.txt` | `agent-arena-sdk`, `anthropic` |
| `README.md` | Learning goals, setup, how to modify, debugging |

**Teaches:** Anthropic Messages API, tool definitions, tool_use responses, tool_result messages, multi-turn tool loops, system prompts.

### 4. Canonical action tool definitions

Standard tool schemas in the SDK/adapter base (not duplicated per-starter):

- `move_to` — target_position: [x, y, z]
- `collect` — target_name: str
- `craft_item` — recipe: str
- `idle` — no params
- `explore` — direction: str (optional)

### 5. Tests

- Unit tests for adapter base (observation formatting, tool schema generation)
- Unit tests for Anthropic adapter (mock API responses)
- Integration test with hand-built Observations (no Godot needed)

## What's NOT in scope (follow-up issues)

| Item | Issue |
|------|-------|
| LangGraph starter | #84 |
| Query tools (spatial memory, episode memory) | #86 |
| Migrate SpatialMemory to SDK | #85 |
| OpenAI / CrewAI starters | Future |
| Inspector refactor (game-side only) | #75 |

## Design Decisions

**Adapter base is thin.** It provides observation formatting and tool schemas. Each starter owns its prompt engineering, error handling, and fallback logic — that's the learning content. The base class shouldn't hide the interesting parts.

**"Claude SDK" → "Anthropic" naming.** The starter uses the `anthropic` Python package directly with `client.messages.create(tools=...)`. There is no separate "Claude Agent SDK" product. The starter folder is `starters/claude/`.

**No query tools in v1.** SpatialMemory and EpisodeMemory haven't been migrated to the SDK yet. Context + action tools is sufficient for a working foraging agent. Query tools add depth but aren't required.

## Dependencies

- #71 (tool completion callbacks) — **DONE** ✅
- #78 (consolidated schemas) — **DONE** ✅
- #80 (remove deprecated code) — **DONE** ✅

Soft dependencies (nice to have, don't block):
- #72 (mock observations) — helps testing but we can use hand-built Observations
- #73 (intermediate starter) — independent work

## Success Criteria

- [ ] A developer can `pip install anthropic`, set `ANTHROPIC_API_KEY`, and have a Claude-powered agent playing foraging in under 15 minutes
- [ ] The starter teaches Anthropic tool_use concepts through commented, tutorial-quality code
- [ ] The adapter base class is reusable for LangGraph and future frameworks
- [ ] Existing `AgentArena.run(callback)` still works unchanged
- [ ] Observation formatting is shared (not duplicated per starter)
- [ ] All tests pass

## Context

See `docs/framework_integration_strategy.md` for the full strategic discussion. Original issue scope was trimmed after analysis revealed query tools need SpatialMemory migration (#85) and the scope was too large for a single PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add framework adapter system for LangGraph, Claude Agent SDK, and other agent frameworks #74

Summary

Architecture: Three Categories

1. Context (injected into prompt — free, always present)

2. Action Tools (sent to Godot — costs a tick, pick exactly ONE)

3. Query Tools (future — see follow-up issues)

Per-Tick Flow (Phase 1)

How It Runs

Deliverables

1. Adapter ABC (`python/sdk/agent_arena_sdk/adapters/base.py`)

2. AgentArena integration

3. Anthropic/Claude starter (`starters/claude/`)

4. Canonical action tool definitions

5. Tests

What's NOT in scope (follow-up issues)

Design Decisions

Dependencies

Success Criteria

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

File	Purpose
`agent.py`	Anthropic adapter + agent logic with tutorial comments
`run.py`	One-command startup
`requirements.txt`	`agent-arena-sdk`, `anthropic`
`README.md`	Learning goals, setup, how to modify, debugging

Item	Issue
LangGraph starter	#84
Query tools (spatial memory, episode memory)	#86
Migrate SpatialMemory to SDK	#85
OpenAI / CrewAI starters	Future
Inspector refactor (game-side only)	#75

Add framework adapter system for LangGraph, Claude Agent SDK, and other agent frameworks #74

Description

Summary

Architecture: Three Categories

1. Context (injected into prompt — free, always present)

2. Action Tools (sent to Godot — costs a tick, pick exactly ONE)

3. Query Tools (future — see follow-up issues)

Per-Tick Flow (Phase 1)

How It Runs

Deliverables

1. Adapter ABC (python/sdk/agent_arena_sdk/adapters/base.py)

2. AgentArena integration

3. Anthropic/Claude starter (starters/claude/)

4. Canonical action tool definitions

5. Tests

What's NOT in scope (follow-up issues)

Design Decisions

Dependencies

Success Criteria

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Adapter ABC (`python/sdk/agent_arena_sdk/adapters/base.py`)

3. Anthropic/Claude starter (`starters/claude/`)