# Lesson 06 — Wrapping Agents as A2A Servers

Transform the standalone QAAgent into a fully **A2A-compliant server**.
Supports **GitHub Models** (free, cloud) and **AI Toolkit LocalFoundry** (local, no token required).

## What You'll Learn

- Implement the `AgentExecutor` interface from the A2A Python SDK
- Define an Agent Card with skills and capabilities
- Wire up `DefaultRequestHandler` + `A2AStarletteApplication`
- Test with curl commands

## Prerequisites

- Lesson 05 completed (QAAgent class)

**GitHub Models** (default):

- `GITHUB_TOKEN` environment variable set in `.env`

**AI Toolkit LocalFoundry** (alternative — no token needed):

- VS Code AI Toolkit extension with a model running on port 5272

> **A2A SDK docs:** [pypi.org/project/a2a-sdk](https://pypi.org/project/a2a-sdk/)
>
> **A2A Samples:** [github.com/a2aproject/a2a-samples](https://github.com/a2aproject/a2a-samples)


## Step 1 — Install the A2A SDK


In [1]:
# ── Dependencies ──────────────────────────────────────────────────────────────
# a2a-sdk[http-server] → A2A Python SDK + Starlette ASGI server + uvicorn
# openai               → OpenAI-compatible SDK (GitHub Models / LocalFoundry)
# python-dotenv        → Loads GITHUB_TOKEN from the .env file automatically
#
# If you have the venv active (a2a-examples kernel selected) these are already
# installed. Run this cell anyway to ensure the kernel is up to date.
# ─────────────────────────────────────────────────────────────────────────────
%pip install "a2a-sdk[http-server]" openai python-dotenv --quiet
print("Dependencies ready.")


Note: you may need to restart the kernel to use updated packages.
Dependencies ready.



[notice] A new release of pip is available: 24.0 -> 26.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
import os
from dotenv import find_dotenv, load_dotenv

# ── Load secrets from .env ─────────────────────────────────────────────────
# find_dotenv() searches upward from this notebook's working directory.
# It will find _examples/.env which contains GITHUB_TOKEN and optionally
# LOCALFOUNDRY_ENDPOINT / LOCALFOUNDRY_MODEL.
#
# To set up your .env:
#   cp _examples/.env.example _examples/.env
#   # then edit .env and add:  GITHUB_TOKEN=ghp_your_token_here
# ─────────────────────────────────────────────────────────────────────────────
env_path = find_dotenv(raise_error_if_not_found=False)
if env_path:
    load_dotenv(env_path)
    print(f"Loaded .env from: {env_path}")
else:
    print("NOTE: .env not found — set GITHUB_TOKEN manually if needed")
    # os.environ["GITHUB_TOKEN"] = "ghp_your_token_here"

# ── Model provider ──────────────────────────────────────────────────────────
# "github"       — GitHub Models  (free, needs GITHUB_TOKEN in .env)
#                  https://github.com/settings/tokens
# "localfoundry" — AI Toolkit LocalFoundry  (local, no token needed)
#                  VS Code AI Toolkit → Models → Load a model → Run
#                  Override endpoint/model via LOCALFOUNDRY_ENDPOINT /
#                  LOCALFOUNDRY_MODEL in .env if you changed the port or model.
PROVIDER = "github"  # ← change to "localfoundry" to use a local model

if PROVIDER == "github":
    token = os.environ.get("GITHUB_TOKEN", "")
    if token:
        print(f"GITHUB_TOKEN is set ({token[:8]}...)")
    else:
        print("ERROR: GITHUB_TOKEN is NOT set — cells below will fail until you set it")
    ENDPOINT = "https://models.inference.ai.azure.com"
    API_KEY = token
    MODEL = "Phi-4"

elif PROVIDER == "localfoundry":
    ENDPOINT = os.environ.get("LOCALFOUNDRY_ENDPOINT", "http://localhost:5272/v1/")
    API_KEY = "unused"  # LocalFoundry ignores the key
    MODEL = os.environ.get("LOCALFOUNDRY_MODEL", "qwen2.5-0.5b-instruct-generic-gpu:4")
    print(f"NOTE: AI Toolkit LocalFoundry — ensure a model is running at {ENDPOINT}")

else:
    raise ValueError(f"Unknown PROVIDER: {PROVIDER!r}")

print(f"Provider  : {PROVIDER}")
print(f"Endpoint  : {ENDPOINT}")
print(f"Model     : {MODEL}")

Loaded .env from: y:\.sources\localm-tuts\a2a\_examples\.env
GITHUB_TOKEN is set (github_p...)
Provider  : github
Endpoint  : https://models.inference.ai.azure.com
Model     : Phi-4


## Step 2 — Recreate the QAAgent

We redefine `QAAgent` inline so this notebook is self-contained.
In production, you'd import it from a shared module.

> This is the same class from Lesson 05. The `server.py` file imports
> it from the Lesson 05 `qa_agent.py` module.


In [3]:
import os
from pathlib import Path
from openai import AsyncOpenAI

SYSTEM_PROMPT = """\
You are a helpful insurance policy assistant.
Use the following policy document to answer questions accurately.
If the answer is not in the document, say so clearly.

--- POLICY DOCUMENT ---
{policy_text}
--- END DOCUMENT ---
"""


def load_knowledge(path: str) -> str:
    return Path(path).read_text(encoding="utf-8")


class QAAgent:
    def __init__(self, knowledge_path: str):
        self.client = AsyncOpenAI(
            base_url=ENDPOINT,
            api_key=API_KEY,
        )
        self.model = MODEL
        self.knowledge = load_knowledge(knowledge_path)
        self.system_prompt = SYSTEM_PROMPT.format(policy_text=self.knowledge)

    async def query(self, question: str) -> str:
        response = await self.client.chat.completions.create(
            model=self.model,
            messages=[
                {"role": "system", "content": self.system_prompt},
                {"role": "user", "content": question},
            ],
            temperature=0.2,
            max_tokens=2048,  # required by GitHub Models free-tier (4 000 out limit)
        )
        return response.choices[0].message.content


print("QAAgent defined.")

QAAgent defined.


## Step 3 — Implement the AgentExecutor

The `AgentExecutor` is the bridge between the A2A protocol and your agent logic:

1. `context.get_user_input()` — extracts text from the incoming message
2. Call your agent's `query()` method
3. `event_queue.enqueue_event(new_agent_text_message(...))` — emit the response

**Key imports:**

```python
from a2a.server.agent_execution import AgentExecutor, RequestContext
from a2a.server.events import EventQueue
from a2a.utils import new_agent_text_message
```


In [4]:
from a2a.server.agent_execution import AgentExecutor, RequestContext
from a2a.server.events import EventQueue
from a2a.utils import new_agent_text_message


class QAAgentExecutor(AgentExecutor):
    """Wraps QAAgent with the A2A AgentExecutor interface."""

    def __init__(self, knowledge_path: str = "data/insurance_policy.txt"):
        self.agent = QAAgent(knowledge_path)

    async def execute(
        self,
        context: RequestContext,
        event_queue: EventQueue,
    ) -> None:
        # 1. Extract user message text
        question = context.get_user_input()

        # 2. Call the QA agent
        answer = await self.agent.query(question)

        # 3. Emit the response as an A2A event
        await event_queue.enqueue_event(new_agent_text_message(answer))

    async def cancel(
        self,
        context: RequestContext,
        event_queue: EventQueue,
    ) -> None:
        raise Exception("cancel not supported")


print("QAAgentExecutor defined.")

QAAgentExecutor defined.


## Step 4 — Define the Agent Card

The Agent Card is served at `/.well-known/agent.json`. It tells clients:

- What the agent can do (skills)
- What capabilities it supports (streaming, push notifications)
- What content types it accepts/produces

Think of it as a **capability manifest** for your agent.


In [5]:
from a2a.types import AgentCapabilities, AgentCard, AgentSkill

agent_card = AgentCard(
    name="QAAgent",
    description=f"Answers questions about insurance policies using {MODEL} via {PROVIDER}",
    url="http://localhost:10001/",
    version="1.0.0",
    capabilities=AgentCapabilities(streaming=True),
    default_input_modes=["text"],
    default_output_modes=["text"],
    skills=[
        AgentSkill(
            id="policy-qa",
            name="Policy Question Answering",
            description="Answer questions about insurance policy documents",
            tags=["qa", "insurance", "policy"],
            examples=[
                "What is the deductible for the Standard plan?",
                "Are cosmetic procedures covered?",
                "How do I file a claim?",
            ],
        )
    ],
)

# Preview the Agent Card as JSON
print(agent_card.model_dump_json(indent=2, exclude_none=True))

{
  "capabilities": {
    "streaming": true
  },
  "defaultInputModes": [
    "text"
  ],
  "defaultOutputModes": [
    "text"
  ],
  "description": "Answers questions about insurance policies using Phi-4 via github",
  "name": "QAAgent",
  "preferredTransport": "JSONRPC",
  "protocolVersion": "0.3.0",
  "skills": [
    {
      "description": "Answer questions about insurance policy documents",
      "examples": [
        "What is the deductible for the Standard plan?",
        "Are cosmetic procedures covered?",
        "How do I file a claim?"
      ],
      "id": "policy-qa",
      "name": "Policy Question Answering",
      "tags": [
        "qa",
        "insurance",
        "policy"
      ]
    }
  ],
  "url": "http://localhost:10001/",
  "version": "1.0.0"
}


## Step 5 — Wire Up the Server

Three components wired together:

1. **`DefaultRequestHandler`** — routes requests to the executor, manages task state
2. **`A2AStarletteApplication`** — builds the ASGI app (Starlette-based)
3. **`uvicorn`** — serves the app

```python
from a2a.server.apps import A2AStarletteApplication
from a2a.server.request_handlers import DefaultRequestHandler
from a2a.server.tasks import InMemoryTaskStore
```


In [6]:
from a2a.server.apps import A2AStarletteApplication
from a2a.server.request_handlers import DefaultRequestHandler
from a2a.server.tasks import InMemoryTaskStore

request_handler = DefaultRequestHandler(
    agent_executor=QAAgentExecutor(),
    task_store=InMemoryTaskStore(),
)

server = A2AStarletteApplication(
    agent_card=agent_card,
    http_handler=request_handler,
)

app = server.build()
print(f"ASGI app built — type: {type(app).__name__}")
print(f"Agent Card URL: {agent_card.url}.well-known/agent.json")

ASGI app built — type: Starlette
Agent Card URL: http://localhost:10001/.well-known/agent.json


## Step 6 — Run the Server

> **Note:** Jupyter already runs an asyncio event loop, so `uvicorn.run()` raises
> `RuntimeError: Runner.run() cannot be called from a running event loop`.
> Use `await server.serve()` instead — Jupyter supports top-level `await`.

Running the server **blocks this notebook**. To keep the notebook interactive,
run the server from a terminal instead:

```bash
cd src
python server.py
```

Or run the cell below (it will block until you interrupt the kernel):


In [None]:
# Uncomment to run the server (this will block until you interrupt the kernel):

import uvicorn

config = uvicorn.Config(app, host="0.0.0.0", port=10001)
server_instance = uvicorn.Server(config)

print("Starting QAAgent A2A Server on http://localhost:10001")
print("Agent Card: http://localhost:10001/.well-known/agent.json")
print("Press Ctrl+C or interrupt kernel to stop")
await server_instance.serve()

INFO:     Started server process [30540]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:10001 (Press CTRL+C to quit)


Starting QAAgent A2A Server on http://localhost:10001
Agent Card: http://localhost:10001/.well-known/agent.json
Press Ctrl+C or interrupt kernel to stop
INFO:     127.0.0.1:63192 - "GET /.well-known/agent-card.json HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63194 - "POST / HTTP/1.1" 200 OK


## Step 7 — Test with curl

With the server running (from terminal), test it:

### Fetch Agent Card

```bash
curl http://localhost:10001/.well-known/agent.json | python -m json.tool
```

### Send a Test Question

```bash
curl -X POST http://localhost:10001 \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": "test-1",
    "method": "message/send",
    "params": {
      "message": {
        "role": "user",
        "parts": [{"kind": "text", "text": "What is the deductible?"}],
        "messageId": "msg-001"
      }
    }
  }'
```

You should see a JSON-RPC response with a Task containing the agent's answer.


## Port Convention

Each agent in this course runs on a dedicated port:

| Port  | Agent             | Lesson |
| ----- | ----------------- | ------ |
| 10001 | QAAgent           | 5-7    |
| 10002 | ResearchAgent     | 9      |
| 10003 | CodeAgent         | 10     |
| 10004 | PlannerAgent      | 11     |
| 10005 | TaskAgent         | 12     |
| 10006 | AssistantAgent    | 13     |
| 10007 | CopilotAgent      | 14     |
| 10008 | OrchestratorAgent | 15     |


## Next Steps

**Keep the server running!** In Lesson 7, you'll build a client that:

- Discovers this agent via its Agent Card
- Sends blocking and streaming requests
- Parses Tasks, Artifacts, and Messages

→ Continue to [Lesson 07 — A2A Client Fundamentals](../07-a2a-client/)
