Add agora-moss: Moss semantic search for Agora Conversational AI via MCP by ashvathsureshkumar · Pull Request #174 · usemoss/moss

ashvathsureshkumar · 2026-04-22T23:24:49Z

Summary

New packages/agora-moss — Python library exposing a single MCP tool (search_knowledge_base) backed by a Moss index over streamable HTTP. Plugs into Agora ConvoAI's llm.mcp_servers join-body field with zero LLM-side plumbing. Sibling to vapi-moss / elevenlabs-moss.
New apps/agora-moss — runnable end-to-end demo: FastMCP streamable-HTTP server (server.py), create_index.py seeder, start_agent.py that mints an RTC token and POSTs a full ConvoAI join body wiring Moss MCP + Deepgram ASR + Cartesia TTS + any OpenAI-compatible LLM. Dockerfile targeting ghcr.io/usemoss/agora-moss.
Docs — library README with quickstart + API table, full walkthrough README in the demo app, and a row for agora-moss in the integrations table.
LLM-compat proxy (optional) — apps/agora-moss/llm_proxy.py, mounted at /llm/chat/completions when LLM_PROXY_UPSTREAM is set. Strips non-OpenAI-spec fields Agora injects into the chat/completions body (turn_id, timestamp, interruptable, strict on tool defs), optionally injects top-level model, auto-decompresses gzipped upstream responses. Lets the demo work against upstreams that strictly enforce the OpenAI schema.
Dev flag MCP_ALLOW_ALL_HOSTS=1 — disables FastMCP's DNS-rebinding protection so a public tunnel host (ngrok / cloudflared) can reach /mcp during development.

Verified end-to-end on a live ConvoAI voice agent: browser mic → Deepgram ASR → OpenAI-compat LLM (via the proxy) → MCP tool call → in-memory Moss query → LLM final answer → Cartesia TTS → audio published back into the channel.

Test plan

packages/agora-moss: uv run pytest passes (12 unit tests + 1 env-gated integration test)
apps/agora-moss: uv run pytest passes (6 unit tests)
Local MCP smoke test: direct client call to search_knowledge_base returns ranked docs
Public-tunnel MCP smoke test: same call over ngrok
Full ConvoAI voice roundtrip including search_knowledge_base tool call and spoken answer
Follow-up: LLM_PROXY_UPSTREAM read at import time — crashes server if unset even when proxy unused (minor; guard the import or defer the lookup into the handler)
Follow-up: streaming httpx.AsyncClient created inside an async generator — teardown on abnormal disconnect relies on aclose() being awaited; consider a module-level client to avoid connection-leak risk under load

…arch guard

…body builder

…signature

…index

…meric rule

…_uids - greeting_message so the agent speaks first when users join - system_messages for concise voice-assistant behavior - LLM_MODEL env var (optional) so providers that require a model field work - remote_rtc_uids from AGORA_REMOTE_RTC_UIDS (defaults to '2001') — fixes 'remote_rtc_uids must not be empty' rejection from ConvoAI

… tunnels - MCP_ALLOW_ALL_HOSTS=1 disables FastMCP's DNS-rebinding protection so a public tunnel hostname (ngrok/cloudflared) can reach /mcp - llm_proxy mounted at /llm/chat/completions when enabled; cleans Agora's ConvoAI request body before forwarding to any OpenAI-compatible upstream: * injects top-level 'model' if LLM_MODEL is set (required by some providers whose chat/completions endpoint expects it in the body) * strips non-spec fields Agora adds (turn_id, timestamp, interruptable, metadata; 'strict' at tools[0]) * auto-decodes gzipped upstream responses This makes the demo work with OpenAI-compat upstreams that are strict about the request schema, without the agora-moss package owning any provider-specific logic. - fastapi added as a dev dep for the proxy

- test_mcp_client.py calls search_knowledge_base against the local MCP server; useful for verifying the tool works without any Agora setup - test_mcp_client_remote.py does the same over the public tunnel URL

Copilot

Pull request overview

Adds a new Moss→Agora Conversational AI integration (agora-moss) plus a runnable demo app that exposes Moss semantic search as a single MCP tool over streamable HTTP.

Changes:

Introduces packages/agora-moss: MossAgoraSearch adapter + create_mcp_app() FastMCP server exposing search_knowledge_base.
Adds apps/agora-moss: demo MCP server, index seeder, ConvoAI join-body starter, and an optional OpenAI-compat LLM proxy.
Updates repo documentation to list the new Agora integration.

Reviewed changes

Copilot reviewed 19 out of 23 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
packages/agora-moss/uv.lock	Locks Python dependencies for the new `agora-moss` package.
packages/agora-moss/src/agora_moss/search.py	Implements the Moss adapter and FastMCP tool exposure.
packages/agora-moss/src/agora_moss/init.py	Exposes the package’s public API via `__all__`.
packages/agora-moss/pyproject.toml	Defines package metadata, dependencies, and tooling config.
packages/agora-moss/README.md	Package-level install/quickstart and API documentation.
packages/agora-moss/LICENSE	Adds BSD-2-Clause license file for the package.
packages/agora-moss/tests/init.py	Initializes the package test module.
packages/agora-moss/tests/test_search.py	Unit + env-gated integration test coverage for search + MCP tool.
apps/agora-moss/pyproject.toml	Demo app dependencies and tooling config.
apps/agora-moss/server.py	ASGI entrypoint wiring FastMCP server (and proxy mount).
apps/agora-moss/llm_proxy.py	Optional OpenAI-compat proxy for strict upstreams.
apps/agora-moss/create_index.py	Seeds a Moss index with sample docs for the demo.
apps/agora-moss/start_agent.py	Mints RTC token and posts ConvoAI `join` body wiring MCP/ASR/TTS/LLM.
apps/agora-moss/env.example	Documents required environment variables for demo usage.
apps/agora-moss/moss_docs.json	Sample documents used by the seeding script.
apps/agora-moss/README.md	End-to-end demo walkthrough and Docker usage instructions.
apps/agora-moss/Dockerfile	Builds a container image for serving the MCP server demo.
apps/agora-moss/test_mcp_client.py	Local script to call the MCP tool via streamable HTTP.
apps/agora-moss/test_mcp_client_remote.py	Remote script to call a deployed MCP endpoint.
apps/agora-moss/tests/init.py	Initializes the demo app test module.
apps/agora-moss/tests/test_start_agent.py	Tests join-body construction and MCP server naming constraints.
README.md	Adds the Agora demo and integration-row to the repo’s top-level docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T23:28:41Z

+        try:
+            return JSONResponse(r.json(), status_code=r.status_code)
+        except Exception:
+            return JSONResponse({"error": "non-json upstream"}, status_code=500)


If the upstream response isn’t JSON, the proxy returns a generic 500 and discards the upstream status/body. This makes debugging upstream schema errors difficult. Prefer returning the upstream status code and raw body (and ideally content-type) instead of forcing a 500.

Copilot · 2026-04-22T23:28:41Z

+        async def gen():
+            client = httpx.AsyncClient(timeout=60)
+            try:


The streaming path creates an httpx.AsyncClient inside the async generator without a context manager. On client disconnect/cancellation, cleanup can be fragile and may leak connections under load. Prefer using async with httpx.AsyncClient(...) (with appropriate cancellation shielding) or a shared client managed by app lifespan.

Copilot · 2026-04-22T23:28:41Z

+URL = "https://attributable-marni-electrosurgical.ngrok-free.dev/mcp"
+


This script hard-codes a specific ngrok URL. That will quickly go stale and can unintentionally leak an endpoint in the repo. Consider reading the URL from an env var/CLI arg (with a placeholder default) or moving this to a clearly labeled local-only example file.

Copilot · 2026-04-22T23:28:41Z

+from llm_proxy import app as proxy_app  # noqa: E402
+from starlette.routing import Mount  # noqa: E402
+
+app.router.routes.insert(0, Mount("/llm", app=proxy_app))


llm_proxy is imported and mounted unconditionally. Because apps/agora-moss/llm_proxy.py reads LLM_PROXY_UPSTREAM at import time, the MCP server will crash on startup when that env var is unset (even if the proxy isn’t needed). Consider only importing/mounting the proxy when LLM_PROXY_UPSTREAM is present (or making llm_proxy safe to import when unset).

Suggested change

from llm_proxy import app as proxy_app # noqa: E402

from starlette.routing import Mount # noqa: E402

app.router.routes.insert(0, Mount("/llm", app=proxy_app))

if os.environ.get("LLM_PROXY_UPSTREAM"):

from llm_proxy import app as proxy_app # noqa: E402

from starlette.routing import Mount # noqa: E402

app.router.routes.insert(0, Mount("/llm", app=proxy_app))

Copilot · 2026-04-22T23:28:42Z

+
+COPY apps/agora-moss/pyproject.toml apps/agora-moss/
+COPY apps/agora-moss/README.md apps/agora-moss/
+COPY apps/agora-moss/server.py apps/agora-moss/


The image copies server.py but not apps/agora-moss/llm_proxy.py. Since server.py imports llm_proxy on startup, the container will crash with ModuleNotFoundError. Copy llm_proxy.py into the image or make the proxy import/mount optional.

Suggested change

COPY apps/agora-moss/server.py apps/agora-moss/

COPY apps/agora-moss/server.py apps/agora-moss/

COPY apps/agora-moss/llm_proxy.py apps/agora-moss/

Copilot · 2026-04-22T23:28:42Z

+UPSTREAM = os.environ["LLM_PROXY_UPSTREAM"]
+UPSTREAM_KEY = os.environ.get("LLM_API_KEY", "")
+MODEL = os.environ.get("LLM_MODEL")


UPSTREAM = os.environ["LLM_PROXY_UPSTREAM"] is evaluated at import time, which makes importing this module crash when the env var is unset (even if the proxy route isn’t mounted/used). Use a lazy lookup (e.g., inside the handler) or os.environ.get with a clear runtime error/disabled behavior.

Copilot · 2026-04-22T23:28:42Z

+                    if r.status_code >= 400:
+                        err = b"".join([chunk async for chunk in r.aiter_bytes()])
+                        print(f"UPSTREAM err body: {err.decode('utf-8', 'replace')[:2000]}", flush=True)
+                        yield err
+                        return


When the upstream returns an error (>=400) during streaming, the generator yields the error body but StreamingResponse still returns HTTP 200 with text/event-stream. Consider checking r.status_code before returning a streaming response and, on error, returning a non-streaming response that preserves the upstream status/body.

HarshaNalluru · 2026-04-22T23:29:47Z

+## Prerequisites
+
+- Python 3.10+ and [uv](https://docs.astral.sh/uv/).
+- Moss project credentials: `MOSS_PROJECT_ID`, `MOSS_PROJECT_KEY`.


…ling Addresses review feedback on PR #174: - server.py: only import/mount llm_proxy when LLM_PROXY_UPSTREAM is set, so the server starts cleanly out of the box (previously server.py crashed at startup with KeyError because llm_proxy.py read the env var at import time). - llm_proxy.py: look up env vars inside the handler instead of at module load, so importing the module never crashes and a missing upstream produces a 503 JSON body instead. - llm_proxy.py: one module-level httpx.AsyncClient reused across requests (previously a new client was created inside the streaming generator and could leak on cancellation). - llm_proxy.py: propagate upstream status and body in all paths. A streaming error now returns a non-streaming response with the upstream status code instead of HTTP 200 text/event-stream wrapping a 4xx body. Non-streaming success passes through the upstream content-type. - Dockerfile: COPY llm_proxy.py into the image alongside server.py so the conditional mount works inside the container too.

…nel URL) The script hardcoded a developer-specific ephemeral ngrok URL that wouldn't work for anyone else. test_mcp_client.py (localhost) covers the same smoke-test use case; remote testing can trivially be done by pointing that script at any public URL.

@HarshaNalluru

Addresses @HarshaNalluru's review comment on PR #174.

vercel · 2026-04-22T23:49:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
moss_nextjs	Ready	Preview, Comment	Apr 22, 2026 11:49pm

…MCP (usemoss#174) ## Summary - **New `packages/agora-moss`** — Python library exposing a single MCP tool (`search_knowledge_base`) backed by a Moss index over streamable HTTP. Plugs into Agora ConvoAI's `llm.mcp_servers` join-body field with zero LLM-side plumbing. Sibling to `vapi-moss` / `elevenlabs-moss`. - **New `apps/agora-moss`** — runnable end-to-end demo: FastMCP streamable-HTTP server (`server.py`), `create_index.py` seeder, `start_agent.py` that mints an RTC token and POSTs a full ConvoAI `join` body wiring Moss MCP + Deepgram ASR + Cartesia TTS + any OpenAI-compatible LLM. Dockerfile targeting `ghcr.io/usemoss/agora-moss`. - **Docs** — library README with quickstart + API table, full walkthrough README in the demo app, and a row for `agora-moss` in the integrations table. - **LLM-compat proxy (optional)** — `apps/agora-moss/llm_proxy.py`, mounted at `/llm/chat/completions` when `LLM_PROXY_UPSTREAM` is set. Strips non-OpenAI-spec fields Agora injects into the chat/completions body (`turn_id`, `timestamp`, `interruptable`, `strict` on tool defs), optionally injects top-level `model`, auto-decompresses gzipped upstream responses. Lets the demo work against upstreams that strictly enforce the OpenAI schema. - **Dev flag `MCP_ALLOW_ALL_HOSTS=1`** — disables FastMCP's DNS-rebinding protection so a public tunnel host (ngrok / cloudflared) can reach `/mcp` during development. Verified end-to-end on a live ConvoAI voice agent: browser mic → Deepgram ASR → OpenAI-compat LLM (via the proxy) → MCP tool call → in-memory Moss query → LLM final answer → Cartesia TTS → audio published back into the channel. ## Test plan - [x] `packages/agora-moss`: `uv run pytest` passes (12 unit tests + 1 env-gated integration test) - [x] `apps/agora-moss`: `uv run pytest` passes (6 unit tests) - [x] Local MCP smoke test: direct client call to `search_knowledge_base` returns ranked docs - [x] Public-tunnel MCP smoke test: same call over ngrok - [x] Full ConvoAI voice roundtrip including `search_knowledge_base` tool call and spoken answer - [ ] Follow-up: `LLM_PROXY_UPSTREAM` read at import time — crashes server if unset even when proxy unused (minor; guard the import or defer the lookup into the handler) - [ ] Follow-up: streaming `httpx.AsyncClient` created inside an async generator — teardown on abnormal disconnect relies on `aclose()` being awaited; consider a module-level client to avoid connection-leak risk under load  --- <a href="https://app.devin.ai/review/usemoss/moss/pull/174" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a>

ashvathsureshkumar added 26 commits April 17, 2026 12:04

chore(agora-moss): scaffold package structure

ab3fe50

feat(agora-moss): add MossAgoraSearch._format_results

5c16875

feat(agora-moss): add MossAgoraSearch constructor, load_index, and se…

4a89e13

…arch guard

chore(agora-moss): commit uv.lock

2de2514

feat(agora-moss): implement MossAgoraSearch.search delegation

39a3c43

feat(agora-moss): add create_mcp_app FastMCP factory

59e96d4

feat(agora-moss): export public API from package __init__

81dc302

test(agora-moss): add env-gated MCP roundtrip integration test

cdc2e99

chore(agora-moss-demo): scaffold demo app

68c3c90

feat(agora-moss-demo): implement MCP server entrypoint

c805b5a

feat(agora-moss-demo): implement start_agent orchestration with join-…

4b94236

…body builder

feat(agora-moss-demo): add create_index.py and sample docs

068d465

fix(agora-moss-demo): use single-call create_index matching moss SDK …

f6bb8b8

…signature

feat(agora-moss-demo): add Dockerfile for ghcr.io/usemoss/agora-moss

ea64892

docs(agora-moss-demo): write full README walkthrough

5a032cf

docs(agora-moss): expand library README with quickstart and API table

f800d7d

docs: list agora-moss in the integrations table

f714bcd

style(agora-moss): ruff format, per-file-ignores for tests, docstrings

c02d1fb

chore(agora-moss): add LICENSE file matching sibling packages

55e8455

fix(agora-moss-demo): only swallow 'already exists' errors in create_…

9945192

…index

feat(agora-moss-demo): validate MCP server name against Agora alphanu…

51cc7a6

…meric rule

test(agora-moss): verify MCP tool-error mapping for adapter exceptions

56a1242

style(agora-moss-demo): apply ruff format to fixed files

e9bb8fb

chore(agora-moss-demo): add ad-hoc MCP client smoke-test scripts

c6fea80

- test_mcp_client.py calls search_knowledge_base against the local MCP server; useful for verifying the tool works without any Agora setup - test_mcp_client_remote.py does the same over the public tunnel URL

Copilot AI review requested due to automatic review settings April 22, 2026 23:24

ashvathsureshkumar requested review from HarshaNalluru, r4ghu and yatharthk2 as code owners April 22, 2026 23:24

ashvathsureshkumar requested a review from CoderOMaster as a code owner April 22, 2026 23:24

Copilot started reviewing on behalf of ashvathsureshkumar April 22, 2026 23:25 View session

This comment was marked as resolved.

Sign in to view

Copilot AI reviewed Apr 22, 2026

View reviewed changes

HarshaNalluru reviewed Apr 22, 2026

View reviewed changes

HarshaNalluru approved these changes Apr 22, 2026

View reviewed changes

ashvathsureshkumar added 3 commits April 22, 2026 16:49

docs(agora-moss-demo): link to moss.dev for Moss project credentials

ad221fc

Addresses @HarshaNalluru's review comment on PR #174.

vercel Bot deployed to Preview – moss_nextjs April 22, 2026 23:49 View deployment

ashvathsureshkumar merged commit 9479301 into main Apr 22, 2026
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agora-moss: Moss semantic search for Agora Conversational AI via MCP#174

Add agora-moss: Moss semantic search for Agora Conversational AI via MCP#174
ashvathsureshkumar merged 29 commits intomainfrom
moss-agora-llm-compat

ashvathsureshkumar commented Apr 22, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

HarshaNalluru Apr 22, 2026

Uh oh!

vercel Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		URL = "https://attributable-marni-electrosurgical.ngrok-free.dev/mcp"

	COPY apps/agora-moss/server.py apps/agora-moss/
	COPY apps/agora-moss/server.py apps/agora-moss/
	COPY apps/agora-moss/llm_proxy.py apps/agora-moss/

Conversation

ashvathsureshkumar commented Apr 22, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

HarshaNalluru Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

vercel Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ashvathsureshkumar commented Apr 22, 2026 •

edited by devin-ai-integration Bot

Loading

vercel Bot commented Apr 22, 2026 •

edited

Loading