demo.mp4
- Project structure
- High-level architecture
- The MCP approach — deep dive
- The LangGraph approach — deep dive
- Frontend options: Streamlit & Chainlit
- Setup & installation (local dev)
- Environment variables
- How to run (quick start)
- API surface & sample calls
- Prompts & prompt engineering notes
- Troubleshooting & common errors
- Testing & QA suggestions
- Design tradeoffs & recommendation
- Docs images (placeholders) — where to place them
- Contributing & License
Command used to print structure: python3 structure.py — current output:
├── app.py
├── chainlit.md
├── client.py
├── langgraph
│ ├── app copy.py
│ └── main copy.py
├── main.py
├── pyproject.toml
├── requirements.txt
├── src
└── structure.py
Important files and folders:
main.py— MCP server usingFastMCP(tools:new_travel_request,travel_data_collected,out_of_domain_tool).client.py— FastAPI wrapper / MCP client that spawns the MCP server stdio subprocess usingmcp-python-client/ClientSessionand exposes/chatendpoint consumed by frontends.app.py— Streamlit frontend (older) or Chainlit frontend (migrated version) depending on whichapp.pyyou run (you may have multiple copies).langgraph/main copy.py,langgraph/app copy.py— LangGraph implementation (custom orchestrator + Chainlit frontend).requirements.txt/pyproject.toml— dependency list.chainlit.md— notes for Chainlit usage or configuration.
Tip: Keep
main.pyandclient.pyas the canonical MCP implementation and keeplanggraph/as an alternate approach (experimental). Keepapp.pyfor Chainlit UI (rename other copies likeapp_streamlit.pyto avoid confusion).
-
MCP approach (recommended)
main.pyruns as an MCP server: it registers callable tools (Python functions) withFastMCP.- A client (in
client.py) usesmcp-python-client/ClientSessionorstdio_clientto connect to the server process over STDIO (or runmcp.run(transport="http")for HTTP/SSE). - A FastAPI wrapper (
client.pyas server) exposes/chatand selects which tool to invoke (router prompt) and calls that tool via MCP. - Frontend (Chainlit) calls
POST /chatand displays the assistant response.
-
LangGraph approach (alternate)
langgraph/main copy.pydefines aStateGraphworkflow:select_tool→call_toolnodes.- The graph calls the LLM for tool selection and then calls local functions directly.
langgraph/app copy.py(Chainlit) calls a FastAPI endpoint/agentthat executes the graph and returns the result.
Both approaches use Azure OpenAI (via langchain_openai.AzureChatOpenAI) in this repo. Both store simple in-memory data (employee_id and travel_data) for session-less demo purposes. For production you should persist per-user session to DB.
Model Context Protocol (MCP) provides a standardized way to:
- Expose "tools" (functions) a server can run,
- Allow LLMs and clients to discover available tools (
list_tools), - Call tools in a structured manner (
call_tool), - Use multiple transports (stdio, HTTP/SSE, WebSocket), and
- Enable orchestration across systems.
-
main.pyregisters functions with@mcp.tool():new_travel_request(user_query: str) -> str: captures employee ID (expects 8-digit).travel_data_collected(messages: list[dict]) -> dict/str: collects travel details and returns JSON or plainresponse.out_of_domain_tool(messages: list[dict]) -> str: polite redirect for non-travel input.
-
client.pyruns a small FastAPI app that:-
Spawns
main.pysubprocess (stdio) usingStdioServerParameters+stdio_client. -
Keeps a
ClientSessionto calllist_tools()andcall_tool(...). -
Provides
/chatendpoint. It does:- Build a compact selector prompt (available tools + conversation).
- Ask an LLM which tool to call (or pick heuristics / call travel tool directly for 8-digit ID).
- Call chosen tool via MCP and return a cleaned assistant response.
-
- Tools are discoverable: any client can ask the server what tools it provides.
- Decouples LLM orchestration and tool implementation.
- Interchangeable transports.
- Easier to integrate multiple servers (HR server, payments server, travel server).
LangGraph models conversation logic with nodes and edges. You define a state type (AgentState) and create nodes (functions) that manipulate state and call LLM. It's a custom finite-state graph + LLM orchestration.
-
main copy.pyusesStateGraph:select_tool(state)— LLM returns one tool name (string).call_selected_tool(state)— calls the selected function fromTOOL_REGISTRY.
-
app copy.pyuses Chainlit to call FastAPI/agentwhich drives the graph.
- Very deterministic flow — you control node transitions and function calling order.
- Easier to reason about complex workflows with branching and retries.
- Tighter coupling between graph and tools (good for domain-specific controlled flows).
- Not easily discoverable by external clients — custom architecture.
- Harder to plug new services dynamically compared to MCP.
- Good when workflow must be tightly controlled and deterministic.
- Quick to spin up, familiar for dashboards.
- Chat UI is possible but requires custom styling for good UX (we implemented a polished
app.pywith CSS). - Not optimized for streaming assistant typing or chat-centric features.
- Built for chatbots: delivers avatars, typing indicators, markdown rendering, steps, streaming messages, and developer tools.
- Simplifies frontend code: you send assistant messages and Chainlit handles rendering.
- Use Chainlit when you want a production chat-like UX fast.
- Create & activate Python environment:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt(If you use conda: conda create -n api_agent python=3.10 && conda activate api_agent and then pip install -r requirements.txt.)
- Install tools used by repo:
fastmcp(for server)mcp-python-client(client)fastapi,uvicornlangchain-openai(Azure client)chainlit(frontend)loguru,python-dotenv, etc.
Example requirements.txt snippet:
fastmcp
mcp-python-client
fastapi
uvicorn
langchain-openai
chainlit
loguru
python-dotenv
requests
- Put your Azure OpenAI keys in a
.envfile (see next section).
Create .env at project root:
AZURE_OPENAI_API_KEY=your_api_key_here
AZURE_OPENAI_ENDPOINT=https://<your-resource-name>.openai.azure.com
AZURE_OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
If your code uses different env names (e.g., AZURE_OPENAI_DEPLOYMENT), make sure they match.
- Start FastAPI wrapper (it spawns the MCP server as a stdio subprocess):
uvicorn client:app --reload-
This will:
- Start a FastAPI server at
http://127.0.0.1:8000. - Inside startup, spawn
python main.pyas a subprocess and create an MCPClientSessionover stdio. - Use
/chatendpoint for frontends.
- Start a FastAPI server at
- Start Chainlit frontend:
chainlit run app.py -w- Interact in the Chainlit UI (it will call
POST http://127.0.0.1:8000/chat).
If you modify main.py to run HTTP transport:
# in main.py
mcp.run(transport="http", host="127.0.0.1", port=8888)Then call tools over HTTP/SSE (client needs to support it). This avoids stdio subprocess and is suitable for production.
- Body: plain text (user message).
- Header:
X-History: optional conversation history (string). - Returns: assistant reply (plain text). On errors it returns a short error message.
Sample curl:
curl -X POST "http://127.0.0.1:8000/chat" \
-H "Content-Type: text/plain" \
-d "I want to book a trip from Mumbai to Pune on 2025-09-15"- If the message is 8-digit number →
new_travel_requestwill capture employee id. - If travel details →
travel_data_collectedwill be called and return a confirmation message. - If offtopic →
out_of_domain_toolpolitely declines.
- Keep selector prompts concise and rule-driven: give the LLM enough context to choose a single tool.
- Prefer structured JSON output from LLM when you want machine-readability (tools), but unwrap to human-readable text for UI.
- Use local heuristics before calling the LLM to reduce cost (e.g., exact 8-digit detect).
- Use robust parsing: always
json.loadssafely and accept double-encoded JSON strings. - Keep explicit constraints (date format, example outputs) but avoid long enumerations that can be ignored by the model.
You are a tool selector. Tools: new_travel_request, travel_data_collected, out_of_domain_tool.
Conversation (last messages): ...
User now says: "<input>"
Rules:
- If input is exactly 8 digits → "new_travel_request".
- If input contains travel details and employee_id exists → "travel_data_collected".
- If unrelated → "out_of_domain_tool".
Return JSON only: {"tool":"<tool-name>", "arguments":{...}} OR plain answer if no tool is needed.
- Fix: Use
FastMCP.run()directly;serve_stdiowas removed/deprecated.
- Fix: Use
mcp-python-clientpackage (e.g.,from mcp_python_client import MCPClient) or low-levelClientSessionwithstdio_client.
- Fix backend to unwrap
{"response":...}into plain text, or sanitize in frontend (Chainlit) by parsing and extractingresponse(the recommended approach is to fix backend so the UI always gets plain text for simple replies).
-
Solution:
- Add explicit example outputs in prompts that produce JSON (show exactly expected structure).
- Use
try: json.loads()with fallback. - Accept double-encoded JSON strings and parse twice.
-
Unit tests for prompt parsing: feed model-like strings and test
safe_json_loadlogic. -
Integration tests:
- Start server in CI (or as subprocess) and run example flows via HTTP client.
- Mock LLM responses in tests to validate state transitions.
-
Manual test cases:
- 8-digit ID only → should capture employee id.
- Purpose only after id → should call travel collection tool.
- Full details in single message → should parse and respond with summary +
confirm. confirm→ should result in booking confirmation message (if implemented).
-
Load testing: test client
list_toolscalls frequency and cache them to avoid repeated LLM calls.
- If you want modularity, future integrations, multi-service orchestration, use MCP (recommended).
- If you need a single, deterministic flow (complex business rules), LangGraph is good.
- For clean frontend UX use Chainlit (chat-focused); use Streamlit for dashboards.
Recommendation: Use MCP as the core pattern + Chainlit as UI. Keep LangGraph as a specialized experiment for flows that require deterministic branching.
Create /docs or /images and add three images (PNG/SVG). Reference them in README with these names and captions.
-
docs/architecture.png— Architecture Diagram (MCP vs LangGraph)- Shows: Client (Chainlit/Streamlit) → FastAPI wrapper → MCP ClientSession → MCP Server (
main.py) with tools; and separate LangGraph flow. - Caption: "High-level architecture showing the MCP pipeline and LangGraph alternative."
- Shows: Client (Chainlit/Streamlit) → FastAPI wrapper → MCP ClientSession → MCP Server (
-
docs/ui_preview.png— UI screenshot (Chainlit)- Shows: conversation with avatars, sample messages, final booking summary formatted in markdown.
- Caption: "Chainlit chat UI with avatars and a formatted booking summary."
Tip: for each image, include a small alt text for accessibility, and keep them in
docs/to avoid clutter.
-
Contributions welcome. Please:
- Open an issue describing changes.
- Add PRs against
mainwith tests and documentation. - Keep changes backward compatible.
-
Suggested
LICENSE: MIT (change as desired).
-
User:
I want to book a trip→ Selector decides:new_travel_requestor asks LLM to choose. Server responds:Please provide your 8-digit Employee ID. -
User:
23456789→new_travel_requestcaptures ID and returns:Got your ID. What's your travel purpose and destination? -
User:
R&D Project, Mumbai to Pune, 15th sept 2025 9am to 20th sept 2025 10pm, Round trip, Train, Sleeper, Self Booked→travel_data_collectedreturns parsed travel details plus ask forcost_center/project_wbs. -
User:
607402 and ADRG.25IT.DG.GE.A01→travel_data_collectedreturns a formatted summary and asksreply 'confirm' to book or 'no' to edit. -
User:
confirm→ Server returns:✅ Your trip is booked!(and you persist to DB in production).
- Add persistent session storage (per-user) — current in-memory
datais demo-only. - Add authentication to the FastAPI endpoints for security.
- Add unit tests for prompt parsing and JSON extraction.
- Add monitoring and logging around LLM calls to watch cost & failures.
- Consider running MCP server over HTTP/SSE in production for simpler scaling (no subprocess).
If you want, I can:
- Generate the three diagrams for you (architecture, sequence, UI preview) as PNGs and place them in
docs/. - Create a
README.mdfile and save it into the repo for you. - Add example Postman collection or OpenAPI docs for
/chat.
Tell me which of those you’d like me to do next and I’ll produce them immediately.

