📞 AI Phone Call Simulator with Real-Time Audio and Tool Integration

This project simulates a real-time voice call between a human user and an OpenAI-powered conversational agent. It supports bidirectional audio streaming, dynamic prompt configuration, and tool-calling via a separate FastAPI server exposing both OpenAPI and MCP-compatible endpoints.

✨ Features

🔁 Full-duplex (bidirectional) audio using WebSockets and pyaudio
🎙️ Realtime transcription & TTS using OpenAI's Whisper + Speech APIs
🧠 Custom toolset integration served by a separate server (tools_server.py)
🔌 Tool use via both OpenAPI and MCP endpoints
🌐 Web interface via FastAPI (app.py) to simulate an interactive call experience

🗂️ Project Structure

.
├── app.py          # Web server (FastAPI) handling audio exchange and user interaction
├── main.py         # CLI runner to test mic/speaker streaming with OpenAI API
├── prompts.py      # System prompt and tool configuration logic
├── tools_server.py # Standalone API server exposing OpenAPI + MCP endpoints
├── requirements.txt# Dependency list
├── .env.example    # Sample environment variables
└── static/         # Web frontend assets (HTML, JS, etc.)
└── templates/      # Web frontend assets (HTML, JS, etc.)

🧰 Installation

1. Create a virtual environment

We recommend using uv for fast dependency resolution.

uv venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

2. Install dependencies

uv pip install -r requirements.txt

🔐 Environment Variables

Create a .env file from the provided .env.example:

API_KEY=your_openai_key_here
WS_URL=wss://[api.openai.com/v1/audio/ws](https://api.openai.com/v1/audio/ws)  # Or your realtime endpoint

🚀 Running the Components

1. Tool Server (OpenAPI + MCP)

Start this first to expose tool endpoints to the assistant:

python tools_server.py

Access:

http://127.0.0.1:8888/docs – OpenAPI Docs
http://127.0.0.1:8888/.well-known/ai-plugin.json – Plugin manifest for tool calling
MCP endpoint is mounted via fastapi-mcp

2. Web App (User Interface)

In another terminal:

python app.py

Access the interface at: http://127.0.0.1:8000

3. CLI Audio Simulator (Optional)

You can simulate mic-to-model calls via:

python main.py

This connects to the OpenAI realtime API, transmits microphone input, and plays AI audio response live.

🔧 Tool Configuration

Tools are defined in tools_server.py and registered via:

{
  "tools": [
    {
      "type": "openapi",
      "url": "[https://xyz123.ngrok.io/.well-known/ai-plugin.json](https://xyz123.ngrok.io/.well-known/ai-plugin.json)"
    }
  ]
}

To switch to an MCP-compatible client (like npx @modelcontextprotocol/inspector), ensure your client config is similar to:

{
  "mcp": {
    "servers": {
      "bank-mcp-server": {
        "type": "sse",
        "url": "[http://127.0.0.1:8888/mcp](http://127.0.0.1:8888/mcp)"
      }
    }
  }
}

📤 Voice Interaction Flow

User speaks or sends audio (Web UI or main.py).
Audio is converted, encoded, and sent via WebSocket to OpenAI.
OpenAI replies with TTS audio and optionally calls a tool.
Result is streamed back and played to the user.

📎 Dependencies

Key libraries:

fastapi, uvicorn, fastapi-mcp – Web & MCP API
pyaudio, pydub – Audio handling
websocket-client, socks, nest_asyncio – Real-time communication
python-dotenv, pydantic – Configuration and validation

🧪 Testing with MCP Inspector

Install and run the inspector:

npx @modelcontextprotocol/inspector

Paste the MCP server config, then issue tool calls via the UI to test interaction with your tools_server.py.

📌 Notes

This is a local-first prototype meant to test real-time capabilities. For production, you may need proper deployment (e.g., Docker, HTTPS certs, scalable ASGI workers).

Audio support on the frontend assumes webm and wav interoperability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📞 AI Phone Call Simulator with Real-Time Audio and Tool Integration

✨ Features

🗂️ Project Structure

🧰 Installation

1. Create a virtual environment

2. Install dependencies

🔐 Environment Variables

🚀 Running the Components

1. Tool Server (OpenAPI + MCP)

2. Web App (User Interface)

3. CLI Audio Simulator (Optional)

🔧 Tool Configuration

📤 Voice Interaction Flow

📎 Dependencies

🧪 Testing with MCP Inspector

📌 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
static		static
templates		templates
README.md		README.md
app.py		app.py
main.py		main.py
prompts.py		prompts.py
requirements.txt		requirements.txt
tools_server.py		tools_server.py

Folders and files

Latest commit

History

Repository files navigation

📞 AI Phone Call Simulator with Real-Time Audio and Tool Integration

✨ Features

🗂️ Project Structure

🧰 Installation

1. Create a virtual environment

2. Install dependencies

🔐 Environment Variables

🚀 Running the Components

1. Tool Server (OpenAPI + MCP)

2. Web App (User Interface)

3. CLI Audio Simulator (Optional)

🔧 Tool Configuration

📤 Voice Interaction Flow

📎 Dependencies

🧪 Testing with MCP Inspector

📌 Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages