A complete MCP (Model Context Protocol) implementation featuring a web search server and chat client powered by local Ollama models.
┌─────────────────┐
│ Streamlit │ (Port 8501)
│ Frontend │ User Interface
└────────┬────────┘
│
↓ HTTP
┌─────────────────┐
│ FastAPI │ (Port 8001)
│ MCP Client │ Orchestration Layer
└────┬────────┬───┘
│ │
↓ ↓
┌─────────┐ ┌──────────────┐
│ Ollama │ │ MCP Server │ (Port 8000)
│ (Local) │ │ (FastMCP 3.0)│ Web Search
└─────────┘ └──────────────┘
│
↓
┌─────────┐
│ SerpAPI │
└─────────┘
- MCP Server: Web search capability using SerpAPI
- MCP Client: FastAPI-based orchestration layer
- Local LLM: Integration with Ollama models (qwen2.5:7b-instruct, llama3.2:3b, etc.)
- Web Interface: Clean Streamlit UI for chatting
- Intelligent Search: Automatic detection of queries requiring web search
- Streamable HTTP Transport: Modern MCP transport protocol
-
Python 3.8+
-
Ollama installed and running locally
- Download from: https://ollama.ai
- Pull models:
ollama pull qwen2.5:7b-instruct ollama pull llama3.2:3b ollama pull qwen2.5-coder:3b
-
SerpAPI Key
- Sign up at: https://serpapi.com
- Get your API key from the dashboard
cd /path/to/projectpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtCreate a .env file (or set environment variables):
cp .env.example .envEdit .env and add your SerpAPI key:
SERPAPI_API_KEY=your_actual_serpapi_key_here
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=qwen2.5:7b-instruct
MCP_SERVER_URL=http://localhost:8000You need to run three components in separate terminal windows:
# Make sure SERPAPI_API_KEY is set
export SERPAPI_API_KEY='your-api-key-here'
# Run the MCP server
python search_mcp_server.pyExpected output:
Starting Web Search MCP Server...
Transport: Streamable HTTP
Port: 8000
Available tool: web_search
python mcp_client_api.pyExpected output:
Starting FastAPI MCP Client...
Ollama URL: http://localhost:11434
MCP Server URL: http://localhost:8000
Default Model: qwen2.5:7b-instruct
INFO: Uvicorn running on http://0.0.0.0:8001
streamlit run streamlit_app.pyExpected output:
You can now view your Streamlit app in your browser.
Local URL: http://localhost:8501
-
Open your browser to
http://localhost:8501 -
Configure settings in the sidebar:
- Select your Ollama model
- Enable/disable web search
- Refresh model list if needed
-
Start chatting:
- Type your question in the chat input
- The system automatically detects if web search is needed
- View search results in expandable sections
Queries that trigger web search:
- "What's the latest news on AI?"
- "Search for Python async programming tutorials"
- "What is the current weather in New York?"
- "Find information about FastMCP 3.0"
Queries that don't need search:
- "Explain how async/await works in Python"
- "Write a function to calculate fibonacci numbers"
- "What is object-oriented programming?"
POST /chat- Send chat messageGET /tools- List available MCP toolsGET /models- List available Ollama modelsGET /health- Health check
POST /mcp/v1/call_tool- Call MCP toolPOST /mcp/v1/list_tools- List available tools
.
├── search_mcp_server.py # MCP server with web search tool
├── mcp_client_api.py # FastAPI client connecting Ollama + MCP
├── streamlit_app.py # Streamlit frontend
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
└── README.md # This file
- User submits a query via Streamlit UI
- FastAPI client receives the request
- Query analysis determines if web search is needed
- If search needed:
- MCP client calls
web_searchtool on MCP server - MCP server queries SerpAPI and returns formatted results
- Search results are added to the conversation context
- MCP client calls
- Ollama LLM generates a response with search context
- Response is displayed to the user with search attribution
You can test the MCP server independently:
import httpx
import json
async def test_search():
client = httpx.AsyncClient()
payload = {
"method": "tools/call",
"params": {
"name": "web_search",
"arguments": {
"query": "FastMCP 3.0 documentation",
"num_results": 3
}
}
}
response = await client.post(
"http://localhost:8000/mcp/v1/call_tool",
json=payload
)
print(json.dumps(response.json(), indent=2))
await client.aclose()
# Run with: python -m asyncio test_script.py- Check if port 8000 is already in use
- Verify SERPAPI_API_KEY is set correctly
- Ensure FastMCP 3.0+ is installed:
pip install --upgrade fastmcp
- Ensure Ollama is running:
ollama list - Check if MCP server is running on port 8000
- Verify model exists:
ollama pull qwen2.5:7b-instruct
- Ensure FastAPI is running on port 8001
- Check browser console for CORS errors
- Try refreshing the page
- Verify SERPAPI_API_KEY is valid
- Check SerpAPI quota/limits
- Look at MCP server logs for errors
- FastMCP Documentation
- MCP Protocol Specification
- Ollama Documentation
- SerpAPI Documentation
- FastAPI Documentation
- Streamlit Documentation
This project is provided as-is for educational and demonstration purposes.
- FastMCP team for the excellent MCP implementation
- Anthropic for the MCP protocol
- Ollama team for local LLM capabilities
- SerpAPI for web search functionality