Production-ready Microsoft Teams–integrated Retrieval Augmented Generation (RAG) application using FastAPI, LangChain, and pluggable LLM/Embeddings strategy switched entirely via environment variables.
- Pluggable LLM Backend: MODEL_TYPE env switch:
ollama | azure | gemini
- Flexible Embeddings: Override via
EMBED_BACKEND
(sentence_transformers / ollama / azure) - Persistent Vector Store: ChromaDB with configurable directory (
CHROMA_PERSIST_DIR
) - Multi-format Ingestion: CSV, PDF, and raw text ingestion (admin only) with versioning & checksum dedupe
- RAG Query Endpoint: Simple RAG query with citations and configurable retrieval
- Teams Integration: Bot Framework SDK with identity mapping and role verification
- Production Ready: uv for dependency management, proper logging, CORS, health checks
-
Install uv (Python package manager):
# Windows (PowerShell) powershell -c "irm https://astral.sh/uv/install.ps1 | iex" # Alternative: pip install uv
-
Install Ollama (if using local LLM):
# Download from https://ollama.ai/ or use package manager winget install Ollama.Ollama # Pull required models ollama pull llama3.1 ollama pull mxbai-embed-large
# Clone and navigate to project
cd best-bot
# Create virtual environment and install dependencies
uv sync
# Copy environment template and configure
copy .env.example .env
# Edit .env - set MODEL_TYPE and other required values
# Initialize database and seed users
uv run python -m app.scripts.migrate
uv run python -m app.scripts.seed_users
# Start development server (with hot reload)
uv run python scripts/dev.py
The API will be available at: http://localhost:8000
- OpenAPI Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
Query the RAG system:
# Test query endpoint
curl -X POST "http://localhost:8000/query/" -H "Content-Type: application/json" -d '{\"question\": \"What data is available?\"}'
Ingest sample data (admin only - currently uses first seeded user):
# Ingest CSV
curl -F "file=@tests/data/sample.csv" -F "source_name=sample_csv" http://localhost:8000/ingest/
# Install dependencies
uv sync --no-dev
# Set production environment variables
export MODEL_TYPE=azure # or ollama/gemini
export LOG_LEVEL=WARNING
export DATABASE_URL=postgresql://... # for production DB
# Run with production settings
uv run python scripts/prod.py
# Build image
docker build -t teams-rag-bot .
# Run container
docker run -p 8000:8000 --env-file .env teams-rag-bot
Core environment variables (see .env.example
for complete list):
MODEL_TYPE=ollama # one of: ollama | azure | gemini
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1
OLLAMA_EMBED=mxbai-embed-large
AOAI_ENDPOINT=https://your-resource.openai.azure.com/
AOAI_API_KEY=your-api-key
AOAI_DEPLOYMENT_CHAT=gpt-4
AOAI_DEPLOYMENT_EMBED=text-embedding-3-small
GEMINI_API_KEY=your-api-key
GEMINI_MODEL=gemini-1.5-pro
Switch between LLM providers without code changes by updating environment variables:
# Switch to Ollama (local)
echo "MODEL_TYPE=ollama" > .env
# Switch to Azure OpenAI (cloud)
echo "MODEL_TYPE=azure" > .env
echo "AOAI_ENDPOINT=https://..." >> .env
echo "AOAI_API_KEY=..." >> .env
# Restart server - no code changes needed!
-
Create Azure Bot Registration:
- Go to Azure Portal > Bot Services
- Create new Bot Registration
- Note App ID and generate App Password
-
Configure Environment:
BOT_APP_ID=your-app-id BOT_APP_PASSWORD=your-app-password BOT_TENANT_ID=your-tenant-id
-
Expose Local Server:
# Install ngrok or use dev tunnels ngrok http 8000 # Note the https URL: https://abc123.ngrok.io
-
Update Bot Registration:
- Set messaging endpoint:
https://abc123.ngrok.io/api/messages
- Set messaging endpoint:
-
Install in Teams:
- Create Teams app manifest
- Upload to Teams for testing
- Natural Language Queries: Users can ask questions directly in Teams
- Admin Commands: Admins can trigger ingestion workflows via Teams
- Identity Mapping: Teams user identity mapped to internal user roles
- Rich Responses: Formatted answers with source citations
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Teams Client │ │ FastAPI App │ │ Vector Store │
│ │◄──►│ │◄──►│ (ChromaDB) │
│ - Chat Interface│ │ - Auth & Routing │ │ - Embeddings │
│ - File Upload │ │ - RAG Pipeline │ │ - Similarity │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────┐
│ LLM Backend │
│ ┌──────────────┐ │
│ │ Ollama │ │
│ │ Azure AI │ │
│ │ Gemini │ │
│ └──────────────┘ │
└──────────────────┘
# Development server with hot reload
uv run python scripts/dev.py
# Production server
uv run python scripts/prod.py
# Run tests
uv run pytest
# Database migrations
uv run python -m app.scripts.migrate
# Seed test users
uv run python -m app.scripts.seed_users
# Check dependencies
uv tree
# Update dependencies
uv sync --upgrade
- Ensure PYTHONPATH includes
src
directory - Use the provided scripts:
uv run python scripts/dev.py
- Check
.chroma
directory permissions - Delete
.chroma
directory to reset
- Ensure Ollama is running:
ollama serve
- Check base URL:
curl http://localhost:11434/api/tags
- Pull required models:
ollama pull llama3.1
- Check
.env
file format - Ensure no spaces around
=
in environment variables - Use commas without spaces for lists:
item1,item2,item3
# View logs with structured output
uv run python scripts/dev.py | tee app.log
# Debug specific module
uv run python -c "from app.rag.models import get_model_strategy; print(get_model_strategy())"
-
Setup Development Environment:
uv sync uv run pre-commit install # Coming soon
-
Run Tests:
uv run pytest tests/
-
Code Style:
uv run black src/ uv run ruff check src/
MIT License - see LICENSE file for details.