Your Personal LLM Laboratory on a Laptop
One-command setup to run your own ChatGPT-style interface with complete control. 100% open source, runs locally, experiment fast with multiple models.
meGPT is a local LLM lab that lets you:
- 🎨 Experiment Fast - ChatGPT-style UI for testing multiple models
- 🔒 Full Control - Everything runs on your machine, no data leaves
- 🔄 Hot-swap Models - Switch models at runtime, no restarts needed
- 🧪 Compare & Test - Built-in tools for model comparison, evals, RAG
- 🛡️ Red Team - Test safety and security of your deployments
- 📦 One Command -
./bootstrap.shand you're running
┌─────────────┐
│ Open WebUI │ (ChatGPT-style Interface)
│ :3000 │
└──────┬──────┘
│
▼
┌─────────────┐
│ Gateway │ (FastAPI - Multi-model routing)
│ :8001 │
└──────┬──────┘
│
▼
┌─────────────┐
│ Ollama │ (LLM Runtime)
│ :11434 │
└──────┬──────┘
│
▼
[Models] (Llama2, Mistral, CodeLlama, etc.)
Key Design Principles:
- UI → Gateway → Ollama → Models - Clean separation of concerns
- Models as Data - Swappable at runtime, not baked into containers
- Python Everywhere - Easy to hack, extend, and understand
- Weird Ports - 3000, 8001, 11434 (easy to remember, avoid conflicts)
- Docker & Docker Compose
- 8GB+ RAM recommended
- 10GB+ free disk space for models
git clone https://github.com/William0Friend/megpt.git
cd megpt
./bootstrap.shThat's it! The script will:
- ✅ Check dependencies
- 🐳 Start all services
- 📦 Pull default models
- 🎉 Open your browser to http://localhost:3000
# Start services
docker compose up -d
# Pull a model
docker exec megpt-ollama ollama pull llama2
# Access the UI
open http://localhost:3000| Service | URL | Purpose |
|---|---|---|
| Open WebUI | http://localhost:3000 | ChatGPT-style interface |
| Gateway API | http://localhost:8001 | FastAPI routing layer |
| Ollama API | http://localhost:11434 | LLM runtime |
- ChatGPT-style UI: Clean, familiar interface via Open WebUI
- Multi-model Support: Run Llama2, Mistral, CodeLlama, and 50+ models
- Model Swapping: Change models without restarting anything
- OpenAI-Compatible API: Use existing OpenAI SDK code
- FastAPI Gateway: Extensible routing and middleware layer
- Promptfoo Evals (
./evals/): Automated prompt testing and scoring - RAG Support (
./rag/): Retrieval-augmented generation examples - Model Comparison (
./model-comparison/): Side-by-side testing tools - Red Team Testing (
./red-team/): Safety and security testing - Prompt Library (
./prompts/): Reusable prompt templates
curl -X POST http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [
{"role": "user", "content": "Explain quantum computing in simple terms"}
]
}'cd model-comparison
python compare_models.py \
--models llama2,mistral,codellama \
--question "Write a Python function to reverse a string"cd evals
npm install -g promptfoo
promptfoo eval
promptfoo viewcd rag
python simple_rag.pycd red-team
python red_team_tests.py --category safety --model llama2docker exec megpt-ollama ollama listdocker exec megpt-ollama ollama pull mistral
docker exec megpt-ollama ollama pull codellama
docker exec megpt-ollama ollama pull neural-chatdocker exec megpt-ollama ollama rm llama2Browse all available models: https://ollama.ai/library
Popular choices:
- llama2 - General purpose, balanced
- mistral - Strong reasoning, faster
- codellama - Code generation and analysis
- neural-chat - Conversational AI
- vicuna - Creative and detailed
- orca-mini - Lightweight, fast
Copy .env.example to .env and customize:
# Ollama
OLLAMA_HOST=0.0.0.0
OLLAMA_BASE_URL=http://localhost:11434
# Gateway
GATEWAY_PORT=8001
LOG_LEVEL=info
# WebUI
WEBUI_SECRET_KEY=change-me-in-production
ENABLE_SIGNUP=true
# Default models to pull on bootstrap
DEFAULT_MODELS=llama2,mistral,codellamaEdit gateway/main.py to:
- Add custom routing logic
- Implement model selection strategies
- Add logging and monitoring
- Integrate with external services
megpt/
├── docker-compose.yml # Service orchestration
├── bootstrap.sh # One-command setup
├── .env.example # Environment template
├── gateway/ # FastAPI gateway service
│ ├── main.py # Gateway implementation
│ ├── Dockerfile # Gateway container
│ └── requirements.txt # Python dependencies
├── prompts/ # Prompt templates
│ ├── system/ # System prompts
│ ├── user/ # User prompt templates
│ └── examples/ # Complete examples
├── evals/ # Promptfoo evaluations
│ ├── promptfooconfig.yaml
│ └── README.md
├── rag/ # RAG examples
│ ├── simple_rag.py
│ └── README.md
├── model-comparison/ # Model testing tools
│ ├── compare_models.py
│ └── README.md
└── red-team/ # Security testing
├── red_team_tests.py
└── README.md
Add a new route to the gateway:
# gateway/main.py
@app.post("/custom/endpoint")
async def custom_handler(request: Request):
# Your logic here
passAdd custom middleware:
from fastapi import Request
@app.middleware("http")
async def log_requests(request: Request, call_next):
# Log or modify requests
response = await call_next(request)
return response# View logs
docker compose logs -f
# Restart services
docker compose restart
# Stop everything
docker compose down
# Stop and remove volumes (clears data)
docker compose down -v
# Rebuild gateway after code changes
docker compose up -d --build gateway
# Check service health
curl http://localhost:8001/health
# List running containers
docker compose ps# Check if ports are in use
lsof -i :3000
lsof -i :8001
lsof -i :11434
# View service logs
docker compose logs ollama
docker compose logs gateway
docker compose logs webui# Check Ollama connectivity
curl http://localhost:11434/api/tags
# Try pulling manually
docker exec -it megpt-ollama bash
ollama pull llama2# Use smaller models
docker exec megpt-ollama ollama pull orca-mini
docker exec megpt-ollama ollama pull tinyllama
# Or increase Docker memory limit in Docker Desktop settings# Verify gateway is running
curl http://localhost:8001/health
# Check gateway logs
docker compose logs gateway
# Restart gateway
docker compose restart gatewayvs. ChatGPT:
- ✅ Runs locally, no API costs
- ✅ Complete privacy, no data sent out
- ✅ Customizable, hackable, extendable
- ❌ Requires your own hardware
- ❌ Models smaller than GPT-4
vs. Raw Ollama:
- ✅ Better UI (Open WebUI vs CLI)
- ✅ Gateway layer for routing/logging
- ✅ Built-in eval & testing tools
- ✅ Example prompts and workflows
- ✅ Reproducible setup
vs. Other Local Solutions:
- ✅ Simpler architecture
- ✅ Faster iteration
- ✅ Better documentation
- ✅ Python-focused (easy to hack)
- ✅ Production-ready patterns
- Multi-model routing (A/B testing)
- Built-in vector database for RAG
- Observability dashboard
- Fine-tuning integration
- Cloud deployment templates
- API key management
- Rate limiting
- Cost tracking (even for local!)
- Prompt versioning
- Model performance analytics
Contributions welcome! Whether it's:
- 🐛 Bug fixes
- ✨ New features
- 📚 Documentation improvements
- 🧪 More eval examples
- 🎨 UI enhancements
See issues for ideas, or open a PR!
MIT License - do whatever you want with it!
- Ollama - LLM runtime
- Open WebUI - ChatGPT-style interface
- FastAPI - Python API framework
- Promptfoo - LLM testing and evals
- OWASP LLM Top 10 - LLM security
Built with ❤️ using:
- Ollama for LLM runtime
- Open WebUI for the interface
- FastAPI for the gateway
- Docker for packaging
Happy experimenting! 🚀
Questions? Issues? Ideas? Open an issue!