Installation

PUMA ships as a Docker Compose stack that bundles a Python runner, an Ollama inference server, and an optional Streamlit dashboard. The recommended path is Docker; a native install is also possible for advanced users who want to run PUMA directly on their host Python interpreter.

Prerequisites

Docker 24+ with the docker compose plugin.
RAM: ~16 GB minimum for small models (1.5B–3B parameters); 32 GB or more is comfortable when running 7B–8B models or many concurrent runs.
GPU (optional): NVIDIA with a recent CUDA driver, or Apple Silicon. PUMA also runs entirely on CPU; performance scales accordingly.
Disk: ~10 GB free for the base stack plus ~2–8 GB per model image.

Install (recommended path: Docker)

git clone https://github.com/pumacp/puma.git
cd puma
docker compose up -d

This starts the Ollama service and prepares the PUMA runner container. To benchmark anything you need at least one model pulled into Ollama:

docker compose exec puma_ollama ollama pull qwen2.5:3b

Verify everything is wired up correctly:

docker compose run --rm puma_runner puma preflight

preflight inspects your host, picks a hardware profile, and reports whether your environment is ready to run benchmarks.

Install (native, advanced)

A native install requires Python 3.11+ and a separately running Ollama instance on http://localhost:11434. After cloning the repo:

pip install -e .

Then run puma directly without docker compose run. See CONTRIBUTING.md for the full development setup (test runner, linters, pre-commit hooks).

Hardware detection

puma preflight selects one of fifteen hardware profiles automatically based on your detected RAM and GPU: the five baseline tiers cpu-lite, cpu-standard, gpu-entry, gpu-mid, gpu-high, plus ten Apple-Silicon variants. Each profile sets sensible defaults for concurrency, batch size, and memory budget. See Models and Datasets for the full profile matrix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

Installation

Prerequisites

Install (recommended path: Docker)

Install (native, advanced)

Hardware detection

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally