A minimal, production-ready Docker setup for Ultralytics YOLO with Python 3.13, NVIDIA GPU support, and UV package manager.
- Python 3.13: Latest Python version
- Ultralytics: State-of-the-art YOLO models for object detection
- NVIDIA GPU Support: CUDA 12.6 runtime for GPU acceleration
- UV Package Manager: Fast, modern Python package manager
- Multi-stage Build: Optimized for minimal image size
- Security: Non-root user execution
.
├── Dockerfile # Multi-stage Docker build with NVIDIA support
├── pyproject.toml # Project dependencies managed by UV
├── main.py # Example application
├── .dockerignore # Docker build optimization
└── README.md # This file
- Docker (with BuildKit support)
- NVIDIA Docker Runtime (for GPU support)
- NVIDIA GPU with CUDA support (optional, will fall back to CPU)
# Ubuntu/Debian
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker# Build the image
docker build -t ultralytics-app:latest .
# Build with build arguments (optional)
docker build \
--build-arg CUDA_VERSION=12.6.0 \
-t ultralytics-app:latest .With GPU support:
docker run --gpus all --rm ultralytics-app:latestCPU only:
docker run --rm ultralytics-app:latestInteractive mode:
docker run --gpus all -it --rm ultralytics-app:latest bashMount volumes for data/models:
docker run --gpus all --rm \
-v $(pwd)/data:/app/data \
-v $(pwd)/models:/app/models \
ultralytics-app:latestInstall UV package manager:
curl -LsSf https://astral.sh/uv/install.sh | shCreate virtual environment and install dependencies:
# Create virtual environment
uv venv --python 3.13
# Activate virtual environment
source .venv/bin/activate # Linux/Mac
# .venv\Scripts\activate # Windows
# Install dependencies
uv pip install -r pyproject.tomlRun the application locally:
python main.pyEdit pyproject.toml and rebuild:
[project]
dependencies = [
"ultralytics>=8.3.0",
"torch>=2.0.0",
"your-package>=1.0.0", # Add here
]This setup achieves minimal image size through:
- Multi-stage builds: Separates build and runtime stages
- UV package manager: Faster dependency resolution and installation
- Minimal base image: Uses CUDA runtime (not devel) image
- Efficient caching: Optimized layer ordering
- .dockerignore: Excludes unnecessary files from build context
- No cache storage: Removes package manager caches
- Runtime-only deps: Only includes necessary runtime libraries
Typical image sizes:
- With optimization: ~4-5 GB (CUDA runtime + Python + dependencies)
- Without optimization: ~8-10 GB
To verify GPU is working inside the container:
docker run --gpus all --rm ultralytics-app:latest python -c "
import torch
print(f'CUDA Available: {torch.cuda.is_available()}')
print(f'CUDA Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')
"from ultralytics import YOLO
# Load model
model = YOLO('yolov8n.pt')
# Run inference
results = model.predict('image.jpg', save=True)
# Process results
for result in results:
boxes = result.boxes
for box in boxes:
print(f"Class: {box.cls}, Confidence: {box.conf}")from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt')
# Train the model
results = model.train(data='coco128.yaml', epochs=100, imgsz=640)Configure the application using environment variables:
docker run --gpus all --rm \
-e YOLO_CONFIG_DIR=/app/.config/ultralytics \
ultralytics-app:latest- Verify NVIDIA drivers:
nvidia-smi - Check Docker GPU support:
docker run --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi - Ensure nvidia-container-toolkit is installed
Reduce batch size or image size:
results = model.predict('image.jpg', imgsz=320) # Smaller image sizeThe first run downloads YOLO models (~6-400MB depending on model). Subsequent runs use cached models.
- Use GPU: Always use
--gpus allflag for significant speedup - Batch Processing: Process multiple images in batches
- Model Selection: Choose appropriate model size (n/s/m/l/x)
- Image Size: Use smaller image sizes for faster inference
- FP16 Inference: Enable half-precision for 2x speedup
This project configuration is provided as-is. Ultralytics YOLO is licensed under AGPL-3.0.