Feature/ollama service#59
Conversation
Reviewer's GuideThis PR integrates an Ollama service into the development workflow by adding it to Docker Compose with GPU support and persistent storage, exposing Makefile commands for Ollama operations, and configuring VSCode to start Ollama automatically before debugging. Flow diagram for Makefile Ollama commandsflowchart TD
A["Developer runs Makefile command"] --> B{"Which command?"}
B -->|ollama-up| C["Start Ollama container"]
B -->|ollama-stop| D["Stop Ollama container"]
B -->|ollama-restart| E["Restart Ollama container"]
B -->|ollama-pull| F["Pull model in Ollama container"]
B -->|ollama-run| G["Run model in Ollama container"]
B -->|ollama-list| H["List models in Ollama container"]
B -->|ollama-rm| I["Remove model from Ollama container"]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Pull Request Overview
This PR integrates Ollama as a containerized service into the development and deployment workflow. The changes add Docker Compose configuration for running Ollama with GPU support, provide developer-friendly Makefile commands for managing the service and models, and configure VSCode to automatically start Ollama before debugging.
- Added Ollama service to docker-compose.yml with GPU support and persistent storage
- Created Makefile targets for common Ollama operations (start, stop, pull, list models, etc.)
- Integrated Ollama startup into VSCode debugging workflow via preLaunchTask
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| docker-compose.yml | Adds Ollama service with GPU configuration, persistent volume, and makes API services depend on it |
| Makefile | Introduces 7 new targets for managing Ollama service lifecycle and model operations |
| .vscode/tasks.json | Defines "Start Ollama service" task that executes the Makefile ollama-up command |
| .vscode/launch.json | Configures Python debugger to automatically start Ollama service before launching |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| deploy: | ||
| resources: | ||
| reservations: | ||
| devices: | ||
| - driver: nvidia | ||
| count: all | ||
| capabilities: [gpu] |
There was a problem hiding this comment.
The Ollama service configuration assumes GPU availability via NVIDIA drivers, but there's no fallback configuration for CPU-only environments. This will cause the service to fail in environments without NVIDIA GPUs or proper drivers.
Consider adding a CPU-only profile or making GPU configuration optional:
ollama:
image: ollama/ollama
container_name: ollama
profiles:
- gpu
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
# ... rest of config
ollama-cpu:
image: ollama/ollama
container_name: ollama
profiles:
- cpu
# ... config without GPU requirements| "subProcess": false, | ||
| "envFile": "${workspaceFolder}/.env", | ||
| "python": "${workspaceFolder}/.venv/bin/python" | ||
| "python": "${workspaceFolder}/.venv/bin/python", |
There was a problem hiding this comment.
[nitpick] The preLaunchTask "Start Ollama service" will run every time you start debugging in VSCode, but the task doesn't check if Ollama is already running. The --no-recreate flag in the Makefile will prevent recreating existing containers, but the task will still execute and potentially cause delays or confusion if it fails.
Consider adding task properties to handle this better:
{
"label": "Start Ollama service",
"type": "shell",
"command": "make",
"args": ["ollama-up"],
"options": {
"cwd": "${workspaceFolder}"
},
"problemMatcher": [],
"isBackground": false,
"presentation": {
"reveal": "silent",
"panel": "shared"
}
}This makes the task output less intrusive when Ollama is already running.
| "python": "${workspaceFolder}/.venv/bin/python", | |
| "python": "${workspaceFolder}/.venv/bin/python", | |
| // NOTE: To avoid intrusive output and redundant execution, ensure the "Start Ollama service" task in .vscode/tasks.json uses: | |
| // "presentation": { "reveal": "silent", "panel": "shared" } |
| ollama: | ||
| image: ollama/ollama | ||
| container_name: ollama | ||
| deploy: | ||
| resources: | ||
| reservations: | ||
| devices: | ||
| - driver: nvidia | ||
| count: all | ||
| capabilities: [gpu] | ||
| ports: | ||
| - "11434:11434" | ||
| volumes: | ||
| - ollama:/root/.ollama | ||
| restart: always | ||
|
|
||
| volumes: | ||
| ollama: |
There was a problem hiding this comment.
The Ollama integration is missing documentation. The README should be updated to explain:
- That Ollama is now a dependency and what it's used for
- How to set up Ollama for development (including GPU requirements)
- Available Makefile commands for Ollama management
- How to configure which models to use
- How to run the API without Ollama (if applicable)
This is especially important since the Quick Start section shows docker run commands that won't include the Ollama service, potentially causing confusion for users.
| depends_on: | ||
| - ollama |
There was a problem hiding this comment.
The depends_on directive only ensures startup order, not readiness. The API services will start as soon as the Ollama container starts, but Ollama may not be ready to accept requests yet. Consider using a health check or wait-for script to ensure Ollama is actually ready before the API services attempt to connect to it.
Example with healthcheck:
ollama:
image: ollama/ollama
healthcheck:
test: ["CMD", "ollama", "list"]
interval: 10s
timeout: 5s
retries: 5
# ... rest of config
aymurai-api:
depends_on:
ollama:
condition: service_healthy| ifndef MODEL | ||
| $(error MODEL variable is required, e.g. make ollama-run MODEL=llama3) | ||
| endif | ||
| docker compose up -d --no-recreate ollama |
There was a problem hiding this comment.
The -it flag in docker compose exec -it requires an interactive terminal, which will cause this Makefile target to fail when run in non-interactive environments (CI/CD pipelines, automated scripts, VSCode tasks, etc.). Consider removing the -it flag or providing separate targets for interactive and non-interactive use.
Recommended change:
ollama-run:
ifndef MODEL
$(error MODEL variable is required, e.g. make ollama-run MODEL=llama3)
endif
docker compose up -d --no-recreate ollama
docker compose exec ollama ollama run $(MODEL)Or add a separate interactive target:
ollama-run-interactive:
ifndef MODEL
$(error MODEL variable is required, e.g. make ollama-run-interactive MODEL=llama3)
endif
docker compose up -d --no-recreate ollama
docker compose exec -it ollama ollama run $(MODEL)| docker compose up -d --no-recreate ollama | |
| docker compose up -d --no-recreate ollama | |
| docker compose exec ollama ollama run $(MODEL) | |
| ollama-run-interactive: | |
| ifndef MODEL | |
| $(error MODEL variable is required, e.g. make ollama-run-interactive MODEL=llama3) | |
| endif | |
| docker compose up -d --no-recreate ollama |
| depends_on: | ||
| - ollama |
There was a problem hiding this comment.
The API services have a hard dependency on Ollama being available, but there's no evidence in the codebase that the API actually uses Ollama (no imports found, no Ollama-related configuration). This creates an unnecessary runtime dependency that will prevent the API from starting if Ollama fails.
If Ollama is optional or only used for specific features, consider:
- Making the
depends_onconditional or removing it - Implementing graceful degradation in the API code when Ollama is unavailable
- Adding connection retry logic with appropriate timeouts
If Ollama is truly required, ensure the API code properly handles connection failures and documents this dependency.
This pull request adds support for managing an Ollama service as part of the development and deployment workflow. The changes introduce a new Ollama container to the Docker Compose setup, provide Makefile commands for common Ollama operations, and integrate Ollama startup into the VSCode development environment.
Docker and service integration:
ollamaservice todocker-compose.ymlwith GPU support, persistent storage, and automatic restart, and made API services depend on it to ensure proper startup order. [1] [2] [3]Developer tooling and automation:
tasks.json) to start the Ollama service using the new Makefile command, and configured the Python launch configuration to run this task before debugging. [1] [2]Summary by Sourcery
Integrate an Ollama service into the development workflow by adding it to Docker Compose, providing Makefile commands for model management, and automating its startup in VSCode.
New Features: