AI-powered feline dental health assessment using consumer-grade language models
VetVision is a web application that uses a three-stage AI pipeline (VLM → VLM → LLM) to assess feline gingivitis from photographs, grading inflammation on a standardised 0–4 scale. It runs entirely on local hardware via Ollama, with no cloud API keys required.
This is a proof of concept. VetVision was born out of personal curiosity about whether consumer-grade vision language models could meaningfully assess feline oral health from photographs. While there is published research exploring AI-assisted dental grading in humans, I could not find any evidence of this approach being attempted for cats. The author is not a veterinarian. The validity of the AI's findings is subjective — my evaluation involved comparing the system's outputs against grading provided by qualified vets, but those reference assessments did not include the level of clinical detail that this tool produces. This project should not be used as a substitute for professional veterinary diagnosis. If it proves to be of value to the veterinary community, I would welcome the opportunity to collaborate with qualified individuals to develop it further.
- Upload a photo of a cat's mouth (along with optional clinical notes)
- Stage A — a vision model examines the image and assigns a gingivitis grade (0–4)
- Stage B — a second vision model pass generates a detailed clinical explanation, with the grade locked from Stage A to ensure consistency
- Stage C — a text model normalises the output into structured JSON and clean markdown
Clinical notes can influence grading — for example, noting "bleeding on manipulation" will escalate the grade to at least 3, and "spontaneous bleeding" or "ulceration" will set it to 4.
| Grade | Severity | Description |
|---|---|---|
| 0 | Healthy | No inflammation |
| 1 | Mild | Erythema at gingival margin only |
| 2 | Moderate | Erythema extending beyond margin |
| 3 | Marked | Significant inflammation or bleeding on manipulation |
| 4 | Severe | Spontaneous bleeding or ulceration |
You need one thing installed on your machine:
- Docker Desktop (includes Docker Compose)
That's it. Ollama (the AI inference engine) runs inside Docker alongside everything else — there is nothing else to install.
Note: The first run will download the AI model (~2 GB for
ministral-3:3b). You'll need a machine with at least 8 GB of RAM, though 16 GB is recommended for comfortable performance.
# Clone the repository
git clone https://github.com/Scratchydisk/vet-vision.git
cd vet-vision
# Build everything, pull the AI model, migrate the database, seed test data, and start
make setupThe setup will take a few minutes on first run (building containers and downloading the model). Once it completes, open your browser:
| Service | URL | Description |
|---|---|---|
| Application | http://localhost:3001 | Main web interface |
| API Documentation | http://localhost:8000/docs | Interactive Swagger UI |
| Task Monitor | http://localhost:5555 | Celery Flower dashboard |
The seed script creates three accounts with different access levels:
| Password | Role | |
|---|---|---|
admin@vetvision.example |
admin123 |
Full system access |
analyst@vetvision.example |
analyst123 |
Manage cats, images, and run analyses |
viewer@vetvision.example |
viewer123 |
Read-only access |
make dev # Start all services
make stop # Stop all services
make dev-logs # Follow live logs
make clean # Stop and remove all data (fresh start)Ollama runs inside Docker and is configured automatically. The default model is ministral-3:3b, which handles both vision and text across all pipeline stages. You can swap models by editing your .env file (copied from .env.example during setup):
# Model for all pipeline stages (ministral handles both vision and text)
OLLAMA_VLM_MODEL=ministral-3:3b
OLLAMA_LLM_MODEL=ministral-3:3b
# Model parameters
AI_TEMPERATURE=0.1
AI_MAX_TOKENS=1024
OLLAMA_TIMEOUT=120.0If you change the model, pull it into the Ollama container:
make pull-models
# or manually:
docker exec vetvision-ollama ollama pull ministral-3:8bThe Ministral 3 family delivered the best results of the models tested. Unlike the other options, Ministral handles both vision and text, so it can be used across all three pipeline stages.
| Model | Stages | Notes |
|---|---|---|
| ministral-3:8b | A, B, C | Best overall — worked very well across all stages |
| ministral-3:14b | A, B, C | Similar to the 8b, perhaps marginally better |
| ministral-3:3b | A, B, C | Adequate — a good choice if RAM is limited |
| llava | A, B | Vision only — the default if Ministral is unavailable |
| bakllava | A, B | Vision only |
| qwen2-vl | A, B | Vision only |
| llama3.2 | C | Text only — default for Stage C normalisation |
| mistral | C | Text only |
| qwen2.5 | C | Text only |
Recommendation: The default
ministral-3:3bis a good starting point. For the best results, upgrade toministral-3:8b— set bothOLLAMA_VLM_MODELandOLLAMA_LLM_MODELtoministral-3:8bin your.envfile, then rundocker exec vetvision-ollama ollama pull ministral-3:8b.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Stage A │────▶│ Stage B │────▶│ Stage C │
│ (VLM) │ │ (VLM) │ │ (LLM) │
│ │ │ │ │ │
│ Grade │ │ Explain │ │ Normalise │
│ Decision │ │ (Locked) │ │ & Clean │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
▼ ▼ ▼
JSON Narrative Final
Grade Text JSON + MD
The key design decision is separating grading from explanation. Stage A decides the grade without needing to justify it; Stage B then explains a locked grade, which prevents the explanation from inadvertently shifting the assessment.
- Backend: Python 3.11+, FastAPI, SQLAlchemy 2.0 (async), PostgreSQL, Celery, Redis
- Frontend: Nuxt 3, Vue 3, TypeScript, Tailwind CSS, Pinia
- AI: Ollama (local inference — no cloud APIs required)
- Infrastructure: Docker Compose for all services
vet-vision/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI entry point
│ │ ├── config.py # Environment configuration
│ │ ├── database.py # SQLAlchemy async setup
│ │ ├── models/ # SQLAlchemy ORM models
│ │ ├── routes/ # API endpoints
│ │ ├── schemas/ # Pydantic request/response models
│ │ ├── services/ # Business logic
│ │ ├── dependencies/ # FastAPI dependencies (auth, permissions)
│ │ ├── clients/ # Ollama client integration
│ │ ├── pipeline/ # AI pipeline stages (A, B, C)
│ │ └── tasks/ # Celery async tasks
│ ├── alembic/ # Database migrations
│ └── scripts/
│ └── seed.py # Database seed script
├── frontend/
│ ├── pages/ # Route pages (file-based routing)
│ ├── layouts/ # App layouts
│ ├── components/ # Vue components
│ ├── composables/ # Vue composables
│ ├── stores/ # Pinia stores
│ └── plugins/ # Nuxt plugins
├── docker/
│ └── docker-compose.yml # Development services
├── .env.example # Environment template
└── Makefile # Development commands
Long-running operations are handled by Celery workers on dedicated queues:
| Queue | Purpose |
|---|---|
pipeline |
AI inference tasks |
meta |
Session aggregation, trend analysis |
exports |
PDF, CSV, and Parquet generation |
make migrate # Run pending migrations
make migrate-new # Create a new migration (prompts for a message)
make seed # Re-seed test data
make db-shell # Open a PostgreSQL shellmake test # Run backend tests (pytest)
make test-cov # Tests with coverage report
make lint # Ruff + MyPy (backend), ESLint (frontend)
make format # Auto-format codeIf you want to work on the frontend independently:
cd frontend
npm run dev # Start Nuxt dev server
npm run lint # ESLint
npm run lint:fix # ESLint with auto-fix
npm run typecheck # Vue type checkingJWT-based with access tokens (60 min) and refresh tokens (30 days). Three roles:
- Admin — full system access, user management
- Analyst — manage cats, images, collections; run analyses and exports
- Viewer — read-only access
Copy .env.example to .env and adjust as needed. Key variables:
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | postgresql+asyncpg://postgres:postgres@localhost:5432/vetvision |
REDIS_URL |
Redis connection string | redis://localhost:6379/0 |
JWT_SECRET_KEY |
JWT signing key | Change for production |
SECRET_KEY |
Application secret | Change for production |
OLLAMA_VLM_MODEL |
Vision model name | ministral-3:3b |
OLLAMA_LLM_MODEL |
Text model name | ministral-3:3b |
Ollama runs inside Docker as the vetvision-ollama container. Check it's running:
docker ps | grep ollamaIf it's not running, restart the services:
make devCheck which models are available in the Ollama container:
docker exec vetvision-ollama ollama listPull a model if it's missing:
docker exec vetvision-ollama ollama pull ministral-3:3b# Full reset — removes all containers and data
make clean
# Rebuild from scratch
make setupPerformance depends on your hardware. The default ministral-3:3b model is the lightest option and should run on most machines. If you have more RAM and want better results, try ministral-3:8b.
For GPU acceleration (NVIDIA only), uncomment the deploy section in docker/docker-compose.yml and ensure you have the NVIDIA Container Toolkit installed.
This is an early-stage project and contributions are welcome. If you're a veterinary professional, researcher, or developer interested in feline dental health assessment, please open an issue or get in touch.
MIT — see LICENCE for details.