VetVision

AI-powered feline dental health assessment using consumer-grade language models

VetVision is a web application that uses a three-stage AI pipeline (VLM → VLM → LLM) to assess feline gingivitis from photographs, grading inflammation on a standardised 0–4 scale. It runs entirely on local hardware via Ollama, with no cloud API keys required.

Important Disclaimer

This is a proof of concept. VetVision was born out of personal curiosity about whether consumer-grade vision language models could meaningfully assess feline oral health from photographs. While there is published research exploring AI-assisted dental grading in humans, I could not find any evidence of this approach being attempted for cats. The author is not a veterinarian. The validity of the AI's findings is subjective — my evaluation involved comparing the system's outputs against grading provided by qualified vets, but those reference assessments did not include the level of clinical detail that this tool produces. This project should not be used as a substitute for professional veterinary diagnosis. If it proves to be of value to the veterinary community, I would welcome the opportunity to collaborate with qualified individuals to develop it further.

What It Does

Upload a photo of a cat's mouth (along with optional clinical notes)
Stage A — a vision model examines the image and assigns a gingivitis grade (0–4)
Stage B — a second vision model pass generates a detailed clinical explanation, with the grade locked from Stage A to ensure consistency
Stage C — a text model normalises the output into structured JSON and clean markdown

Clinical notes can influence grading — for example, noting "bleeding on manipulation" will escalate the grade to at least 3, and "spontaneous bleeding" or "ulceration" will set it to 4.

Grading Scale

Grade	Severity	Description
0	Healthy	No inflammation
1	Mild	Erythema at gingival margin only
2	Moderate	Erythema extending beyond margin
3	Marked	Significant inflammation or bleeding on manipulation
4	Severe	Spontaneous bleeding or ulceration

Getting Started

Prerequisites

You need one thing installed on your machine:

Docker Desktop (includes Docker Compose)

That's it. Ollama (the AI inference engine) runs inside Docker alongside everything else — there is nothing else to install.

Note: The first run will download the AI model (~2 GB for ministral-3:3b). You'll need a machine with at least 8 GB of RAM, though 16 GB is recommended for comfortable performance.

One-command setup

# Clone the repository
git clone https://github.com/Scratchydisk/vet-vision.git
cd vet-vision

# Build everything, pull the AI model, migrate the database, seed test data, and start
make setup

The setup will take a few minutes on first run (building containers and downloading the model). Once it completes, open your browser:

Service	URL	Description
Application	http://localhost:3001	Main web interface
API Documentation	http://localhost:8000/docs	Interactive Swagger UI
Task Monitor	http://localhost:5555	Celery Flower dashboard

Test Accounts

The seed script creates three accounts with different access levels:

Email	Password	Role
`admin@vetvision.example`	`admin123`	Full system access
`analyst@vetvision.example`	`analyst123`	Manage cats, images, and run analyses
`viewer@vetvision.example`	`viewer123`	Read-only access

Everyday commands

make dev          # Start all services
make stop         # Stop all services
make dev-logs     # Follow live logs
make clean        # Stop and remove all data (fresh start)

Configuring AI Models

Ollama runs inside Docker and is configured automatically. The default model is ministral-3:3b, which handles both vision and text across all pipeline stages. You can swap models by editing your .env file (copied from .env.example during setup):

# Model for all pipeline stages (ministral handles both vision and text)
OLLAMA_VLM_MODEL=ministral-3:3b
OLLAMA_LLM_MODEL=ministral-3:3b

# Model parameters
AI_TEMPERATURE=0.1
AI_MAX_TOKENS=1024
OLLAMA_TIMEOUT=120.0

If you change the model, pull it into the Ollama container:

make pull-models
# or manually:
docker exec vetvision-ollama ollama pull ministral-3:8b

Tested Models

The Ministral 3 family delivered the best results of the models tested. Unlike the other options, Ministral handles both vision and text, so it can be used across all three pipeline stages.

Model	Stages	Notes
ministral-3:8b	A, B, C	Best overall — worked very well across all stages
ministral-3:14b	A, B, C	Similar to the 8b, perhaps marginally better
ministral-3:3b	A, B, C	Adequate — a good choice if RAM is limited
llava	A, B	Vision only — the default if Ministral is unavailable
bakllava	A, B	Vision only
qwen2-vl	A, B	Vision only
llama3.2	C	Text only — default for Stage C normalisation
mistral	C	Text only
qwen2.5	C	Text only

Recommendation: The default ministral-3:3b is a good starting point. For the best results, upgrade to ministral-3:8b — set both OLLAMA_VLM_MODEL and OLLAMA_LLM_MODEL to ministral-3:8b in your .env file, then run docker exec vetvision-ollama ollama pull ministral-3:8b.

Architecture

Pipeline

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Stage A   │────▶│   Stage B   │────▶│   Stage C   │
│   (VLM)     │     │   (VLM)     │     │   (LLM)     │
│             │     │             │     │             │
│  Grade      │     │  Explain    │     │  Normalise  │
│  Decision   │     │  (Locked)   │     │  & Clean    │
└─────────────┘     └─────────────┘     └─────────────┘
      │                   │                   │
      ▼                   ▼                   ▼
    JSON              Narrative            Final
    Grade              Text              JSON + MD

The key design decision is separating grading from explanation. Stage A decides the grade without needing to justify it; Stage B then explains a locked grade, which prevents the explanation from inadvertently shifting the assessment.

Tech Stack

Backend: Python 3.11+, FastAPI, SQLAlchemy 2.0 (async), PostgreSQL, Celery, Redis
Frontend: Nuxt 3, Vue 3, TypeScript, Tailwind CSS, Pinia
AI: Ollama (local inference — no cloud APIs required)
Infrastructure: Docker Compose for all services

Project Structure

vet-vision/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI entry point
│   │   ├── config.py            # Environment configuration
│   │   ├── database.py          # SQLAlchemy async setup
│   │   ├── models/              # SQLAlchemy ORM models
│   │   ├── routes/              # API endpoints
│   │   ├── schemas/             # Pydantic request/response models
│   │   ├── services/            # Business logic
│   │   ├── dependencies/        # FastAPI dependencies (auth, permissions)
│   │   ├── clients/             # Ollama client integration
│   │   ├── pipeline/            # AI pipeline stages (A, B, C)
│   │   └── tasks/               # Celery async tasks
│   ├── alembic/                 # Database migrations
│   └── scripts/
│       └── seed.py              # Database seed script
├── frontend/
│   ├── pages/                   # Route pages (file-based routing)
│   ├── layouts/                 # App layouts
│   ├── components/              # Vue components
│   ├── composables/             # Vue composables
│   ├── stores/                  # Pinia stores
│   └── plugins/                 # Nuxt plugins
├── docker/
│   └── docker-compose.yml       # Development services
├── .env.example                 # Environment template
└── Makefile                     # Development commands

Background Task Queues

Long-running operations are handled by Celery workers on dedicated queues:

Queue	Purpose
`pipeline`	AI inference tasks
`meta`	Session aggregation, trend analysis
`exports`	PDF, CSV, and Parquet generation

Development

Database

make migrate          # Run pending migrations
make migrate-new      # Create a new migration (prompts for a message)
make seed             # Re-seed test data
make db-shell         # Open a PostgreSQL shell

Testing and Code Quality

make test             # Run backend tests (pytest)
make test-cov         # Tests with coverage report
make lint             # Ruff + MyPy (backend), ESLint (frontend)
make format           # Auto-format code

Frontend-only Development

If you want to work on the frontend independently:

cd frontend
npm run dev           # Start Nuxt dev server
npm run lint          # ESLint
npm run lint:fix      # ESLint with auto-fix
npm run typecheck     # Vue type checking

Authentication

JWT-based with access tokens (60 min) and refresh tokens (30 days). Three roles:

Admin — full system access, user management
Analyst — manage cats, images, collections; run analyses and exports
Viewer — read-only access

Environment Variables

Copy .env.example to .env and adjust as needed. Key variables:

Variable	Description	Default
`DATABASE_URL`	PostgreSQL connection string	`postgresql+asyncpg://postgres:postgres@localhost:5432/vetvision`
`REDIS_URL`	Redis connection string	`redis://localhost:6379/0`
`JWT_SECRET_KEY`	JWT signing key	Change for production
`SECRET_KEY`	Application secret	Change for production
`OLLAMA_VLM_MODEL`	Vision model name	`ministral-3:3b`
`OLLAMA_LLM_MODEL`	Text model name	`ministral-3:3b`

Troubleshooting

Ollama connection errors

Ollama runs inside Docker as the vetvision-ollama container. Check it's running:

docker ps | grep ollama

If it's not running, restart the services:

make dev

Models not found

Check which models are available in the Ollama container:

docker exec vetvision-ollama ollama list

Pull a model if it's missing:

docker exec vetvision-ollama ollama pull ministral-3:3b

Docker issues

# Full reset — removes all containers and data
make clean

# Rebuild from scratch
make setup

Slow inference

Performance depends on your hardware. The default ministral-3:3b model is the lightest option and should run on most machines. If you have more RAM and want better results, try ministral-3:8b.

For GPU acceleration (NVIDIA only), uncomment the deploy section in docker/docker-compose.yml and ensure you have the NVIDIA Container Toolkit installed.

Contributing

This is an early-stage project and contributions are welcome. If you're a veterinary professional, researcher, or developer interested in feline dental health assessment, please open an issue or get in touch.

Licence

MIT — see LICENCE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
design		design
docker		docker
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENCE		LICENCE
Makefile		Makefile
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

VetVision

Important Disclaimer

What It Does

Grading Scale

Getting Started

Prerequisites

One-command setup

Test Accounts

Everyday commands

Configuring AI Models

Tested Models

Architecture

Pipeline

Tech Stack

Project Structure

Background Task Queues

Development

Database

Testing and Code Quality

Frontend-only Development

Authentication

Environment Variables

Troubleshooting

Ollama connection errors

Models not found

Docker issues

Slow inference

Contributing

Licence

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages