Skip to content

Scratchydisk/vet-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VetVision

AI-powered feline dental health assessment using consumer-grade language models

VetVision is a web application that uses a three-stage AI pipeline (VLM → VLM → LLM) to assess feline gingivitis from photographs, grading inflammation on a standardised 0–4 scale. It runs entirely on local hardware via Ollama, with no cloud API keys required.


Important Disclaimer

This is a proof of concept. VetVision was born out of personal curiosity about whether consumer-grade vision language models could meaningfully assess feline oral health from photographs. While there is published research exploring AI-assisted dental grading in humans, I could not find any evidence of this approach being attempted for cats. The author is not a veterinarian. The validity of the AI's findings is subjective — my evaluation involved comparing the system's outputs against grading provided by qualified vets, but those reference assessments did not include the level of clinical detail that this tool produces. This project should not be used as a substitute for professional veterinary diagnosis. If it proves to be of value to the veterinary community, I would welcome the opportunity to collaborate with qualified individuals to develop it further.


What It Does

  1. Upload a photo of a cat's mouth (along with optional clinical notes)
  2. Stage A — a vision model examines the image and assigns a gingivitis grade (0–4)
  3. Stage B — a second vision model pass generates a detailed clinical explanation, with the grade locked from Stage A to ensure consistency
  4. Stage C — a text model normalises the output into structured JSON and clean markdown

Clinical notes can influence grading — for example, noting "bleeding on manipulation" will escalate the grade to at least 3, and "spontaneous bleeding" or "ulceration" will set it to 4.

Grading Scale

Grade Severity Description
0 Healthy No inflammation
1 Mild Erythema at gingival margin only
2 Moderate Erythema extending beyond margin
3 Marked Significant inflammation or bleeding on manipulation
4 Severe Spontaneous bleeding or ulceration

Getting Started

Prerequisites

You need one thing installed on your machine:

That's it. Ollama (the AI inference engine) runs inside Docker alongside everything else — there is nothing else to install.

Note: The first run will download the AI model (~2 GB for ministral-3:3b). You'll need a machine with at least 8 GB of RAM, though 16 GB is recommended for comfortable performance.

One-command setup

# Clone the repository
git clone https://github.com/Scratchydisk/vet-vision.git
cd vet-vision

# Build everything, pull the AI model, migrate the database, seed test data, and start
make setup

The setup will take a few minutes on first run (building containers and downloading the model). Once it completes, open your browser:

Service URL Description
Application http://localhost:3001 Main web interface
API Documentation http://localhost:8000/docs Interactive Swagger UI
Task Monitor http://localhost:5555 Celery Flower dashboard

Test Accounts

The seed script creates three accounts with different access levels:

Email Password Role
admin@vetvision.example admin123 Full system access
analyst@vetvision.example analyst123 Manage cats, images, and run analyses
viewer@vetvision.example viewer123 Read-only access

Everyday commands

make dev          # Start all services
make stop         # Stop all services
make dev-logs     # Follow live logs
make clean        # Stop and remove all data (fresh start)

Configuring AI Models

Ollama runs inside Docker and is configured automatically. The default model is ministral-3:3b, which handles both vision and text across all pipeline stages. You can swap models by editing your .env file (copied from .env.example during setup):

# Model for all pipeline stages (ministral handles both vision and text)
OLLAMA_VLM_MODEL=ministral-3:3b
OLLAMA_LLM_MODEL=ministral-3:3b

# Model parameters
AI_TEMPERATURE=0.1
AI_MAX_TOKENS=1024
OLLAMA_TIMEOUT=120.0

If you change the model, pull it into the Ollama container:

make pull-models
# or manually:
docker exec vetvision-ollama ollama pull ministral-3:8b

Tested Models

The Ministral 3 family delivered the best results of the models tested. Unlike the other options, Ministral handles both vision and text, so it can be used across all three pipeline stages.

Model Stages Notes
ministral-3:8b A, B, C Best overall — worked very well across all stages
ministral-3:14b A, B, C Similar to the 8b, perhaps marginally better
ministral-3:3b A, B, C Adequate — a good choice if RAM is limited
llava A, B Vision only — the default if Ministral is unavailable
bakllava A, B Vision only
qwen2-vl A, B Vision only
llama3.2 C Text only — default for Stage C normalisation
mistral C Text only
qwen2.5 C Text only

Recommendation: The default ministral-3:3b is a good starting point. For the best results, upgrade to ministral-3:8b — set both OLLAMA_VLM_MODEL and OLLAMA_LLM_MODEL to ministral-3:8b in your .env file, then run docker exec vetvision-ollama ollama pull ministral-3:8b.


Architecture

Pipeline

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Stage A   │────▶│   Stage B   │────▶│   Stage C   │
│   (VLM)     │     │   (VLM)     │     │   (LLM)     │
│             │     │             │     │             │
│  Grade      │     │  Explain    │     │  Normalise  │
│  Decision   │     │  (Locked)   │     │  & Clean    │
└─────────────┘     └─────────────┘     └─────────────┘
      │                   │                   │
      ▼                   ▼                   ▼
    JSON              Narrative            Final
    Grade              Text              JSON + MD

The key design decision is separating grading from explanation. Stage A decides the grade without needing to justify it; Stage B then explains a locked grade, which prevents the explanation from inadvertently shifting the assessment.

Tech Stack

  • Backend: Python 3.11+, FastAPI, SQLAlchemy 2.0 (async), PostgreSQL, Celery, Redis
  • Frontend: Nuxt 3, Vue 3, TypeScript, Tailwind CSS, Pinia
  • AI: Ollama (local inference — no cloud APIs required)
  • Infrastructure: Docker Compose for all services

Project Structure

vet-vision/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI entry point
│   │   ├── config.py            # Environment configuration
│   │   ├── database.py          # SQLAlchemy async setup
│   │   ├── models/              # SQLAlchemy ORM models
│   │   ├── routes/              # API endpoints
│   │   ├── schemas/             # Pydantic request/response models
│   │   ├── services/            # Business logic
│   │   ├── dependencies/        # FastAPI dependencies (auth, permissions)
│   │   ├── clients/             # Ollama client integration
│   │   ├── pipeline/            # AI pipeline stages (A, B, C)
│   │   └── tasks/               # Celery async tasks
│   ├── alembic/                 # Database migrations
│   └── scripts/
│       └── seed.py              # Database seed script
├── frontend/
│   ├── pages/                   # Route pages (file-based routing)
│   ├── layouts/                 # App layouts
│   ├── components/              # Vue components
│   ├── composables/             # Vue composables
│   ├── stores/                  # Pinia stores
│   └── plugins/                 # Nuxt plugins
├── docker/
│   └── docker-compose.yml       # Development services
├── .env.example                 # Environment template
└── Makefile                     # Development commands

Background Task Queues

Long-running operations are handled by Celery workers on dedicated queues:

Queue Purpose
pipeline AI inference tasks
meta Session aggregation, trend analysis
exports PDF, CSV, and Parquet generation

Development

Database

make migrate          # Run pending migrations
make migrate-new      # Create a new migration (prompts for a message)
make seed             # Re-seed test data
make db-shell         # Open a PostgreSQL shell

Testing and Code Quality

make test             # Run backend tests (pytest)
make test-cov         # Tests with coverage report
make lint             # Ruff + MyPy (backend), ESLint (frontend)
make format           # Auto-format code

Frontend-only Development

If you want to work on the frontend independently:

cd frontend
npm run dev           # Start Nuxt dev server
npm run lint          # ESLint
npm run lint:fix      # ESLint with auto-fix
npm run typecheck     # Vue type checking

Authentication

JWT-based with access tokens (60 min) and refresh tokens (30 days). Three roles:

  • Admin — full system access, user management
  • Analyst — manage cats, images, collections; run analyses and exports
  • Viewer — read-only access

Environment Variables

Copy .env.example to .env and adjust as needed. Key variables:

Variable Description Default
DATABASE_URL PostgreSQL connection string postgresql+asyncpg://postgres:postgres@localhost:5432/vetvision
REDIS_URL Redis connection string redis://localhost:6379/0
JWT_SECRET_KEY JWT signing key Change for production
SECRET_KEY Application secret Change for production
OLLAMA_VLM_MODEL Vision model name ministral-3:3b
OLLAMA_LLM_MODEL Text model name ministral-3:3b

Troubleshooting

Ollama connection errors

Ollama runs inside Docker as the vetvision-ollama container. Check it's running:

docker ps | grep ollama

If it's not running, restart the services:

make dev

Models not found

Check which models are available in the Ollama container:

docker exec vetvision-ollama ollama list

Pull a model if it's missing:

docker exec vetvision-ollama ollama pull ministral-3:3b

Docker issues

# Full reset — removes all containers and data
make clean

# Rebuild from scratch
make setup

Slow inference

Performance depends on your hardware. The default ministral-3:3b model is the lightest option and should run on most machines. If you have more RAM and want better results, try ministral-3:8b.

For GPU acceleration (NVIDIA only), uncomment the deploy section in docker/docker-compose.yml and ensure you have the NVIDIA Container Toolkit installed.


Contributing

This is an early-stage project and contributions are welcome. If you're a veterinary professional, researcher, or developer interested in feline dental health assessment, please open an issue or get in touch.


Licence

MIT — see LICENCE for details.

About

AI-powered feline dental health assessment using consumer-grade language models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors