-
Notifications
You must be signed in to change notification settings - Fork 0
08. Inference & AI
DIRD+ runs two kinds of AI entirely on-device:
- Computer vision — ONNX detection + segmentation models (WebAssembly, in the WebView).
- Language — an optional local LLM (llama.cpp, in the Rust backend) that only polishes report prose.
Neither sends data off the device. Models are downloaded once and stored locally.
Implemented in src/lib/ai/ (InferenceService, ONNXModelManager, ModelDownloader) and src/lib/analysis/.
Fundus image (any camera)
│ 1. Preprocess resize to 640×640, letterbox, normalize (÷255), RGB
│ 2. ONNX inference detection → boxes (class + confidence); segmentation → masks
│ (ONNX Runtime Web, SIMD + multi-thread, Intel/AMD/ARM profiles)
│ 3. Post-process Non-Maximum Suppression (IoU 0.45), confidence threshold
│ 4. Spatial quadrant distribution (4 zones + center), macular-edema detection,
│ analysis cup/disc ratio (OpenCV.js), spatial calibration from the optic disc
│ 5. Classify apply the active clinical guideline (see page 10)
▼ 6. Persist store detections, segmentations, classification, measurements
Detectable classes (reference models): optic_disc, fovea, hard_exudate, hemorrhage, cotton_wool_spot, microhemorrhages, edema, microaneurysm, neovascularization, venous_beading, IRMA.
Models live in the Debaq/dird_models repository (AGPL-3.0) and are downloaded on demand from Settings → AI Models. You can also plug in your own ONNX model — see 09. Model Interface.
Per-inference timings (preprocess / inference / post-process / NMS / total) are recorded locally, persist across sessions, and can be exported to JSON for benchmarking across devices.
Implemented in src-tauri/src/llm.rs (Rust, via the llama-cpp-2 crate) and surfaced in Settings → AI Models → Local assistant (src/components/settings/LocalLLMSection.tsx).
- Curated catalog of small open-weight GGUF models (SmolLM2, TinyLlama, Llama-3.2, Qwen2.5, Gemma-2, Phi-3.5; ~230 MB–2.4 GB, Q4_K_M).
-
User-driven download with resumable progress events (
llm:download_progress). No weights are bundled with the app. - In-process inference — generation runs inside the Tauri process; per-family chat templating (TinyLlama / ChatML / Llama-3 / Phi-3 / Gemma).
- Only used for report prose. See 07. Report Pipeline. It never classifies or diagnoses.
100% local: once a model is downloaded, no clinical text ever leaves the device.