-
Notifications
You must be signed in to change notification settings - Fork 0
Troubleshooting
Check logs:
docker compose logs winnowCommon causes:
-
Missing required env var —
IMMICH_URLorAPI_KEYnot set -
Cannot reach Immich — check the URL and that Immich is running; use
http://nothttps://unless you have TLS set up
Your API key has been revoked or expired. Generate a new one in Immich → Account Settings → API Keys and update API_KEY in your .env.
Set VERBOSE=true in .env (or pass -e VERBOSE=true) to enable DEBUG-level output on the console. Useful for tracing exactly which images are being fetched, filtered, and selected.
The log file always captures DEBUG regardless of VERBOSE. It is written to the output volume:
docker exec winnow cat /app/frigate_train/winnow.log- Make sure Immich has completed face recognition and you have named people in your library
-
YEARS_FILTERdefaults to 10 years — increase it if your tagged photos are older -
MIN_FACE_COUNTskips people with few photos — lower or remove it
- Confirm
FRIGATE_URLis reachable from inside the container:docker exec winnow curl $FRIGATE_URL/api/stats - Check Frigate v0.16+ — older versions don't have the face training API
- Set
DRY_RUN=trueto verify selection without uploading
winnow downloads InsightFace and HuggingFace (SigLIP) models on first run.
- Ensure the container has internet access
- Confirm the model volume is mounted and writable
- If behind a proxy, set
HTTP_PROXY/HTTPS_PROXYenv vars
Use the :cpu image tag (ghcr.io/sudolulo/winnow:cpu) which omits the GPU base entirely. Or set FORCE_CPU=true with any other tag to disable GPU at runtime.
Everything works on CPU but embedding computation is slower — typically tens of seconds per person instead of under a second on GPU.
Memory: Set a minimum container memory limit of 2 GB (mem_limit: 2g). If you have a large library — hundreds of images per person — allow more.
-
Confirm the GPU is visible inside the container:
docker exec winnow nvidia-smiIf this fails, the container doesn't have GPU access — check the
deploy.resources.reservations.devicesblock incompose.ymland that the NVIDIA Container Toolkit is installed on the host. -
Check which ONNX providers are available:
docker exec winnow /app/.venv/bin/python3 -c \ "import onnxruntime as ort; ort.preload_dlls(cuda=True, cudnn=True); print(ort.get_available_providers())"
Expected (GPU working):
['CUDAExecutionProvider', 'CPUExecutionProvider', ...]
GPU not working: only['CPUExecutionProvider']listed. -
If only CPU providers appear even though
nvidia-smiworks, you may be running an image older than 0.2.11 — a packaging bug causedonnxruntime-gputo be silently overwritten. Update to fix it:docker compose pull && docker compose up -d
-
Confirm devices are passed through — your
compose.ymlshould have:devices: - /dev/kfd - /dev/dri group_add: - video - render
-
Verify ROCm sees the GPU:
docker exec winnow /app/.venv/bin/python3 -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
ROCm exposes itself as CUDA in PyTorch —
Truehere means ROCm is working.
-
Confirm the DRI device is passed through:
devices: - /dev/dri group_add: - render
-
Ensure
OPENVINO_DEVICE=GPUis set — without it, OpenVINO defaults to CPU even with device passthrough. -
Verify OpenVINO sees the device:
docker exec winnow /app/.venv/bin/python3 -c \ "from openvino.runtime import Core; print(Core().available_devices)"
Expected:
['CPU', 'GPU']. If only['CPU']appears, the Level Zero / OpenCL runtime can't see the device — check group membership and device permissions.
winnow logs ONNX providers available: [...] and the selected execution provider at DEBUG level on every start. Enable verbose mode and inspect the log:
docker exec winnow cat /app/frigate_train/winnow.log | grep -i "provider\|execution\|cuda\|rocm\|openvino"The upload tracker is stored in CACHE_DIR (/app/.if_cache by default). If this volume isn't persisted between runs, the tracker resets and images are re-uploaded.
Make sure /app/.if_cache is mounted to a persistent host path.
To clear the upload history for one person and start fresh:
RESET_PERSON=JohnRemove this after one run. It deletes winnow-managed Frigate training files for that person, wipes their upload history, then processes normally. Manually-added Frigate files are never touched.
-
Selected images are too blurry: Raise
BLUR_THRESHOLD(default120.0) — e.g.200requires sharper images -
Too many images rejected for blur: Lower
BLUR_THRESHOLD— e.g.80accepts blurrier images -
Too many images rejected for small face size: Lower
MIN_FACE_WIDTH(default90) — e.g.50accepts smaller face crops -
Selected images have poor-quality face crops: Raise
MIN_FACE_WIDTH— e.g.150requires larger face crops -
Too many images rejected for low detection confidence: Lower
MIN_CONFIDENCE(default0.7) — e.g.0.5accepts lower-confidence detections -
Training set includes marginal detections: Raise
MIN_CONFIDENCE— e.g.0.85requires more confident detections -
Rejected images being re-tried: Set
RETRY_REJECTED=truefor one run