Skip to content

Troubleshooting

sudolulo edited this page Jun 12, 2026 · 11 revisions

Troubleshooting

Container exits immediately

Check logs:

docker compose logs winnow

Common causes:

  • Missing required env varIMMICH_URL or API_KEY not set
  • Cannot reach Immich — check the URL and that Immich is running; use http:// not https:// unless you have TLS set up

"Immich API key is invalid or expired (401 Unauthorized)"

Your API key has been revoked or expired. Generate a new one in Immich → Account SettingsAPI Keys and update API_KEY in your .env.


Getting more log detail

Set VERBOSE=true in .env (or pass -e VERBOSE=true) to enable DEBUG-level output on the console. Useful for tracing exactly which images are being fetched, filtered, and selected.

The log file at /app/winnow.log always captures DEBUG regardless of VERBOSE:

docker exec winnow cat /app/winnow.log

"No people found" / nothing processed

  • Make sure Immich has completed face recognition and you have named people in your library
  • YEARS_FILTER defaults to 10 years — increase it if your tagged photos are older
  • MIN_FACE_COUNT skips people with few photos — lower or remove it

Frigate upload fails

  • Confirm FRIGATE_URL is reachable from inside the container: docker exec winnow curl $FRIGATE_URL/api/stats
  • Check Frigate v0.16+ — older versions don't have the face training API
  • Set DRY_RUN=true to verify selection without uploading

Models fail to download

winnow downloads InsightFace and HuggingFace (SigLIP) models on first run.

  • Ensure the container has internet access
  • Confirm the model volume is mounted and writable
  • If behind a proxy, set HTTP_PROXY / HTTPS_PROXY env vars

Running on CPU (no GPU)

Use the :cpu image tag (ghcr.io/sudolulo/winnow:cpu) which omits the CUDA base entirely. Or set FORCE_CPU=true with :latest to disable GPU at runtime.

Everything works on CPU but embedding computation is slower — typically tens of seconds per person instead of under a second on GPU.

Memory: CPU mode requires more host RAM than GPU mode. Set a minimum container memory limit of 2 GB (mem_limit: 2g in compose). If you have a large library — hundreds of images per person — allow more. Thumbnails are processed in bounded batches of 32 so peak RAM stays flat regardless of library size, but the embedding models themselves use roughly 1–1.5 GB.


GPU not being used / "No GPU execution provider found"

If you have a GPU and the container is running but winnow reports no GPU provider:

  1. Confirm the GPU device is visible inside the container:

    docker exec winnow nvidia-smi

    If this fails, the container doesn't have GPU access — check the deploy.resources.reservations.devices block in compose.yml and that the NVIDIA container toolkit is installed on the host.

  2. Check which ONNX providers are available:

    docker exec winnow /app/.venv/bin/python3 -c \
      "import onnxruntime as ort; ort.preload_dlls(cuda=True, cudnn=True); print(ort.get_available_providers())"

    Expected (GPU working): ['CUDAExecutionProvider', 'CPUExecutionProvider', ...]
    GPU not working: ['AzureExecutionProvider', 'CPUExecutionProvider'] — only CPU providers listed.

  3. If only CPU providers appear even though nvidia-smi works, you may be running an image older than 0.2.11. A packaging bug in 0.2.10 caused the CPU onnxruntime package to silently overwrite onnxruntime-gpu, removing GPU support without any error. Update to :latest to fix it:

    docker compose pull && docker compose up -d
  4. Set VERBOSE=true and check startup logs — winnow logs ONNX providers available: [...] at DEBUG level on every start.


Same images uploaded every run

The upload tracker is stored in CACHE_DIR (/app/.if_cache by default). If this volume isn't persisted between runs, the tracker resets and images are re-uploaded.

Make sure /app/.if_cache is mounted to a persistent host path.


Re-uploading a specific person

To clear the upload history for one person and start fresh:

RESET_PERSON=John

Remove this after one run — it clears the history and then processes normally.


Image quality issues

  • Too blurry: Lower BLUR_THRESHOLD (default 100) — e.g. 50 accepts more blur
  • Face too small: Lower MIN_FACE_WIDTH (default 50px)
  • Low confidence detections included: Raise MIN_CONFIDENCE (default 0.7)
  • Rejected images being re-tried: Set RETRY_REJECTED=true for one run

Clone this wiki locally