[0.2.0] 2026-05-02
Added
-
Image.FaceDetection— fast face detection with bounding boxes, confidence scores, and the five canonical facial landmarks (right eye, left eye, nose tip, right mouth corner, left mouth corner). Default model is YuNet 2023-March hosted atopencv/face_detection_yunet— MIT licensed, ~340 KB on disk, real-time on CPU. Functions:detect/2,boxes/2,crop_largest/2,draw_boxes/3. Thecrop_largest/2helper is the wire-in point for face-aware crop bias used by siblingimage_plug(gravity: :face, ImageKitz-, Cloudflareface-zoom). -
Image.Background— class-agnostic foreground/background separation.remove/2returns the input image with the background made transparent (alpha mask applied);mask/2returns the foreground mask alone for custom compositing. Default model is BiRefNet lite (MIT, ~210 MB), powered by Ortex. -
Image.Captioning— natural-language description of an image.caption/2returns a string like"a man riding a horse with a bird of prey". Default model is BLIP base (BSD-3-Clause, ~990 MB), powered by Bumblebee. Heavy enough that it is not autostarted by default; configureautostart: trueor add the child spec to your supervisor. -
Image.ZeroShot— classify an image against arbitrary labels you supply at call time, no retraining.classify/3returns[%{label, score}]sorted descending;label/3returns just the best label;similarity/3computes CLIP-space cosine similarity between two images. Default model is OpenAI CLIP ViT-B/32 (MIT, ~600 MB), powered by Bumblebee. Default prompt template"a photo of {label}"boosts accuracy on bare-noun labels; override or disable as needed. -
New flags
--background,--caption, and--zero-shotformix image_vision.download_modelsto pre-fetch the new defaults.
Changed
- The
:fileslist inmix.exsnow shipslogo.jpgso the docs render the project logo on hexdocs.pm.
See the README for the full feature list and the background, captioning, and zero-shot guides