Release v1.0.5 — Two-phase captioning (caption only viable photos) · akalavol/LoRA-Dataset-Coach

Your idea, made even smarter

You suggested: in "All" mode, do WD14 → Florence → JoyCaption in order.
That's exactly the new pipeline — but with a key improvement: the heavy
captions run last AND only on photos worth keeping.

The problem with the old behavior

In "All" mode, every image went face → CLIP → WD14 → Florence → JoyCaption
before moving to the next. JoyCaption (~30-120s/image) ran on every photo,
including the ones about to be rejected (blurry, wrong person, duplicates).

New two-phase pipeline

Phase 1 — fast analysis on all images: face detection, CLIP, quality,
WD14 tags, AI detection, artifacts → computes the viability verdict.
Phase 2 — Florence-2 / JoyCaption run only on viable / borderline
photos. Rejects are skipped entirely.

Impact

On a 200-photo dataset with ~80 viable, JoyCaption now runs on 80 images
instead of 200 — roughly 60% less time on the slow step.

Bonus:

Phase 2 has its own progress bar + live preview, so you see the fast
analysis results (and the dataset verdict) before the slow captioning.
The cache stores phase-2 captions, so if you later keep more photos, only
the newly-kept ones get captioned.
Per-target scores correctly reflect the captions.

Updating

Git: git pull
In-app: ⚙ Config → 🔄 Check now → ⬇ Install update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.5 — Two-phase captioning (caption only viable photos)

Choose a tag to compare

Sorry, something went wrong.