Releases · mrbizarro/phosphene

11 Jun 11:13

v3.1.0

e25e433

v3.1.0 — Ideogram 4 (local text-rendering image model) Latest

Latest

Ideogram 4 — text that lands where you put it

Phosphene now runs Ideogram 4 locally on Apple Silicon — an open-weight 9.3B image model that's exceptional at rendering real, legible text — with a visual text-placement canvas: position each line of text with a box and the model renders it there.

Generated locally in Phosphene. Every text element — title, subtitle, the four specimen labels, the footer — was placed by its caption bounding box.

Using it

Pick Ideogram 4 in Image Studio.
One-time setup: add a Hugging Face Read token in Settings → API tokens, and accept the (free, non-commercial research) license on the model page. Your first render downloads the ~26 GB weights.
Place text with the Simple or Layout canvas, or paste a raw caption.
Speed presets: Fast (12 steps) and Quality (20 steps).

Under the hood

mflux 0.18 (mflux-generate-ideogram4), gated ideogram-ai/ideogram-4-fp8 (9.3B DiT + Qwen3-VL text encoder + FLUX.2 VAE).
Fixed a gated-download auth bug so your configured token is always used for the download — it previously fell back to a stale cached hf auth login and failed with a 401 for anyone whose cache held a different account.

Update via Pinokio. Personal/research use is free; commercial use needs a license from Ideogram.

Assets 3

11 Jun 09:45

mrbizarro

v3.0.13

814dc2d

v3.0.13 — character LoRA fixes

Two bug fixes in the character / LoRA path — both reported by @claude3d in #5 and validated end-to-end before shipping.

Fixed

Character LoRA with a space in its filename (e.g. Annie Phosphene_v2.safetensors) was misclassified as a style-only LoRA, and the Character tab showed "No trained characters yet" — even when the sidecar was correct. The id-validation regex rejected spaces and gated the character scan before the sidecar was read. Spaces are now allowed end-to-end — classification, selection, and the /characters/<id>/… routes — so multi-word character names work with no renaming.
Character tab silently rendering plain text-to-video. Switching away from the Character tab and back left the avatar's selection ring lit while the underlying selection had been cleared — so Generate shipped no character and you got a plain T2V clip with the trigger doing nothing. The selection indicator now stays in lockstep with the actual selection, so a character that looks selected really is. (For reference: the Params panel showing "Text" is by design — the character_id drives the LoRA stack, not the mode label.)

Also

The prompt Enhance feature now correctly preserves the active character trigger token.

Update via Pinokio. (Ideogram 4 support is still cooking on a separate branch.)

Contributors

claude3d

Assets 2

31 May 15:29

mrbizarro

v3.0.7

74d2df4

v3.0.7 — GPU-race fix + version visibility

Follow-up patch closing items found in an external code review of the v3.0.6 hardening batch.

Fixes

No more image-vs-video GPU collision. The previous guard only stopped you starting an image during a video render — not the reverse. Now a single process-wide GPU gate covers both directions: if a video, image, or training render is running, a new image request waits politely instead of starting a second heavy job that could OOM a memory-constrained Mac.
The panel now reports its LTX engine version. /status carries the installed ltx-2-mlx version, so a bug report always says what's actually running (and a version mismatch is surfaced, not just buried in a log).
Internal correctness: the canonical contributor guide now matches the real ltx-2-mlx pin; the (dormant) Scenema reference server URL-encodes filenames; the VERSION stamp is corrected to 3.0.7 (the v3.0.6 tag had carried 3.0.0).

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

Assets 2

31 May 12:45

mrbizarro

v3.0.6

2c4a0f2

v3.0.6 — security + stability hardening

A hardening release from a full code-review pass. Every fix was adversarially verified and validated against live T2V + character-HQ renders before shipping.

Security

CivitAI token leak fixed. The LoRA-download host check (endswith("civitai.com")) also matched lookalike domains, and your CivitAI API token rode through HTTP redirects to any host. Now an exact-host allowlist plus a redirect handler that strips the auth header the moment a redirect leaves civitai.com. If you've used the CivitAI browser, this one matters.

Fixes

HDR / IC-LoRA mode revived. It referenced an undefined variable and crashed at startup on every run — it had never actually worked. One-line fix.
Image-during-video OOM guard. Generating an image while a video render was in flight could run two heavy GPU jobs at once and kill the video on a memory-constrained Mac. The image request now waits politely until the render finishes.
HiDream renders can no longer hang the queue forever (watchdog deadline), and HiDream is now covered by the RAM pre-flight so a too-small Mac gets a clear message instead of an OOM.
Stats history is now crash-safe (atomic write — a crash mid-save no longer wipes accumulated data).
Smaller: orphaned temp-file cleanup at boot, tighter crash-report file permissions, a Qwen-LoRA family guard, and a CivitAI cert-bundle pin so HTTPS can't break after an update.

Under the hood

The helper now reports the exact ltx-2-mlx version it loaded, and warns loudly on a version mismatch — the root cause behind several recent hard-to-diagnose render failures.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

Assets 2

31 May 07:30

mrbizarro

v3.0.5

32f0c46

v3.0.5 — A2V kwarg signature shim

Targeted fix for the M5 user (#5).

Fixed

`combined_image_conditionings() got an unexpected keyword argument 'frame_rate'` on installs whose upstream `ltx-2-mlx` is past v0.14.0. Some v0.14.x point release dropped the `frame_rate` kwarg; our 2026-05-13 patch was still forwarding it, so any I2V / A2V job crashed at the conditioning step.

The wrapper now probes the live signature once and routes accordingly — forwards `frame_rate` iff the upstream function accepts it, strips otherwise. Works for both pre- and post-removal upstreams.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

Assets 2

31 May 06:44

mrbizarro

v3.0.4

f59a43c

v3.0.4 — CivitAI SSL fix

Small patch.

Fixed

CivitAI search in the Image LoRA panel was failing with `SSL: CERTIFICATE_VERIFY_FAILED` on every fresh macOS install (#16). Pinokio installs Python via uv, which doesn't inherit macOS system certs. Added `SSL_CERT_FILE` and `REQUESTS_CA_BUNDLE` env vars to start.js pointing at certifi's bundle. Diagnosed and patch-recommended by @PiotrAstroCamp — thanks.

Update

In Pinokio: Stop the Phosphene app, then Start again. A regular Update won't be enough — Pinokio reads start.js at process spawn, not at panel boot.

Carry-over

v3.0.3: HiDream options hidden pending lab repo (#15)
v3.0.2: Boost/Turbo accel speed restored
v3.0.1: FFLF crash fix

Contributors

PiotrAstroCamp

Assets 2

28 May 12:13

mrbizarro

v3.0.3

0608ed1

v3.0.3 — HiDream hidden + speed fix

Small follow-up to v3.0.2.

Fixed

HiDream-O1 Image Studio engines hidden from the dropdown (#15). The engines expected a standalone clone of the HiDream-O1-MLX-LAB repo at `~/HIDREAM-O1-MLX-LAB-active/`, which hasn't been published yet — so selecting any of them crashed for everyone outside the dev machine. Hidden until the lab repo is public.

Carry-over from v3.0.2

Boost/Turbo accel speed restored. Standard T2V at turbo: 503s → ~302s (−40%).
FFLF crash fix from v3.0.1.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

What still works for image generation

Qwen-Image-Edit (Fast / Standard / Quality) — full Image Studio coverage. Fast (Lightning 4-step) is the default.

Assets 2

28 May 12:08

mrbizarro

v3.0.2

2d888ea

v3.0.2 — speed restored

For the past two months Boost and Turbo have been silently doing nothing — or doing the wrong thing. Two unrelated bugs landed within hours of each other on May 9 and cancelled each other out, so the symptom was just "renders feel slow." This release fixes both.

What was broken

The acceleration monkey-patch in mlx_warm_helper.py targeted only TextToVideoPipeline.denoise_loop. After ltx-2-mlx renamed that to DistilledPipeline (~May 9), the patch hit 3 of 9 import sites. The active Standard T2V path through _base.BasePipeline.generate was never patched — every Boost/Turbo run since May 9 silently went full denoise.
Same evening, commit 3e4bfd8 bumped protected_tail from ceil(N/3) → ceil(N/2) to fix Turbo eye/skin artifacts. On the 8-step schedule this dropped cacheable slots from 3 to 2 — meaning Turbo became identical to Boost.

What you'll see

Standard T2V, accel=turbo: 503s → ~310s (~38% faster). Boost: 503s → 364s. Compound env+turbo: ~251s (~50% faster). All v3.0.1 fixes (FFLF crash, update auto-detect) carry forward.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

Issue #9 reply (oo2music)

hi @oo2music — v3.0.2 just shipped. The FFLF crash fix from v3.0.1 is in there (the decode_and_stream kwarg shim into get_kf_pipe), plus a separate pair of fixes that restore Boost/Turbo speed — both had been silently no-op'ing since May 9.

You're already on v3.0, so one Update from the Phosphene tile should get you to v3.0.2. If FFLF works for you now, feel free to close this — and if anything's still off, paste a fresh trace and I'll reopen.

Issue #14 reply (ronyeoh)

hi @ronyeoh — the Update-twice path from the earlier reply should have walked you from 2.0.0 through 3.0.1; v3.0.2 just shipped on top of that.

Worth flagging: v3.0.2 also fixes two speed bugs that had been cancelling each other out since May 9 — Boost and Turbo were silently running at full denoise (or worse, doing the same thing as each other). Standard T2V at turbo goes from ~503s → ~310s after the fix. If renders felt slow on your end, that's likely why.

Update from the Phosphene tile — twice if you're still on 2.0.0, once if you made it to 3.x. Paste a fresh trace if it still won't move and I'll dig in.

Contributors

oo2music and ronyeoh

Assets 2

26 May 09:24

mrbizarro

v3.0.1

6305a8a

v3.0.1 — FFLF crash fix

Patch release. Two small fixes on top of v3.0.0.

Fixes

FFLF (First Frame / Last Frame keyframe interpolation) crashed with VideoDecoder.decode_and_stream() got an unexpected keyword argument 'frame_rate' on machines that pulled a post-rename upstream ltx-2-mlx. The runtime kwarg shim was being installed by every render path EXCEPT FFLF — fixed. Reported by @oo2music in #9.
Update flow now auto-detects the configured upstream so the same update.js works for both stable and internal dev clones. No user-facing change for normal Pinokio installs.

How to update

Click Update in Pinokio. If you're still on v2.x, click Update twice — first click runs the old v2 script, second click runs v3.

Contributors

oo2music

Assets 2

23 May 06:10

mrbizarro

v3.0.0

9b4cd98

Phosphene 3.0 — characters, voice, Image Studio, A2V

⚠️ Updating from v2.x? Click Update TWICE.

Pinokio runs your existing v2 update.js on the first click — it pulls the new code but doesn't install 3.0's new Python deps (ltx-trainer, mlx-vlm for Gemma auto-caption, mflux 0.17.5, plus a handful of transitive packages). The panel will boot to errors after the first click. Click Update again and Pinokio runs the new 3.0 script, installs everything, and the panel boots clean. Fresh installs are unaffected.

What's in 3.0

This is the biggest release since the project started. 341 commits since v2.0.5.

Character training (in-panel)

Drop 30 to 80 photos, click Train, get a face LoRA back
Add a voice clip and get a voice LoRA stacked with it
Gemma 3 12B auto-captions the dataset locally
Letterbox crop preserves wide and portrait sources
~3 hours per character on M4 Max 64 GB
Validated recipe: rank 32, alpha 32, 100 epochs, lr 1e-4, 512² square

Image Studio

Three native-MLX engines: Qwen-Image-Edit-2511, our own MLX port of HiDream-O1, and the FLUX.1 family via mflux
Multi-reference composition up to 3 subjects
HiDream-O1 ported in 5 days after upstream release. ~67 s per 1024² on 64 GB
Family-install gate at the panel level — refuses to submit if the engine binary isn't installed

Audio-to-Video

New A2V mode drives video motion from an audio reference
Works with character LoRAs

Joint audio+video stays the differentiator

LTX-2 emits a synced audio track alongside the frames in the same model pass
Most local video models (Wan, Hunyuan, Mochi, CogVideoX) are silent
Ambient/diegetic audio is where it shines; dialogue is hit-or-miss

Full panel redesign

Three top-level tabs: Video, Images, Train Character
Capability tier auto-detection. Low-RAM Macs see a clean limited surface; 64 GB+ sees everything
Character is a first-class mode pill on the Video tab, not a buried chip
Round avatars, click-to-switch, voice badge, rename/delete in one click
Vertical-player chrome moved outside the right edge so 9:16 clips aren't occluded

Performance

Q8 HQ default for character renders. ~6 min for a 7-second clip at 1024×576 on M4 Max 64 GB (was ~15 min in 2.x)
Codex skip-step optimization on Q8 HQ (~12% faster)
Adaptive wall-time estimates that learn from your machine after two renders
TeaCache wired through both Extend and A2V stage 1
Server-side panel watchdog rescues stuck MLX deallocators (was a 10+ min freeze post-decode on Balanced)

Stability + UX

Eleven UX contradictions audited and fixed (Codex C+ pass)
Load Params on any clip restores the actual seed used and reopens in the right mode (Character / Video / Image)
Mac memory pressure shown as actual pressure %, not sticky swap
50+ small fixes across the panel, helper, and installer

Engineering

341 commits since v2.0.5
ltx-2-mlx pinned to v0.14.0 (dgrauet's request — upstream breaking changes deferred)
Server-side enforcement that Q4 + character_id is rejected (was producing identity-degraded output)
New post-decode SIGKILL rescue in the panel for stuck helper processes

Hardware

Apple Silicon only. No Intel Mac, no Linux, no Windows.

16 / 24 GB: 512 px video, image gen works
32 GB: 768 px
64 GB+: 1024×576 video, full HD image, character training
96 / 128 GB: same caps. Model is the bottleneck, not RAM.

Install

One-click via Pinokio (search Phosphene). Or clone the repo and follow the README.

Credits

Lightricks for LTX-Video 2.3 (their license applies to the weights)
dgrauet/ltx-2-mlx for the MLX port that makes this possible
HiDream AI for HiDream-O1
Qwen team + mflux for Qwen-Image-Edit
Apple for MLX
Phosphene the panel is MIT

Assets 2

Releases: mrbizarro/phosphene

v3.1.0 — Ideogram 4 (local text-rendering image model)

Ideogram 4 — text that lands where you put it

Using it

Under the hood

Uh oh!

v3.0.13 — character LoRA fixes

Fixed

Also

Contributors

Uh oh!

v3.0.7 — GPU-race fix + version visibility

Fixes

Update

Uh oh!

v3.0.6 — security + stability hardening

Security

Fixes

Under the hood

Update

Uh oh!

v3.0.5 — A2V kwarg signature shim

Uh oh!

v3.0.4 — CivitAI SSL fix

Contributors

Uh oh!

v3.0.3 — HiDream hidden + speed fix

Uh oh!

v3.0.2 — speed restored

v3.0.2 — speed restored

Issue #9 reply (oo2music)

Issue #14 reply (ronyeoh)

Contributors

Uh oh!

v3.0.1 — FFLF crash fix

Fixes

How to update

Contributors

Uh oh!

Phosphene 3.0 — characters, voice, Image Studio, A2V

⚠️ Updating from v2.x? Click Update TWICE.

What's in 3.0

Character training (in-panel)

Image Studio

Audio-to-Video

Joint audio+video stays the differentiator

Full panel redesign

Performance

Stability + UX

Engineering

Hardware

Install

Credits

Uh oh!