Skip to content

Releases: mrbizarro/phosphene

v3.1.0 — Ideogram 4 (local text-rendering image model)

11 Jun 11:13

Choose a tag to compare

Ideogram 4 — text that lands where you put it

Phosphene now runs Ideogram 4 locally on Apple Silicon — an open-weight 9.3B image model that's exceptional at rendering real, legible text — with a visual text-placement canvas: position each line of text with a box and the model renders it there.

Ideogram 4 sample — a field-guide poster with precisely placed text

Generated locally in Phosphene. Every text element — title, subtitle, the four specimen labels, the footer — was placed by its caption bounding box.

Using it

  • Pick Ideogram 4 in Image Studio.
  • One-time setup: add a Hugging Face Read token in Settings → API tokens, and accept the (free, non-commercial research) license on the model page. Your first render downloads the ~26 GB weights.
  • Place text with the Simple or Layout canvas, or paste a raw caption.
  • Speed presets: Fast (12 steps) and Quality (20 steps).

Under the hood

  • mflux 0.18 (mflux-generate-ideogram4), gated ideogram-ai/ideogram-4-fp8 (9.3B DiT + Qwen3-VL text encoder + FLUX.2 VAE).
  • Fixed a gated-download auth bug so your configured token is always used for the download — it previously fell back to a stale cached hf auth login and failed with a 401 for anyone whose cache held a different account.

Update via Pinokio. Personal/research use is free; commercial use needs a license from Ideogram.

v3.0.13 — character LoRA fixes

11 Jun 09:45

Choose a tag to compare

Two bug fixes in the character / LoRA path — both reported by @claude3d in #5 and validated end-to-end before shipping.

Fixed

  • Character LoRA with a space in its filename (e.g. Annie Phosphene_v2.safetensors) was misclassified as a style-only LoRA, and the Character tab showed "No trained characters yet" — even when the sidecar was correct. The id-validation regex rejected spaces and gated the character scan before the sidecar was read. Spaces are now allowed end-to-end — classification, selection, and the /characters/<id>/… routes — so multi-word character names work with no renaming.
  • Character tab silently rendering plain text-to-video. Switching away from the Character tab and back left the avatar's selection ring lit while the underlying selection had been cleared — so Generate shipped no character and you got a plain T2V clip with the trigger doing nothing. The selection indicator now stays in lockstep with the actual selection, so a character that looks selected really is. (For reference: the Params panel showing "Text" is by design — the character_id drives the LoRA stack, not the mode label.)

Also

  • The prompt Enhance feature now correctly preserves the active character trigger token.

Update via Pinokio. (Ideogram 4 support is still cooking on a separate branch.)

v3.0.7 — GPU-race fix + version visibility

31 May 15:29

Choose a tag to compare

Follow-up patch closing items found in an external code review of the v3.0.6 hardening batch.

Fixes

  • No more image-vs-video GPU collision. The previous guard only stopped you starting an image during a video render — not the reverse. Now a single process-wide GPU gate covers both directions: if a video, image, or training render is running, a new image request waits politely instead of starting a second heavy job that could OOM a memory-constrained Mac.
  • The panel now reports its LTX engine version. /status carries the installed ltx-2-mlx version, so a bug report always says what's actually running (and a version mismatch is surfaced, not just buried in a log).
  • Internal correctness: the canonical contributor guide now matches the real ltx-2-mlx pin; the (dormant) Scenema reference server URL-encodes filenames; the VERSION stamp is corrected to 3.0.7 (the v3.0.6 tag had carried 3.0.0).

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

v3.0.6 — security + stability hardening

31 May 12:45

Choose a tag to compare

A hardening release from a full code-review pass. Every fix was adversarially verified and validated against live T2V + character-HQ renders before shipping.

Security

  • CivitAI token leak fixed. The LoRA-download host check (endswith("civitai.com")) also matched lookalike domains, and your CivitAI API token rode through HTTP redirects to any host. Now an exact-host allowlist plus a redirect handler that strips the auth header the moment a redirect leaves civitai.com. If you've used the CivitAI browser, this one matters.

Fixes

  • HDR / IC-LoRA mode revived. It referenced an undefined variable and crashed at startup on every run — it had never actually worked. One-line fix.
  • Image-during-video OOM guard. Generating an image while a video render was in flight could run two heavy GPU jobs at once and kill the video on a memory-constrained Mac. The image request now waits politely until the render finishes.
  • HiDream renders can no longer hang the queue forever (watchdog deadline), and HiDream is now covered by the RAM pre-flight so a too-small Mac gets a clear message instead of an OOM.
  • Stats history is now crash-safe (atomic write — a crash mid-save no longer wipes accumulated data).
  • Smaller: orphaned temp-file cleanup at boot, tighter crash-report file permissions, a Qwen-LoRA family guard, and a CivitAI cert-bundle pin so HTTPS can't break after an update.

Under the hood

  • The helper now reports the exact ltx-2-mlx version it loaded, and warns loudly on a version mismatch — the root cause behind several recent hard-to-diagnose render failures.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

v3.0.5 — A2V kwarg signature shim

31 May 07:30

Choose a tag to compare

Targeted fix for the M5 user (#5).

Fixed

  • `combined_image_conditionings() got an unexpected keyword argument 'frame_rate'` on installs whose upstream `ltx-2-mlx` is past v0.14.0. Some v0.14.x point release dropped the `frame_rate` kwarg; our 2026-05-13 patch was still forwarding it, so any I2V / A2V job crashed at the conditioning step.

The wrapper now probes the live signature once and routes accordingly — forwards `frame_rate` iff the upstream function accepts it, strips otherwise. Works for both pre- and post-removal upstreams.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

v3.0.4 — CivitAI SSL fix

31 May 06:44

Choose a tag to compare

Small patch.

Fixed

  • CivitAI search in the Image LoRA panel was failing with `SSL: CERTIFICATE_VERIFY_FAILED` on every fresh macOS install (#16). Pinokio installs Python via uv, which doesn't inherit macOS system certs. Added `SSL_CERT_FILE` and `REQUESTS_CA_BUNDLE` env vars to start.js pointing at certifi's bundle. Diagnosed and patch-recommended by @PiotrAstroCamp — thanks.

Update

In Pinokio: Stop the Phosphene app, then Start again. A regular Update won't be enough — Pinokio reads start.js at process spawn, not at panel boot.

Carry-over

  • v3.0.3: HiDream options hidden pending lab repo (#15)
  • v3.0.2: Boost/Turbo accel speed restored
  • v3.0.1: FFLF crash fix

v3.0.3 — HiDream hidden + speed fix

28 May 12:13

Choose a tag to compare

Small follow-up to v3.0.2.

Fixed

  • HiDream-O1 Image Studio engines hidden from the dropdown (#15). The engines expected a standalone clone of the HiDream-O1-MLX-LAB repo at `~/HIDREAM-O1-MLX-LAB-active/`, which hasn't been published yet — so selecting any of them crashed for everyone outside the dev machine. Hidden until the lab repo is public.

Carry-over from v3.0.2

  • Boost/Turbo accel speed restored. Standard T2V at turbo: 503s → ~302s (−40%).
  • FFLF crash fix from v3.0.1.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.

What still works for image generation

  • Qwen-Image-Edit (Fast / Standard / Quality) — full Image Studio coverage. Fast (Lightning 4-step) is the default.

v3.0.2 — speed restored

28 May 12:08

Choose a tag to compare

v3.0.2 — speed restored

For the past two months Boost and Turbo have been silently doing nothing — or doing the wrong thing. Two unrelated bugs landed within hours of each other on May 9 and cancelled each other out, so the symptom was just "renders feel slow." This release fixes both.

What was broken

  • The acceleration monkey-patch in mlx_warm_helper.py targeted only TextToVideoPipeline.denoise_loop. After ltx-2-mlx renamed that to DistilledPipeline (~May 9), the patch hit 3 of 9 import sites. The active Standard T2V path through _base.BasePipeline.generate was never patched — every Boost/Turbo run since May 9 silently went full denoise.
  • Same evening, commit 3e4bfd8 bumped protected_tail from ceil(N/3)ceil(N/2) to fix Turbo eye/skin artifacts. On the 8-step schedule this dropped cacheable slots from 3 to 2 — meaning Turbo became identical to Boost.

What you'll see

Standard T2V, accel=turbo: 503s → ~310s (~38% faster). Boost: 503s → 364s. Compound env+turbo: ~251s (~50% faster). All v3.0.1 fixes (FFLF crash, update auto-detect) carry forward.

Update

Update once from the Phosphene tile. If you're on anything older than v3.0.1, Update twice.


Issue #9 reply (oo2music)

hi @oo2music — v3.0.2 just shipped. The FFLF crash fix from v3.0.1 is in there (the decode_and_stream kwarg shim into get_kf_pipe), plus a separate pair of fixes that restore Boost/Turbo speed — both had been silently no-op'ing since May 9.

You're already on v3.0, so one Update from the Phosphene tile should get you to v3.0.2. If FFLF works for you now, feel free to close this — and if anything's still off, paste a fresh trace and I'll reopen.


Issue #14 reply (ronyeoh)

hi @ronyeoh — the Update-twice path from the earlier reply should have walked you from 2.0.0 through 3.0.1; v3.0.2 just shipped on top of that.

Worth flagging: v3.0.2 also fixes two speed bugs that had been cancelling each other out since May 9 — Boost and Turbo were silently running at full denoise (or worse, doing the same thing as each other). Standard T2V at turbo goes from ~503s → ~310s after the fix. If renders felt slow on your end, that's likely why.

Update from the Phosphene tile — twice if you're still on 2.0.0, once if you made it to 3.x. Paste a fresh trace if it still won't move and I'll dig in.

v3.0.1 — FFLF crash fix

26 May 09:24

Choose a tag to compare

Patch release. Two small fixes on top of v3.0.0.

Fixes

  • FFLF (First Frame / Last Frame keyframe interpolation) crashed with VideoDecoder.decode_and_stream() got an unexpected keyword argument 'frame_rate' on machines that pulled a post-rename upstream ltx-2-mlx. The runtime kwarg shim was being installed by every render path EXCEPT FFLF — fixed. Reported by @oo2music in #9.

  • Update flow now auto-detects the configured upstream so the same update.js works for both stable and internal dev clones. No user-facing change for normal Pinokio installs.

How to update

Click Update in Pinokio. If you're still on v2.x, click Update twice — first click runs the old v2 script, second click runs v3.

Phosphene 3.0 — characters, voice, Image Studio, A2V

23 May 06:10

Choose a tag to compare

⚠️ Updating from v2.x? Click Update TWICE.

Pinokio runs your existing v2 update.js on the first click — it pulls the new code but doesn't install 3.0's new Python deps (ltx-trainer, mlx-vlm for Gemma auto-caption, mflux 0.17.5, plus a handful of transitive packages). The panel will boot to errors after the first click. Click Update again and Pinokio runs the new 3.0 script, installs everything, and the panel boots clean. Fresh installs are unaffected.

What's in 3.0

This is the biggest release since the project started. 341 commits since v2.0.5.

Character training (in-panel)

  • Drop 30 to 80 photos, click Train, get a face LoRA back
  • Add a voice clip and get a voice LoRA stacked with it
  • Gemma 3 12B auto-captions the dataset locally
  • Letterbox crop preserves wide and portrait sources
  • ~3 hours per character on M4 Max 64 GB
  • Validated recipe: rank 32, alpha 32, 100 epochs, lr 1e-4, 512² square

Image Studio

  • Three native-MLX engines: Qwen-Image-Edit-2511, our own MLX port of HiDream-O1, and the FLUX.1 family via mflux
  • Multi-reference composition up to 3 subjects
  • HiDream-O1 ported in 5 days after upstream release. ~67 s per 1024² on 64 GB
  • Family-install gate at the panel level — refuses to submit if the engine binary isn't installed

Audio-to-Video

  • New A2V mode drives video motion from an audio reference
  • Works with character LoRAs

Joint audio+video stays the differentiator

  • LTX-2 emits a synced audio track alongside the frames in the same model pass
  • Most local video models (Wan, Hunyuan, Mochi, CogVideoX) are silent
  • Ambient/diegetic audio is where it shines; dialogue is hit-or-miss

Full panel redesign

  • Three top-level tabs: Video, Images, Train Character
  • Capability tier auto-detection. Low-RAM Macs see a clean limited surface; 64 GB+ sees everything
  • Character is a first-class mode pill on the Video tab, not a buried chip
  • Round avatars, click-to-switch, voice badge, rename/delete in one click
  • Vertical-player chrome moved outside the right edge so 9:16 clips aren't occluded

Performance

  • Q8 HQ default for character renders. ~6 min for a 7-second clip at 1024×576 on M4 Max 64 GB (was ~15 min in 2.x)
  • Codex skip-step optimization on Q8 HQ (~12% faster)
  • Adaptive wall-time estimates that learn from your machine after two renders
  • TeaCache wired through both Extend and A2V stage 1
  • Server-side panel watchdog rescues stuck MLX deallocators (was a 10+ min freeze post-decode on Balanced)

Stability + UX

  • Eleven UX contradictions audited and fixed (Codex C+ pass)
  • Load Params on any clip restores the actual seed used and reopens in the right mode (Character / Video / Image)
  • Mac memory pressure shown as actual pressure %, not sticky swap
  • 50+ small fixes across the panel, helper, and installer

Engineering

  • 341 commits since v2.0.5
  • ltx-2-mlx pinned to v0.14.0 (dgrauet's request — upstream breaking changes deferred)
  • Server-side enforcement that Q4 + character_id is rejected (was producing identity-degraded output)
  • New post-decode SIGKILL rescue in the panel for stuck helper processes

Hardware

Apple Silicon only. No Intel Mac, no Linux, no Windows.

  • 16 / 24 GB: 512 px video, image gen works
  • 32 GB: 768 px
  • 64 GB+: 1024×576 video, full HD image, character training
  • 96 / 128 GB: same caps. Model is the bottleneck, not RAM.

Install

One-click via Pinokio (search Phosphene). Or clone the repo and follow the README.

Credits

  • Lightricks for LTX-Video 2.3 (their license applies to the weights)
  • dgrauet/ltx-2-mlx for the MLX port that makes this possible
  • HiDream AI for HiDream-O1
  • Qwen team + mflux for Qwen-Image-Edit
  • Apple for MLX
  • Phosphene the panel is MIT