Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .changelog/NEXT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Unreleased Changes

## Added

## Changed

- Bumped the bundled slashdo submodule (`lib/slashdo`) to latest `main` (`11cb89c`).

## Fixed

- **[ltx2-fflf-skips-last-image-resize-when-both-frames-set] ltx2 FFLF now resizes both anchor frames.** The two-keyframe ltx2 FFLF path passes both `--image` and `--last-image` into `scripts/generate_ltx2.py`, but Video Gen only resized the start image when both anchors were present. The end frame could therefore reach `KeyframeInterpolationPipeline.generate_and_save()` at its original dimensions. `videoGen/local.js` now treats ltx2 true-FFLF as a real last-image consumer and runs the same ffmpeg resize/crop pass used for multi-keyframes, with a regression test asserting the helper receives resized start and end paths.

## Removed
11 changes: 5 additions & 6 deletions PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,11 @@ For project goals, see [GOALS.md](./GOALS.md). For completed work, see [.changel

## Next Up

1. **[ltx2-fflf-skips-last-image-resize-when-both-frames-set]** **ltx2 FFLF skips last-image resize when both start and end frames are provided — CONFIRMED bug (verified 2026-05-30 this replan).** `server/services/videoGen/local.js:573` gates `lastImageWillBeUsed` on `mode === 'fflf' && !sourceImagePath`, so the ltx2 true-FFLF flow (both `sourceImagePath` AND `lastImagePath` set, passed via `--image` + `--last-image` at `buildLtx2Args` ~line 328) skips the `resizeImage` ffmpeg pass for the end frame; `scripts/generate_ltx2.py` `_resolve_keyframes` (~line 313) then passes both paths straight through with no resize/letterbox/pad. The end-frame image reaches `KeyframeInterpolationPipeline.generate_and_save()` at its ORIGINAL dimensions → dimension mismatch / degraded output (the multi-keyframe path already resizes each keyframe at `local.js` ~line 606; the two-keyframe legacy path was missed). Fix: resize the last image in the ltx2 two-keyframe FFLF path the same way multi-keyframes are resized, and correct the stale comment at `local.js` ~line 560 that wrongly assumes only mlx_video/Windows consume `--last-image`. (Was filed UNVERIFIED under blocked; verification this replan confirmed it real and moved it up.)
2. **[media-job-store-progress-on-job-record]** **mediaJobQueue: persist progress/statusMsg on the job record.** `runJob`'s progress handler in `server/services/mediaJobQueue/index.js` (~line 575) only broadcasts to SSE (`broadcastSse`) — it doesn't mutate the job object. So `GET /api/media-jobs/:id` (used by `MediaJobsQueue.jsx` and any other non-SSE consumer) always reports 0 progress for in-flight jobs. SSE-attached pages don't see this (lastPayload replay), but list-view hydration does. Add `job.progress` + `job.statusMsg` updates inside the dispatcher's `progress` handler. Cross-cutting across image/video/codex paths.
3. **[shared-bounded-concurrency-mapper]** **Extract a generic `mapWithConcurrency(items, n, fn)` to `server/lib/` and reuse it.** The worker-pool idiom (cursor + N workers draining a shared array, order-preserving result) is now hand-rolled in THREE places (verified 2026-05-30): `embedBatch` in `server/services/embeddings.js` (~line 158), the mapper in `server/services/catalogExtraction.js` (~line 317), and another in `server/routes/imageVideoModels.js`. A pure `server/lib/` helper would collapse all three onto one tested mechanism (barrel + README row per the maintenance rule).
4. **[video-resume-restore-keyframes]** **VideoGen resume: restore keyframes from the active job.** `ACTIVE_JOB_PARAM_FIELDS` in `server/routes/videoGen.js` (~line 739) intentionally omits `keyframes` (gallery filename + frame-index pairs for multi-keyframe FFLF) because the v1 resume effect doesn't repopulate the picker UI for them. A user who reloads mid-render on a keyframe job loses the keyframe-picker state even though every other form field is restored. Whitelist `keyframes` in the route AND wire a `setKeyframes()`-style setter in `client/src/pages/VideoGen.jsx`'s resume effect (~line 507). Independent of the SSE re-attach.
5. **[chrome-canary-followups]** **Hardening for the custom-Chrome-binary feature — remaining low/medium items.** _High-severity (a)–(f) are now DONE (verified 2026-05-30): (a) `browser/server.js` (~line 46) derives `macAppBundle` from `chromePath` on macOS; (b) both `spawn()` calls (~lines 221, 236) have `.on('error', …)` listeners; (d) both `loadConfig`s try/catch `JSON.parse`; (e) `setup-browser.js#loadConfig` warns + returns null and `applyCanaryToConfig` skips the save on unreadable config; (f) top-level `runCanarySetup()` is `.catch()`-wrapped. (c) headless-flag flip is tracked in `[setup-browser-canary-headless]` (now in Backlog with a decision)._ **Remaining low/medium:** (g) idempotency guard only checks `chromePath` — decliners + `macAppBundle`-only users get re-prompted every update; (h) `PORTOS_USE_CANARY` only matches literal `0/false/1/true` — `no/off/yes/on/True` fall through both branches; (i) `spawnSync(install.cmd, …, { stdio: 'inherit' })` for brew/winget can hang on a sudo prompt under non-TTY + `PORTOS_USE_CANARY=1`; (j) `cachedConfig` in `browserService.js` is stale relative to setup-browser's direct write (GET /api/browser/config returns pre-update values until process restart); (k) `saveConfig` uses bare `writeFileSync` — switch to the canonical `atomicWrite` (`server/lib/fileUtils.js`); (l) `spawnSync` status check treats `status: null` (spawn-failure/signal) like a non-zero exit and never logs `result.error`; (m) `optionalPath` Zod schema has no `.app`/`.exe` sanity check; (n) `launchBrowser`'s reuse-existing-Chrome early-return fires BEFORE `headlessMode` is set (wrong /health mode after a PM2 restart that reuses Chrome); (o) `detectCanary` on macOS only checks `/Applications/...` and misses per-user `~/Applications` installs.
6. **[ref-watch-phosphene-teacache-extend-a2v-denoise]** **TeaCache through Extend + A2V Stage-1 denoise loops.** From `reference-watch` review of phosphene (commits `ea98aad8` + `17be2a79`, 2026-05-28). PortOS's `scripts/generate_ltx2.py` invokes `ExtendPipeline.extend_from_video(...)` (~line 378) and `AudioToVideoPipeline.generate_and_save(...)` (~line 417) directly without passing a TeaCache controller, so the slowest two modes in the panel run full denoise. Phosphene reuses the existing Stage-1 calibration (`ti2vid_two_stages._build_teacache_controller(n_steps, thresh=0.5)`) by monkey-patching `guided_denoise_loop` on the `ltx_pipelines_mlx.retake` and `ltx_pipelines_mlx.a2vid_two_stage` modules, gating activation on a per-call module-level config dict, set before the pipeline call and cleared in `finally`. Predicted ~1.2× on extend at default threshold (up to ~3× at 1.5); similar for A2V Stage 1. Fix: install both monkey-patches at the top of `generate_ltx2.py`, wire `_EXTEND_TC_CONFIG` / `_A2V_TC_CONFIG` from the helper's `run_extend`/`run_a2v` paths, and add a `--no-teacache` CLI flag (default-on). **Decision (resolved 2026-05-30): land independently** against today's pin — the calibration helper `_build_teacache_controller` already exists at the current pin, so there's no need to couple this to the risky rename in `[ref-watch-phosphene-bump-ltx2-pin-v0148]`. Scope: small/medium.
1. **[media-job-store-progress-on-job-record]** **mediaJobQueue: persist progress/statusMsg on the job record.** `runJob`'s progress handler in `server/services/mediaJobQueue/index.js` (~line 575) only broadcasts to SSE (`broadcastSse`) — it doesn't mutate the job object. So `GET /api/media-jobs/:id` (used by `MediaJobsQueue.jsx` and any other non-SSE consumer) always reports 0 progress for in-flight jobs. SSE-attached pages don't see this (lastPayload replay), but list-view hydration does. Add `job.progress` + `job.statusMsg` updates inside the dispatcher's `progress` handler. Cross-cutting across image/video/codex paths.
2. **[shared-bounded-concurrency-mapper]** **Extract a generic `mapWithConcurrency(items, n, fn)` to `server/lib/` and reuse it.** The worker-pool idiom (cursor + N workers draining a shared array, order-preserving result) is now hand-rolled in THREE places (verified 2026-05-30): `embedBatch` in `server/services/embeddings.js` (~line 158), the mapper in `server/services/catalogExtraction.js` (~line 317), and another in `server/routes/imageVideoModels.js`. A pure `server/lib/` helper would collapse all three onto one tested mechanism (barrel + README row per the maintenance rule).
3. **[video-resume-restore-keyframes]** **VideoGen resume: restore keyframes from the active job.** `ACTIVE_JOB_PARAM_FIELDS` in `server/routes/videoGen.js` (~line 739) intentionally omits `keyframes` (gallery filename + frame-index pairs for multi-keyframe FFLF) because the v1 resume effect doesn't repopulate the picker UI for them. A user who reloads mid-render on a keyframe job loses the keyframe-picker state even though every other form field is restored. Whitelist `keyframes` in the route AND wire a `setKeyframes()`-style setter in `client/src/pages/VideoGen.jsx`'s resume effect (~line 507). Independent of the SSE re-attach.
4. **[chrome-canary-followups]** **Hardening for the custom-Chrome-binary feature — remaining low/medium items.** _High-severity (a)–(f) are now DONE (verified 2026-05-30): (a) `browser/server.js` (~line 46) derives `macAppBundle` from `chromePath` on macOS; (b) both `spawn()` calls (~lines 221, 236) have `.on('error', …)` listeners; (d) both `loadConfig`s try/catch `JSON.parse`; (e) `setup-browser.js#loadConfig` warns + returns null and `applyCanaryToConfig` skips the save on unreadable config; (f) top-level `runCanarySetup()` is `.catch()`-wrapped. (c) headless-flag flip is tracked in `[setup-browser-canary-headless]` (now in Backlog with a decision)._ **Remaining low/medium:** (g) idempotency guard only checks `chromePath` — decliners + `macAppBundle`-only users get re-prompted every update; (h) `PORTOS_USE_CANARY` only matches literal `0/false/1/true` — `no/off/yes/on/True` fall through both branches; (i) `spawnSync(install.cmd, …, { stdio: 'inherit' })` for brew/winget can hang on a sudo prompt under non-TTY + `PORTOS_USE_CANARY=1`; (j) `cachedConfig` in `browserService.js` is stale relative to setup-browser's direct write (GET /api/browser/config returns pre-update values until process restart); (k) `saveConfig` uses bare `writeFileSync` — switch to the canonical `atomicWrite` (`server/lib/fileUtils.js`); (l) `spawnSync` status check treats `status: null` (spawn-failure/signal) like a non-zero exit and never logs `result.error`; (m) `optionalPath` Zod schema has no `.app`/`.exe` sanity check; (n) `launchBrowser`'s reuse-existing-Chrome early-return fires BEFORE `headlessMode` is set (wrong /health mode after a PM2 restart that reuses Chrome); (o) `detectCanary` on macOS only checks `/Applications/...` and misses per-user `~/Applications` installs.
5. **[ref-watch-phosphene-teacache-extend-a2v-denoise]** **TeaCache through Extend + A2V Stage-1 denoise loops.** From `reference-watch` review of phosphene (commits `ea98aad8` + `17be2a79`, 2026-05-28). PortOS's `scripts/generate_ltx2.py` invokes `ExtendPipeline.extend_from_video(...)` (~line 378) and `AudioToVideoPipeline.generate_and_save(...)` (~line 417) directly without passing a TeaCache controller, so the slowest two modes in the panel run full denoise. Phosphene reuses the existing Stage-1 calibration (`ti2vid_two_stages._build_teacache_controller(n_steps, thresh=0.5)`) by monkey-patching `guided_denoise_loop` on the `ltx_pipelines_mlx.retake` and `ltx_pipelines_mlx.a2vid_two_stage` modules, gating activation on a per-call module-level config dict, set before the pipeline call and cleared in `finally`. Predicted ~1.2× on extend at default threshold (up to ~3× at 1.5); similar for A2V Stage 1. Fix: install both monkey-patches at the top of `generate_ltx2.py`, wire `_EXTEND_TC_CONFIG` / `_A2V_TC_CONFIG` from the helper's `run_extend`/`run_a2v` paths, and add a `--no-teacache` CLI flag (default-on). **Decision (resolved 2026-05-30): land independently** against today's pin — the calibration helper `_build_teacache_controller` already exists at the current pin, so there's no need to couple this to the risky rename in `[ref-watch-phosphene-bump-ltx2-pin-v0148]`. Scope: small/medium.

## Backlog

Expand Down
19 changes: 11 additions & 8 deletions server/services/videoGen/local.js
Original file line number Diff line number Diff line change
Expand Up @@ -557,20 +557,23 @@ export async function generateVideo({ pythonPath, prompt, negativePrompt = '', m
const parsedNumFrames = Number(numFrames);
const parsedFps = Number(fps);

// Resize source image to match the model resolution. mlx_video requires
// exact dimensions (it doesn't auto-pad), and pixie-forge learned the
// hard way that letting the model upscale a portrait reference makes
// garbled output.
// Resize conditioning images to match the model resolution. mlx_video and
// ltx2 both require exact dimensions (they don't auto-pad), and pixie-forge
// learned the hard way that letting the model upscale a portrait reference
// makes garbled output.
//
// Skip the last-image resize when buildArgs / the Python child won't
// actually consume it:
// - On macOS/mlx_video the FFLF fallback only triggers in `fflf` mode
// AND when no source image is also provided (single conditioning frame
// only). Anything else is a no-op, so resizing is wasted ffmpeg work.
// - ltx2 true-FFLF consumes both --image and --last-image, so resize the
// last frame even when a source image is also present.
// - On macOS/mlx_video the FFLF fallback only consumes the last image when
// no source image is also provided (single conditioning frame only).
// Anything else is a no-op, so resizing is wasted ffmpeg work.
// - On Windows we forward --last-image to generate_win.py so it can log
// status, but the diffusers pipeline only reads --image — the script
// never opens the last-frame file, so no resize is needed there either.
const lastImageWillBeUsed = !!lastImagePath && !IS_WIN && mode === 'fflf' && !sourceImagePath;
const lastImageWillBeUsed = !!lastImagePath && !IS_WIN && mode === 'fflf'
&& (model.runtime === 'ltx2' || !sourceImagePath);
// A non-null `keyframes` that ISN'T a length-≥2 array is malformed —
// fail fast instead of silently dropping it (which would produce an
// unexpected text/i2v render with the user's anchors ignored). The
Expand Down
50 changes: 49 additions & 1 deletion server/services/videoGen/local.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
*/
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { join } from 'path';
import { tmpdir } from 'os';
import { randomUUID } from 'crypto';

// ─── dep mocks (must be declared before the module import) ───────────────────
Expand Down Expand Up @@ -101,12 +102,13 @@ vi.mock('child_process', () => {
// ─── module under test ───────────────────────────────────────────────────────
// Import AFTER all vi.mock calls so the hoisted mocks are in place.
let generateChainedVideo;
let generateVideo;
let videoGenEvents;

beforeEach(async () => {
vi.resetModules();
// Re-import fresh copies so mock reset above applies cleanly
({ generateChainedVideo } = await import('./local.js'));
({ generateChainedVideo, generateVideo } = await import('./local.js'));
({ videoGenEvents } = await import('./events.js'));
});

Expand Down Expand Up @@ -339,3 +341,49 @@ describe('generateChainedVideo — extend chain arg routing', () => {
expect(innerJobIds).toHaveLength(2);
});
});

describe('generateVideo — ltx2 FFLF image resizing', () => {
it('resizes both start and end frames before passing them to the ltx2 helper', async () => {
const { execFile, spawn } = await import('child_process');
const execFileMock = vi.mocked(execFile);
const spawnMock = vi.mocked(spawn);
execFileMock.mockClear();
spawnMock.mockClear();

const jobId = 'fflf-two-frame-resize-test';
const sourceImagePath = '/mock/uploads/start.png';
const lastImagePath = '/mock/uploads/end.png';

await generateVideo({
jobId,
pythonPath: '/usr/bin/python3',
modelId: 'ltx2_unified',
prompt: 'interpolate the two anchors',
width: 512,
height: 512,
numFrames: 25,
fps: 24,
mode: 'fflf',
sourceImagePath,
lastImagePath,
});

expect(execFileMock).toHaveBeenCalledTimes(2);
expect(execFileMock.mock.calls.map((call) => call[1][1])).toEqual([
sourceImagePath,
lastImagePath,
]);

const renderCall = spawnMock.mock.calls.find(
([bin, args]) => String(bin).includes('.portos/ltx-2-mlx/.venv/bin/python3')
&& Array.isArray(args)
&& args.includes('--mode')
&& args.includes('fflf'),
);
expect(renderCall).toBeTruthy();

const args = renderCall[1];
expect(args[args.indexOf('--image') + 1]).toBe(join(tmpdir(), `resized-src-${jobId}.png`));
expect(args[args.indexOf('--last-image') + 1]).toBe(join(tmpdir(), `resized-last-${jobId}.png`));
});
});