Skip to content

feat(producer): add pngDecodeBlitWorkerPool (hf#732 PR 2/5)#757

Merged
vanceingalls merged 1 commit into
mainfrom
vai/677-2-decode-pool
May 13, 2026
Merged

feat(producer): add pngDecodeBlitWorkerPool (hf#732 PR 2/5)#757
vanceingalls merged 1 commit into
mainfrom
vai/677-2-decode-pool

Conversation

@vanceingalls
Copy link
Copy Markdown
Collaborator

Summary

PR 2 of 5 in the hf#732 decomposition stack. Adds a worker_threads-based pool that offloads PNG decode + alpha-blit onto a fixed-size pool. No production wiring yet — the pool stands alone and ships behind a later PR in the stack.

New files

  • packages/producer/src/services/pngDecodeBlitWorker.ts — worker entry. Imports from @hyperframes/engine/alpha-blit (zero-import TS source, survives the new Worker(<path>) loader boundary).
  • packages/producer/src/services/pngDecodeBlitWorkerPool.ts — fixed-size pool with run() API. Uses transferList for buffer ownership transfer (no 16bpc HDR buffer copies).
  • packages/producer/src/services/pngDecodeBlitWorkerPool.test.ts — 6 vitest tests pinning byte-equivalence with inline path, transferList correctness, concurrent dispatch, termination semantics. All pass.

Build wiring

  • packages/cli/tsup.config.ts: second tsup entry emits dist/pngDecodeBlitWorker.js next to dist/cli.js. Without this entry the pool's new Worker(<path>) would fail at runtime in the shipped CLI.
  • packages/producer/build.mjs: third esbuild entry mirrors the wiring for direct producer consumers.
  • packages/engine/package.json: adds ./alpha-blit subpath export pointing at src/utils/alphaBlit.ts.

Stack

Stacked on top of #756 (PR 1: worker-count cap). No behavior change in any render.

Test plan

  • 6 pool tests pass
  • Producer + engine typecheck clean
  • oxlint clean

— Vai

jrusso1020
jrusso1020 previously approved these changes May 12, 2026
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: APPROVE — clean standalone pool, no production wiring yet

Per Rule 5: pulled check_runs at 978f5103 — all required CI green (Lint, Build, Test, Typecheck, regression, regression-shards, preview-regression, player-perf, etc.). Graphite mergeability_check in_progress (normal for stack base).

Audited: pngDecodeBlitWorkerPool.ts end-to-end (~350 lines).
Trusting: the 6 tests (byte-equivalence + transferList + concurrent dispatch + termination), the worker entry pngDecodeBlitWorker.ts, and the 3-place build wiring (cli/tsup, producer/build.mjs, engine/package.json subpath export) — verified the files exist + are properly registered, didn't audit internals.

What I checked carefully

The pool is well-engineered. Specific load-bearing details that hold up:

  1. Node <8KB Buffer pool defensive copy (:165-200) — Node's Buffer.alloc(N) returns a slice over a shared 8KB pool ArrayBuffer for buffers under 8KB. postMessage with transferList rejects shared-pool ArrayBuffers with DataCloneError. The pool handles this by copying into a dedicated Uint8Array(N).buffer before transfer. PNG inputs are the realistic small-buffer source (sparse DOM layers, low-bytes screenshots). Defensive copy with logged-warning on dest-side too. Right call.

  2. Lifecycle correctness (:284-309) — terminate() rejects queued tasks first, then in-flight per-slot tasks with "terminated mid-task", then Promise.all over worker.terminate() with .catch(() => undefined) to swallow individual worker shutdown errors. Idempotent (if (terminated) return).

  3. onWorkerError / onWorkerExit (:222-241) — reject the in-flight task with informative message (crashed mid-task: ${err.message}; dest buffer lost — useful for debugging). The "dest buffer lost" framing correctly conveys that the transferred dest can't be recovered after a worker crash.

  4. transferList contract is documented at both the API JSDoc (:33-41) and inline (:127-152). The caller-must-use-result.dest invariant is the most error-prone aspect of worker_threads pools; this docstring spells it out clearly.

  5. buildExecArgv (:101-114) — handles the vitest tsx-loader-not-in-execArgv case correctly. Best-effort tsx/esm resolution with silent fallback for prod. Matches the existing shaderTransitionWorkerPool pattern (per docstring claim).

Concerns

None blocking. One small observation: the traceEnabled env (HF_PNG_DECODE_BLIT_POOL_TRACE=1) is logged at info level via the optional logger. If the operator sets the trace flag but doesn't pass a logger, the trace messages silently drop (logger is ??= {}). That's fine for opt-in tracing, but worth a one-line comment noting "trace requires passing a logger to surface output."

Praise

  • "No production wiring yet" is honest scope. Standalone pool that's testable on its own; PR 4 wires it. Clean stack discipline.
  • The "Why a SEPARATE pool from shaderTransitionWorkerPool" docstring (:14-22) is exactly the kind of design rationale that ages well — a future reader will know why these aren't a single shared pool.
  • ArrayBuffer-pool-defense is a real-world bug class that bites Node worker_threads integrations regularly. Codifying the workaround here saves the next pool author from learning it the hard way.

Review by Rames Jusso (pr-review)

miguel-heygen
miguel-heygen previously approved these changes May 12, 2026
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review additive to Rames.

Clean standalone pool. Two small notes:

  1. queue.unshift(task) in run() when an idle slot exists (:887). This puts the task at the front of the queue, then dispatchNext immediately shift()s it back off. Functionally correct but the round-trip through the array is unnecessary — could dispatch directly to the idle slot without touching the queue. Micro-optimization, not blocking.

  2. Worker entry path resolution falls through to .ts without checking existsSync on the TS path (:661-663). If neither .js nor .ts exists (e.g., corrupted install), the error surfaces at new Worker(entry) with a generic module-not-found error. A pre-flight existsSync check with a descriptive error message would save debugging time. Non-blocking.

Tests are thorough — the 8KB pool threshold test and the pipelining proof are exactly what this pool needs.

— Magi

@vanceingalls vanceingalls force-pushed the vai/677-1-worker-cap branch from 327a47b to f108f1e Compare May 13, 2026 20:36
@vanceingalls vanceingalls force-pushed the vai/677-2-decode-pool branch from 978f510 to 2dac19a Compare May 13, 2026 20:36
@vanceingalls vanceingalls changed the base branch from vai/677-1-worker-cap to graphite-base/757 May 13, 2026 21:04
@vanceingalls vanceingalls force-pushed the vai/677-2-decode-pool branch from 2dac19a to 703a6f8 Compare May 13, 2026 21:05
@graphite-app graphite-app Bot changed the base branch from graphite-base/757 to main May 13, 2026 21:05
@graphite-app graphite-app Bot dismissed stale reviews from miguel-heygen and jrusso1020 May 13, 2026 21:05

The base branch was changed.

hf#732 lever-4. Adds a `worker_threads`-based pool that offloads PNG
decode + alpha-blit onto a fixed-size pool, mirroring the existing
shader-blend worker pool. No production wiring yet — the pool stands
alone and ships behind a later PR in the stack.

Why a separate pool: PNG decode (zlib inflate) plus 16bpc alpha-blit
both cost real wall-time on every layered frame. Doing them inline on
the main event loop forces all DOM workers to converge on one Node
thread for compositing, which is the next bottleneck after the
shader-blend dispatch was fixed in this stack's PR-3.

Pieces:

* `pngDecodeBlitWorker.ts` — the worker entry. Imports `decodePng` +
  `blitRgba8OverRgb48le` from `@hyperframes/engine/alpha-blit` (the
  subpath added here on the engine package). The worker file has zero
  internal cross-package imports, so it survives the `new Worker(<path>)`
  loader boundary without dragging in the producer's module graph.

* `pngDecodeBlitWorkerPool.ts` — fixed-size pool with `run()` API
  returning a Promise. Uses `transferList` for buffer-of-ownership
  semantics so we never serialize the 16bpc HDR frame buffer; the
  worker decode owns the input, the main thread owns the blitted
  output.

* `pngDecodeBlitWorkerPool.test.ts` — 6 vitest tests pinning byte-
  equivalence with the inline path, transferList correctness across
  the 8KB Node pool threshold, and concurrent dispatch / termination
  semantics. All pass.

Build wiring:

* `packages/cli/tsup.config.ts`: second tsup entry emits
  `dist/pngDecodeBlitWorker.js` alongside `dist/cli.js`. The pool's
  resolver probes for that file next to its loaded module. Without
  this entry the pool would crash or silently fall back to inline
  decode/blit at runtime in the shipped CLI, killing the perf gain.

* `packages/producer/build.mjs`: third esbuild entry emits the worker
  as `dist/services/pngDecodeBlitWorker.js` for direct producer
  consumers. Adds the `@hyperframes/engine/alpha-blit` workspace
  alias to the existing `workspaceAliasPlugin` so both builds resolve
  the import the same way.

* `packages/engine/package.json`: adds `./alpha-blit` subpath export
  pointing at `src/utils/alphaBlit.ts`. The file is already
  import-free (only `zlib`) so the worker survives the loader boundary
  directly via this TS source.

No behavior change. PR 2 of 5 in the hf#732 decomposition stack;
stacked on top of #PR1 (worker-count cap bump).

-- Vai

Co-Authored-By: Vai <vai@heygen.com>
@vanceingalls vanceingalls force-pushed the vai/677-2-decode-pool branch from 703a6f8 to ede5cde Compare May 13, 2026 21:05
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after stack rebase dismissed previous approvals. Same commit ede5cde37e — no code changes since the last round.

Previous review notes from me (queue.unshift round-trip, missing existsSync on .ts fallback) were non-blocking and I stand by them as future cleanup candidates, but neither is worth holding the stack for.

Fresh pass focusing on the areas I previously trusted to Rames:

Worker lifecycle — solid. Workers are spawned in a try/catch that tears down partial spawns on failure (lines 253-268). onWorkerError rejects the in-flight task and marks the slot idle. onWorkerExit rejects mid-task if exit happens outside terminate(). terminate() drains the queue first, then rejects in-flight tasks, then calls worker.terminate() on all slots with .catch(() => undefined). Idempotent via terminated flag.

One observation: after onWorkerExit fires for an unexpected exit, the dead worker's slot stays in slots with busy: false and a terminated worker. If new tasks arrive they could be dispatched to the dead worker's slot, which would postMessage to a terminated worker and throw. In practice this PR has no production wiring yet (PR 4 does that), and the pool is created/terminated within a single render lifecycle, so an unexpected worker exit during active use would already be a fatal condition. But worth noting for when production wiring lands — a respawn-or-remove-slot strategy in onWorkerExit would make the pool more resilient.

Buffer transfer correctness — the 8KB shared-pool defense is thorough. The two-stage check (backing fits exactly, then Uint8Array.slice fallback) handles all the edge cases. The worker side correctly uses Buffer.from(arrayBuffer, offset, length) to re-wrap without copying.

Tests — the 6 tests cover the right properties. The pipelining test (frame N+1 capture overlaps frame N decode) is a good proof that the pool actually provides the intended concurrency benefit. The termination test correctly handles the race between task completion and pool shutdown.

Build wiring — the three-place wiring (cli/tsup, producer/build.mjs, engine/package.json subpath export) is consistent. The @hyperframes/engine/alpha-blit alias in both build configs resolves to the same source file.

Clean standalone pool with no production wiring. Approving.

— Magi

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approved at ede5cde3 — carrying forward my prior APPROVE with focused notes on what changed since 978f5103.

Stack root resolution

hf#756 (the worker-count cap, stack root) is now merged ✓ — the stack-blocker concern from yesterday is closed. This PR can land as soon as it's stamped.

What's new since my prior review (focused on the delta)

The substantive addition is the explicit workerEntryPath factory option + matching test + bundled-CLI build wiring. I read those carefully:

  1. workerEntryPath option (pngDecodeBlitWorkerPool.ts:142-153) is the right defensive shape for the bundled-CLI case. resolveWorkerEntry() now has a 4-tier order: explicit option → HF_*_WORKER_ENTRY env → sibling .js → sibling .ts. The docstring names why explicit-path is needed (in a bundled CLI, import.meta.url resolves to the bundle path — cli.js — not the worker's emitted path, so the sibling probe lands in the wrong directory). Without this option, the bundled-CLI case would silently fail or crash. The hf#677 reference in the test comment makes the regression target explicit. ✓

  2. tsup.config.ts:7-22 emits BOTH cli and pngDecodeBlitWorker entries with an @hyperframes/engine/alpha-blit alias pointing at the import-free alphaBlit.ts source. This is exactly what's needed so the worker survives the new Worker() filesystem load boundary — the worker doesn't pull in the full engine graph, just the alpha-blit utility. The inline comment ("alphaBlit.ts is import-free (only zlib) so the worker survives the worker_thread loader boundary directly via this TS source") names the load-bearing constraint.

  3. producer/build.mjs:61-76 mirrors the wiring with a separate esbuild entry. Symmetric — both producer-direct consumers and CLI-bundled consumers get the worker emitted at the expected sibling path.

  4. engine/package.json subpath export "./alpha-blit": "./src/utils/alphaBlit.ts" — clean conditional export, no import.meta.url indirection. ✓

  5. New test spawns from an explicit workerEntryPath, bypassing the import.meta.url resolver — explicitly references the hf#677 bundled-CLI bug as the regression target, runs the explicit-path spawn against an inline-blit reference for byte-equivalence proof (not just spawn-success). This is the right Rule-7-shape proof: "the explicit-path spawn actually ran real work, not just spawned and crashed silently."

Prior advisory items at HEAD

Neither was addressed; both remain non-blocking nits:

  • Magi's queue unshiftshift round-trip (pngDecodeBlitWorkerPool.ts:393-398) — same pattern as before. Micro-optimization; doesn't affect correctness.
  • Magi's existsSync on .ts fallback (resolveWorkerEntry line 158) — still falls through to .ts without checking. With the new explicit workerEntryPath option in play, this matters less — the bundled-CLI case (which is where the resolver was most likely to land in a corrupted state) now uses the explicit path. For dev/test paths where the heuristic still runs, the worst case is a new Worker() module-not-found error with a generic message. Cosmetic, not gating.
  • My prior trace-without-logger observation is unchanged. Same disposition.

CI

In_progress on the new SHA. Completed and green: Lint, Format, File size check, player-perf, Preview parity, CodeQL, Detect changes. Test, Typecheck, Build, CLI smoke, all regression-shards, both windows jobs, preview-regression still running. No failures.

Carry-forward praise

The praise from my prior review still applies to the unchanged surfaces:

  • 8KB Node Buffer pool defensive copy with Uint8Array(N).buffer for transfer
  • Lifecycle correctness in terminate() (queued → in-flight → Promise.all over worker.terminate())
  • onWorkerError / onWorkerExit with informative dest buffer lost framing
  • Documented transferList contract at both JSDoc and inline
  • "Why a SEPARATE pool from shaderTransitionWorkerPool" rationale ages well

Verdict

APPROVE re-affirmed. The new workerEntryPath + build wiring fills the bundled-CLI hole correctly. Ready to merge — stack root is in, CI green so far. Ship 🚀

Review by Rames Jusso (pr-review)

@vanceingalls vanceingalls merged commit 92bccfd into main May 13, 2026
39 checks passed
@vanceingalls vanceingalls deleted the vai/677-2-decode-pool branch May 13, 2026 21:52
vanceingalls added a commit that referenced this pull request May 13, 2026
## Summary

PR 3 of 5 in the hf#732 decomposition stack. Adds a `worker_threads`-based pool that runs the shader-transition blend (one of 15 transition shaders) on a fixed-size worker pool. **No production wiring yet** — the pool stands alone; PR 4 wires it.

The shader blend is a hot inner loop over every pixel of every transition frame at 16bpc. Moving it off the main event loop removes the JS-event-loop ceiling that capped throughput in earlier hf#732 iterations.

### New files

- `packages/producer/src/services/shaderTransitionWorker.ts` — worker entry. Imports from `@hyperframes/engine/shader-transitions` (zero-import TS source).
- `packages/producer/src/services/shaderTransitionWorkerPool.ts` — fixed-size pool. Uses `transferList` so the 16bpc HDR `from`/`to`/`out` buffers move by ownership.
- `packages/producer/src/services/shaderTransitionWorkerPool.test.ts` — 6 vitest tests pinning byte-equivalence across all 15 shaders, transferList correctness, pool lifecycle. All pass.

### Build wiring

- `packages/cli/tsup.config.ts`: third tsup entry emits `dist/shaderTransitionWorker.js`.
- `packages/producer/build.mjs`: fourth esbuild entry for direct producer consumers.
- `packages/engine/package.json`: adds `./shader-transitions` subpath export.

## Stack

Stacked on top of #757 (PR 2: pngDecodeBlit pool). No behavior change in any render.

## Test plan

- [x] 6 pool tests pass
- [x] Producer + engine typecheck clean
- [x] oxlint clean

— Vai
@jrusso1020 jrusso1020 mentioned this pull request May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants