Skip to content

CRT / post-process shader pipeline (libretro .glsl + .slang) with shader downloader#184

Merged
JRickey merged 47 commits into
mainfrom
claude/implement-crt-shader-0caJ1
May 17, 2026
Merged

CRT / post-process shader pipeline (libretro .glsl + .slang) with shader downloader#184
JRickey merged 47 commits into
mainfrom
claude/implement-crt-shader-0caJ1

Conversation

@JRickey
Copy link
Copy Markdown
Owner

@JRickey JRickey commented May 16, 2026

Summary

Adds a post-process shader pipeline to the SSB64 PC port: libretro
single-file .glsl, .glslp presets, and .slang/.slangp shaders
are normalized, transpiled (GLSL → SPIR-V → GLSL/HLSL/MSL via glslang +
SPIRV-Cross), and run as a multi-pass chain across all three Fast3D
backends (OpenGL, Metal, D3D11). Includes the in-game shader picker,
the two-phase fetch+install shader downloader, runtime diagnostics, and
the bundled scanlines / crt-lottes shaders.

D3D11 bring-up fixes (this round)

Every shader worked on OpenGL and Metal but failed on Windows/D3D11.
Root-caused via staged runtime bisection and fixed:

  1. VS↔PS signature register-order mismatch — the root cause of the
    black screen for every shader that compiled. kStockHlslVertex
    declared SV_Position before TEXCOORD0; SPIRV-Cross emits the
    fragment input struct with the user varying first and SV_Position
    last. D3D11 links VS-output→PS-input by hardware register index
    (declaration order; SV_Position is handled separately by the
    rasterizer), so TEXCOORD0 mismatched and the PS sampled garbage
    UVs → black. Reordered VOut to mirror the SPIRV-Cross signature.
    GL/Metal were immune (SPIRV-Cross generates both stages there).
  2. Overlapping HLSL sampler registers — non-schema samplers (e.g.
    a libretro shader's bezel/LUT) all defaulted to t0/s0,
    colliding with Source; FXC hard-errored (X4500) and rejected the
    shader. They now get unique registers above the schema range.
  3. Stale scissor — the D3D11 post-process draw inherited the game
    path's ScissorEnable=TRUE + game-FB scissor rect, clipping the
    pass on the window-sized mDstFb. Now sets a full-destination
    scissor.
  4. MSAA resolve targetmGameFbMsaaResolved was allocated under
    a narrower condition than ComposeFinalFrame consumes it; with
    MSAA on + a shader active at matched resolution it was never
    created. Allocation now mirrors the consume condition.

Verified on Windows/D3D11 (NVIDIA, MSAA 4x): bundled scanlines /
crt-lottes and libretro shaders (crt-Cyclon, etc.) render
correctly; OpenGL and Metal paths unchanged.

Clean-room

Authored against public libretro shader-spec documentation, the
glslang/SPIRV-Cross APIs, and public D3D11/HLSL signature-linkage
docs. No RetroArch or GPL-licensed shader-runtime code was referenced
or copied.

Submodule

Rides a libultraship pointer bump (claude/postprocess-shaders on
the fork) carrying the four D3D11 fixes above plus the transpiler /
chain implementation.

🤖 Generated with Claude Code

claude and others added 30 commits May 11, 2026 20:30
The plan from claude/plan-crt-shader-dJzz9 is now alongside the
implementation work for easy reference. Implementation rules in §6 and
§8 are ironclad.

https://claude.ai/code/session_017QqToFcBhCpxcAqauuY1LZ
Phase 1 of the CRT/post-process shader plan
(docs/crt_shader_plan_2026-05-11.md) implemented for libultraship +
OpenGL backend. The 13-file LUS-side change set is captured as a
git-am-able patch file so it survives even though the session that
produced it could not push to the LUS fork directly.

docs/crt_shader_phase1_apply.md walks through applying the patch,
pushing the resulting LUS branch, and bumping the submodule pointer.

The patch adds:
 - GfxRenderingAPI virtuals (SupportsPostProcess /
   CreatePostProcessProgram / DestroyPostProcessProgram /
   RunPostProcess), all with no-op defaults so D3D11+Metal backends
   keep building.
 - PostProcessChain helper + PostProcessSourceLoader.
 - Interpreter::ComposeFinalFrame() (factored MSAA-resolve block)
   and per-frame CVar reconciliation in StartFrame().
 - Full OpenGL backend implementation (~240 LOC).
 - gPostProcessEnabled / gPostProcessShader CVars.
 - Bundled scanlines.glsl (original MIT work) demonstrating the
   standard Source/SourceSize/OutputSize/InputSize/FrameCount/
   FrameDirection uniform schema.

libultraship builds clean against the patch on Linux/OpenGL; full
ssb64 link is blocked only by a pre-existing GCC-vs-Clang
preprocessor issue in port/port_watchdog.cpp:448 (commit 03099c3,
unrelated to this work).

Implemented from the libretro/Mednafen public shader-uniform spec.
No code copied from RetroArch or any GPL-licensed shader runtime
(plan §8.2 clean-room rules).

https://claude.ai/code/session_017QqToFcBhCpxcAqauuY1LZ
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Patch is now committed as JRickey/libultraship@2191d4c (branch
claude/postprocess-shaders); the outer submodule pointer bump in aa3a2f1
makes the workaround patch file redundant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Outer-tree changes consuming the libultraship Phase-1 post-process feature:

- port/gui/PortMenu.cpp: add a "Post-Process Shader" combobox in
  Settings → Graphics. Selections drive the gPostProcessEnabled flag and
  the gPostProcessShader string CVar that LUS reads each frame.
- libultraship bump: pulls the Metal backend, the .msl/.hlsl sibling
  loader, and the bundled MIT crt-lottes shader.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-supplied .glsl files dropped into ./shaders/ now work on all three
backends without authoring per-backend siblings. The transpiler runs in
the core layer (glslang + SPIRV-Cross) at shader load time; hand-written
.hlsl / .msl siblings, when present, retain priority. See
libultraship@c749725.

Completes the Phase 1 follow-up tracked in docs/crt_shader_plan_2026-05-11.md
§3.2 and docs/crt_shader_phase1_apply.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop a raw `.glsl` grabbed from libretro/glsl-shaders into ./shaders/
and it just loads. The new PostProcessGlslNormalizer rewrites libretro
identifier conventions (Texture / TextureSize / TEX0 / FragColor) into
our schema, selects the FRAGMENT half of combined VS+FS files, and
neutralizes the legacy `texture2D` / `PRECISION` idioms before the
shader hits the OpenGL backend or the SPIRV-Cross transpile path.

See libultraship@6ff940d.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LUS side (libultraship@9eafc93) removes the LUS_FORCE_POSTPROCESS_TRANSPILE
dev env var, re-frames the bundled .hlsl/.msl companion files from
"Phase 1 stand-in" to "optional hand-tuned override", and updates two
"until SPIRV-Cross ships" error messages now that it ships.

docs/crt_shader_phase1_apply.md: move SPIRV-Cross transpiler + libretro
normalizer into the "shipped follow-ups" section and document the
real-world usage path (drop .glsl in ./shaders/, switch via CVar).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
libultraship@30fa5cc distinguishes source vs input video size in
PostProcessParams so per-source-row CRT shaders (Hyllian and friends)
get a proper TextureSize/InputSize ratio. Fixes the concentric-ring
moire on the freshly-added crt-hyllian-glow drop-in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the user-facing polish queued behind the multi-pass work:
move the shader dir to the LUS user-data location, replace the 3-item
ComboBox with a tree picker that walks the on-disk shader tree, add
per-shader compat=any/native sidecar metadata + Low-Res-Mode warning
modal, and a "Download libretro shader pack" button re-using the
new Updater.cpp HTTP/zip plumbing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LUS Phase 2 of the CRT shader plan now ships:

- DestroyFramebuffer added to GfxRenderingAPI so PostProcessChain
  can free intermediate FBOs on shader-switch instead of leaking
  full-resolution render targets in VRAM.
- .glslp INI preset parser (Phase 2A).
- Multipass PostProcessChain (Phase 2B): each pass owns a sized
  FBO, intermediates ping through into the chain's final mDstFb.
  Programs are staged before commit so a broken preset doesn't
  black-screen.
- Loader integration (Phase 2C): single `LoadPostProcessShader`
  returns a bundle (N sources + N configs); .glslp takes priority
  over .glsl on name lookup.
- Normalizer fix: strip ALL user in/varying/out declarations so
  libretro shaders that alias TEX0 via `#define Coord TEX0` +
  redeclare `IN vec2 Coord;` don't trigger redefinition errors.

What works:
- Single-pass .glsl shaders (unchanged from Phase 1).
- Single-pass .glslp wrappers (e.g. crt-aperture.glslp).
- Multi-pass linear chains (Source-to-Source).

Known limitation (Phase 2D follow-up, not yet shipped):
- The `Original` sampler binding isn't plumbed. Halation/glow
  presets whose final pass combines bloom with the original game
  FB (crt-easymode-halation.glslp pass 4, crt-hyllian-glow's
  resolve2.glsl) will compile but render with Original missing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Halation/glow multipass presets now have access to the un-modified
game framebuffer alongside the rotating pass-output Source.

- PostProcessParams + RunPostProcess plumb originalFb + Original-
  Size through every pass.
- Three backends bind Original to second texture/sampler slot.
- Normalizer + transpiler emit shaders with Source at slot 0 and
  Original at slot 1 (HLSL t0/t1, MSL texture(0)/texture(1)).
- 13/13 postprocess gtests pass.

See libultraship@35aee7c.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Unblocks the linearize/blur_horiz/blur_vert passes of
crt-easymode-halation and any other libretro preset that uses
the COMPAT_VARYING / COMPAT_ATTRIBUTE GLSL-version portability
macros for its varying declarations.

See libultraship@34f83d3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multipass-shader load failures rolled back staged intermediate
FBOs via the new DestroyFramebuffer, leaving tombstoned slots in
the mFramebuffers vector. Metal::EndFrame's state-clear loop
walked all slots and dereferenced the nulled mViewport /
mScissorRect pointers — SIGSEGV the next frame.

Fix in libultraship@aa1f19e skips tombstoned slots in EndFrame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A post-process load failure used to retry every frame because
mPostProcessName was always updated and !IsActive() stayed true.
Now we latch a `failed` flag and only retry when the user changes
name/enabled. Reduces log spam from ~180 lines/sec to one batch
per user change.

libultraship@9cc4259.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plumbs libretro `.glslp` `filter_linearN` / `wrap_modeN` end-to-end so
multi-pass CRT shaders that need point-sampled masks or repeat-wrapped
tile patterns render correctly. Adds PostProcessWrapMode enum,
backend (filter, wrap) sampler caches on D3D11 / Metal, params-driven
glTexParameteri on OpenGL. Original (slot 1) stays pinned to linear /
clamp-to-edge on every pass — .glslp has no per-pass override for it.

9/9 PostProcessPreset gtests pass (two new wrap_mode cases included).

See libultraship@e9ce408.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2E left the (filter, wrap)-keyed Metal sampler cache un-released
on backend destruction. Bounded at 8 entries so it didn't grow, but
shutdown leaked. Override ~GfxRenderingAPIMetal to walk the map.

See libultraship@92bc586.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes one of the parsed-but-ignored .glslp fields from the Phase 2B
limitations list. PostProcessChain::Run now modulates pp.frameCount
when a pass declares frame_count_modN > 0, so libretro scanline-crawl
/ bayer-dither shaders that index into a fixed-length pattern cycle
in their intended period instead of drifting toward UINT32_MAX.

See libultraship@eac30c9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PostProcessChain now plumbs per-pass `srgb_framebufferN` /
`float_framebufferN` through to the backend FBO allocation. OpenGL
allocates GL_SRGB8_ALPHA8 / GL_RGBA16F and toggles
GL_FRAMEBUFFER_SRGB during the pass; D3D11 picks the corresponding
DXGI format; Metal carries lazy per-format pipeline variants
because the color-attachment format is pinned into the pipeline
state. Final-pass mDstFb always stays Default-format so ImGui
keeps consuming RGBA8 LDR.

See libultraship@3624b4b.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
libretro `.glslp` `aliasN` now binds an earlier pass's output FBO to
a named sampler slot in every later pass. The chain tracks the alias
list, the normalizer / transpiler reserve a sampler slot per alias
(GLSL binding 3+i, HLSL t2+i, MSL texture(2+i)), and all three
backends bind extra textures+samplers at slots 2..N+1 with per-
producer filter / wrap state.

`<alias>Size` uniforms are NOT injected in this phase; shaders can
use textureSize(<alias>, 0). External-LUT `textures = "..."` is
still deferred to Phase 2I.

74/75 PostProcess gtests pass (new BindsAliasSamplersToHigherSlots
covers the t2/texture(2) emit).

See libultraship@8f20c5f.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`.glslp` `textures = "..."` declarations are now parsed, the PNGs
loaded via stb_image, and uploaded to the backend through a new
CreatePostProcessStaticTexture rendering-API virtual. Backends bind
them at the same sampler-slot space as aliases (slots 2..N+M+1).
PostProcessExtraBinding grows staticTextureId; backends prefer it
over sourceFb.

75/76 PostProcess gtests pass (one new ParsesExternalTexturesWithAttributes
case exercises the parser).

See libultraship@f0273ef.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit follow-up: replaces a verbatim-reproduced libretro preset in
the parser tests with a synthetic equivalent (the redistribution
posture violation — implementation code was already clean), and
reframes a normalizer comment that named a corpus author to credit
the libretro portability spec conventions generically.

No implementation behaviour change. 75/76 PostProcess gtests still
pass.

See libultraship@4e38970.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
UX plan §1 + §2 (port side). Replaces the hardcoded 3-item combo
(Off / Scanlines / CRT — Lottes) with a WIDGET_CUSTOM ImGui combo
that opens onto:

- "Off" disable affordance.
- "Bundled" group, pulled from Fast::ListBuiltinPostProcessShaders
  rather than hardcoded — adding a builtin to f3d.o2r surfaces it
  here without a port change.
- One node per folder under <user-data>/shaders/, walked via the
  new Fast::ListUserPostProcessShaders LUS API.
- Filter input at the top of the popup.

Empty-state hint shows the per-user shaders path so a fresh install
discovers where to drop files.

Bumps libultraship to dbcfeb8 (user-data path + picker walker).

Also drops the orphan gPostProcessShaderSelect cvar; we now drive
gPostProcessEnabled + gPostProcessShader directly from the picker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps libultraship to a28e48c which closes the remaining Phase 2
items from docs/crt_shader_plan_2026-05-11.md §4 S2:

- <alias>Size / <externalTexture>Size UBO members so libretro
  multipass shaders that filter sampler reads against the named-
  pass size see real values rather than zero.
- mipmap_input per-pass support, plumbed through all three
  backends (intermediate FBOs allocated with mip storage when a
  downstream pass mip-samples them; chain regens the mip chain
  before each mipmap_input pass; samplers switch to LINEAR_MIPMAP_
  LINEAR).
- #pragma parameter declarations parsed out of the user GLSL and
  exposed as live floats on the chain; the port menu now renders
  ImGui sliders per declared parameter below the shader picker.

Port-side this commit adds the RenderPostProcessShaderParameters
WIDGET_CUSTOM beneath the picker — reaches into the active
Fast3dWindow's Interpreter, pulls each pass's parameter descriptors
from the chain, and binds an ImGui::SliderFloat per entry with
descriptor.minValue / maxValue / step. "Reset to Defaults" snaps
every parameter back to its pragma default.

Slider state lives only in the chain — no cvar persistence yet,
so switching shaders or reloading the chain resets to defaults.
Adding cvar-backed persistence is a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JRickey and others added 17 commits May 12, 2026 20:49
ParseSlangPreset() lands as a parallel entry point next to the
existing .glslp parser. INI surface is identical; .slangp tags the
flavor and captures numeric unknown-key entries into a
parameterOverrides map for later #pragma parameter matching.

User testing criteria
=====================
No observable game behavior change. The .slangp path is reachable
only through the parser API; PostProcessSourceLoader still recognizes
only .glsl / .glslp, so the existing CRT shader picker is unchanged.
Sanity: load any existing .glslp preset through the menu — should
behave identically to pre-bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ParseSlangSource() — takes a .slang shader text and splits it
into per-stage GLSL buffers plus extracted #pragma metadata
(name / format / parameter declarations). Parser-only; no loader
wiring yet.

User testing criteria
=====================
No observable game behavior change. The new parser is reachable only
through its own API; the source loader still picks .glsl / .glslp.
Sanity: load any existing .glslp preset — should behave identically
to pre-bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PostProcessSlangTranspiler::Compile() takes a parsed .slang source
and produces per-backend (GLSL 330 / HLSL SM5 / MSL 2.2) vertex +
fragment text plus UBO + sampler reflection. Glslang-gated; falls
back gracefully when LUS_POSTPROCESS_TRANSPILER is off.

User testing criteria
=====================
No observable game behavior change. The transpiler is reachable only
through its own API; chain integration is Phase 3D. Sanity: any
existing .glslp preset still loads and runs identically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GfxRenderingAPI gains CreatePostProcessSlangProgram /
DestroyPostProcessSlangProgram with default `-1` / no-op fallbacks.
OpenGL implements both. PostProcessChain gains LoadSlangPasses() that
compiles a parallel set of slang passes; UnloadShader cleans them up.

Phase 3D-1 leaves the slang Run path untouched: programs compile and
sit dormant. Default-`-1` on D3D11 / Metal means slang loads will
reject cleanly on those backends until 3D-3 lands.

User testing criteria
=====================
No observable game behavior change. The slang path is reachable only
via PostProcessChain::LoadSlangPasses, which nothing in the port
currently calls (Phase 3F wires the source loader). Sanity:

  1. Load any existing .glslp preset — renders identically.
  2. Toggle post-process off / cycle shaders — no regressions, no
     spurious log lines from the new slang path.
  3. Game launches and exits cleanly on OpenGL and (Apple) Metal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GfxRenderingAPI gains RunPostProcessSlang (default no-op);
PostProcessChain::Run() dispatches into a slang-specific path when
mSlangPasses is populated. GL backend implements the draw: UBO
memcpy + sampler bind + fullscreen-triangle draw via a slang-flavor
VAO with {vec4 Position, vec2 TexCoord} attributes. UBO is packed
from artifact reflection with standard libretro semantics; user
parameters default to zero.

User testing criteria
=====================
The slang Run path is reachable but unwired — PostProcessSourceLoader
still only knows .glsl / .glslp, so the picker exposes no slang
shaders. Phase 3F wires that. Sanity:

  1. Load any existing .glslp preset — renders identically.
  2. Toggle post-process / cycle shaders — no regressions.
  3. Game launches on OpenGL (and Metal, where slang still rejects
     cleanly) without log spam.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The slang load path is now reachable from the CRT shader picker.
LoadPostProcessSlangShader probes `.slangp` then `.slang` (filesystem
then archive); the interpreter tries slang first and falls back to
the legacy .glsl / .glslp probe. New bundled `slang-scanlines.slang`
acts as the picker canary — distinguishable from legacy `scanlines`
by a subtle blue cast on dark bands.

User testing criteria
=====================
1. Launch on OpenGL. Open the CRT shader picker — `slang-scanlines`
   appears under Bundled. Select it; expect scanlines with a faint
   cool-blue cast on the darker bands.
2. Cycle to `scanlines` (legacy) and back — both load cleanly, no
   log spam.
3. On Metal (and D3D11 if built): `slang-scanlines` should fail to
   load (CreatePostProcessSlangProgram returns -1 on those backends
   until Phase 3D-3). One-time error log expected; no crashes.
4. Existing .glsl / .glslp shaders render identically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both remaining backends now implement the slang program path:
compile authored VS+FS, allocate a per-program UBO, route a shared
{vec4 Position, vec2 TexCoord} vertex buffer through SPIRV-Cross's
emitted attribute layout. With this commit `slang-scanlines` loads
and renders on macOS Metal and Windows D3D11 in addition to OpenGL.

User testing criteria
=====================
1. macOS Metal: pick `slang-scanlines` in the CRT picker. Expect
   scanlines with the faint blue cast. No fallback log line. Cycle
   shaders + toggle the gate, quit cleanly.
2. Windows D3D11: same flow. The "loaded but backend rejected it;
   falling back to legacy" message should NOT appear for
   slang-scanlines anymore.
3. Existing .glsl / .glslp shaders render identically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two comment-only tweaks: drop a corpus-derived example path and
rephrase a slang-transpiler comment to cite the public .slang format
rather than reference-implementation behavior. No runtime change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chain now allocates an `OriginalHistory<N>` capture ring + per-pass
`PassFeedback<N>` ping-pong FBOs at LoadSlangPasses time, captures
the game FB into the ring each frame, and routes the right slot to
each sampler at draw time. New bundled `slang-persistence.slang`
canary exercises `OriginalHistory1`.

User testing criteria
=====================
1. Pick `slang-persistence` in the CRT picker. Expect a mild motion
   trail behind anything moving on screen.
2. Cycle to other shaders and back — no log spam, no regressions.
3. Existing .glsl / .glslp shaders unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Outer-tree counterpart to the libultraship submodule bump on this
branch:
- port/enhancements/ShaderDownloader.cpp: libretro/glsl-shaders master.zip
  fetcher (curl + unzip/tar fallback), single-pass .glsl filter,
  per-shader .lus.json sidecar emit.
- port/gui/PortMenu.cpp: tree picker + Bundled / user-shader folders,
  Low Resolution Mode warn modal for compat=native shaders, runtime
  diagnostics panel, Reload-active-shader button, downloader button.

Saved on agent/shader-ux-wip before rolling the parent branch back to
da8e6a2 for a clean Phase 2 baseline; reapply selectively when the
slang + normalizer issues are addressed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LUS submodule bump (53c4b42 -> d2af0d6) carries three normalizer
fixes that drive the libretro single-file `.glsl` corpus accept rate
from 48% to 67% (85 -> 117 of 176 shaders), plus 25 new unit tests
across the post-process subsystems.

ShaderDownloader.cpp now invokes the same Fast::NormalizeUserGlsl +
PostProcessTranspiler::SynthesizeMissing pipeline the picker uses
against each candidate before copying it into
<user-data>/shaders/libretro/. Anything that won't transpile under
the current LUS build is skipped at install time with a counter
bump; the completion status reads "Installed N shaders (skipped M
unsupported by this build)". Self-maintaining — when the normalizer
learns to handle a new shader shape, the next download
automatically picks up the additional files without code changes
in the downloader. Refactored LooksLikeSingleFileShader to take a
buffer rather than a path so the file is read once and reused for
the cheap pre-filter and the authoritative transpile gate.

docs/crt_shader_status_2026-05-14.md: status sheet covering what
shipped since Phase 2I, current test coverage, the bug write-ups,
the install-gate, and the §8 clean-room rules recap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the one-shot "Download libretro shader pack" button with a
modal-driven flow that lets the user pick which shaders to install
from the validated subset.

Downloader (ShaderDownloader.cpp + enhancements.h):
 - New ShaderPackPhase state machine: Idle -> DownloadingCatalog ->
   ExtractingCatalog -> EnumeratingCatalog -> AwaitingSelection ->
   InstallingSelected -> Done. Error and Cancel return to Idle and
   tear down the tempdir.
 - FetchShaderPackCatalogAsync() runs phases 1-3 on a worker thread
   and publishes the validated candidate list to s_candidates.
 - InstallSelectedShaderPackAsync(stems) copies just the user-picked
   files from the held-over tempdir, writes sidecars, then cleans up.
 - ReferencesUnsupportedHistoryBinding skips libretro shaders that
   need PassPrev<N>Texture / Pass<N>Texture (multipass-history
   bindings the Phase 1 single-pass runtime can't populate). Caught
   crt-easymode-halation and 7 other shaders that compiled but
   sampled empty textures at runtime.
 - Per-phase SPDLOG_INFO so the in-game log can be tailed for
   diagnosis.

Modal (PortMenu.cpp:RenderShaderPackModal):
 - Drawn from PortMenu::DrawElement so it survives the menu collapsing.
 - 6 conceptual states share one window scaffold: Loading (3 sub-
   states with status string + Cancel), AwaitingSelection (checkbox
   list + Select All/None + name filter + "Install N shaders"
   button), Installing (status only, no cancel — copy is fast), Done
   (status + OK), Error (status + Retry/Close).
 - All checkboxes start ticked so the first interaction is to
   de-select rather than to re-tick.
 - Filter is purely visual — checkboxes outside the filter retain
   state.
 - ImGuiCond_Always position/size + ImGuiWindowFlags_NoSavedSettings
   so a stale imgui.ini can't park the modal off-screen at zero size.
 - Unsupported shaders never appear in the list — the install gate
   filters them out before the catalog publishes, so the user sees
   only the runnable subset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lter

Driven by user testing against the libretro pack on macOS:

ShaderDownloader.cpp:
 - DeriveLabel uses ASCII " - " instead of "—" (U+2014). The bundled
   ImGui font is Latin-1 only, so non-ASCII separators rendered as `?`
   in the picker (e.g. "Crt ? Hyllian Glow").
 - WritesPartialFragColor filter rejects shaders that write
   `FragColor.rgb = ...` without ever setting `.a`. Backend treats
   unwritten alpha as 0 and blending multiplies the screen to black.
   Caught zfast_crt_composite, the only file in the corpus with this
   one-off authorial bug.
 - Install is now ADDITIVE: dropped the `remove_all(outRoot)` so
   "Get more shaders" accumulates across rounds instead of replacing
   the previous round's picks. Same-stem re-picks still refresh
   bytes + sidecar via `copy_options::overwrite_existing`.
 - WipeTempDir helper closes leaks on every catalog/install error
   path. Tempdir would previously leak (potentially hundreds of MB
   after extraction) when curl/extract/enumeration failed.
 - Per-step SPDLOG_INFO + RunSilent's exit-code logging makes
   missing-tool failures (exit 127 = "command not found") visible
   in ssb64.log instead of just surfacing as "Error: extract
   failed".
 - Doc block enumerates the cross-platform tool requirements
   (curl + unzip-or-tar) and how the per-OS path resolution works.

PortMenu.cpp:
 - Replaced raw ImGui::Button/Checkbox/InputText in the modal and
   the "Get more shaders..." button with UIWidgets::Button/
   Checkbox/InputString so the styling matches the rest of the
   menu. Action coloring: LightBlue primary, Green for
   "Install N", Gray for cancel/secondary.
 - Hardcoded "CRT — Lottes" bundled label and "Filter…"
   placeholder switched to ASCII for the same font reason.
 - Fixed sizes on the action-row buttons so the layout doesn't
   reflow as labels change between phases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Feature has shipped through Phase 3E; planning/status docs are no
longer needed in-tree. History remains in git.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes the bundled crt-lottes and scanlines shaders rendering on the
OpenGL backend on Linux. Two issues in libultraship's post-process
runtime: ComposeFinalFrame left mDstFb bound when handing off to
ImGui, and the FlipY uniform was inverting an already-inverted V
axis. See LUS 96ae1833 for the deep dive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fixes

Brings in the four D3D11-only post-process fixes so the bundled and
libretro CRT shaders render on Windows/D3D11 (previously black or
backend-rejected; GL/Metal were already fine):

- VS<->PS signature register-order fix (root cause of the black screen)
- unique HLSL sampler registers (libretro multi-sampler shaders no
  longer rejected with X4500)
- full-destination scissor in the D3D11 post-process draw
- MSAA resolve-target allocation matched to its consume condition

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JRickey JRickey merged commit 4c141ef into main May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants