Skip to content

v1.3 — Round v11 alias state cache (huge demo3 wins across G4 + Intel)

Choose a tag to compare

@matthewdeaves matthewdeaves released this 11 May 18:25
· 58 commits to master since this release

One fat universal binary, six retro Macs spanning 23 years (1999 G3 → 2019 i5 iMac).

Drop-in install

  1. Download quakespasm-fat-app-v0.97.0.zip
  2. Unzip; you get Quakespasm.app + id1/ (per-machine autoexec configs) + quakespasm.pak
  3. Drop id1/PAK0.PAK + id1/PAK1.PAK from your registered Quake install in next to the app
  4. Double-click Quakespasm.app

The fat binary holds three slices (ppc_750 G3, ppc_7400 G4 + AltiVec, x86_64 Lion+). dyld picks the right one at launch. host.c reads sysctl hw.model and chains the right per-machine autoexec on top of the per-arch baseline, so the same bundle delivers a hand-tuned visual stack on each machine without a per-target build.

Bench fleet (Round v11.1, d64427d)

Machine OS GPU demo1 1024 demo3 1024 (alias-heavy)
yosemite (PowerMac1,1, 1999, G3 449 MHz) Panther 10.3.9 Rage 128 16 MB 16.95 19.90
sawtooth (PowerMac3,1, 1999, G4 500 MHz) Tiger 10.4.11 GeForce2 MX 32 MB 40.25 46.90
quicksilver (PowerMac3,5, 2001, G4 733 MHz) Tiger 10.4.11 Radeon 9000 Pro 64 MB 62.75 84.05
mini-g4 (PowerMac10,1, 2005, G4 1.25 GHz) Tiger 10.4.11 Radeon 9200 32 MB 48.45 65.60
mini-intel (Macmini2,1, 2007, C2D 2.33 GHz) Lion 10.7.5 GMA 950 64 MB 72.85 44.60
imac-2019 (iMac19,1, 2019, i5-9600K) Sequoia 15.7.5 Radeon Pro 580X 8 GB 1610.95 1575.15

What's new vs v1.2

Round v11 — per-frame GL state cache in R_DrawAliasModel (b300d7b, d64427d):

R_DrawAliasModel is the hottest state-change site in the engine — ~50 GL calls per alias entity in the default config, most of which are defensive resets that match the cached value. The cache intercepts glTexEnvf(GL_TEXTURE_ENV_MODE, ...), glDepthMask, and glEnable/Disable(GL_BLEND) and no-ops calls that match the cached value (TexEnvMode is per-active-TMU so multitex paths cache correctly). Cache is reset once per frame from R_RenderScene.

Same-session bench vs v8 wrap baseline (84d3597), demo3 1024×768:

  • sawtooth +31.2 % (35.75 → 46.90)
  • quicksilver +37.7 % (61.05 → 84.05)
  • mini-g4 +44.8 % (45.30 → 65.60)
  • mini-intel +21.0 % (36.85 → 44.60)
  • imac-2019 +19.8 % (1314.90 → 1575.15)
  • yosemite −3.9 % (within noise band; cache compiled out via QS_DISABLE_ALIAS_STATE_CACHE)

v11.1 tightening (d64427d) — always_inline, struct-coalesced cache state, __builtin_expect hints. Recovered most of the v11.0 demo1/demo2 regression on cached machines while preserving the demo3 wins. iMac demo3 picked up an extra +8 % from the inlining alone (+11.0 % → +19.8 %).

G3 ppc750 slice excludes the cache entirely. Compile-time QS_DISABLE_ALIAS_STATE_CACHE macro (defined when __ppc__ && !__VEC__ && !__ALTIVEC__) collapses every cache helper to its raw glXxx call. The cache code, cvar, and R_AliasStateCache_FrameReset body are all stripped from the G3 ppc750 slice of the fat binary — nm on build/quakespasm-g3 confirms zero cache symbols. Per-machine gating in action: Rage 128's driver hates the cache pattern (small negative on yosemite), so the slice that doesn't benefit ships without it.

Toggleability: gl_aliasstate_cache 1 (CVAR_ARCHIVE, default on for G4 / Lion / iMac) lets you A/B the contribution at runtime — defensively re-asserted in each per-machine autoexec where the cache is wired in.

Icon refresh: new Q1 cyborg / Strogg + Q1 logo composition (hand-cleaned in Photoshop). The legacy-only ICNS pipeline from v1.2 still applies — Panther / Tiger Finder display the icon correctly without needing per-OS swaps.

Tooling: scripts/build-icon.py + scripts/rebuild-icon.sh consolidated into a single scripts/make-icon.py with --keep-bg for hand-cleaned PNGs (the canonical workflow).

Docs: new mini-intel multi-tenancy section documents the cross-build host's shared use with the Q2 sister project. Both projects use isolated rsync upload dirs (mini-intel:quakespasm/ vs mini-intel:quake2/) and workstation-local build artifacts, so concurrent compiles are safe modulo CPU contention on the 2-core Core 2 Duo.

Hardware reach

Era First model Last model Span
PowerPC PowerMac1,1 (1999) PowerMac10,1 (2005) 6 years
Intel Macmini2,1 (2007) iMac19,1 (2019) 12 years

Six bench machines, 23 years of Apple silicon, one fat universal binary, one source tree.

Known constraints

  • G3 yosemite ships at 800×600 default; 1024×768 is also playable on demo3 (19.9 fps) but interactive play feels sharper at 800×600. Override with vid_width N; vid_height M; vid_restart in console.
  • Sawtooth GeForce2 MX is the only G4 without a fragment-shader water path; uses the classic-warp r_oldwater 1 fallback like the G3.
  • The G4 trio + mini-intel show a persistent −2 to −4 % on demo1 / demo2 1024 vs v8 wrap (brush-heavy, low cache hit rate). The demo3 wins dominate in absolute fps.

Source

🤖 Built with Claude Code