feat(abi): plug-in ABI major v2 — struct_size on DP vtables + loader reject + #342 re-scan (ADR-020 / v1.6.0)#351
Merged
Conversation
…reject + #342 re-scan (ADR-020) Ships ADR-020 rules 1–3 as the coordinated runtime v1.6.0 major-version bump, preventing the class of bug that broke standalone-VK weaving (leia plugin pinned runtime headers v1.4.1 while runtime was v1.5.2 → DP-vtable offset skew → runtime's set_chroma_key call hit the plug-in's destroy). ABI v2 — struct_size + append-only + loader enforcement: * xrt_display_processor.h: add `uint32_t struct_size; uint32_t reserved_0;` as the 8-byte first-field header. Rewrites the #348 tripwire asserts to anchor at XRT_DP_BASE_OFF = offsetof(.., process_atlas) (asserted == 8 on both 64-bit and 32-bit/Android) plus i*sizeof(void*). Adds XRT_DP_HAS_SLOT(xdp, field): bounds the field against the plug-in's reported struct_size, so appending a new method at the END of the vtable is now backward- AND forward-compatible within a major. Gates all 12 optional inline wrappers on HAS_SLOT in addition to the per-pointer NULL check; mandatory process_atlas (slot 0) + destroy (last) stay un-gated. * xrt_display_processor_{d3d11,d3d12,gl,metal}.h: same 8-byte header + a per-API tripwire block (XRT_DP_{D3D11,D3D12,GL,METAL}_BASE_OFF) + HAS_SLOT gating on every optional wrapper. d3d12 has the extra set_output_format slot (index 1) the other APIs omit. The XRT_DP_ABI_ASSERT / XRT_DP_ABI_MSG / XRT_DP_HAS_SLOT macros are #ifndef-guarded in both the base and per-API headers so any include order is safe (no MSVC C4005 redefinitions). * xrt_plugin.h: XRT_PLUGIN_API_VERSION_2 = 2; XRT_PLUGIN_API_VERSION_CURRENT points at it. v1 → v2 is the one-time break introducing the struct_size header on the DP vtables and turning the loader's version log into an enforced reject. ABI-v1 plug-ins (≤ leia v1.0.5) are rejected and must rebuild against v2 headers. * target_plugin_loader.c: rule 3 enforcement — after a successful xrtPluginNegotiate but BEFORE iface->probe, reject any plug-in whose reported plugin_api_version != XRT_PLUGIN_API_VERSION_CURRENT in all three try_load_one variants (Windows registry, Android, POSIX/JSON). The DLL is unloaded and the loader falls back to the next plug-in (sim_display). * sim_display_processor{,_d3d11,_d3d12,_gl,_metal}.{c,cpp,m}: every factory sets base.struct_size = sizeof(struct xrt_display_processor[_<api>]) before assigning the vtable. calloc already zeroes reserved_0. * oxr_plugin_stub.c: static_assert pin moved from XRT_PLUGIN_API_VERSION_CURRENT == _1 to == _2. Folded #342 — durable DP re-scan at per-client compositor create: The bundle finalize service-restart (shipped in Track A / v0.5.0) covers the fresh-bundle-install ordering, but a service started mid-install outside the bundle path (or a standalone service) still bakes the already-discovered factories into xrt_system_compositor_info forever. This change picks up a vendor plug-in registered AFTER the service started on the next app launch, without requiring a service restart. Mechanism (callback-on-syscomp-info, layering-clean — confirmed with user): * xrt_compositor.h: add `void (*refresh_display_processors)(...)` function-pointer field to xrt_system_compositor_info. * target_plugin_loader.{c,h}: new public target_plugin_refresh_active(). Tracks the winning ProbeOrder of the active plug-in in a new static g_active_probe_order. discover_active_plugin now takes a uint32_t max_probe_order filter — the first-call path passes UINT32_MAX, the refresh path passes the active ProbeOrder so it only attempts STRICTLY-better candidates (never re-probes the active plug-in). Mutex-guarded via os_mutex (not C11 atomics — MinGW caveat per CLAUDE.md). The previous DLL is intentionally leaked on swap, consistent with the existing load-path leak. * target_instance.c: factor xsysc->info.dp_factory_* assignment block into fill_dp_factories_from_plugin() and install refresh_display_processors_cb (calls target_plugin_refresh_active + re-derives factories) on xsysc->info.refresh_display_processors. The in-process / handle path leaves the field NULL — fresh-instance-per- launch already covers it. * comp_multi_compositor.c (VK service path) and comp_d3d11_service.cpp (D3D11 service path) invoke the callback at the top of multi_compositor_create / system_create_native_compositor — i.e. once per IPC client at compositor create, before any DP-factory read. Cleanups bundled with the v1.6.0 bump: * comp_vk_native_compositor.c: the dp_vtable_looks_sane degrade-log now names "a plug-in built against a different runtime ABI major (ADR-020)" alongside the heap-collision possibility (the misleading old wording was flagged in project_plugin_abi_policy_and_release). * comp_d3d11_window.cpp (#345, dev-only): the dev-manifest auto-resolution at WM_WORKSPACE_LAUNCH_APP is now gated on `getenv("DISPLAYXR_DEV") != NULL && getenv("XR_RUNTIME_JSON") == NULL` — installs (with no build/Release/ sibling) no longer force a stale XR_RUNTIME_JSON onto child apps. Docs: * docs/adr/ADR-020: status Proposed → Accepted; Status/rollout section reflects rules 1–3 done in v1.6.0, the per-API DP structs that gained struct_size, and leia v1.0.6's rule-5 pin self-check. * docs/specs/runtime/plugin-discovery.md: §6 rewritten for v2 — describes XRT_DP_HAS_SLOT, the strict major-match reject (rule 3), and the XRT_PLUGIN_API_VERSION_3 reservation for the next break. * CMakeLists.txt: VERSION 1.5.2 → 1.6.0. Validation: scripts\build_windows.bat build → exit 0; all _Static_asserts (base + 4 per-API blocks) hold; DisplayXRClient.dll + displayxr-service.exe + DisplayXR-SimDisplay.dll all relink. Pre-existing C4090 + cmake_install \d warning unchanged. Hardware testing requires the matching leia v1.0.6 (Track B2). With v1.6.0 + the currently-shipped leia v1.0.5 (ABI v1), the loader will log "ABI major mismatch — plugin_api=1, runtime expects 2 ... skipping (ADR-020 rule 3)" and fall back to sim_display SBS — intentional. Memory: project_plugin_abi_policy_and_release. Plan: ~/.claude/plans/task-ship-the-displayxr-ticklish-quokka.md (Track B). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Track B1 — runtime v1.6.0, plug-in ABI major v2
Ships ADR-020 rules 1–3 as the coordinated runtime major bump that prevents the class of bug that broke standalone-VK weaving (leia v1.4.1 headers vs runtime v1.5.2 → DP vtable offset skew → runtime's
set_chroma_keycall hit the plug-in'sdestroy).Track A (bundle v0.5.0 with the leia v1.0.5 pin fix +
#346version gate +#342finalize restart) shipped 2026-05-27 and was validated end-to-end on the Leia box (fresh install + upgrade, no reboot). With Track A out, this PR is unblocked.ABI v2 —
struct_size+ append-only + loader enforcementxrt_display_processor.h: 8-bytestruct_size/reserved_0header at the top of the vtable. guard(abi): compile-time tripwire on the plug-in DP-vtable ABI + ADR-020 #348 asserts rewritten to anchor atXRT_DP_BASE_OFF(=offsetof(.., process_atlas), asserted== 8on both 64- and 32-bit) plusi*sizeof(void*). NewXRT_DP_HAS_SLOT(xdp, field)macro bounds the field against the plug-in's reportedstruct_size, so appending a new method at the END of the vtable is now backward AND forward compatible within a major. All 12 optional inline wrappers gate onHAS_SLOTin addition to the NULL check; mandatoryprocess_atlas(slot 0) +destroy(last) stay un-gated.xrt_display_processor_{d3d11,d3d12,gl,metal}.h: same 8-byte header + a per-API tripwire block (XRT_DP_{D3D11,D3D12,GL,METAL}_BASE_OFF) +HAS_SLOTgating on every optional wrapper.d3d12has the extraset_output_formatslot the other APIs omit. TheXRT_DP_ABI_ASSERT/XRT_DP_ABI_MSG/XRT_DP_HAS_SLOTmacros are#ifndef-guarded in both base and per-API headers so any include order is safe (no MSVC C4005).xrt_plugin.h:XRT_PLUGIN_API_VERSION_2 = 2;XRT_PLUGIN_API_VERSION_CURRENTnow points at it. The v1 → v2 break is the one-time introduction ofstruct_sizeon the DP vtables + the flip from loader-log to loader-reject. ABI-v1 plug-ins (≤leia v1.0.5) are rejected.target_plugin_loader.c: rule 3 enforcement — after a successfulxrtPluginNegotiatebut beforeiface->probe, reject any plug-in whose reportedplugin_api_version != XRT_PLUGIN_API_VERSION_CURRENTin all threetry_load_onevariants (Windows registry, Android, POSIX/JSON). DLL unloaded; loader falls back to the next plug-in /sim_display.sim_display_processor{,_d3d11,_d3d12,_gl,_metal}.{c,cpp,m}: every factory setsbase.struct_size = sizeof(struct xrt_display_processor[_<api>])before assigning the vtable.oxr_plugin_stub.c:_Static_assertpin moved fromXRT_PLUGIN_API_VERSION_CURRENT == _1to== _2.Folded #342 — durable DP re-scan at per-client compositor create
Track A's bundle finalize service-restart deterministically covers the fresh-bundle-install ordering, but a service started mid-install outside the bundle path (or a standalone service) still bakes the already-discovered factories into
xrt_system_compositor_infoforever. This change picks up a vendor plug-in registered AFTER the service started on the next app launch, without requiring a service restart.Layering-clean mechanism (callback-on-syscomp-info — confirmed with @dfattal):
xrt_compositor.h: addvoid (*refresh_display_processors)(...)fn-ptr toxrt_system_compositor_info.target_plugin_loader.{c,h}: new publictarget_plugin_refresh_active(). Tracks the winning ProbeOrder in a new staticg_active_probe_order;discover_active_pluginnow takes auint32_t max_probe_orderfilter — first-call path passesUINT32_MAX, refresh path passes the active ProbeOrder so it only attempts strictly-better candidates (never re-probes the active plug-in).os_mutexguarded (not C11 atomics — MinGW caveat). Previous DLL intentionally leaked on swap, consistent with the existing load-path leak.target_instance.c: factor thexsysc->info.dp_factory_*assignment block intofill_dp_factories_from_plugin(); installrefresh_display_processors_cb(callstarget_plugin_refresh_active+ re-derives factories) onxsysc->info. In-process / handle path leaves the field NULL — fresh-instance-per-launch already covers it.comp_multi_compositor.c(VK service) andcomp_d3d11_service.cpp(D3D11 service) invoke the callback at the top ofmulti_compositor_create/system_create_native_compositor— once per IPC client, before any DP-factory read.Cleanups bundled with the bump
comp_vk_native_compositor.c: thedp_vtable_looks_sanedegrade-log now names "a plug-in built against a different runtime ABI major (ADR-020)" alongside the heap-collision possibility.comp_d3d11_window.cpp(Dev-manifest auto-resolution forces stale XR_RUNTIME_JSON onto child apps on dev boxes #345, dev-only): dev-manifest auto-resolution atWM_WORKSPACE_LAUNCH_APPnow gated ongetenv("DISPLAYXR_DEV") != NULL && getenv("XR_RUNTIME_JSON") == NULL. End-user installs (nobuild/Release/sibling) no longer have a staleXR_RUNTIME_JSONforced onto child apps.docs/adr/ADR-020: statusProposed → Accepted; rules 1–3 marked Done (v1.6.0, ABI major 2); per-API DP structs called out.docs/specs/runtime/plugin-discovery.md: §6 rewritten for v2.CMakeLists.txt:VERSION 1.5.2 → 1.6.0.Validation
scripts\build_windows.bat build→ exit 0._Static_asserts hold (base + 4 per-API blocks) on MSVC x64.DisplayXRClient.dll(1.81 MB),displayxr-service.exe(1.62 MB),DisplayXR-SimDisplay.dll(60 KB) all relink.oxr_session.c:2867C4090 + the unrelatedcmake_install.cmake:101\descape).Sequencing — what ships next
/release v1.6.0cuts the tag.displayxr-leia-plugin):struct_sizein 4 leia factories, re-pinDXR_RUNTIME_GIT_TAG "v1.5.2" → "v1.6.0", workflowref: 'v1.5.2' → 'v1.6.0', rule-5 CI pin self-check,VERSION 1.0.5 → 1.0.6. Mechanical (~10 min) once the v1.6.0 tag exists.versions.jsonre-pin to runtime v1.6.0 + leia v1.0.6.Expected behavior post-merge (before B2 ships)
On the Leia box with the currently-shipped
leia v1.0.5(ABI v1), the new loader will log:…and fall back to
sim_displaySBS. That's the intentional Track-B cliff —leia v1.0.6(B2) restores the weave.Plan:
~/.claude/plans/task-ship-the-displayxr-ticklish-quokka.md(Track B).Memory:
project_plugin_abi_policy_and_release.🤖 Generated with Claude Code