Skip to content

Fix UBSan undefined-behavior hits in bgfx_p.h (cherry-pick of bkaradzic/bgfx#3688)#60

Merged
bghgary merged 1 commit intoBabylonJS:update-bgfxfrom
bghgary:fix-sortkey-default-init
Apr 24, 2026
Merged

Fix UBSan undefined-behavior hits in bgfx_p.h (cherry-pick of bkaradzic/bgfx#3688)#60
bghgary merged 1 commit intoBabylonJS:update-bgfxfrom
bghgary:fix-sortkey-default-init

Conversation

@bghgary
Copy link
Copy Markdown

@bghgary bghgary commented Apr 24, 2026

[Created by Copilot on behalf of @bghgary]

Cherry-picks the upstream UBSan fix from bkaradzic/bgfx#3688 (merged commit ac54f48d) onto BabylonJS/bgfx:update-bgfx.

What this fixes

Two distinct UB hits in src/bgfx_p.h that surface only under -fno-sanitize-recover=all UBSan (i.e. when sanitizer findings are made fatal):

  1. SortKey uninitialized bool read. SortKey had no default initializers — initialization happened only via reset(). EncoderImpl constructed an embedded m_key whose m_hasAlphaRef was then read by encodeDraw() before reset() ran, producing 'load of value 190, which is not a valid value for type bool'. Fix: call m_key.reset() as the first thing in EncoderImpl(). Narrower than default-initializing the type because all other SortKey consumers always decode(...) before reading.

  2. BitMaskToIndexIteratorT shift-by-width. bx::countTrailingZeros(0) returns the type width, which was then used as a shift exponent — UB for the zero-mask case. Fix: branchless ternary mask = (0 == _mask) ? 0 : _mask >> ntz; in both the constructor and next(). Behavior preserved (isDone() returns true on the same iterations as before).

Validation

Verified end-to-end via temporary repoint of BN #1676 (MacOS_Sanitizers job runs UBSan with halt_on_error=1):

  • Pre-fix: MacOS_Sanitizers failed with both findings.
  • Post-fix (this commit): full BN pipeline green, including MacOS_Sanitizers (28✅).

Cherry-pick provenance

Single squash commit copied via git cherry-pick -x from upstream merge ac54f48d. Diff is identical to the upstream squash. Standard (cherry picked from commit …) footer included.

Follow-ups (separate PRs)

After this merges:

  1. Bump the bgfx submodule pointer on BabylonJS/bgfx.cmake to the merge commit of this PR.
  2. Bump bgfx.cmake GIT_TAG in BabylonNative CMakeLists.txt to that bgfx.cmake bump SHA (replaces the temporary bghgary/bgfx.cmake validation pin currently on BN External renderbuffers/drawables and external flipping (OpenGL, Metal, Direct3D) bkaradzic/bgfx#1676).

@bghgary bghgary changed the title Default-initialize SortKey members to eliminate UBSan bool hit Fix UBSan undefined-behavior hits in bgfx_p.h Apr 24, 2026
@bghgary bghgary force-pushed the fix-sortkey-default-init branch from 2d37a62 to ee97801 Compare April 24, 2026 19:52
* Default-initialize SortKey members to eliminate UBSan bool hit

SortKey data members had no initializers; the only initialization path
was an explicit reset() call. Reading m_hasAlphaRef (bool) before
reset() ran surfaced under BabylonNative's -fno-sanitize-recover=all
UBSan as 'load of value 190, which is not a valid value for type bool'.
Integer siblings had the same hazard, but UBSan only flags bools.

Add in-class default initializers matching reset()'s values. Zero
runtime cost where reset() immediately follows. No API/ABI change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Avoid shift-by-width UB in BitMaskToIndexIteratorT

UBSan on macOS flags the shifts at bgfx_p.h:854 and :862. bx::countTrailingZeros returns the type width (e.g. 32 for uint32_t) when its argument is zero, and that value was fed directly into the shift operator, producing shift-by-width UB.

Guard each shift against a zero mask so the width-valued ntz is never used as a shift exponent. Behavior is preserved for every input that previously avoided UB, and callers see isDone() == true in exactly the same iterations as before.

* Address review feedback: tighter BitMask ternary + SortKey reset() in EncoderImpl ctor

Per review on bkaradzic#3688:

- BitMask: use branchless ternary (0 == _mask ? 0 : _mask >> ntz) in place of early-return guard.

- SortKey: drop in-class default initializers; initialize via m_key.reset() in EncoderImpl() ctor, which is the narrower location where the UB read originates.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
(cherry picked from commit ac54f48)
@bghgary bghgary force-pushed the fix-sortkey-default-init branch from ee97801 to b91e4c0 Compare April 24, 2026 21:55
@bghgary bghgary changed the title Fix UBSan undefined-behavior hits in bgfx_p.h Fix UBSan undefined-behavior hits in bgfx_p.h (cherry-pick of bkaradzic/bgfx#3688) Apr 24, 2026
@bghgary bghgary marked this pull request as ready for review April 24, 2026 21:56
@bghgary bghgary merged commit 7e4749e into BabylonJS:update-bgfx Apr 24, 2026
bghgary added a commit to BabylonJS/BabylonNative that referenced this pull request Apr 25, 2026
## Context

Our sanitized macOS/Linux jobs enable
`-fsanitize=address,undefined[,vptr]`, so the UB checks are active — but
**UBSan is recoverable by default**. On a hit it prints a diagnostic to
stderr and keeps running. The process doesn't abort, the test doesn't
fail, CI stays green, and the diagnostic is buried in test output nobody
reads.

Adding `-fno-sanitize-recover=all` is what converts each UBSan hit into
a `SIGABRT` → test failure. We were missing it.

## Change

Three commits:

1. **Make UBSan findings fatal in sanitized builds.** Add
`-fno-sanitize-recover=all` to the Clang sanitizer `add_compile_options`
line in `CMakeLists.txt`, and set
`UBSAN_OPTIONS=halt_on_error=1:print_stacktrace=1:symbolize=1` and
`ASAN_OPTIONS=abort_on_error=1` in `build-macos.yml` / `build-linux.yml`
as runtime belt-and-suspenders. (MSVC only supports ASan, which already
aborts on error.)
2. **Bump glslang to merged PR #5 SHA** — picks up
[BabylonJS/glslang#5](BabylonJS/glslang#5)
(suppress UBSan enum check on SPIR-V mask bit-combining operators).
3. **Bump bgfx.cmake to merged #115 (UBSan fix)** — picks up
[BabylonJS/bgfx.cmake#115](BabylonJS/bgfx.cmake#115),
which bumps the bgfx submodule to the cherry-pick of
[bkaradzic/bgfx#3688](bkaradzic/bgfx#3688)
(default-init the SortKey ints/bools so we don't load undefined values
into bitfields) via
[BabylonJS/bgfx#60](BabylonJS/bgfx#60).

The dependency bumps in commits 2 and 3 fix every UBSan finding
currently being printed (and ignored) by the sanitized jobs on `master`,
so this PR should be green on first run.

## Out of scope

- **metal-cpp's intentional "nil-messaging" pattern.** Apple documents
sending messages to `nil` as well-defined in Obj-C, but C++ UBSan
doesn't know that. Not hit on the current pin; if a future metal-cpp
bump trips it, the standard fix is a scoped `-fno-sanitize=null` on
metal-cpp only.
- `-fsanitize=nullability` (Apple `_Nonnull` attribute checks, separate
from `null`) — worth considering as a follow-up once this lands and the
baseline is clean.
- Windows ASan already aborts on error; no change needed there.

[Created by Copilot on behalf of @bghgary]

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant