Skip to content

Fix: enable EBO for simple_ilist on MSVC via LLVM_DECLARE_EMPTY_BASES#2012

Open
YevheniiKotyrlo wants to merge 3 commits intofacebook:static_hfrom
YevheniiKotyrlo:fix/msvc-simple-ilist-empty-bases
Open

Fix: enable EBO for simple_ilist on MSVC via LLVM_DECLARE_EMPTY_BASES#2012
YevheniiKotyrlo wants to merge 3 commits intofacebook:static_hfrom
YevheniiKotyrlo:fix/msvc-simple-ilist-empty-bases

Conversation

@YevheniiKotyrlo
Copy link
Copy Markdown

@YevheniiKotyrlo YevheniiKotyrlo commented May 8, 2026

Summary

Add a portable LLVM_DECLARE_EMPTY_BASES macro to llvh's Compiler.h and apply it to simple_ilist so MSVC actually performs Empty Base Optimization across the class's two empty bases (list_base_type and SpecificNodeAccess). Without this attribute, MSVC pads each empty base with one byte and pointer-aligns the result, shifting the embedded Sentinel member from offset 0 to offset 8 of the simple_ilist.

The macro mirrors the existing HERMES_EMPTY_BASES one architectural layer up at include/hermes/Support/Compiler.h. llvh cannot depend on Hermes, so the same idiom is needed in both layers — that's the whole reason this surfaced inside llvh after Meta had already worked around it for PointerBase under HERMESVM_CONTIGUOUS_HEAP.

This change is purely layout-sensitive: it does not affect codegen, ABI, or behavior on any other compiler, and on MSVC it produces the same single-allocation layout that GCC and Clang already produce.

Diff shape

File Change
external/llvh/include/llvh/Support/Compiler.h +12 lines — defines LLVM_DECLARE_EMPTY_BASES next to the other class-decoration macros (between LLVM_LIBRARY_VISIBILITY and LLVM_PREFETCH). Resolves to __declspec(empty_bases) on MSVC, empty elsewhere.
external/llvh/include/llvh/ADT/simple_ilist.h -1/+8 lines on the class declaration — applies the macro and tightens the explanatory doc comment. +17 lines for a static_assert invariant in namespace ilist_detail that pins sizeof(simple_ilist<T>) == sizeof(ilist_sentinel<OptionsT>), locking the EBO collapse as a compile-time guarantee (catches macro removal, MSVC ABI changes, build-flag drift).
external/llvh/patches/simple-ilist-msvc-empty-bases.patch New file — records the divergence per the directory's documented convention. Two prior simple_ilist tweaks already follow this pattern: ilist-erase-between.patch and ilist-nondefault-destructor.patch.

When the shift is observable

Through ordinary C++ list usage, the shift is invisible — the iterators hand out node pointers, and &list.end() already accounts for the sentinel offset. It becomes load-bearing only for code that takes the address of a simple_ilist and treats it as the address of the embedded sentinel:

  • FFI consumers that wrap simple_ilist from another language and walk the linked list manually.
  • C++ code that compares an iterator's stored node pointer against &list rather than against an iterator obtained from list.end().

A real-world manifestation: jbroma/fast-flow-transform's Rust FFI (crates/hermes/src/parser/node.rs) takes a *const NodeList from hermes_get_FunctionDeclaration_params(...) and iterates by reading next pointers, comparing each against the head pointer it was given. On Linux/Clang and macOS the comparison succeeds — Sentinel is at offset 0, so &list == &list.Sentinel — and iteration terminates. On Windows MSVC, the C++ side stores &list.Sentinel (offset 8) in the linked-list pointers while Rust received &list (offset 0), so the comparison never matches; iteration walks past the end of the list, dereferences the sentinel as if it were a real Node, and trips a STATUS_ACCESS_VIOLATION 0xC0000005 on the first field load.

Repro and validation

Quickest repro (downstream, but isolates the bug to this exact attribute):

git clone --recursive https://github.com/jbroma/fast-flow-transform
cd fast-flow-transform
cargo test --release -p fft --lib hparser::tests::test1
# On Windows MSVC: STATUS_ACCESS_VIOLATION 0xC0000005
# Apply this PR's patch (or jbroma/fast-flow-transform#18 build-time hook):
cargo test --release -p fft --lib hparser::tests::test1
# Now passes.

The trivial input parse("function foo(p1) { var x = (10 + p1); }") is enough — it forces the FFI to iterate the function's params simple_ilist. Layout traces (eprintln! in NodeIterator::next) on the failing build show the off-by-8 signature directly:

NodeIterator::next: head=0x4e5fe8 cur(was)=0x4e5d18 read next=0x4e5ff0
                                                              ^^^^^^^^
                                                  head + 8 (offset of `Sentinel`)

After the patch the same trace shows read next == head and iteration terminates correctly.

Test plan

  • cargo build --release on Windows MSVC: clean build of fast-flow-transform's full Hermes static-lib link, 36.13s from clean state (build.rs applies the patch via git apply, cmake compiles all 10 Hermes static libs).
  • cargo test --release --workspace on Windows MSVC: 136 tests pass across fft, fft_ast, fft_pass, fft_node, fft_support, hermes parser/FFI crates. The FFI iteration tests (parser::hermes_parser::tests::good_parse, comments, magic_comments, parse_error) all green; before the patch they crashed with STATUS_ACCESS_VIOLATION.
  • Static_assert (Tier B): forced instantiation in namespace ilist_detail triggers per-translation-unit. Compiles cleanly under MSVC with the macro in place; would fail at compile time without the macro (sizeof grows by 8 bytes — the assertion would fire).
  • Linux C++ side: same patch builds cleanly under WSL Ubuntu 22.04 + clang (verified all 10 Hermes static libs produced — libLLVHSupport.a, libhermesParser.a, libhermesAST.a, etc.). Macro resolves to empty on non-MSVC; layout is identical to before the patch.
  • OneJS Vite-native end-to-end on Windows: with this patch + the fast-flow-transform#18 build-time hook, curl http://localhost:8081/index.bundle?platform=ios returns a complete 5.01 MB bundle in 1.31 s (and 5.03 MB android bundle in 1.04 s). Before the patch the bundler segfaulted on the first request.
  • Linux full cargo test --release --workspace: not re-run end-to-end against the macro form. rustc 1.93.1 hits an internal compiler error in fft_ast's lint pass (pre-existing rustc bug; fft_ast is a Rust-only crate that doesn't include the C++ headers I changed). The C++ Linux build (above) covers the relevant correctness; full Rust workspace verification would need a rustc version that doesn't ICE.

Alternatives considered

  • Inline #if defined(_MSC_VER) block on simple_ilist only. This was the first cut. Replaced with the portable macro per reviewer feedback — the macro version aligns with three precedents already in the codebase (the LLVM_* convention in llvh's own Compiler.h; the HERMES_EMPTY_BASES mirror one layer up; the external/llvh/patches/ recording rule).
  • Apply the attribute to every multi-empty-base class in llvh. I swept ADT headers — only one additional candidate found (ilist_node_traits at ilist.h:84, which inherits from ilist_alloc_traits and ilist_callback_traits, both empty). Reported the sweep in this thread; pre-built the Tier C patch locally on a wip branch (c6dbd850) and ready to fold in here or split into a follow-up PR per the maintainer's preference.
  • Fix the consumer (Rust FFI side). Possible — Rust could call list.end() and use the returned pointer instead of &list. But that requires either exposing a public accessor for the sentinel pointer (a Hermes-side change) or pulling the sentinel address out of an iterator's private field (fragile). Tagging simple_ilist with the EBO macro is the actual structural fix and matches the LLVM-upstream convention for layout-sensitive containers (see D146190, which introduced __declspec(empty_bases) on std::optional/std::variant for the same MSVC ABI quirk).

Notes

  • Other ilist types in this directory (iplist, iplist_impl) are not affected because they single-inherit from simple_ilist (one empty + one non-empty base; the multi-empty-base failure mode doesn't apply). The Tier C sweep result above documents ilist_node_traits separately.
  • The static_assert costs one template instantiation per translation unit that includes simple_ilist.h. The trigger node and its sentinel type are zero-runtime-cost compile-time only.
  • Inline doc comment on simple_ilist explains the motivation in-source so future maintainers understand why the attribute is required.

@meta-cla meta-cla Bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label May 8, 2026
YevheniiKotyrlo added a commit to YevheniiKotyrlo/fast-flow-transform that referenced this pull request May 8, 2026
…t via build-time patch file

Mirror the same change being upstreamed at facebook/hermes#2012:
`__declspec(empty_bases)` on `simple_ilist` so MSVC actually applies
Empty Base Optimization across the class's two empty bases. Without
the attribute, MSVC pads each empty base with one byte and
pointer-aligns the result, shifting the embedded `Sentinel` member
from offset 0 to offset 8.

The Rust FFI in `crates/hermes/src/parser/node.rs` is one of the
callers that depends on the coincident layout: it wraps a
`*const NodeList` (`= *const simple_ilist<Node>`) and iterates by
reading `next` pointers, comparing each one against the head pointer
it was given. On Linux/Clang and macOS the comparison succeeds —
`Sentinel` is at offset 0, so `&list == &list.Sentinel` — and
iteration terminates. On Windows MSVC the C++ side stores
`&list.Sentinel` (offset 8) in the linked-list pointers while Rust
received `&list` (offset 0); the comparison never matches, the
iterator walks past the end of the list, dereferences the sentinel
as if it were a real `Node`, and trips a `STATUS_ACCESS_VIOLATION
0xC0000005` on the first field load. Reproducible in
`cargo test --release -p fft --lib hparser::tests::test1` on Windows
(35-character JS suffices because it forces the FFI to iterate the
function's params `simple_ilist`).

Approach: a unified-diff patch file lives under `patches/` and the
two cmake-driven build scripts apply it to the bundled Hermes
submodule via `git apply` before configuring. The hook is:

  - **Target-gated.** Skipped entirely when `is_msvc_target()` is
    false. GCC/Clang already collapse the empty bases without the
    attribute, so Linux and macOS builds are unaffected.

  - **Idempotent.** Returns early when `__declspec(empty_bases)
    simple_ilist` is already present in the file — true after the
    first run in a single build, and also true once the Hermes
    submodule is bumped past facebook/hermes#2012 so the patch
    becomes unnecessary and can be removed at the maintainer's
    leisure.

  - **Loud on conflict.** If `git apply` fails (Hermes refactored
    `simple_ilist.h` and the patch's context lines no longer match),
    the build script panics with a message that names the patch file,
    surfaces git's stderr, and points at the two resolution paths:
    drop the patch when upstream lands, or regenerate it. Silently
    skipping would produce an unpatched binary that AVs at runtime —
    worse than a build error.

Validated end-to-end on Windows (host) and on Linux WSL Ubuntu 22.04
with Rust 1.88.0:

  - `cargo test --release -p fft --lib hparser::tests::test1`
    passes on Windows (was `STATUS_ACCESS_VIOLATION` before).
  - `cargo test --workspace --release` clean on Windows.
  - Full workspace test clean on Linux (the patcher is gated and
    correctly skipped — verified by greping the submodule file
    after the build).
  - `cargo fmt --all --check` clean.
  - Re-running the build after the patch already landed exits the
    hook in <1ms (idempotency check).
  - Simulated Hermes-bump-with-conflict (inserted a comment ahead of
    the `simple_ilist` template) panics with the documented message
    and exit code 101, never producing a binary.

Closes-by: should make jbroma#17's `cvt_smloc` graceful fallback unnecessary
once this lands; the original `assert!(loc.is_valid(), ...)` is
correct because Hermes does not produce null `SMLoc`s in this code
path. Resolves the runtime AV reported in jbroma#16.
YevheniiKotyrlo added a commit to YevheniiKotyrlo/fast-flow-transform that referenced this pull request May 8, 2026
…t via build-time patch file

Mirror the same change being upstreamed at facebook/hermes#2012:
`__declspec(empty_bases)` on `simple_ilist` so MSVC actually applies
Empty Base Optimization across the class's two empty bases. Without
the attribute, MSVC pads each empty base with one byte and
pointer-aligns the result, shifting the embedded `Sentinel` member
from offset 0 to offset 8.

The Rust FFI in `crates/hermes/src/parser/node.rs` is one of the
callers that depends on the coincident layout: it wraps a
`*const NodeList` (`= *const simple_ilist<Node>`) and iterates by
reading `next` pointers, comparing each one against the head pointer
it was given. On Linux/Clang and macOS the comparison succeeds —
`Sentinel` is at offset 0, so `&list == &list.Sentinel` — and
iteration terminates. On Windows MSVC the C++ side stores
`&list.Sentinel` (offset 8) in the linked-list pointers while Rust
received `&list` (offset 0); the comparison never matches, the
iterator walks past the end of the list, dereferences the sentinel
as if it were a real `Node`, and trips a `STATUS_ACCESS_VIOLATION
0xC0000005` on the first field load. Reproducible in
`cargo test --release -p fft --lib hparser::tests::test1` on Windows
(35-character JS suffices because it forces the FFI to iterate the
function's params `simple_ilist`).

Approach: a unified-diff patch file lives under `patches/` and the
two cmake-driven build scripts apply it to the bundled Hermes
submodule via `git apply` before configuring. The hook is:

  - **Target-gated.** Skipped entirely when `is_msvc_target()` is
    false. GCC/Clang already collapse the empty bases without the
    attribute, so Linux and macOS builds are unaffected.

  - **Idempotent.** Returns early when `__declspec(empty_bases)
    simple_ilist` is already present in the file — true after the
    first run in a single build, and also true once the Hermes
    submodule is bumped past facebook/hermes#2012 so the patch
    becomes unnecessary and can be removed at the maintainer's
    leisure.

  - **Race-safe.** Cargo runs `crates/hermes/build.rs` and
    `crates/fft_support/build.rs` in parallel, so both can pass the
    idempotency check while a sibling is still mid-`git apply`. If
    our own `git apply` then fails, we re-read the file: if the
    attribute is now present, we skip silently (sibling won the
    race); only a real conflict still panics.

  - **Loud on conflict.** If `git apply` fails for a non-race reason
    (Hermes refactored `simple_ilist.h` and the patch's context lines
    no longer match), the build script panics with a message that
    names the patch file, surfaces git's stderr, and points at the
    two resolution paths: drop the patch when upstream lands, or
    regenerate it. Silently skipping would produce an unpatched
    binary that AVs at runtime — worse than a build error.

Validated end-to-end on Windows MSVC (host) and Linux WSL Ubuntu
22.04 with Rust 1.88.0, plus eight edge-case scenarios:

  - Clean apply on Windows: passes.
  - Idempotent re-run: passes in <10s, no re-application.
  - Linux: workspace passes; patcher correctly skips (verified by
    inspecting the submodule file post-build).
  - Hermes-bump-with-upstream-fix scenario (pre-applied attribute):
    builds clean, no re-application.
  - Patch file missing: clean panic with "was patches/ pruned?".
  - Submodule target file missing: clean panic with "reading ...
    failed".
  - Patch file corrupt (not a valid diff): clean panic surfacing
    git's "No valid patches in input".
  - Partial pre-application (comment present but attribute absent):
    clean panic with "patch does not apply".
  - Three back-to-back parallel clean builds: all pass; race-safety
    holds under the parallel invocation.

`cargo fmt --all --check` clean. End-to-end: rebuilt napi binding
plugged into onejs/one's Vite-native bundler returns a complete ~5 MB
RN bundle in ~1.2 s on Windows where it previously segfaulted on the
first request.

Closes-by: should make jbroma#17's `cvt_smloc` graceful fallback unnecessary
once this lands; the original `assert!(loc.is_valid(), ...)` is
correct because Hermes does not produce null `SMLoc`s in this code
path. Resolves the runtime AV reported in jbroma#16.
@tmikov
Copy link
Copy Markdown
Contributor

tmikov commented May 8, 2026

Hi, this is a nice catch.
Hermes already has a macro for this, but llvh can't depend on Hermes:

#define HERMES_EMPTY_BASES __declspec(empty_bases)

I think it would be slightly cleaner to add LLVM_DECLARE_EMPTY_BASES to llvh's Compiler.h and then use the macro.

LLVH is a permanent partial vendored fork, so we have no plans to ever upgrade it en-masse. This gives us some freedom to tweak it (although we do try to keep the tweaks to a minimum and record them in https://github.com/facebook/hermes/tree/bcbb8c040abcf315fa284e798f52ae8252d0b1f1/external/llvh/patches).

@YevheniiKotyrlo YevheniiKotyrlo force-pushed the fix/msvc-simple-ilist-empty-bases branch from 1e89d1c to ba1a251 Compare May 8, 2026 17:22
@YevheniiKotyrlo
Copy link
Copy Markdown
Author

Thanks for the pointer to HERMES_EMPTY_BASES — that finding actually changes how I'd describe the bug's history. Hermes had already worked around MSVC's multi-empty-base EBO failure for PointerBase under HERMESVM_CONTIGUOUS_HEAP; the same idiom just hadn't propagated down into llvh yet. So this isn't a new class of bug for the codebase, it's simple_ilist joining a club PointerBase was already in.

I've force-pushed a revised commit that adopts your suggestion exactly:

  • external/llvh/include/llvh/Support/Compiler.h — added LLVM_DECLARE_EMPTY_BASES, sitting between LLVM_LIBRARY_VISIBILITY and LLVM_PREFETCH to group with the other class-decoration macros. Modelled on HERMES_EMPTY_BASES style.
  • external/llvh/include/llvh/ADT/simple_ilist.h — replaced the inline #if defined(_MSC_VER) block with LLVM_DECLARE_EMPTY_BASES applied to the template, and tightened the explanatory doc comment.
  • external/llvh/patches/simple-ilist-msvc-empty-bases.patch — added the patch file recording the divergence per the directory's convention. The two prior simple_ilist tweaks (ilist-erase-between.patch, ilist-nondefault-destructor.patch) gave the naming style.

Patch reverses cleanly via git apply --check --reverse; downstream consumer fast-flow-transform's napi binding rebuilds in 39.83s on Windows MSVC and its 36-test workspace passes (FFI iteration tests parser::hermes_parser::tests::good_parse, comments, magic_comments, parse_error — the ones that AV without the attribute — all green).

Two optional follow-ups I considered but kept out of this PR per your "minimum tweaks" preference; happy to do either if you'd prefer them in:

  1. Compile-time invariant. A static_assert near simple_ilist pinning sizeof(simple_ilist<T>) to the size of its non-empty list_base_type, locking the EBO collapse as a type-level guarantee. Catches MSVC flag changes / ABI swaps / refactors automatically. ~3 lines.
  2. Codebase sweep. A grep across llvh ADT for other multi-empty-base classes that might benefit from the same macro. iplist/ilist are transitively repaired since they single-inherit from simple_ilist, so the visible ilist family is already covered, but other ADT containers might not be.

Either or both, or neither — happy to defer to your call.

@tmikov
Copy link
Copy Markdown
Contributor

tmikov commented May 8, 2026

  1. Compile-time invariant. A static_assert near simple_ilist pinning sizeof(simple_ilist<T>) to the size of its non-empty list_base_type, locking the EBO collapse as a type-level guarantee. Catches MSVC flag changes / ABI swaps / refactors automatically. ~3 lines.

Makes sense. Please add it to this PR.

  1. Codebase sweep. A grep across llvh ADT for other multi-empty-base classes that might benefit from the same macro. iplist/ilist are transitively repaired since they single-inherit from simple_ilist, so the visible ilist family is already covered, but other ADT containers might not be.

Seems like a good idea. Depending on the results, we may split it into a separate PR or keep it in this one.

@YevheniiKotyrlo
Copy link
Copy Markdown
Author

YevheniiKotyrlo commented May 8, 2026

Tier B done

Pushed as a fast-forward commit (a9d17068) on top of the existing PR commit:

// External/llvh/include/llvh/ADT/simple_ilist.h, namespace ilist_detail:
struct ebo_check_node : ilist_node<ebo_check_node> {};
static_assert(
    sizeof(simple_ilist<ebo_check_node>) ==
        sizeof(ilist_sentinel<compute_node_options<ebo_check_node>::type>),
    "simple_ilist must use Empty Base Optimization. If this fires on MSVC, "
    "ensure LLVM_DECLARE_EMPTY_BASES is applied to the class template "
    "(see external/llvh/patches/simple-ilist-msvc-empty-bases.patch).");

patches/simple-ilist-msvc-empty-bases.patch refreshed to keep the recorded divergence byte-aligned with the in-tree change. Builds and tests pass on Windows MSVC (cargo build --release clean, 36/36 fast-flow-transform workspace tests green).

Tier C sweep results

Searched the 75 headers under external/llvh/include/llvh/ADT/ for class declarations with multi-base inheritance, then filtered to ones where 2+ bases are empty.

One additional candidate found:

ilist_node_traits<NodeTy> at external/llvh/include/llvh/ADT/ilist.h:84:

template <typename NodeTy>
struct ilist_node_traits : ilist_alloc_traits<NodeTy>,
                           ilist_callback_traits<NodeTy> {};

Both bases are empty by the same logic as simple_ilist's — ilist_alloc_traits<NodeTy> (line 53) and ilist_callback_traits<NodeTy> (line 65) carry only static methods and template member functions, no data members. So ilist_node_traits itself sits in the same multi-empty-base shape: sizeof == 1 with EBO collapse, sizeof == 2 without on MSVC.

The downstream effect is subtler than simple_ilist's. ilist_node_traits is used as the trait template parameter for iplist_impl (ilist.h:168class iplist_impl : public TraitsT, IntrusiveListT); iplist_impl itself only has one empty base + one non-empty base, so the standard MSVC EBO works fine on iplist_impl directly. But the inner multi-empty-base failure inflates sizeof(ilist_node_traits) from 1 to 2, which then prevents EBO from kicking in when iplist_impl inherits from it (it's no longer "empty" from MSVC's POV) — iplist/ilist carry an unnecessary 8 bytes of trait padding on Windows.

Fix is the same one-line decoration:

template <typename NodeTy>
struct LLVM_DECLARE_EMPTY_BASES ilist_node_traits
    : ilist_alloc_traits<NodeTy>, ilist_callback_traits<NodeTy> {};

(ilist.h already transitively includes llvh/Support/Compiler.h via simple_ilist.h, so no new #include needed.)

Cleared (false positives from the grep):

Class File Reason
iplist_impl ilist.h:168 1 empty base + 1 non-empty (simple_ilist); single empty base, no MSVC EBO failure
SmallVector SmallVector.h:851 Both bases (SmallVectorImpl, SmallVectorStorage) carry data
SmallPtrSetIterator SmallPtrSet.h:267 Both bases (SmallPtrSetIteratorImpl with bucket pointers, DebugEpochBase::HandleBase with epoch tracking) carry data
All iterator_facade_base<...> derivations (SCCIterator, SmallSetIterator, value_sequence_iterator, StringMapIterBase, ImutAVLValueIterator) various Single base; comma in declaration is a template arg, not a sibling base
DenseMap, SmallDenseMap, DenseSet, SmallDenseSet, SetVector, SmallSetVector, PriorityWorklist, SmallPriorityWorklist, MapVector, SmallMapVector, PackedVector, SmallString, df_iterator, po_iterator and their _ext variants various Single-base inheritance

No multi-base classes outside the ilist family in external/llvh/include/llvh/ADT/. The pattern is genuinely rare.

Disposition

Two paths, your call:

  1. Fold into this PR. ~3-line addition to ilist.h + a corresponding external/llvh/patches/ilist-node-traits-msvc-empty-bases.patch (matching the convention). Keeps the "MSVC EBO multi-empty-base hardening" all in one reviewable unit.
  2. Split into a follow-up PR. Lets this PR stay narrow ("simple_ilist + macro + patch file"). A second PR closes the symmetric ilist_node_traits case.

Both work for me. Lean on whichever you prefer.

simple_ilist inherits from two empty bases (the configured
list_base_type and SpecificNodeAccess). MSVC's ABI does not collapse
multiple empty bases by default — it pads each empty base with one
byte and pointer-aligns the result, shifting the embedded Sentinel
from offset 0 to offset 8. The shift is invisible through pure C++
list usage (iterators hand out node pointers and account for the
sentinel offset), but it breaks any caller that compares an iterator's
stored node pointer against `&list` rather than against an iterator
obtained from `list.end()` — most visibly FFI consumers walking the
linked-list `next` pointers from another language.

A real-world manifestation: fast-flow-transform's Rust FFI in
crates/hermes/src/parser/node.rs wraps a *const NodeList and iterates
by reading the embedded `next` pointer, comparing each one against
the head pointer it was given. On Linux/Clang and macOS the comparison
succeeds (Sentinel at offset 0). On Windows MSVC it never matches
because C++ stores `&list.Sentinel` (offset 8) in the linked-list
pointers while Rust received `&list` (offset 0); the iterator walks
past the end, dereferences the sentinel as a real Node, and trips
STATUS_ACCESS_VIOLATION 0xC0000005 during AST traversal.

This patch adds an LLVM_DECLARE_EMPTY_BASES macro to llvh's Compiler.h
mirroring the existing HERMES_EMPTY_BASES one layer up at
include/hermes/Support/Compiler.h:44 (llvh cannot depend on Hermes,
so the same idiom is needed in both layers), and tags simple_ilist
with the macro. Per the external/llvh/patches/ directory convention
the divergence is also recorded as a patch file.

GCC and Clang on Linux/macOS resolve the macro to nothing — their
Itanium ABI collapses empty bases unconditionally, so there is nothing
to opt into. MSVC and clang-cl on Windows (which emulates MSVC's ABI
for binary compat) resolve it to __declspec(empty_bases) and produce
the correct layout.

Test plan:
- Builds cleanly on MSVC and on GCC/Clang (the macro is gated on
  _MSC_VER and is a no-op elsewhere).
- fast-flow-transform's Rust workspace tests (36 across fft, fft_support,
  hermes parser/FFI crates) pass on Windows MSVC with this patch
  applied; before the fix the FFI iteration tests crashed with
  STATUS_ACCESS_VIOLATION.
- Linux/macOS: unchanged, no measurable layout difference.
Per review feedback on this PR, add a `static_assert` at namespace
scope that pins `sizeof(simple_ilist<ebo_check_node>)` to
`sizeof(ilist_sentinel<OptionsT>)`. The assertion fires at compile
time if the two empty bases stop collapsing — for example if
LLVM_DECLARE_EMPTY_BASES is removed from the class template, MSVC's
ABI changes, or build-flag drift disables the attribute.

Also refresh external/llvh/patches/simple-ilist-msvc-empty-bases.patch
so the recorded divergence stays byte-aligned with the in-tree change
(the directory's existing convention — every llvh tweak is recorded
there).

Backstops the runtime FFI bug: simple_ilist's two empty bases
(list_base_type, SpecificNodeAccess) must collapse so the embedded
Sentinel sits at offset 0; without that, FFI consumers like
fast-flow-transform that compare an iterator's stored next pointer
against `&list` walk past end-of-list and dereference Sentinel as if
it were a real Node.
@YevheniiKotyrlo YevheniiKotyrlo force-pushed the fix/msvc-simple-ilist-empty-bases branch from e14e37f to a9d1706 Compare May 8, 2026 21:01
@YevheniiKotyrlo YevheniiKotyrlo changed the title Fix: apply __declspec(empty_bases) to simple_ilist for MSVC Fix: enable EBO for simple_ilist on MSVC via LLVM_DECLARE_EMPTY_BASES May 8, 2026
@tmikov
Copy link
Copy Markdown
Contributor

tmikov commented May 9, 2026

I think it makes sense to combine all fixes in a single PR. They are all related after all. Thanks!

YevheniiKotyrlo added a commit to YevheniiKotyrlo/fast-flow-transform that referenced this pull request May 9, 2026
Refresh the build-time patch and idempotency markers to byte-match the
post-review form of facebook/hermes#2012.

After tmikov's review, the upstream PR was revised from an inline
`#if defined(_MSC_VER)` block to a portable LLVM_DECLARE_EMPTY_BASES
macro defined in llvh's own Compiler.h (mirroring the existing
HERMES_EMPTY_BASES one layer up). The fft patch is updated to match
exactly so the post-merge idempotency check still no-ops cleanly when
fft eventually bumps its Hermes pin past the merged commit.

The idempotency check now accepts both markers:
- `LLVM_DECLARE_EMPTY_BASES simple_ilist` (the new upstream form)
- `__declspec(empty_bases) simple_ilist` (the legacy inline form)

Recognizing both keeps already-patched checkouts building without a
forced submodule re-init: developers who applied the legacy form from
a previous build see the legacy marker and skip; new checkouts apply
the macro form and see the new marker.

Validation:
- `cargo build --release` on Windows MSVC: 39.83s incremental, clean
- `cargo test --release --workspace` on Windows MSVC: 36/36 pass across
  fft, fft_support, and hermes parser/FFI crates (the FFI iteration
  tests that AV without the attribute all green)
- `git apply --check --reverse` on the refreshed patch: clean
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 9, 2026

@tmikov has imported this pull request. If you are a Meta employee, you can view this in D104498746.

ilist_node_traits inherits from ilist_alloc_traits and
ilist_callback_traits — both empty. On MSVC without
LLVM_DECLARE_EMPTY_BASES sizeof grows from 1 to 2, which then
prevents EBO when iplist_impl uses ilist_node_traits as its TraitsT
base — iplist/ilist end up carrying needless trait padding on
Windows.

Symmetric fix to the simple_ilist change. Recorded in
external/llvh/patches/ilist-node-traits-msvc-empty-bases.patch per
the directory's convention.
@facebook-github-tools
Copy link
Copy Markdown

@YevheniiKotyrlo has updated the pull request. You must reimport the pull request before landing.

@YevheniiKotyrlo
Copy link
Copy Markdown
Author

Pushed Tier C as a fast-forward commit (91ae770f). PR now has 3 commits — the macro + simple_ilist application + patch file (a69e3c75), the static_assert invariant (a9d17068), and the ilist_node_traits decoration + recorded patch (91ae770f).

Both compile-time invariants are now empirically verified, not just analytically asserted. Two isolated MSVC test programs (vcvars64.bat env, cl /std:c++17, hermes' llvh include paths):

Static_assert (Tier B) — fires when the macro is removed. Test stubs LLVM_DECLARE_EMPTY_BASES to empty before including simple_ilist.h:

=== POSITIVE (macro intact) ===
test-static-assert-positive.cpp
POSITIVE_EXIT=0

=== NEGATIVE (macro stubbed empty) ===
…\external\llvh\include\llvh/ADT/simple_ilist.h(366):
  error C2338: static_assert failed: 'simple_ilist must use Empty Base
  Optimization. If this fires on MSVC, ensure LLVM_DECLARE_EMPTY_BASES
  is applied to the class template (see external/llvh/patches/
  simple-ilist-msvc-empty-bases.patch).'

Tier C ilist_node_traits — exactly 8 bytes saved on MSVC. Two test programs measure sizeof(iplist_impl<simple_ilist<N>, my_traits>), with and without LLVM_DECLARE_EMPTY_BASES on my_traits (mirrors the in-tree state before/after this commit):

=== WITH macro (Tier C applied) ===
sizeof(simple_ilist<N>) = 16
sizeof(iplist_impl<IL, my_traits[WITH macro]>) = 16
savings vs no-macro = 0 bytes

=== NO macro (current in-tree state of llvh) ===
sizeof(simple_ilist<N>) = 16
sizeof(iplist_impl<IL, my_traits[NO macro]>) = 24
padding overhead = 8 bytes

Confirms the analytical claim: without the Tier C decoration, iplist/ilist carry an extra 8 bytes of trait padding per instance on Windows MSVC.

cargo build --release against fast-flow-transform (which links Hermes statically) is also clean on Windows MSVC after the fold (Finished release profile [optimized] target(s) in 34.36s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants