Skip to content

[WASM] Add READYTORUN_FIXUP_InjectStringThunks and R2R thunk infrastructure#127483

Open
davidwrighton wants to merge 29 commits intodotnet:mainfrom
davidwrighton:R2RThunksForWasm
Open

[WASM] Add READYTORUN_FIXUP_InjectStringThunks and R2R thunk infrastructure#127483
davidwrighton wants to merge 29 commits intodotnet:mainfrom
davidwrighton:R2RThunksForWasm

Conversation

@davidwrighton
Copy link
Copy Markdown
Member

Note

This PR was authored with assistance from GitHub Copilot.

Summary

Adds end-to-end infrastructure for mapping strings to pregenerated code thunks in ReadyToRun images, targeting WebAssembly platforms. This enables R2R-compiled images to embed interpreter-to-native and native-to-interpreter transition thunks that are discovered by string key at module load time.

Key Changes

New R2R Fixup: READYTORUN_FIXUP_InjectStringThunks (0x39)

  • Encodes a series of (null-terminated UTF8 string, 4-byte table index) pairs
  • Processed at module load time via eager fixups
  • At most one per R2R compilation unit

Crossgen2 (Compiler Side)

  • StringDiscoverableAssemblyStubNode — abstract base class for stubs discoverable by string key
  • InjectStringThunksSignature — emits the fixup signature into the R2R image
  • WasmR2RToInterpreterThunkNode / WasmInterpreterToR2RThunkNode — concrete thunk nodes
  • WasmLowering — signature encoding for WASM calling conventions
  • Thunk nodes are demand-driven via stub dependencies from call sites

Runtime (CoreCLR Side)

  • Global CAS-based copy-on-write hash table (StringToThunkHash) for lock-free reads
  • ProcessInjectStringThunksFixup — processes the fixup, merges into global table
  • LookupPregeneratedThunkByString — lock-free lookup API
  • Deferred thunk resolution: when a PortableEntryPoint is initialized before its R2R module is loaded, the method is tracked per-LoaderAllocator and resolved later
  • Single global lock protects both the LA registry and per-LA pending arrays
  • CAS-based TrySetInterpreterThunk ensures thunks are never overwritten once set
  • Cleanup in LoaderAllocator::Destroy for collectible assemblies

Signature Encoding (ABI)

  • Three aligned implementations: WasmLowering.cs (crossgen2), SignatureMapper.cs (WasmAppBuilder), helpers.cpp (runtime)
  • Format: <prefix><return>[T][hidden-params]<explicit-params>[p]
  • Documented in docs/design/coreclr/botr/clr-abi.md

Documentation

  • Updated docs/design/coreclr/botr/readytorun-format.md with fixup 0x39 specification
  • Added WASM calling convention section to docs/design/coreclr/botr/clr-abi.md

davidwrighton and others added 22 commits April 23, 2026 15:03
…nk mappings

This adds a new ReadyToRun fixup that enables mapping UTF-8 strings to
pregenerated code thunks embedded in R2R images. The fixup is placed in
the eager imports section and processed at module load time.

Changes across all layers:

Format definition:
- Add READYTORUN_FIXUP_InjectStringThunks = 0x39 to readytorun.h and
  ReadyToRunConstants.cs
- Bump R2R minor version from 5 to 6 in all three locations

Runtime (CoreCLR VM):
- Refactor StringThunkSHashTraits from wasm/helpers.cpp into shared
  stringthunkhash.h header, available to all platforms
- Add pregeneratedstringthunks.cpp/.h with global hash table using
  copy-on-write CAS pattern for lock-free concurrent reads
- InitializePregeneratedStringThunkHash() called at EE startup
- LookupPregeneratedThunkByString() API returns PCODE or NULL
- ProcessInjectStringThunksFixup() handles the fixup in
  LoadDynamicInfoEntry, merging new entries with existing ones

Crossgen2 compiler:
- Add abstract StringDiscoverableAssemblyStubNode (derives from
  AssemblyStubNode) with LookupString property; instances register
  themselves via OnMarked
- Add InjectStringThunksSignature that collects all registered stubs
  at emission time and encodes them as (UTF8 string, RVA) pairs
- Root the InjectStringThunks import eagerly in NodeFactory
- Sort stubs by LookupString for deterministic compilation

Tooling and documentation:
- Add r2rdump parser case for InjectStringThunks signatures
- Update readytorun-format.md with fixup table entry and format spec

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of unconditionally rooting the InjectStringThunks import, store
it on the NodeFactory and have each StringDiscoverableAssemblyStubNode
declare a dependency on it via ComputeNonRelocationBasedDependencies.
The import is only pulled into the graph when at least one stub is marked.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Change GetSignature to return (WasmFuncType, string) where the string
is a compact serialization of the signature:

Return type: 'v' (void), 'i'/'l'/'f'/'d'/'V' (primitives), 'S<N>'
(struct by ref with N bytes).

Hidden params (this, retbuf, generic context, async continuation):
'i' or 'l' based on pointer size.

Explicit params: 'i'/'l'/'f'/'d'/'V' (by value), 'S<N>' (by ref),
'e' (empty struct, not emitted to WasmFuncType).

Suffix 'p' indicates SP and PE params are generated (managed calls).

Add IsEmptyStruct helper (stub returning false) for detecting empty
structs by field count per the BasicCABI spec. Handle empty structs
for both parameters ('e' encoding) and returns (treated as void).
See dotnet#127361.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce WasmSignature readonly struct implementing IEquatable and
IComparable. Equality and comparison are based on the signature string
(with Debug.Assert that FuncType agrees when strings match). This
enables sorting and deduplication of signatures by string alone.

Update WasmLowering.GetSignature to return WasmSignature and update
callers in WasmObjectWriter and ReadyToRunCodegenNodeFactory.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- WasmImportThunk now takes WasmSignature and uses it for mangled name
  and comparison operations
- WasmImportThunkPortableEntrypoint uses static WasmSignature values
- RaiseSignature rewritten to parse signature string instead of WasmFuncType
- Added CompilerTypeSystemContext.Wasm.cs with GetValueTupleStructOfSize
  cache using tree-based ValueTuple construction
- Unmanaged calling convention flag set when 'p' suffix is absent
- Roundtrip assert: raised signature re-lowered must equal original
- Cache first empty struct found during lowering for 'e' roundtrip

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of iterating the wasm-level _typeNode params, iterate the
raised MethodSignature. This enables:
- Indirect struct args: zero-fill the transition block slot on store,
  and pass the original byref local directly on restore
- Empty struct args: skip entirely (no wasm local exists)
- Made WasmLowering.IsEmptyStruct public for cross-file access

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Site api.

Co-authored-by: Copilot <copilot@github.com>
The 'this' parameter is now encoded with a distinct 'T' character
instead of 'i'/'l'. On raise, 'T' sets HasThis on the MethodSignature
rather than adding an explicit parameter. This enables proper
roundtripping and allows ArgIterator to correctly compute offsets
(e.g. GetRetBuffArgOffset with hasThis).

Also fix build errors in CorInfoImpl.ReadyToRun.cs: qualify
LoweringFlags and cast getCallConv() to int.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add WasmR2RToInterpreterThunkNode, a StringDiscoverableAssemblyStubNode
that captures arguments into a transition block and dispatches to the
interpreter via READYTORUN_HELPER_InitInstClass.

Key details:
- Thunk keyed by WasmSignature, discoverable by 'I'-prefixed signature string
- Arguments area is 16-byte aligned; TransitionBlock is 8-byte aligned
- Indirect struct args copied with memory.copy + memory.fill padding
- Stack pointer global saved/restored around helper call
- V128 return uses 16-byte aligned buffer; others use 8-byte i64 store

Also adds memory.copy, memory.fill, and i64.const WASM instructions,
and updates WasmImportThunk to use memory.fill for indirect struct
zero-filling.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…all site dependency

- Add WasmInterpreterToR2RThunkNode: a StringDiscoverableAssemblyStubNode that
  bridges from interpreter calling convention to R2R compiled functions. Uses
  ArgIterator offsets (minus TransitionBlock size) to locate args in the
  interpreter buffer, sets up a TERMINATE_R2R_STACK_WALK frame, and dispatches
  via call_indirect.
- Fix retbuf detection in both WasmR2RToInterpreterThunkNode and
  WasmInterpreterToR2RThunkNode to check SignatureString[0] == 'S' instead of
  using ArgIterator.HasRetBuffArg/GetRetBuffArgOffset. The R2R-to-interpreter
  thunk now passes the retbuf wasm local directly.
- Add factory cache and accessor for WasmInterpreterToR2RThunk on
  ReadyToRunCodegenNodeFactory.
- Fix recordCallSite TODO: wire up WasmR2RToInterpreterThunk dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add AddAdditionalDependency helper for lazily adding to _additionalDependencies
- Move WasmR2RToInterpreterThunk from AddPrecodeFixup to AddAdditionalDependency
  in recordCallSite
- Add WasmInterpreterToR2RThunk dependency for every compiled managed
  non-UnmanagedCallersOnly method on Wasm, using GetSignature(MethodDesc)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
- Replace ValueTuple-based struct size construction with a cache of real
  struct types encountered during GetSignature. ValueTuples have auto
  layout which causes padding, making roundtrip size assertions fail.
  The cache uses a locked Dictionary for thread safety.

- Fix RaiseSignature to skip the hidden retbuf pointer parameter when
  the return type is a struct (S<N> encoding). Previously it was included
  in the raised MethodSignature parameters, causing GetSignature to emit
  a duplicate retbuf pointer on re-encoding.

- Fix WasmImportThunk to handle 'this' pointer correctly: store/restore
  it separately before the explicit parameter loop, and start
  wasmLocalIndex past both 'this' and retbuf locals.

- Fix WasmImportThunkPortableEntrypoint to strip IsUnmanagedCallersOnly
  flag when computing thunk signatures, since thunks always use managed
  calling convention.

- Fix DelayLoadHelperImport to skip creating WasmImportThunk for
  GenericLookupSignature on WASM, as these are eager fixups that don't
  need import thunks.

- Fix WasmR2RToInterpreterThunkNode to skip 'this' and retbuf wasm
  locals before iterating explicit parameters.

- Skip creating R2R-to-interpreter thunks for unmanaged call sites,
  as they don't go through interpreter transitions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ForceSigWalk method had two bugs in its Wasm-specific path for
accounting unnamed arguments (this, retbuf, generic context, etc.)
when no named arguments are present:

1. The check 'maxOffset == 0' could never be true because maxOffset
   is initialized to OffsetOfArgs (8 on Wasm32). Changed to compare
   against OffsetOfArgs.

2. The fallback 'maxOffset = _wasmOfsStack' was incorrect because
   _wasmOfsStack is relative to OffsetOfArgs, but maxOffset is an
   absolute offset. Changed to 'OffsetOfArgs + _wasmOfsStack'.

These bugs caused GCRefMapBuilder to allocate a zero-length fake
stack for methods with only unnamed arguments (e.g. parameterless
instance methods), leading to IndexOutOfRangeException when writing
the 'this' pointer GC ref at ThisOffset.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace 'n' encoding with 'S' for multi-field structs passed by ref
- Add hardcoded struct sizes for QCallModule (8), QCallAssembly (8),
  GCHeapHardLimitInfo (64) so signatures produce S<N> format
- Add ParseSignatureTokens tokenizer to handle multi-char S<N> tokens
- Add Token-based API (TokenToNativeType/TokenToNameType/TokenToArgType)
- Update InterpToNativeGenerator to use token-based parsing
- Unknown struct types log a diagnostic at High importance

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TokenToNameType returns the full S<N> token (e.g. S8, S64) so
  generated function names encode the struct size
- ArgsWithSlotOffsets computes running slot indices: structs consume
  max(size/8, 1) slots instead of always 1
- Add TokenToSlotCount helper
- Remove IsBlittable gate from TypeToChar — multi-field structs are
  always passed by pointer, matching crossgen2 WasmLowering behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- helpers.cpp: Refactor GetSignatureKey to support S<N> struct tokens,
  LowerTypeHandle for recursive single-field unwrapping, caller prefix
  parameter ('M' for calli, 'I' for PE-to-interpreter)
- helpers.cpp: Use 'T' for this pointer encoding (was 'i')
- WasmLowering.cs: Remove redundant hidden retbuf pointer from signature
  string (implied by S<N> return type)
- RaiseSignature: Remove hasReturnBuffer skip logic (no longer in string)
- SignatureMapper.cs: Use 'T' for this pointer, add T to token maps
- InterpToNativeGenerator.cs: Add 'M' prefix to g_wasmThunks entries
- clr-abi.md: Document Type Lowering and Signature String Encoding spec

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tion

Replace dynamic alloca-based initial buffer sizing with a fixed 64-byte
stack buffer. Fall back to alloca only when S<N> tokens make the key
exceed the initial buffer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…EntryPointThunk

Both functions now check the process-startup thunk cache first, then
fall back to LookupPregeneratedThunkByString for thunks injected via
READYTORUN_FIXUP_InjectStringThunks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a MethodDesc's PortableEntryPoint is initialized before the R2R module
containing its thunk is loaded, the method is tracked on a per-LoaderAllocator
SArray and resolved later when new R2R thunks are injected.

- Add TrySetInterpreterThunk CAS-based thunk installation on PortableEntryPoint
- Track pending methods per-LoaderAllocator using SArray<MethodDesc*> with
  NULL-compaction on resolve
- Single global lock (s_pendingThunkResolutionLock) protects both the LA
  registry and per-LA pending arrays, keeping LAs alive during scans
- Registration flag on LoaderAllocator avoids duplicate list scans
- Unregistration in LoaderAllocator::Destroy for correct collectible cleanup
- LookupThunk/LookupPortableEntryPointThunk now also check R2R thunk hash
- Remove stale WASM-TODO comments

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 28, 2026 00:01
@github-actions github-actions Bot added the area-crossgen2-coreclr only use for closed issues label Apr 28, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end infrastructure for string-keyed discovery of pregenerated ReadyToRun (R2R) thunks on WebAssembly. It introduces a new eager R2R fixup (READYTORUN_FIXUP_InjectStringThunks) to inject string→code mappings at module load time, unifies signature-string encoding across compiler/task/runtime, and wires up generation + runtime lookup for interpreter↔native transition thunks.

Changes:

  • Add READYTORUN_FIXUP_InjectStringThunks (0x39) and bump R2R minor version to 18.6 across components.
  • Implement compiler-side thunk nodes + fixup signature emission, and runtime-side global string→thunk lookup + deferred PortableEntryPoint resolution.
  • Extend WASM signature encoding to support T (this) and S<N> (struct-byref / retbuf) and update thunk generators and runtime signature computation accordingly.

Reviewed changes

Copilot reviewed 52 out of 52 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/tasks/WasmAppBuilder/coreclr/SignatureMapper.cs Adds tokenized signature encoding (S<N>, T) and parsing utilities for thunk generation.
src/tasks/WasmAppBuilder/coreclr/ManagedToNativeGenerator.cs Removes currently-unused pregenerated signature list (now empty).
src/tasks/WasmAppBuilder/coreclr/InterpToNativeGenerator.cs Updates generator to token-aware signatures and prefixes thunk keys with M.
src/coreclr/vm/wasm/helpers.hpp Declares API to lookup portable-entrypoint→interpreter thunk by MethodDesc.
src/coreclr/vm/wasm/helpers.cpp Unifies runtime signature key format (prefix + T + S<N>), and falls back to pregenerated string thunk table.
src/coreclr/vm/wasm/callhelpers-pinvoke.cpp Adds CompressionNative_CompressBound pinvoke entry and updates module entry count.
src/coreclr/vm/wasm/callhelpers-interp-to-managed.cpp Updates generated thunk table keys to include M prefix and adds new struct-token variants.
src/coreclr/vm/stringthunkhash.h Introduces shared StringToThunkHash shash traits for string→pointer maps.
src/coreclr/vm/pregeneratedstringthunks.h Declares global pregenerated string thunk table + pending PE thunk-resolution APIs.
src/coreclr/vm/pregeneratedstringthunks.cpp Implements CAS-based copy-on-write global string→thunk hash and pending PE thunk resolution tracking.
src/coreclr/vm/precode_portable.hpp Adds PortableEntryPoint::TrySetInterpreterThunk CAS setter.
src/coreclr/vm/method.cpp Registers methods for deferred PE thunk resolution when thunk isn’t yet available.
src/coreclr/vm/loaderallocator.hpp Adds per-LoaderAllocator pending PE thunk list and registration bit.
src/coreclr/vm/loaderallocator.cpp Initializes pending-resolution state and unregisters on LoaderAllocator::Destroy; adds LA helper method.
src/coreclr/vm/jitinterface.cpp Processes new READYTORUN_FIXUP_InjectStringThunks during eager fixups.
src/coreclr/vm/CMakeLists.txt Adds new VM sources/headers for pregenerated string thunk support.
src/coreclr/vm/ceemain.cpp Initializes pregenerated thunk hash + pending-resolution lock during EE startup.
src/coreclr/tools/Common/JitInterface/WasmLowering.cs Adds signature-string generation/roundtrip (WasmSignature) and S<N>/T/p encoding logic.
src/coreclr/tools/Common/JitInterface/CorInfoTypes.cs Makes getCallConv and isAsyncCall accessible to R2R codegen for callsite reporting.
src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs Removes placeholder recordCallSite (moved to RyuJit/ReadyToRun implementations).
src/coreclr/tools/Common/Internal/Runtime/ReadyToRunConstants.cs Adds managed enum value for InjectStringThunks fixup kind.
src/coreclr/tools/Common/Internal/Runtime/ModuleHeaders.cs Bumps managed R2R header minor version to 18.6.
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmObjectWriter.cs Adapts to WasmLowering.GetSignature(...).FuncType.
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmInstructions.cs Adds memory.copy, memory.fill, and i64.const helpers for thunk emission.
src/coreclr/tools/Common/Compiler/DependencyAnalysis/Target_Wasm/WasmTypes.cs Introduces WasmSignature value type carrying both WasmFuncType and signature string.
src/coreclr/tools/Common/Compiler/CompilerTypeSystemContext.Wasm.cs Adds caching for struct-by-size and empty-struct roundtripping during signature raising.
src/coreclr/tools/aot/ILCompiler.RyuJit/JitInterface/CorInfoImpl.RyuJit.cs Provides stub recordCallSite to satisfy interface surface.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunSignature.cs Adds pretty-printer/parser support for InjectStringThunks fixup payload.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs Records WASM callsites to demand-drive thunk dependencies; adds per-method interpreter→R2R thunks.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/ILCompiler.ReadyToRun.csproj Includes new dependency-analysis nodes and shared Wasm type-system context file.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs Adds registration/collection for string-discoverable stubs and thunk node caches; introduces InjectStringThunks import.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/WasmR2RToInterpreterThunkNode.cs New string-discoverable thunk node for R2R→interpreter transitions.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/WasmInterpreterToR2RThunkNode.cs New string-discoverable thunk node for interpreter→R2R transitions.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/WasmImportThunkPortableEntrypoint.cs Switches import thunk identity to use WasmSignature + managed-call flags handling.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/WasmImportThunk.cs Updates import thunk to be keyed by WasmSignature and to handle indirect/empty structs consistently.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/StringDiscoverableAssemblyStubNode.cs Adds base class for stubs discoverable by string key; registers into fixup emission.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/InjectStringThunksSignature.cs Emits fixup payload of (utf8-string, reloc) pairs with terminator.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/DelayLoadHelperImport.cs Avoids generating delay-load helper thunk for generic lookups on WASM; emits null pointer in table.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/ArgIterator.cs Fixes Wasm32 maxOffset accounting when only unnamed args exist.
src/coreclr/tools/aot/ILCompiler.Compiler/ILCompiler.Compiler.csproj Includes shared Wasm type-system context file.
src/coreclr/nativeaot/Runtime/inc/ModuleHeaders.h Bumps NativeAOT R2R header minor version to 18.6.
src/coreclr/jit/jit.h Adds INDEBUG_OR_WASM macro to keep callsite signature plumbing in WASM non-debug builds.
src/coreclr/jit/importercalls.cpp Allocates/stashes CORINFO_SIG_INFO for WASM so emitter can report callsites.
src/coreclr/jit/gentree.h Keeps GenTreeCall::callSig field for WASM non-debug builds.
src/coreclr/jit/gentree.cpp Initializes/clones callSig under INDEBUG_OR_WASM.
src/coreclr/jit/emitwasm.cpp Always records callsite signature for WASM; tags unmanaged calls for filtering in R2R compiler.
src/coreclr/jit/emit.h Extends call emission params/debug info to carry unmanaged-call bit on WASM.
src/coreclr/jit/emit.cpp Keeps callsite recording enabled on WASM even in non-debug builds.
src/coreclr/jit/codegenwasm.cpp Passes call signature info through to emitter on WASM and tracks unmanaged calls.
src/coreclr/inc/readytorun.h Bumps R2R minor version and adds READYTORUN_FIXUP_InjectStringThunks native enum value.
docs/design/coreclr/botr/readytorun-format.md Documents fixup 0x39 payload format and semantics.
docs/design/coreclr/botr/clr-abi.md Documents WASM type lowering + signature-string encoding (T, S<N>, e, p) and prefixes.

Comment thread src/tasks/WasmAppBuilder/coreclr/SignatureMapper.cs
Comment thread src/tasks/WasmAppBuilder/coreclr/SignatureMapper.cs
Comment thread src/coreclr/vm/pregeneratedstringthunks.cpp

The series terminates when the null-terminated string is the empty string (a single `0x00` byte). There is no trailing RVA after the terminal empty string.

At runtime, the entries are merged into a global hash table. Strings already present in the table from previously loaded modules take precedence over new entries. The table can be queried via `LookupPregeneratedThunkByString`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems inefficient for release configurations with single R2R module. Should we have a mode where this a prebuilt R2R hashtable?

(This can be a follow up.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to handle this as a followup. The value of doing a pregenerated hash table is dependent on how many entries there really are, and the actual structure of how we end up using R2R in the field.

Comment thread src/coreclr/vm/pregeneratedstringthunks.cpp
Comment thread src/coreclr/vm/method.cpp Outdated
davidwrighton and others added 7 commits April 28, 2026 11:20
…bleBase, not by the image base.

Co-authored-by: Copilot <copilot@github.com>
Add FlagPendingThunkResolution on DynamicMethodDesc to track whether the
method is already in the pending thunk resolution list. The flag is set/cleared
using interlocked operations under s_pendingThunkResolutionLock, preventing
unbounded growth from re-used LCG methods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pregenerated string thunk hash table, lookup, and pending resolution
are only used on WASM. Guard them with TARGET_WASM, providing no-op stubs
for InitializePregeneratedStringThunkHash and ProcessInjectStringThunksFixup
on other platforms so callers remain unchanged.

Also adds FlagPendingThunkResolution on DynamicMethodDesc with interlocked
set/clear to prevent duplicate pending entries from reused LCG methods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of logging a message and producing an invalid signature, emit
WASM0067 error and return null so the build fails with a clear diagnostic
pointing at the missing entry in s_knownStructSizes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### Type Lowering

Managed types are lowered to WebAssembly value types according to the following rules
(implemented in `WasmLowering.LowerToAbiType` and `WasmLowering.LowerType`):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this match the lowering done by the wasm native toolchain as of right now, or are there differences?

| Value type (struct) — single field with padding, or multiple fields | Passed by reference (`i32` pointer) |
| Empty struct (zero instance fields) | Elided from the signature entirely |

**Struct unwrapping** is recursive: a struct containing a single struct field, where the inner struct
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De-duplicate this with the existing content in this doc (look for "Structs are generally passed by-reference, unless they happen to exactly contain a single primitive field" above)

stack and passes a pointer. For return values, the caller provides a hidden return buffer pointer
(see Signature Encoding below).

### Signature String Encoding
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This signature encoding is not part of the ABI. This doc is meant to be only about ABI details that the codegen needs to be aware of.

This can be a code comment somewhere, or be in the R2R format.


if (pMD->IsDynamicMethod())
{
pMD->AsDynamicMethodDesc()->InterlockedClearFlags(DynamicMethodDesc::FlagPendingThunkResolution);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What guarantees that the dynamic method is not in the process of being reused when this is running?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-crossgen2-coreclr only use for closed issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants