Add logic to load webcil as r2r#126388
Conversation
davidwrighton
commented
Apr 1, 2026
- Update webcil spec to have a spot to store the tableBase in the header during webcil loading
- Tweak corerun build to make the stack pointer available as an export and make the table size growable
- Compute payloadSize by reading the wasm bytes directly instead of calling the helper function
- import "stackPointer", "tableBase", "table" and "imageBase" from "webcil" instead of "env". [TODO! Update the webcil.md to specify that these must be specified when importing a v1 webcil image.]
- Adjust cDAC handling of webcil images. Notably, the webcil file format is a specification of its own, and its data structures are specified, and shouldn't be drawn from the target, but instead hard-coded.
There was a problem hiding this comment.
Pull request overview
Updates Webcil (WASM-hosted IL) loading to support ReadyToRun by introducing a v1 Webcil header that can carry TableBase, shifting cDAC parsing to use the Webcil spec directly (rather than target type info), and updating the WASM toolchain/host plumbing to provide the required imports and table behavior.
Changes:
- Add Webcil v1 header handling (extra
TableBasefield) and adjust Webcil decoding/validation for versioned headers. - Update WASM object writer and corerun host to import
stackPointer,imageBase,tableBase, andtablefrom"webcil", and enable growable tables. - Update cDAC Loader contract/data parsing and tests to use spec-defined offsets/sizes (hard-coded layouts).
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/LoaderTests.cs | Updates mock Webcil header layout used in Loader contract tests. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/WebcilSectionHeader.cs | Switches Webcil section header reads to spec-defined fixed offsets. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/WebcilHeader.cs | Reads version/section count at fixed offsets and introduces version-aware header sizing. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/Loader_1.cs | Uses spec-based header/section sizes to resolve Webcil RVA→file offset. |
| src/coreclr/vm/datadescriptor/datadescriptor.inc | Removes cDAC type descriptors for Webcil header/section structs (no longer drawn from target). |
| src/coreclr/utilcode/webcildecoder.cpp | Adds v1 header support and version-aware section-table base handling for Webcil decoding/DAC enumeration. |
| src/coreclr/inc/webcildecoder.h | Introduces Webcil v1 header struct and tracks section header base pointer in decoder. |
| src/coreclr/tools/Common/Wasm/Webcil.cs | Bumps Webcil major version constant to 1. |
| src/coreclr/tools/Common/Compiler/ObjectWriter/WebcilEncoder.cs | Makes header encoding size version-aware and introduces TableBase offset constant. |
| src/coreclr/tools/Common/Compiler/ObjectWriter/WasmObjectWriter.cs | Imports globals/table from "webcil", emits passive element segment, and populates TableBase during payload copy. |
| src/coreclr/tools/Common/Compiler/ObjectWriter/WasmNative.cs | Adds a table import type encoding. |
| src/coreclr/tools/Common/Compiler/ObjectWriter/WasmInstructions.cs | Adds WASM instruction encoders for ref.null, table.grow, and table.init. |
| src/coreclr/hosts/corerun/wasm/libCorerun.js | Parses payload size from wasm bytes, instantiates Webcil module with "webcil" imports, and extracts payload. |
| src/coreclr/hosts/corerun/CMakeLists.txt | Enables table growth and exports the stack pointer global for the host. |
| docs/design/mono/webcil.md | Documents v1 header’s TableBase field semantics. |
| docs/design/datacontracts/Loader.md | Updates Loader contract documentation to reference spec-defined Webcil header/section sizes. |
Comments suppressed due to low confidence (2)
src/coreclr/tools/Common/Compiler/ObjectWriter/WebcilEncoder.cs:54
HeaderEncodeSizecan return 32 for v1+, butEmitHeaderonly writes fields up to offset 24..27. The final 4 bytes (TableBase) are left uninitialized in thestackallocbuffer, so emitted Webcil headers may contain non-deterministic garbage.
Ensure the buffer is initialized (e.g., headerBuffer.Clear()) and explicitly write a value for TableBase (likely 0 in the serialized image, since it is patched during getWebcilPayload).
public static int HeaderEncodeSize(WebcilHeader header)
{
int size = sizeof(uint) + // Id: 4
sizeof(ushort) + // VersionMajor: 6
sizeof(ushort) + // VersionMinor: 8
sizeof(ushort) + // CoffSections: 10
sizeof(ushort) + // Reserved0: 12
sizeof(uint) + // PeCliHeaderRva: 16
sizeof(uint) + // PeCliHeaderSize: 20
sizeof(uint) + // PeDebugRva: 24
sizeof(uint); // PeDebugSize: 28
Debug.Assert(size == 28);
if (header.VersionMajor >= 1)
{
size += sizeof(uint); // TableBase: 32
}
return size;
}
public static int TableBaseOffset => 28;
public static int EmitHeader(in WebcilHeader header, Stream outputStream)
{
int encodeSize = HeaderEncodeSize(header);
Span<byte> headerBuffer = stackalloc byte[encodeSize];
BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(0, 4), header.Id);
BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(4, 2), header.VersionMajor);
BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(6, 2), header.VersionMinor);
BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(8, 2), header.CoffSections);
BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(10, 2), header.Reserved0);
BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(12, 4), header.PeCliHeaderRva);
BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(16, 4), header.PeCliHeaderSize);
BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(20, 4), header.PeDebugRva);
BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(24, 4), header.PeDebugSize);
outputStream.Write(headerBuffer);
return encodeSize;
}
src/coreclr/tools/Common/Wasm/Webcil.cs:16
WC_VERSION_MAJORis now 1, butWebcilHeader(in this file) is still a 28-byte struct and does not model the v1TableBasefield described in docs/design/mono/webcil.md. This makes the managed representation inconsistent with the spec and with the encoder logic that can emit a 32-byte header.
Consider introducing an explicit v1 header type (or extending WebcilHeader/adding a trailing field) so the versioned layout is representable and can be encoded/decoded deterministically.
internal static class WebcilConstants
{
public const int WC_VERSION_MAJOR = 1;
public const int WC_VERSION_MINOR = 0;
/// <summary>
/// 'WbIL' magic bytes interpreted as a little-endian uint32.
/// </summary>
public const uint WEBCIL_MAGIC = 0x4c496257;
| using System.ComponentModel.DataAnnotations; | ||
| using System.IO; | ||
| using System.Runtime.InteropServices; | ||
| using System.Text; |
There was a problem hiding this comment.
System.ComponentModel.DataAnnotations is imported but never used in this test file. This will produce an unused-using warning (often treated as an error in this repo). Remove the using directive.
| using System.ComponentModel.DataAnnotations; | |
| using System.IO; | |
| using System.Runtime.InteropServices; | |
| using System.Text; | |
| using System.IO; | |
| using System.Runtime.InteropServices; |
| m_size = size; | ||
| m_hasContents = (size > 0); | ||
| m_pHeader = (m_hasContents && size >= sizeof(WebcilHeader)) ? (const WebcilHeader *)flatBase : NULL; | ||
| if (m_pHeader->VersionMajor >= 1) | ||
| { | ||
| // For version 1 and above, the section headers start after the larger header | ||
| if (size < sizeof(WebcilHeader_1)) | ||
| { | ||
| m_pHeader = NULL; // Not enough data for even the header | ||
| } | ||
| } |
There was a problem hiding this comment.
Init dereferences m_pHeader unconditionally (m_pHeader->VersionMajor) even when size < sizeof(WebcilHeader) (in which case m_pHeader is set to NULL). This is a null-dereference when Init is called on small/empty buffers.
Guard the access (e.g., if (m_pHeader != NULL && m_pHeader->VersionMajor >= 1)), and ensure m_sections is set consistently for all paths (also consider initializing/clearing m_sections in the ctor/Reset() since it is a new member).
| RETURN FALSE; | ||
|
|
||
| COUNT_T headerEnd = sizeof(WebcilHeader) + (COUNT_T)pHeader->CoffSections * sizeof(WebcilSectionHeader); | ||
| COUNT_T headerEnd = static_cast<COUNT_T>(((uint8_t*)m_sections - (uint8_t*)m_base)) + (COUNT_T)pHeader->CoffSections * sizeof(WebcilSectionHeader); |
There was a problem hiding this comment.
HasWebcilHeaders computes headerEnd using m_sections, but m_sections can be NULL (e.g., when VersionMajor==1 but size < sizeof(WebcilHeader_1) and Init nulled m_pHeader). This leads to invalid pointer arithmetic/underflow and can make the bounds check unreliable.
Compute the section table start from pHeader->VersionMajor after validating it (e.g., use sizeof(WebcilHeader_1) vs sizeof(WebcilHeader) directly) and return FALSE early if the buffer is too small for the chosen header size.
| COUNT_T headerEnd = static_cast<COUNT_T>(((uint8_t*)m_sections - (uint8_t*)m_base)) + (COUNT_T)pHeader->CoffSections * sizeof(WebcilSectionHeader); | |
| COUNT_T headerSize; | |
| if (pHeader->VersionMajor == WEBCIL_VERSION_MAJOR_0) | |
| { | |
| headerSize = sizeof(WebcilHeader); | |
| } | |
| else | |
| { | |
| // WEBCIL_VERSION_MAJOR_1 | |
| headerSize = sizeof(WebcilHeader_1); | |
| } | |
| if (m_size < headerSize) | |
| RETURN FALSE; | |
| const uint8_t *sectionTable = reinterpret_cast<const uint8_t *>(m_base) + headerSize; | |
| COUNT_T headerEnd = static_cast<COUNT_T>(sectionTable - reinterpret_cast<const uint8_t *>(m_base)) + | |
| static_cast<COUNT_T>(pHeader->CoffSections) * sizeof(WebcilSectionHeader); |
| if (enumThis) | ||
| { | ||
| DacEnumMemoryRegion(m_base, sizeof(WebcilHeader)); | ||
| const WebcilHeader *pHeader = (const WebcilHeader *)m_base; | ||
| DacEnumMemoryRegion(m_base, dac_cast<TADDR>(m_sections) - dac_cast<TADDR>(m_base)); | ||
| } | ||
|
|
||
| if (HasWebcilHeaders()) | ||
| { | ||
| // Enumerate section headers | ||
| const WebcilHeader *pHeader = (const WebcilHeader *)m_base; | ||
| DacEnumMemoryRegion(m_base + sizeof(WebcilHeader), | ||
| sizeof(WebcilSectionHeader) * pHeader->CoffSections); | ||
| DacEnumMemoryRegion(m_sections, sizeof(WebcilSectionHeader) * pHeader->CoffSections); | ||
|
|
There was a problem hiding this comment.
EnumMemoryRegions enumerates dac_cast<TADDR>(m_sections) - dac_cast<TADDR>(m_base) even when m_sections is NULL (or not initialized). That can underflow and cause DAC to try to enumerate an enormous memory region.
Only use m_sections when it is non-null; otherwise enumerate a safe minimal range (e.g., min(m_size, sizeof(WebcilHeader))) or compute the header size from validated version data before enumerating.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/WebcilHeader.cs
Show resolved
Hide resolved
|
Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara |
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmObjectWriter.cs
Outdated
Show resolved
Hide resolved
… more friendly for a future multithreaded scenario