Skip to content

Add logic to load webcil as r2r#126388

Draft
davidwrighton wants to merge 4 commits intodotnet:mainfrom
davidwrighton:add_logic_to_load_webcil_as_r2r
Draft

Add logic to load webcil as r2r#126388
davidwrighton wants to merge 4 commits intodotnet:mainfrom
davidwrighton:add_logic_to_load_webcil_as_r2r

Conversation

@davidwrighton
Copy link
Copy Markdown
Member

  • Update webcil spec to have a spot to store the tableBase in the header during webcil loading
  • Tweak corerun build to make the stack pointer available as an export and make the table size growable
  • Compute payloadSize by reading the wasm bytes directly instead of calling the helper function
  • import "stackPointer", "tableBase", "table" and "imageBase" from "webcil" instead of "env". [TODO! Update the webcil.md to specify that these must be specified when importing a v1 webcil image.]
  • Adjust cDAC handling of webcil images. Notably, the webcil file format is a specification of its own, and its data structures are specified, and shouldn't be drawn from the target, but instead hard-coded.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Webcil (WASM-hosted IL) loading to support ReadyToRun by introducing a v1 Webcil header that can carry TableBase, shifting cDAC parsing to use the Webcil spec directly (rather than target type info), and updating the WASM toolchain/host plumbing to provide the required imports and table behavior.

Changes:

  • Add Webcil v1 header handling (extra TableBase field) and adjust Webcil decoding/validation for versioned headers.
  • Update WASM object writer and corerun host to import stackPointer, imageBase, tableBase, and table from "webcil", and enable growable tables.
  • Update cDAC Loader contract/data parsing and tests to use spec-defined offsets/sizes (hard-coded layouts).

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/native/managed/cdac/tests/LoaderTests.cs Updates mock Webcil header layout used in Loader contract tests.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/WebcilSectionHeader.cs Switches Webcil section header reads to spec-defined fixed offsets.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/WebcilHeader.cs Reads version/section count at fixed offsets and introduces version-aware header sizing.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/Loader_1.cs Uses spec-based header/section sizes to resolve Webcil RVA→file offset.
src/coreclr/vm/datadescriptor/datadescriptor.inc Removes cDAC type descriptors for Webcil header/section structs (no longer drawn from target).
src/coreclr/utilcode/webcildecoder.cpp Adds v1 header support and version-aware section-table base handling for Webcil decoding/DAC enumeration.
src/coreclr/inc/webcildecoder.h Introduces Webcil v1 header struct and tracks section header base pointer in decoder.
src/coreclr/tools/Common/Wasm/Webcil.cs Bumps Webcil major version constant to 1.
src/coreclr/tools/Common/Compiler/ObjectWriter/WebcilEncoder.cs Makes header encoding size version-aware and introduces TableBase offset constant.
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmObjectWriter.cs Imports globals/table from "webcil", emits passive element segment, and populates TableBase during payload copy.
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmNative.cs Adds a table import type encoding.
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmInstructions.cs Adds WASM instruction encoders for ref.null, table.grow, and table.init.
src/coreclr/hosts/corerun/wasm/libCorerun.js Parses payload size from wasm bytes, instantiates Webcil module with "webcil" imports, and extracts payload.
src/coreclr/hosts/corerun/CMakeLists.txt Enables table growth and exports the stack pointer global for the host.
docs/design/mono/webcil.md Documents v1 header’s TableBase field semantics.
docs/design/datacontracts/Loader.md Updates Loader contract documentation to reference spec-defined Webcil header/section sizes.
Comments suppressed due to low confidence (2)

src/coreclr/tools/Common/Compiler/ObjectWriter/WebcilEncoder.cs:54

  • HeaderEncodeSize can return 32 for v1+, but EmitHeader only writes fields up to offset 24..27. The final 4 bytes (TableBase) are left uninitialized in the stackalloc buffer, so emitted Webcil headers may contain non-deterministic garbage.

Ensure the buffer is initialized (e.g., headerBuffer.Clear()) and explicitly write a value for TableBase (likely 0 in the serialized image, since it is patched during getWebcilPayload).

        public static int HeaderEncodeSize(WebcilHeader header)
        {
            int size = sizeof(uint) +    // Id: 4
                    sizeof(ushort) +     // VersionMajor: 6
                    sizeof(ushort) +     // VersionMinor: 8
                    sizeof(ushort) +     // CoffSections: 10
                    sizeof(ushort) +     // Reserved0: 12
                    sizeof(uint) +       // PeCliHeaderRva: 16
                    sizeof(uint) +       // PeCliHeaderSize: 20
                    sizeof(uint) +       // PeDebugRva: 24
                    sizeof(uint);        // PeDebugSize: 28
            Debug.Assert(size == 28);
            if (header.VersionMajor >= 1)
            {
                size += sizeof(uint); // TableBase: 32
            }
            return size;
        }

        public static int TableBaseOffset => 28;

        public static int EmitHeader(in WebcilHeader header, Stream outputStream)
        {
            int encodeSize = HeaderEncodeSize(header);
            Span<byte> headerBuffer = stackalloc byte[encodeSize];
            BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(0, 4), header.Id);
            BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(4, 2), header.VersionMajor);
            BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(6, 2), header.VersionMinor);
            BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(8, 2), header.CoffSections);
            BinaryPrimitives.WriteUInt16LittleEndian(headerBuffer.Slice(10, 2), header.Reserved0);
            BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(12, 4), header.PeCliHeaderRva);
            BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(16, 4), header.PeCliHeaderSize);
            BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(20, 4), header.PeDebugRva);
            BinaryPrimitives.WriteUInt32LittleEndian(headerBuffer.Slice(24, 4), header.PeDebugSize);

            outputStream.Write(headerBuffer);
            return encodeSize;
        }

src/coreclr/tools/Common/Wasm/Webcil.cs:16

  • WC_VERSION_MAJOR is now 1, but WebcilHeader (in this file) is still a 28-byte struct and does not model the v1 TableBase field described in docs/design/mono/webcil.md. This makes the managed representation inconsistent with the spec and with the encoder logic that can emit a 32-byte header.

Consider introducing an explicit v1 header type (or extending WebcilHeader/adding a trailing field) so the versioned layout is representable and can be encoded/decoded deterministically.

internal static class WebcilConstants
{
    public const int WC_VERSION_MAJOR = 1;
    public const int WC_VERSION_MINOR = 0;

    /// <summary>
    /// 'WbIL' magic bytes interpreted as a little-endian uint32.
    /// </summary>
    public const uint WEBCIL_MAGIC = 0x4c496257;

Comment on lines +6 to 9
using System.ComponentModel.DataAnnotations;
using System.IO;
using System.Runtime.InteropServices;
using System.Text;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System.ComponentModel.DataAnnotations is imported but never used in this test file. This will produce an unused-using warning (often treated as an error in this repo). Remove the using directive.

Suggested change
using System.ComponentModel.DataAnnotations;
using System.IO;
using System.Runtime.InteropServices;
using System.Text;
using System.IO;
using System.Runtime.InteropServices;

Copilot uses AI. Check for mistakes.
Comment on lines 56 to +66
m_size = size;
m_hasContents = (size > 0);
m_pHeader = (m_hasContents && size >= sizeof(WebcilHeader)) ? (const WebcilHeader *)flatBase : NULL;
if (m_pHeader->VersionMajor >= 1)
{
// For version 1 and above, the section headers start after the larger header
if (size < sizeof(WebcilHeader_1))
{
m_pHeader = NULL; // Not enough data for even the header
}
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Init dereferences m_pHeader unconditionally (m_pHeader->VersionMajor) even when size < sizeof(WebcilHeader) (in which case m_pHeader is set to NULL). This is a null-dereference when Init is called on small/empty buffers.

Guard the access (e.g., if (m_pHeader != NULL && m_pHeader->VersionMajor >= 1)), and ensure m_sections is set consistently for all paths (also consider initializing/clearing m_sections in the ctor/Reset() since it is a new member).

Copilot uses AI. Check for mistakes.
RETURN FALSE;

COUNT_T headerEnd = sizeof(WebcilHeader) + (COUNT_T)pHeader->CoffSections * sizeof(WebcilSectionHeader);
COUNT_T headerEnd = static_cast<COUNT_T>(((uint8_t*)m_sections - (uint8_t*)m_base)) + (COUNT_T)pHeader->CoffSections * sizeof(WebcilSectionHeader);
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HasWebcilHeaders computes headerEnd using m_sections, but m_sections can be NULL (e.g., when VersionMajor==1 but size < sizeof(WebcilHeader_1) and Init nulled m_pHeader). This leads to invalid pointer arithmetic/underflow and can make the bounds check unreliable.

Compute the section table start from pHeader->VersionMajor after validating it (e.g., use sizeof(WebcilHeader_1) vs sizeof(WebcilHeader) directly) and return FALSE early if the buffer is too small for the chosen header size.

Suggested change
COUNT_T headerEnd = static_cast<COUNT_T>(((uint8_t*)m_sections - (uint8_t*)m_base)) + (COUNT_T)pHeader->CoffSections * sizeof(WebcilSectionHeader);
COUNT_T headerSize;
if (pHeader->VersionMajor == WEBCIL_VERSION_MAJOR_0)
{
headerSize = sizeof(WebcilHeader);
}
else
{
// WEBCIL_VERSION_MAJOR_1
headerSize = sizeof(WebcilHeader_1);
}
if (m_size < headerSize)
RETURN FALSE;
const uint8_t *sectionTable = reinterpret_cast<const uint8_t *>(m_base) + headerSize;
COUNT_T headerEnd = static_cast<COUNT_T>(sectionTable - reinterpret_cast<const uint8_t *>(m_base)) +
static_cast<COUNT_T>(pHeader->CoffSections) * sizeof(WebcilSectionHeader);

Copilot uses AI. Check for mistakes.
Comment on lines 855 to 866
if (enumThis)
{
DacEnumMemoryRegion(m_base, sizeof(WebcilHeader));
const WebcilHeader *pHeader = (const WebcilHeader *)m_base;
DacEnumMemoryRegion(m_base, dac_cast<TADDR>(m_sections) - dac_cast<TADDR>(m_base));
}

if (HasWebcilHeaders())
{
// Enumerate section headers
const WebcilHeader *pHeader = (const WebcilHeader *)m_base;
DacEnumMemoryRegion(m_base + sizeof(WebcilHeader),
sizeof(WebcilSectionHeader) * pHeader->CoffSections);
DacEnumMemoryRegion(m_sections, sizeof(WebcilSectionHeader) * pHeader->CoffSections);

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnumMemoryRegions enumerates dac_cast<TADDR>(m_sections) - dac_cast<TADDR>(m_base) even when m_sections is NULL (or not initialized). That can underflow and cause DAC to try to enumerate an enormous memory region.

Only use m_sections when it is non-null; otherwise enumerate a safe minimal range (e.g., min(m_size, sizeof(WebcilHeader))) or compute the header size from validated version data before enumerating.

Copilot uses AI. Check for mistakes.
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-crossgen2-coreclr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants