Skip to content

Add Webcil support to R2RDump#127885

Open
davidwrighton wants to merge 6 commits intodotnet:mainfrom
davidwrighton:wasmR2RDump
Open

Add Webcil support to R2RDump#127885
davidwrighton wants to merge 6 commits intodotnet:mainfrom
davidwrighton:wasmR2RDump

Conversation

@davidwrighton
Copy link
Copy Markdown
Member

Note

This PR was created with the assistance of GitHub Copilot.

Summary

Adds support for reading and dumping Webcil files in R2RDump, including a full WebAssembly bytecode disassembler.

Changes

  • Webcil support in R2RDump: Enable R2RDump to open and process Webcil (.wasm) assemblies.
  • IAssemblyMetadata.GetSectionData: Add GetSectionData to the metadata interface and implement across all types.
  • WASM bytecode disassembler: New WasmDisassembler class that decodes WASM binary instructions into WAT text format, covering:
    • All core MVP instructions (control flow, memory, numeric, reference, GC)
    • Complete SIMD instruction set (0xFD prefix, all 256 sub-opcodes)
    • Exception handling try_table instruction with all catch clause kinds
    • Bulk memory, table, and saturating truncation instructions (0xFC prefix)
    • GC/struct/array instructions (0xFB prefix)

davidwrighton and others added 6 commits May 6, 2026 12:59
R2RDump previously could not read Webcil files (the format used for
managed assemblies in WebAssembly environments). This adds a
WebcilImageReader that implements IBinaryImageReader for the Webcil
format, enabling R2RDump to dump headers, methods, and section
contents from Webcil-format R2R images.

Changes:
- New WebcilImageReader.cs implementing IBinaryImageReader
- ReadyToRunReader detects Webcil format (after MachO, before PE)
- DumpModel handles Webcil in reference assembly loading
- Program.cs maps OperatingSystem.Unknown to TargetOS.Linux for Webcil
- ReadyToRunMethod gracefully handles null PEReader (Webcil has no PE)
- ILCompiler.Reflection.ReadyToRun.csproj includes shared Webcil.cs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the PEReader ImageReader property with a GetSectionData(int rva)
method that returns a BlobReader. This decouples the interface from
PEReader, enabling non-PE formats (Webcil) to provide section data.

Implementations:
- StandaloneAssemblyMetadata: delegates to PEReader.GetSectionData
- ManifestAssemblyMetadata: same with null-guard
- WebcilAssemblyMetadata: resolves RVA via WebcilImageReader sections
- SimpleAssemblyMetadata (tests): delegates to PEReader.GetSectionData

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement a full WASM instruction disassembler that decodes WebAssembly
binary format into WAT-style text output. This enables the --disasm flag
in R2RDump to work with Webcil/WASM R2R images.

- Add WasmDisassembler.cs with complete opcode tables for all standard
  WASM instructions (control, parametric, variable, table, memory,
  numeric, conversion, sign-extension, reference types) plus 0xFC
  (bulk memory/saturating truncation), 0xFB (GC), and 0xFD (SIMD)
  prefixed opcodes
- Add WebcilImageReader.GetWasmFunctionBody() to parse the WASM module's
  type, function, and code sections to extract function info including
  type signature and local declarations
- Integrate into TextDumper.DumpWasmDisasm() to print parameters and
  locals with their local indices, result types, and disassembled
  instructions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
WebcilAssemblyMetadata was not retaining a reference to the pinned
metadata byte array passed to its constructor. After
GetStandaloneAssemblyMetadata returned, the array could be collected
by the GC despite being allocated on the Pinned Object Heap, since
no live reference existed. This caused an AccessViolationException
when MetadataReader accessed the freed memory on larger files like
system.private.corelib.wasm.

Fix: store the metadata byte array in a field to keep it rooted for
the lifetime of the MetadataReader.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the stub DecodeFDPrefixed() method with a complete implementation
of all WebAssembly SIMD instructions (0xFD prefix, sub-opcodes 0-255)
per the WebAssembly spec. This includes memory operations, lane
load/store, shuffle, splat, extract/replace lane, comparisons, bitwise
operations, arithmetic, and conversion instructions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement opcode 0x1F (try_table) per the WebAssembly exception handling
spec. Decodes the block type and vector of catch clauses, supporting all
four catch clause kinds: catch, catch_ref, catch_all, catch_all_ref.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 6, 2026 20:44
@github-actions github-actions Bot added the area-crossgen2-coreclr only use for closed issues label May 6, 2026
@davidwrighton davidwrighton requested a review from adamperlin May 6, 2026 20:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the CoreCLR R2RDump toolchain to recognize Webcil inputs (including WASM-wrapped Webcil) and adds a WebAssembly bytecode disassembler for dumping function bodies in a WAT-like textual form. It also evolves the metadata abstraction so method-body bytes can be retrieved without assuming a PE-backed PEReader.

Changes:

  • Add WebcilImageReader support to ReadyToRunReader initialization and R2RDump’s metadata-opening path.
  • Replace IAssemblyMetadata.ImageReader with IAssemblyMetadata.GetSectionData(int rva) and update method-body local signature decoding accordingly.
  • Add WasmDisassembler and integrate WASM disassembly printing into TextDumper for Webcil/WASM scenarios.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/coreclr/tools/r2rdump/WasmDisassembler.cs New WASM bytecode decoder/disassembler for dumping instructions.
src/coreclr/tools/r2rdump/TextDumper.cs Emits WASM-specific disassembly and metadata (params/locals/results) for Webcil inputs.
src/coreclr/tools/r2rdump/Program.cs Adds fallback handling for OperatingSystem.Unknown when producing TargetDetails.
src/coreclr/tools/r2rdump/DumpModel.cs Detects Webcil inputs when opening reference assemblies for metadata resolution.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs New reader that parses Webcil (and WASM-wrapped Webcil) and exposes metadata/sections/function bodies.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/StandaloneAssemblyMetadata.cs Implements GetSectionData via PEReader section access.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs Detects Webcil images and uses WebcilImageReader as the CompositeReader.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunMethod.cs Switches local-signature decoding to use GetSectionData + MethodBodyBlock.Create.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ManifestAssemblyMetadata.cs Implements GetSectionData when backed by a PEReader.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ILCompiler.Reflection.ReadyToRun.csproj Links in shared Webcil definitions (Webcil.cs).
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/IAssemblyMetadata.cs Replaces PEReader exposure with GetSectionData(int rva).
src/coreclr/tools/aot/ILCompiler.ReadyToRun.Tests/TestCasesRunner/R2RResultChecker.cs Updates test metadata wrapper to implement GetSectionData.

Comment on lines +63 to +67
case 0x02:
{
string bt = ReadBlockType();
indent++;
return $"block{bt}";
}

private string ReadHeapType()
{
Comment thread src/coreclr/tools/r2rdump/DumpModel.cs
Comment thread src/coreclr/tools/r2rdump/Program.cs
Comment on lines +504 to +507
else if (WebcilImageReader.IsWebcilImage(image))
{
CompositeReader = new WebcilImageReader(image);
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a future problem.

@jkotas jkotas added area-ReadyToRun arch-wasm WebAssembly architecture and removed area-crossgen2-coreclr only use for closed issues labels May 6, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

}

public ImmutableArray<byte> GetEntireImage()
=> Unsafe.As<byte[], ImmutableArray<byte>>(ref Unsafe.AsRef(in _image));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to avoid the case from mutable byte[] to ImmutableArray here?

return result;
}

public int GetOffset(int rva)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public int GetOffset(int rva)
public int GetSectionRelativeOffset(int rva)


unsafe
{
fixed (byte* p = &image[(int)offset])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a Span here and individual reads from the fields so we don't need the unsafe wrapping the memcpy here?

webcilOffset = 0;

// Simple scan: look for the Webcil magic in the WASM module
// The Webcil payload is embedded as a custom section in the WASM module
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we searching the whole file here ? Because I think that if we are looking just at data section, then it should be at fixed offset 0. Maybe that's fine for helper tool ?

/// Decodes WASM binary instructions into WAT (WebAssembly Text) format.
/// Based on the WebAssembly specification: https://webassembly.github.io/spec/core/
/// </summary>
internal sealed class WasmDisassembler
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add e2e integration test and use wat2wasm to test a round-trip.

Does this also parse non-R2R modules ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-ReadyToRun

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants