- 
                Notifications
    
You must be signed in to change notification settings  - Fork 5.2k
 
Description
Object Writer
To generate usable Wasm from RyuJIT we need an object writer. The object writer has multiple responsibilities, not limited to:
- Record the Wasm code associated with methods into function bodies in the output Wasm object
 - Record function signatures for each function body for the type section
 - Build the imports/exports sections
 Build the function pointer table(The linker generates this automatically based on relocations)- Build the data segment(s) containing constants, indirection cells, etc
 - Build the names section containing function name information for diagnostics
 - Embed managed modules into a wasm wrapper, aka WebCIL ([tracking] Publish .NET assemblies in Webcil files #80807)
 - (Optional for browser, mandatory for WASI) Record various types of relocations and encode them into the output following https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md so we can link all R2R outputs into the runtime
 - Generate and embed DWARF debug information
https://github.com/WebAssembly/tool-conventions/blob/main/Debugging.md
https://yurydelendik.github.io/webassembly-dwarf/ 
NativeAOT-LLVM has an existing object writer which can be found here: https://github.com/dotnet/runtimelab/blob/feature/NativeAOT-LLVM/src/coreclr/tools/aot/ILCompiler.LLVM/CodeGen/WasmObjectWriter.cs
Object model
There are two models we can adopt for RyuJIT Wasm, and the object writer should be able to support both:
- Single unified module. In this case, we link all of the Wasm modules generated by the build together (using 
wasm-ld) into a single module alongside the runtime itself, so there's just a finalyourapp.wasmto be loaded by the host. This requires robust relocation information and has a mixture of upsides and downsides. - Array of modules. In this case, the runtime is loaded by itself as a free-standing module - the 'main' module - and then we load a module for each managed assembly, where 1-N of these modules also contain Wasm code that was generated by RyuJIT. The runtime is responsible for orchestrating this. This model is not compatible with hosts like wasmtime, which expect a single module to AOT-compile and run.
 
Object composition
Function signatures
In Wasm each module has a type section which enumerates all the different function signatures (function types) used by the module. This section precedes the actual code section and the function section (along with the import section) refers to function types by index.
We need to generate entries in this section for all of our compiled methods and any method signatures that our compiled code needs to invoke, otherwise the module will be invalid. We want to avoid generating duplicate entries.
Imports
In Wasm each module explicitly specifies its external dependencies, in our case most of these would be PAL APIs or helpers. We have to generate a table of these at build time and associate each import with a name and a type (from the function signatures table). This section comes after the function signatures and before functions because imports and declared functions all share the same index space.
We likely also want to import a function pointer table from outside, the one being used by the CoreCLR runtime.
It is probably fine to have unused imports or have a standard set of imports for 'all the PAL APIs and CLR helpers' instead of trying to specifically only import what we use.
Functions
The function section is a sequence of function type indices (pointing into the type section) where each one corresponds with an actual function body in the code section, coming later in the module.
table section
We want to either declare a function pointer table or import one from outside so that it can be used for call_indirect.
Memories
We want to declare one linear memory with no maximum size and a reasonable minimum size. The linker will merge it with the memories of other modules.
Globals
We want to declare some common globals like the stack top/bottom, etc, matching what emscripten clang generates so that we can link correctly with the CoreCLR runtime module.
Exports
If we plan to export global variables or functions by name we'll need to generate an exports section which maps function indices (from the functions section + imports section) and global indices to names.
Element section
Any function pointer data we want to load into the function pointer table needs to be defined in the element section as a vector of function indices.
If we're using the linker, it will automatically synthesize the contents of the table for us so we won't need to generate any elements in that scenario, just make sure that we have appropriate relocations for any function that has its address taken.
Code section
Actual method bodies live in the code section, where there is one function body for every entry in the function section above (which is where the signature of the function was specified).
Each function body specifies its locals in groups where all the locals in a given group have the same type, i.e. 10 i32s, 5 f32s, 3 f64s and they are sequentially numbered.
After the locals the actual code follows.
We need to track any relocatable values that end up in code, like function pointers or memory addresses, so we can emit relocs for them.
Data section
String literals and other compile time information, along with reserved locations for things like indirection cells, all are defined by 'data segments' in the data section. Each segment specifies an offset and a size along with a vector of bytes to fill that segment at load time.
We need to track any relocatable values that end up in data segments, like function pointers or memory addresses, so we can emit relocs for them.
Names (custom) section
Contains names for each function in the module, used for debugging and profiling.
Target features (custom) section
Specifies which Wasm target features we're using, various tooling including wasm-ld wants to see this in order to process our code correctly. We'll want to make sure we don't use non-MVP features without specifying them here.
(incomplete - work in progress)