Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC (Garbage Collection) AOT support #2144

Open
wenyongh opened this issue Apr 23, 2023 · 2 comments
Open

GC (Garbage Collection) AOT support #2144

wenyongh opened this issue Apr 23, 2023 · 2 comments
Labels
new feature New feature request

Comments

@wenyongh
Copy link
Contributor

wenyongh commented Apr 23, 2023

GC (Gargbage Collection) AOT support

Motivation

In recent days we had refactored the WAMR GC interpreter implementation to support most features of the latest GC MVP spec proposal, and support the bytecodes generate by the binaryen toolchain as its bytecode definition is a little different from the definition of GC MVP, so as to run the workloads compiled by the ts2wasm compiler, which uses binaryen as its backend. To gain better performance and to reduce the footprint, we started to enable the GC AOT support.

Design Goals

  • Support platform linux, macos, windows, android, linux-sgx, nuttx and so on
  • Support target x86-64, x86-32, aarch64, arm32, thumb, riscv64, riscv32
  • Support the same functionalities as interpreter
  • Support the bytecode definition of both GC MVP and GC binaryen
  • Pass the spec cases and the cases compiled by ts2wasm compiler
  • Support GC multi-threading
  • Export necessary GC APIs
  • Standardize the interface between runtime and GC allocator

Non Design Goals now

  • Improve the performance of GC allocator and GC reclaim process, and resolve the fragment issue of GC heap
    There may be some shortcomings of the current ems heap allocator, we hope we can replace it with other GC heap allocator in the feature. And the interfaces between runtime and GC allocator will be standardized, so we can easily integrate a new GC allocator into runtime.

Changes of the AOT file format

There will be many changes in the AOT file format, after discussion, some decisions were made:

  • The AOT file version will be upgraded from 1 to 2
  • The new version won’t be compatible with the old version since it is complex and difficult to main the compatibility with old version, in other words, the AOT file generated by the old wamrc won’t be able to be run by the new AOT runtime.
  • Remove the current handles to keep the compatibility in AOT runtime, so as to make the code clear and concise
  • Consider the support for multiple memories, exception handling and component model
  • Strings in the AOT file are always terminated by ‘\0’, so runtime can directly use them and doesn’t need to create a copy

Some detail changes

  • Enable feature flags, each bit denotes whether a feature is support or not
  • Extend the structures which have value type field, add extra heap type field if the value type is a reference type that requires heap type info
  • Extend the function type to function/struct/array/sub/sub_final/recursive types, and add a flag to specify which defined type it is
  • Extend the initializer expression to support more const expression (like struct.new_xxx, array.new_xxx) and support a piece of bytecode to do the initialization

Changes of the AOT compiler

  • Add --enable-gc option for wamrc
  • Support build wamrc for GC MVP version and GC binaryen version, by default the GC MVP version is built, and when cmake -DWAMR_BUILD_GC_BINARYEN=1 is used, the GC binaryen version is built.

Changes of AOT module instance layout

  • The value of element in the table instance is changed from the function index of uint32 type to the function object pointer of WASMFunctionObjRef type. The impacts the instantiation of table instance and how to access it.

AOT stack frame process

Current AOT doesn’t operate on the stack frame except in the feature of dumping call stack, memory profiling and performance profiling, since the operations of the function local variables and stack operators have been optimized to the operations of registers and native stack by the LLVM codegen. Since GC requires to add the GC objects into the rootset during garbage collection, we need to enable the stack frame operation for GC AOT:

  • Create a frame before calling a function, and release it after returning from a function
  • Commit operators from registers into stack frame when needed

Export GC runtime APIs

See PR #2143, the runtime APIs are exported in core/iwasm/include/gc_export.h

GC multi-threding support

Need to enhance the thread suspend/resume mechanism to support the GC reclaim process for multi-thread: a thread can require to suspend the world if needed, or suspend other threads (wait until all other threads are suspended), and resume them after a job is finished. This mechanism is also helpful to the linear memory info synchronization when memory.grow opcode is exectued, see the discussion in #2078. Another scenario may be the source debugger for multi-threading: when a thread enters into break pointer, it may ask other threads to suspend, and resume them when it continues to run.

Standardize the interface between runtime and GC heap allocator

Better define the interface between runtime and GC heap allocator so that we can easily integrate a new GC heap allocator into runtime.

References

@wenyongh wenyongh added the new feature New feature request label Apr 23, 2023
@no1wudi
Copy link
Collaborator

no1wudi commented May 4, 2023

@TianlongLiang
Copy link
Contributor

TianlongLiang commented Jul 18, 2023

Use this issue to track and update the AOT GC opcodes compilation development progress.

SubTasks

  • Modify existing opcodes and implement simple new opcodes(non-struct, non-array, non-branching)
  • Implement new struct and array related opcodes
  • implement new branching opcodes

Progress

Basically finished most part of subtask 1, part 1, namely modifying existing opcodes, including:

  1. WASM_OP_CALL_INDIRECT
  2. WASM_OP_SELECT_T
  3. WASM_OP_TABLE_GET
  4. WASM_OP_TABLE_SET
  5. WASM_OP_REF_NULL
  6. WASM_OP_REF_IS_NULL
  7. WASM_OP_REF_FUNC
  8. WASM_OP_TABLE_INIT
  9. WASM_OP_TABLE_GROW
  10. WASM_OP_TABLE_FILL
  11. WASM_OP_GET_LOCAL
  12. WASM_OP_SET_LOCAL
  13. WASM_OP_TEE_LOCAL

pr #2376 merged

Subtask 1, part 2, to implement simple new opcodes, along with Subtask3 branching opcodes:

  1. WASM_OP_REF_EQ
  2. WASM_OP_CALL_REF
  3. WASM_OP_RETURN_CALL_REF
  4. WASM_OP_REF_AS_NON_NULL
  5. WASM_OP_BR_ON_NULL
  6. WASM_OP_BR_ON_NON_NULL
  7. WASM_OP_I31_NEW
  8. WASM_OP_I31_GET_S
  9. WASM_OP_I31_GET_U
  10. WASM_OP_REF_TEST
  11. WASM_OP_REF_CAST
  12. WASM_OP_REF_TEST_NULLABLE
  13. WASM_OP_REF_CAST_NULLABLE
  14. WASM_OP_BR_ON_CAST
  15. WASM_OP_BR_ON_CAST_FAIL
  16. WASM_OP_BR_ON_CAST_NULLABLE
  17. WASM_OP_BR_ON_CAST_FAIL_NULLABLE
  18. WASM_OP_EXTERN_INTERNALIZE
  19. WASM_OP_EXTERN_EXTERNALIZE

pr #2486 merged

Subtask 2, array-related and struct-related opcodes

  1. WASM_OP_STRUCT_NEW_CANON
  2. WASM_OP_STRUCT_NEW_CANON_DEFAULT
  3. WASM_OP_STRUCT_GET
  4. WASM_OP_STRUCT_GET_S
  5. WASM_OP_STRUCT_GET_U
  6. WASM_OP_STRUCT_SET
  7. WASM_OP_ARRAY_NEW_CANON
  8. WASM_OP_ARRAY_NEW_CANON_DEFAULT
  9. WASM_OP_ARRAY_NEW_CANON_FIXED
  10. WASM_OP_ARRAY_GET
  11. WASM_OP_ARRAY_GET_S
  12. WASM_OP_ARRAY_GET_U
  13. WASM_OP_ARRAY_SET
  14. WASM_OP_ARRAY_LEN
  15. WASM_OP_ARRAY_NEW_CANON_DATA
  16. WASM_OP_ARRAY_COPY(to be compatible with binaryen GC)

#2487 merged

wenyongh added a commit that referenced this issue Aug 28, 2023
Implement a full LLVM AOT/JIT stack frame dump:
commit the function arguments, locals, stack operands from LLVM values to the stack frame,
which is required by the GC AOT/JIT feature, and may be required by the AOT debugger,
AOT snapshot and other features.

Refer to:
#2144
#2333
#2506
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature request
Projects
None yet
Development

No branches or pull requests

3 participants