Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial embedded ELF module loader. #5504

Merged
merged 1 commit into from
Apr 20, 2021
Merged

Initial embedded ELF module loader. #5504

merged 1 commit into from
Apr 20, 2021

Conversation

benvanik
Copy link
Collaborator

@benvanik benvanik commented Apr 18, 2021

Enabled with -iree-llvm-link-embedded and -iree-llvm-target-triple={any}-pc-linux-elf.
This is still considered experimental and not enabled by default. More verification is needed across platforms and we need to make progress on #4717 before more models can run.

~1000x faster than the system loader on Windows (60-100ms -> 50us) and 64b + ELF memory usage. Most of the time spent on Windows/x64 is in applying page protections:
image

Though this is a massive performance win the real benefit is dramatically simplified compiler configuration and cross-platform retargetability: there's no need for the Android NDK/etc, cross-compilation (Linux->Windows->Mac->etc) is trivial, and the ELF files we generate will run anywhere C will run and we can PROT_EXEC. Users just need lld (ld may work, but eh) with the appropriate target architectures built into it and we will only need to embed one binary per architecture (vs one per architecture crossed with all platforms it may be used on). This is how we get portable AOT execution on the cheap.

Imports are not supported so it fails on any executable that ends up using -lm (floorf, etc).
The iree-translate invocation I'm using to produce the executables attempts to work around this in some cases with fmaf:

    -iree-llvm-target-triple=x86_64-pc-linux-elf \
    -iree-codegen-linalg-to-llvm-use-unfused-fma \
    -iree-llvm-link-embedded=true \
    -iree-llvm-debug-symbols=false

Tentative support is in place for bare-metal targets via the new IREE_PLATFORM_GENERIC selector; it's untested but likely to be just a short iteration from working. This path requires no syscalls (file IO, dlopen, mmap, etc) and just C11's aligned_alloc/free.

Fixes #5351.

@benvanik benvanik added compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) performance ⚡ Performance/optimization related work across the compiler and runtime hal/cpu Runtime Host/CPU-based HAL backend labels Apr 18, 2021
@google-cla google-cla bot added the cla: yes label Apr 18, 2021
@benvanik benvanik added the platform/generic 🔩 Bare metal/generic target build, execution, benchmarking, and deployment label Apr 18, 2021
@benvanik benvanik force-pushed the benvanik-elf branch 2 times, most recently from fbf71b9 to 136e655 Compare April 18, 2021 19:31
@benvanik benvanik marked this pull request as ready for review April 18, 2021 22:51
@benvanik benvanik requested a review from ScottTodd April 18, 2021 22:51
Copy link
Collaborator

@stellaraccident stellaraccident left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really awesome (and a good walk down memory lane for me). Thanks for the side chat with clarifications on specifics.

iree/hal/local/elf/arch/riscv.c Outdated Show resolved Hide resolved
iree/hal/local/elf/arch/x86_64.c Show resolved Hide resolved
iree/hal/local/elf/arch/x86_64.c Outdated Show resolved Hide resolved
iree/hal/local/elf/arch/x86_64.c Outdated Show resolved Hide resolved
iree/hal/local/elf/arch/x86_64_msvc.asm Show resolved Hide resolved
iree/hal/local/elf/elf_module.c Show resolved Hide resolved
iree/hal/local/elf/platform/generic.c Outdated Show resolved Hide resolved
iree/hal/local/elf/platform/linux.c Outdated Show resolved Hide resolved
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just scanned through, did not review all files in detail. Looks good :D

typedef uint64_t iree_elf64_addr_t;
typedef uint16_t iree_elf64_half_t;
typedef uint64_t iree_elf64_off_t;
typedef int32_t iree_elf64_sword_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚔️ sword :O

Enabled with `-iree-llvm-link-embedded` and
`-iree-llvm-target-triple={any}-pc-linux-elf`.

~1000x faster than the system loader on Windows (60-100ms -> 50us)
and 64b + ELF memory usage.

Imports are not supported so it fails on any executable that ends up
using -lm (floorf, etc).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) hal/cpu Runtime Host/CPU-based HAL backend performance ⚡ Performance/optimization related work across the compiler and runtime platform/generic 🔩 Bare metal/generic target build, execution, benchmarking, and deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement simple cross-platform/arch ELF loader for executables
3 participants