-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial embedded ELF module loader. #5504
Conversation
fbf71b9
to
136e655
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really awesome (and a good walk down memory lane for me). Thanks for the side chat with clarifications on specifics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just scanned through, did not review all files in detail. Looks good :D
typedef uint64_t iree_elf64_addr_t; | ||
typedef uint16_t iree_elf64_half_t; | ||
typedef uint64_t iree_elf64_off_t; | ||
typedef int32_t iree_elf64_sword_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚔️ sword
:O
Enabled with `-iree-llvm-link-embedded` and `-iree-llvm-target-triple={any}-pc-linux-elf`. ~1000x faster than the system loader on Windows (60-100ms -> 50us) and 64b + ELF memory usage. Imports are not supported so it fails on any executable that ends up using -lm (floorf, etc).
Enabled with
-iree-llvm-link-embedded
and-iree-llvm-target-triple={any}-pc-linux-elf
.This is still considered experimental and not enabled by default. More verification is needed across platforms and we need to make progress on #4717 before more models can run.
~1000x faster than the system loader on Windows (60-100ms -> 50us) and 64b + ELF memory usage. Most of the time spent on Windows/x64 is in applying page protections:
Though this is a massive performance win the real benefit is dramatically simplified compiler configuration and cross-platform retargetability: there's no need for the Android NDK/etc, cross-compilation (Linux->Windows->Mac->etc) is trivial, and the ELF files we generate will run anywhere C will run and we can PROT_EXEC. Users just need lld (ld may work, but eh) with the appropriate target architectures built into it and we will only need to embed one binary per architecture (vs one per architecture crossed with all platforms it may be used on). This is how we get portable AOT execution on the cheap.
Imports are not supported so it fails on any executable that ends up using -lm (floorf, etc).
The
iree-translate
invocation I'm using to produce the executables attempts to work around this in some cases withfmaf
:Tentative support is in place for bare-metal targets via the new
IREE_PLATFORM_GENERIC
selector; it's untested but likely to be just a short iteration from working. This path requires no syscalls (file IO, dlopen, mmap, etc) and just C11's aligned_alloc/free.Fixes #5351.