Skip to content

Extend CustomCodeMemory interface allowing for using previously-mapped images #10865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Milek7
Copy link
Contributor

@Milek7 Milek7 commented May 29, 2025

Module::deserialize_raw and CustomCodeMemory were introduced with aim of supporting no_std platforms. With small tweaks they can also be used for using Wasmtime on platforms that have full-blown virtual memory capabilities, but doesn't allow for directly mapping executable pages from user code instead limiting that capability for system loader. Adding features necessary for such platforms was previously attempted by #8245.

There are currently two issues with using Module::deserialize_raw for using images either statically linked into embedder executable or dynamically loaded shared objects:

  • CodeMemory::publish will initially make entire image read-only, destroying executable permissions that cannot be restored by user code. This will happen even if CustomCodeMemory is provided.
  • CodeMemory::publish will attempt to unconditionally register unwind information. As these are already properly registered by the system loader this is superfluous at best, or could fail module loading if system decides to return error on attempt for double-registration.

This commit solves these issues by:

  • Moving responsibility for making image RO to CustomCodeMemory hook.
  • Making CustomCodeMemory publishing hook return enum that tells what steps (only mapping, or mapping with registration) were performed, using default implementation only for actions not reported by the hook.
  • Additionally CustomCodeMemory hooks also receive pointers to the entire image, not only executable section. This allows the embedder to keep track of the images and unload them in case they were loaded as dynamic shared object.

@Milek7 Milek7 requested a review from a team as a code owner May 29, 2025 19:34
@Milek7 Milek7 requested review from dicej and removed request for a team May 29, 2025 19:34
@Milek7 Milek7 changed the title Extend CustomCodeMemory interface allowing for using previously-mappeddimages Extend CustomCodeMemory interface allowing for using previously-mapped images May 29, 2025
…d images

`Module::deserialize_raw` and `CustomCodeMemory` were introduced with aim of
supporting `no_std` platforms. With small tweaks they can also be used for using
Wasmtime on platforms that have full-blown virtual memory capabilities, but
doesn't allow for directly mapping executable pages from user code instead
limiting that capability for system loader.
Adding features necessary for such platforms was previously attempted by bytecodealliance#8245.

There are currently two issues with using `Module::deserialize_raw` for
using images either statically linked into embedder executable or dynamically
loaded shared objects:

- `CodeMemory::publish` will initially make entire image read-only,
  destroying executable permissions that cannot be restored by user code.
  This will happen even if `CustomCodeMemory` is provided.
- `CodeMemory::publish` will attempt to unconditionally register
  unwind information. As these are already properly registered by the system
  loader this is superfluous at best, or could fail module loading if system
  decides to return error on attempt for double-registration.

This commit solves these issues by:
- Moving responsibility for making image RO to `CustomCodeMemory` hook.
- Making `CustomCodeMemory` publishing hook return enum that tells
  what steps (only mapping, or mapping with registration) were
  performed, using default implementation only for actions not reported by the hook.
- Additionally `CustomCodeMemory` hooks also receive pointers to the
  entire image, not only executable section. This allows the embedder
  to keep track of the images and unload them in case they were
  loaded as dynamic shared object.
@Milek7 Milek7 force-pushed the customcodememory-dso branch from 00d5875 to 180ddad Compare May 29, 2025 19:47
@Milek7 Milek7 marked this pull request as draft May 29, 2025 20:18
@Milek7
Copy link
Contributor Author

Milek7 commented May 29, 2025

Okay, I got a little ahead of myself.

make_readonly don't actually get called for modules created by Module::deserialize_raw. So first issue is not true.
However double-registration is still a concern.
Providing address of entire image to hooks is still useful for keeping track for unloading purposes (and maybe for no virtual memory target if they actually want to make that RO mapping).

So I still think this makes sense.

@Milek7 Milek7 marked this pull request as ready for review May 29, 2025 20:44
@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label May 29, 2025
@alexcrichton
Copy link
Member

This seems reasonable to me to support, but what do you think of modeling this a bit differently? This is a pretty sensitive piece of the codebase which we want to ensure doesn't accidentally regress and having a lot of intertwined "if this else that" style logic can be relatively confusing to follow. Do you think it would be possible to internally model this as "assume there's always a custom memory publisher"? That way adding more hooks would mean adding more trait methods, and the "driver" would still be relatively straightforward of "do this, then that, then that".

A custom trait could then override any one particular step to a noop and that wouldn't involve a need for custom enums/etc which are pretty tightly bound to the exact structure of what we have today as well.

@Milek7
Copy link
Contributor Author

Milek7 commented Jun 2, 2025

Do you think about trait methods that would return bool indicating whether default implementation should be called, or calling only trait methods with default implementation provided as default CustomCodeMemory? In second case should MmapVec become part of public API?

@alexcrichton
Copy link
Member

Ideally we'd model everything with the trait to allow embedders to customize all the hooks necessary, but that's also not easy as there's internal types, like MmapVec, which we don't want to expose. This came up in #10740 where we have an internal trait with private types and a public trait with less functionality and that became a problem as well. Unfortunately I'm not sure how best to thread this needle...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants