Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for userfaultfd #4040

Merged
merged 1 commit into from
Apr 18, 2022

Conversation

alexcrichton
Copy link
Member

This commit removes support for the userfaultfd or "uffd" syscall on
Linux. This support was originally added for users migrating from Lucet
to Wasmtime, but the recent developments of kernel-supported
copy-on-write support for memory initialization wound up being more
appropriate for these use cases than usefaultfd. The main reason for
moving to copy-on-write initialization are:

  • The userfaultfd feature was never necessarily intended for this
    style of use case with wasm and was susceptible to subtle and rare
    bugs that were extremely difficult to track down. We were never 100%
    certain that there were kernel bugs related to userfaultfd but the
    suspicion never went away.

  • Handling faults with userfaultfd was always slow and single-threaded.
    Only one thread could handle faults and traveling to user-space to
    handle faults is inherently slower than handling them all in the
    kernel. The single-threaded aspect in particular presented a
    significant scaling bottleneck for embeddings that want to run many
    wasm instances in parallel.

  • One of the major benefits of userfaultfd was lazy initialization of
    wasm linear memory which is also achieved with the copy-on-write
    initialization support we have right now.

  • One of the suspected benefits of userfaultfd was less frobbing of the
    kernel vma structures when wasm modules are instantiated. Currently
    the copy-on-write support has a mitigation where we attempt to reuse
    the memory images where possible to avoid changing vma structures.
    When comparing this to userfaultfd's performance it was found that
    kernel modifications of vmas aren't a worrisome bottleneck so
    copy-on-write is suitable for this as well.

Overall there are no remaining benefits that userfaultfd gives that
copy-on-write doesn't, and copy-on-write solves a major downsides of
userfaultfd, the scaling issue with a single faulting thread.
Additionally copy-on-write support seems much more robust in terms of
kernel implementation since it's only using standard memory-management
syscalls which are heavily exercised. Finally copy-on-write support
provides a new bonus where read-only memory in WebAssembly can be mapped
directly to the same kernel cache page, even amongst many wasm instances
of the same module, which was never possible with userfaultfd.

In light of all this it's expected that all users of userfaultfd should
migrate to the copy-on-write initialization of Wasmtime (which is
enabled by default).

This commit removes support for the `userfaultfd` or "uffd" syscall on
Linux. This support was originally added for users migrating from Lucet
to Wasmtime, but the recent developments of kernel-supported
copy-on-write support for memory initialization wound up being more
appropriate for these use cases than usefaultfd. The main reason for
moving to copy-on-write initialization are:

* The `userfaultfd` feature was never necessarily intended for this
  style of use case with wasm and was susceptible to subtle and rare
  bugs that were extremely difficult to track down. We were never 100%
  certain that there were kernel bugs related to userfaultfd but the
  suspicion never went away.

* Handling faults with userfaultfd was always slow and single-threaded.
  Only one thread could handle faults and traveling to user-space to
  handle faults is inherently slower than handling them all in the
  kernel. The single-threaded aspect in particular presented a
  significant scaling bottleneck for embeddings that want to run many
  wasm instances in parallel.

* One of the major benefits of userfaultfd was lazy initialization of
  wasm linear memory which is also achieved with the copy-on-write
  initialization support we have right now.

* One of the suspected benefits of userfaultfd was less frobbing of the
  kernel vma structures when wasm modules are instantiated. Currently
  the copy-on-write support has a mitigation where we attempt to reuse
  the memory images where possible to avoid changing vma structures.
  When comparing this to userfaultfd's performance it was found that
  kernel modifications of vmas aren't a worrisome bottleneck so
  copy-on-write is suitable for this as well.

Overall there are no remaining benefits that userfaultfd gives that
copy-on-write doesn't, and copy-on-write solves a major downsides of
userfaultfd, the scaling issue with a single faulting thread.
Additionally copy-on-write support seems much more robust in terms of
kernel implementation since it's only using standard memory-management
syscalls which are heavily exercised. Finally copy-on-write support
provides a new bonus where read-only memory in WebAssembly can be mapped
directly to the same kernel cache page, even amongst many wasm instances
of the same module, which was never possible with userfaultfd.

In light of all this it's expected that all users of userfaultfd should
migrate to the copy-on-write initialization of Wasmtime (which is
enabled by default).
@alexcrichton
Copy link
Member Author

It's worth pointing out that this doesn't yet remove the support for paged memory initialization internally because that initialization is a step towards the static memory initialization that copy-on-write uses. I'll file an issue about simplifying that if we move forward with this.

@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Apr 15, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions
Copy link

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

Copy link
Member

@peterhuene peterhuene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 With these changes I think we can finally restructure the pooling allocator layout so we can have a single madvise call per instance deallocation (it was originally designed to avoid unnecessary fault handling since they were so expensive from userspace).

@alexcrichton alexcrichton merged commit 3f3afb4 into bytecodealliance:main Apr 18, 2022
@alexcrichton alexcrichton deleted the rm-uffd branch April 18, 2022 17:42
@alexcrichton
Copy link
Member Author

FWIW I do think we'll want to measure that before landing it, I think historically I measured that madvise got slower as you gave it more memory to madvise, even if all the memory was PROT_NONE'd memory. I wasn't measuring precisely that though so it may be worth re-measuring to confirm.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Apr 18, 2022
This commit simplifies the `MemoryInitialization` enum by removing the
`Paged` variant. The `Paged` variant was originally added for uffd, but
that support has now been removed in bytecodealliance#4040. This is no longer necessary
but is still used as an intermediate step of becoming a `Static` variant
of initialized memory (which copy-on-write uses). As a result this
commit largely modifies the static initialization of memory steps and
folds the two methods together.
alexcrichton added a commit that referenced this pull request May 5, 2022
* Remove the `Paged` memory initialization variant

This commit simplifies the `MemoryInitialization` enum by removing the
`Paged` variant. The `Paged` variant was originally added for uffd, but
that support has now been removed in #4040. This is no longer necessary
but is still used as an intermediate step of becoming a `Static` variant
of initialized memory (which copy-on-write uses). As a result this
commit largely modifies the static initialization of memory steps and
folds the two methods together.

* Apply suggestions from code review

Co-authored-by: Peter Huene <peter@huene.dev>

Co-authored-by: Peter Huene <peter@huene.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants