-
Notifications
You must be signed in to change notification settings - Fork 161
fast_memcpy: add memcpy implementation for openhcl_vmm #2297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
openhcl_vmm has a lot of code that depends on memcpy being fast, but musl's memcpy on x86_64 is slow. Write a generic memcpy in Rust and rely on LLVM to do a good job optimizing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces an optimized memcpy and memmove implementation to address performance issues with musl's implementation on x86_64. The implementation handles various copy sizes with specialized strategies and correctly handles overlapping memory regions.
- Adds a new
fast_memcpycrate with optimized memory copy functions - Integrates the fast_memcpy into OpenHCL's underhill_entry for x86_64 targets
- Configures dev profile optimizations to ensure fast_memcpy is always optimized
Reviewed Changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| support/fast_memcpy/src/lib.rs | Core implementation of optimized memcpy/memmove with size-specific strategies and overlap handling |
| support/fast_memcpy/Cargo.toml | Package manifest for the new fast_memcpy crate |
| openhcl/underhill_entry/src/lib.rs | Imports fast_memcpy for x86_64 to replace musl's memcpy |
| openhcl/underhill_entry/Cargo.toml | Adds fast_memcpy dependency |
| Cargo.toml | Registers fast_memcpy workspace member and configures dev profile optimizations |
| Cargo.lock | Lock file updates for new dependency |
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
| 0 => {} | ||
| 1 => copy_one::<u8>(dest, src), | ||
| 2 => copy_one::<u16>(dest.cast(), src.cast()), | ||
| 3 => copy_one::<U8x3>(dest.cast(), src.cast()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How frequently does 3 come up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth having this extra type and wiring? It seems like an odd len to come in. How often is memcpy called with such small sizes in general? I'd expect small sizes to just be optimized out.
openhcl_vmmhas a lot of code that depends on memcpy being fast, but musl's memcpy on x86_64 is often slow. Write a generic memcpy in Rust and rely on LLVM to do a good job optimizing it.