Skip to content

Conversation

olivergillespie
Copy link

@olivergillespie olivergillespie commented Aug 29, 2025

Add a new MIR pass which changes the source filename for statements with move operands to include information about the types being moved.

This is just a proof of concept to play with, we'll need a better approach if we want to make this a feature.

Background

When profiling my applications to improve throughput, I've found lots of cases where I have a lot of CPU time spent in memcpy. Profiling tools show this to be directly called from my code, e.g. copy::main -> memcpy, but it's not always obvious to me where the memcpy comes from, to allow me to fix it (e.g. by moving some data on to the heap). Even with line number, it could come from values moved into a closure, but with complex code that could be several values of different types and sizes, so how do I know which are the expensive ones?

I would like to have some way to trace back from expensive memcpy at runtime to which values are being moved and why. As a proof of concept, I've looked for move operands and added the moved type into the source filename, which is obviously a hack, but allows profiling tools to show this alongside the runtime cost of memcpy.

Profile before:
before

Profile after:
The new filename/line number is like this: /tmp/rust/copy.rs Moved:[[u8; 1234](1234 bytes)] Line::12
after

The new profile shows us the types that were moved for each memcpy operation.

I think static analysis is not enough to solve this satisfactorily, because we really only care about cases which are actually hot.

Real implementation

As a first step, simply emitting a mapping of source location to moved types would allow for manual lookup by the developer. They can profile their code, then for each memcpy of interest, look up the mapping and see which type(s) were moved there (could this data be exposed somehow by rust-analyzer?). It would be nice to avoid the manual mapping step, though, and have the information available directly in the profile. Profiler tools could be modified to read the mapping, but that's a pain (I assume). Creative hacks include inserting wrapper method calls around memcpy which include this information in the method name. copy::main -> memcpy becomes copy::main -> fake_methods::rustc_move_[[u8; 1234](1234 bytes)] -> memcpy, so profilers see it automatically. This could be limited to moves over a size threshold to reduce overhead.

Additional features/questions

  • Do we have access to the variable names? That would be useful to include.
  • Is this a legitimate way to find the relevant moves? Did I miss any cases?
  • Is there already some easy way to solve this problem (finding the cause of expensive move memcpys) that I just don't know about?

Add a new MIR pass which changes the source filename
for statements with move operands to include information
about the types being moved.
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 29, 2025
@rust-log-analyzer
Copy link
Collaborator

The job aarch64-gnu-llvm-19-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants