Skip to content

Overhead

Clark Gaebel edited this page Apr 23, 2022 · 2 revisions

In our experience, magic-trace tends to make applications 2%-10% slower, but most applications should come in closer to 2% than 10%. Additionally, the application will stall for ~10us when magic-trace takes a snapshot. All in, I think it's fair to say that magic-trace has less overhead than perf -g, and more overhead than perf -glbr.

Memory bandwidth

The "2%-10%" overhead mostly (maybe entirely?) comes from Intel PT's memory bandwidth usage.

In our experience, Intel PT (and therefore magic-trace) uses hundreds of Mbps of memory bandwidth to construct its traces. This is usually fine; Intel PT pauses tracing if it notices that the trace would saturate memory bandwidth. Momentarily saturating memory bandwidth is the number one reason people see "Decode Errors" in their traces.

You can decrease magic-trace's memory bandwidth consumption by decreasing the timing resolution.

Breakpoint

When magic-trace takes its snapshot, it interrupts the application for ~10us. To prevent that overhead from affecting users of your application, we recommend triggering snapshots off of a function that's called after completely servicing a user's request.