Skip to content

AsmPrinter puts NamedRegionTimer around every instruction, causing big overhead in clang -ftime-report #39650

@llvmbot

Description

@llvmbot
Bugzilla Link 40303
Version trunk
OS All
Reporter LLVM Bugzilla Contributor
CC @francisvm,@RKSimon,@arsenm

Extended Description

Clang -ftime-report seems to have quite a lot of overhead, compared to when the flag is not used. In the time reported, curiously passes like "X86 Assembly Printer" take up a lot of of time.

Overhead of having -ftime-report on is 30-60% which sounds like a lot. Some tests I did locally (actual files don't matter; basically any non-trivial .cpp file compilation will do):

            regular   -ftime-report

catch.cpp 1.337 1.950
catch.cpp -O2 4.616 6.586
stl.cpp 0.882 1.269
unityformat.cpp 7.195 7.312
range-compr.cpp 5.352 6.129
shader.cpp 6.000 9.555
shader.cpp -O2 12.635 20.061

My guess is that's because lib/CodeGen/AsmPrinter/AsmPrinter.cpp basically has two timer samples (NamedRegionTimer) for every instruction, and for every "Handler" that it invokes.

EmitFunctionBody basically looks like:

for (auto &MBB : *MF) {
for (auto &MI : MBB) {
for (const HandlerInfo &HI : Handlers) {
NamedRegionTimer T(...);
beginInstruction();
}
// ...
for (const HandlerInfo &HI : Handlers) {
NamedRegionTimer T(...);
endInstruction();
}
}
}

And then every timer sample is of course involves getting elapsed time, process times, and memory usage samples twice (for beginning and end of region).

I'm testing this with a couple days old clang trunk (8.0.0), but seemingly the issue has been there for a while. Haven't tracked down how far back it exists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions