-
Notifications
You must be signed in to change notification settings - Fork 19k
cmd/link,cmd/compile: linktime InlMark #77093
Copy link
Copy link
Closed as duplicate of#29571
Labels
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.PerformanceToolProposalIssues describing a requested change to a Go tool or command-line program.Issues describing a requested change to a Go tool or command-line program.compiler/runtimeIssues related to the Go compiler and/or runtime.Issues related to the Go compiler and/or runtime.
Metadata
Metadata
Assignees
Labels
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.PerformanceToolProposalIssues describing a requested change to a Go tool or command-line program.Issues describing a requested change to a Go tool or command-line program.compiler/runtimeIssues related to the Go compiler and/or runtime.Issues related to the Go compiler and/or runtime.
In CL 733845 this simple lines of code:
Compiles to (this is after pruning them):
This is a bit ridiculous level of inline marks, nops are cheap altho:
InlMarkinto the regular execution stream, so they consume decoder slots at afloor(N_inlmarks / 2)ratio.mov rax, rbx, add rax, rcx→add rax, rbx, rcx(decoded as 3 operands instruction which doesn't exists on amd64 rather than going through the register renaming unit).I have other examples where removing the inline marks speeds up things, altho most of them are because the inline mark is the drop overflowing the instructions into the next cache line or getting unlucky with loop alignment.
AFAIK inline marks exists because runtime and tracing internals needs a placeholder PC to use for the inlined function in backtraces.
I think it would make more sense if the linker generated inlmarks (with cooperation from the compiler).
There would be an inline mark symbol for each function definition (so all functions inlining a same function would all point to the same symbol).
They would be PC Quantum sized objects in a zero-tripped
ro-^xsegment to minimize memory usage (still significant debug info / pclntab overhead ...).