Skip to content

testing: b.Loop still causes extra allocations after 1.26 b.Loop inlining behavior changes #77339

@thepudds

Description

@thepudds

Background

The first version of b.Loop as implemented in Go 1.24 stopped the inlining of functions called within the b.Loop body.

Shortly after 1.24 was released, I had filed #73137, which suggested that strategy ideally would be improved upon, including because it caused heap allocations under a b.Loop benchmark that would not occur in normal usage or in an older b.N style benchmark, especially in cases where the inlining would allow something to be stack allocated that would otherwise be heap allocated.

Austin (comment) and others agreed it made sense to improve the implementation.

Go 1.26 did change the implementation, and as a result the issue I had filed is now closed:

Problem

The 1.26 implementation no longer prevents inlining.

However, from what I can tell, the 1.26 implementation is such that it still causes allocations under b.Loop that do not occur under an older style b.N benchmark. I suspect the common case is that scenarios that would trigger extra allocations in 1.24/1.25 under b.Loop still trigger extra allocations in go1.26rc2.

I have a CL at https://go.dev/cl/738822 that makes a small change to the Go 1.26 implementation that I think resolves this issue.

Additional details

Inlining most of my comment from a few days ago in #73137 (comment), which includes a sample benchmark that illustrates the problem:

I wanted to understand the new 1.26 b.Loop behavior better, so I poked around a bit at how it's implemented.

One thing I noticed is it seems the new behavior means b.Loop still seems to cause extra allocations compared to the older b.N style benchmarks, including in the example I built for this issue here -- in particular, my "b.Loop-basic" benchmark from the playground link in the opening comment above.

The function being benchmark now can be inlined in Go 1.26, which is an improvement compared to Go 1.24/1.25 behavior, but the undesirable allocation still happens with b.Loop in go1.26rc2 (compared to the allocations do not happen in an otherwise equivalent b.N benchmark).

I suspect that is due to the way the 1.26 b.Loop compiler changes are handling the temporary variables it is creating. It looks like the autotmp variables are being declared by the compiler outside the loop body, which escape analysis will determine is an escaping value and results in a heap allocation.

I sent WIP https://go.dev/cl/738822 with a candidate fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.FixPendingIssues that have a fix which has not yet been reviewed or submitted.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions