Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: generate same code regardless of whether return values are named #20859

Open
bradfitz opened this issue Jun 30, 2017 · 18 comments
Open
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@bradfitz
Copy link
Contributor

There's an article at the top of Hacker News right now,

https://blog.minio.io/golang-internals-part-2-nice-benefits-of-named-return-values-1e95305c8687
https://news.ycombinator.com/item?id=14668323

... which advocates for naming return values for the benefit of better generated code.

It concludes:

So we will be gradually adopting named return values more and more, both for new code as well as for existing code.

In fact we are also investigating if we can develop a little utility to help or automate this process.

This is totally counter to:

https://golang.org/s/style#named-result-parameters

... which says that naming result parameters should only be for godoc cleanliness.

We should fix the compiler to generate the same code regardless.

/cc @randall77 @mdempsky @josharian @brtzsnr @cherrymui

@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 30, 2017
@bradfitz bradfitz added this to the Go1.10 milestone Jun 30, 2017
@robpike
Copy link
Contributor

robpike commented Jun 30, 2017

I like named return values. I wish I had permission to use them.

-rob

@bradfitz
Copy link
Contributor Author

I like named return values. I wish I had permission to use them.

I interpret this as sadness about the "style" URL above, another term I know you don't like. I guess you value anti-stutter in godoc differently than some of us do. I agree there are cases where it'd be nice to use named return values (especially to reference them from defer) without cluttering godoc. Or to eliminate a few lines in other cases.

It'd be neat if we could influence godoc rendering separate from whether we chose to use named return values in code. But those proposals never go anywhere quickly (#16666, #18342, #7873), so instead we control documentation formatting by how we write the code.

@dmitshur
Copy link
Contributor

dmitshur commented Jun 30, 2017

Is this issue about the following two styles generating same code?

func foo() (ch credentialHeader) {
    // ...
    if cond {
        return ch
    }
    // ...
    return ch
}
func foo() credentialHeader {
    var ch credentialHeader
    // ...
    if cond {
        return ch
    }
    // ...
    return ch
}

Or is it about handling these cases too?

func foo() credentialHeader {
    // ...
    if cond {
        return credentialHeader{/* ... */}
    }
    // ...
    return credentialHeader{/* ... */}
}

@fwessels
Copy link

It would be great if the compiler can take care of this and always generated the most optimal code.

Even so, for all of the examples as posted above by @shurcooL, you can still argue that the first is the most readable (or at the very least it is the briefest version). But of course this also has to do with personal style.

@robpike
Copy link
Contributor

robpike commented Jun 30, 2017

Sometimes, such as when there are several return values , code can be cleaner and simpler using named returns. One instance is when there are multiple return values followed by an error: you can use Go's handling of zero values on function entry to avoid messy return statements. But the style guys win, they always win, and much as I like consistency in the code base I get sad when following the style guide makes the code uglier.

The style tyrants made illegal the single most interesting language property of Sawzall because to use it required judgment. The language was made much clumsier in practice as a result, and it depressed me. Style tyrants want to eliminate judgment, plus it seems sometimes that the result of style guides is to steer programmers away from all the interesting properties of the language, towards a mainstream but dull uniformity free of joy.

I follow the style guide anyway, just to be clear.

@fwessels
Copy link

Maybe, if someone has poor judgment, it is better to stay away from writing code altogether (or doing other stuff for that matter...)

And besides, hey, where would be the world be without some rebellions... 😄

@hanwen
Copy link
Contributor

hanwen commented Jun 30, 2017

@robpike - pray tell, which feature was that?

@cherrymui
Copy link
Member

It seems there are a few things for the binary difference. In the first example in the blog,

type objectInfo struct {
	arg1 int64
	arg2 uint64
	arg3 string
	arg4 []int
}
func NoNamedReturnParams(i int) (objectInfo) {

	if i == 1 {
		// Do one thing
		return objectInfo{}
	}

	if i == 2 {
		// Do another thing
		return objectInfo{}
	}

	if i == 3 {
		// Do one more thing still
		return objectInfo{}
	}

	// Normal return
	return objectInfo{}
}
"".NoNamedReturnParams t=1 size=243 args=0x40 locals=0x0
0x0000 	TEXT	"".NoNamedReturnParams(SB), $0-64
0x0000 	MOVQ	$0, "".~r1+16(FP)
0x0009 	LEAQ	"".~r1+24(FP), DI
0x000e 	XORPS	X0, X0
0x0011 	ADDQ	$-16, DI
0x0015 	DUFFZERO	$288
0x0028 	MOVQ	"".i+8(FP), AX
0x002d 	CMPQ	AX, $1
0x0031 	JEQ	$0, 199
0x0037 	CMPQ	AX, $2
0x003b 	JEQ	$0, 155
0x003d 	CMPQ	AX, $3
0x0041 	JNE	111
0x0043 	MOVQ	"".statictmp_2(SB), AX
0x004a 	MOVQ	AX, "".~r1+16(FP)
0x004f 	LEAQ	"".~r1+24(FP), DI
0x0054 	LEAQ	"".statictmp_2+8(SB), SI
0x005b 	DUFFCOPY	$854
0x006e 	RET
0x006f 	MOVQ	"".statictmp_3(SB), AX
0x0076 	MOVQ	AX, "".~r1+16(FP)
0x007b 	LEAQ	"".~r1+24(FP), DI
0x0080 	LEAQ	"".statictmp_3+8(SB), SI
0x0087 	DUFFCOPY	$854
0x009a 	RET
0x009b 	MOVQ	"".statictmp_1(SB), AX
0x00a2 	MOVQ	AX, "".~r1+16(FP)
0x00a7 	LEAQ	"".~r1+24(FP), DI
0x00ac 	LEAQ	"".statictmp_1+8(SB), SI
0x00b3 	DUFFCOPY	$854
0x00c6 	RET
0x00c7 	MOVQ	"".statictmp_0(SB), AX
0x00ce 	MOVQ	AX, "".~r1+16(FP)
0x00d3 	LEAQ	"".~r1+24(FP), DI
0x00d8 	LEAQ	"".statictmp_0+8(SB), SI
0x00df 	DUFFCOPY	$854
0x00f2 	RET

The compiler generates four separate statictmps for objectInfo{}. It could realize that they are the same value so just generate one statictmp and reuse it. Better, it could realize it is zero value therefore no need to generate statictmp.

If I rewrite it to something like (like @shurcooL's example)

func NoNamedReturnParams2(i int) (objectInfo) {
        var o objectInfo
        if i == 1 {
                // Do one thing
                return o
        }

        if i == 2 {
                // Do another thing
                return o
        }

        if i == 3 {
                // Do one more thing still
                return o
        }

        // Normal return
        return o
}

It generates

"".NoNamedReturnParams2 STEXT nosplit size=309 args=0x40 locals=0x40
	0x0000 00000 (r.go:29)	TEXT	"".NoNamedReturnParams2(SB), NOSPLIT, $64-64
	0x0000 00000 (r.go:29)	SUBQ	$64, SP
	0x0004 00004 (r.go:29)	MOVQ	BP, 56(SP)
	0x0009 00009 (r.go:29)	LEAQ	56(SP), BP
	0x000e 00014 (r.go:29)	FUNCDATA	$0, gclocals·1d0ed49f611d7e40a62328b5976a2ede(SB)
	0x000e 00014 (r.go:29)	FUNCDATA	$1, gclocals·98abd00825523461856b7cae0585bf34(SB)
	0x000e 00014 (r.go:29)	MOVQ	$0, "".~r1+80(SP)
	0x0017 00023 (r.go:29)	LEAQ	"".~r1+88(SP), DI
	0x001c 00028 (r.go:29)	XORPS	X0, X0
	0x001f 00031 (r.go:29)	ADDQ	$-16, DI
	0x0023 00035 (r.go:29)	DUFFZERO	$288
	0x0036 00054 (r.go:30)	MOVQ	$0, "".o(SP)
	0x003e 00062 (r.go:30)	LEAQ	"".o+8(SP), DI
	0x0043 00067 (r.go:30)	ADDQ	$-16, DI
	0x0047 00071 (r.go:30)	DUFFZERO	$288
	0x005a 00090 (r.go:29)	MOVQ	"".i+72(SP), AX
	0x005f 00095 (r.go:31)	CMPQ	AX, $1
	0x0063 00099 (r.go:31)	JEQ	261
	0x0069 00105 (r.go:36)	CMPQ	AX, $2
	0x006d 00109 (r.go:36)	JEQ	213
	0x006f 00111 (r.go:41)	CMPQ	AX, $3
	0x0073 00115 (r.go:41)	JNE	165
	0x0075 00117 (r.go:43)	MOVQ	"".o(SP), AX
	0x0079 00121 (r.go:43)	MOVQ	AX, "".~r1+80(SP)
	0x007e 00126 (r.go:43)	LEAQ	"".~r1+88(SP), DI
	0x0083 00131 (r.go:43)	LEAQ	"".o+8(SP), SI
	0x0088 00136 (r.go:43)	DUFFCOPY	$854
	0x009b 00155 (r.go:43)	MOVQ	56(SP), BP
	0x00a0 00160 (r.go:43)	ADDQ	$64, SP
	0x00a4 00164 (r.go:43)	RET
	0x00a5 00165 (r.go:47)	MOVQ	"".o(SP), AX
	0x00a9 00169 (r.go:47)	MOVQ	AX, "".~r1+80(SP)
	0x00ae 00174 (r.go:47)	LEAQ	"".~r1+88(SP), DI
	0x00b3 00179 (r.go:47)	LEAQ	"".o+8(SP), SI
	0x00b8 00184 (r.go:47)	DUFFCOPY	$854
	0x00cb 00203 (r.go:47)	MOVQ	56(SP), BP
	0x00d0 00208 (r.go:47)	ADDQ	$64, SP
	0x00d4 00212 (r.go:47)	RET
	0x00d5 00213 (r.go:38)	MOVQ	"".o(SP), AX
	0x00d9 00217 (r.go:38)	MOVQ	AX, "".~r1+80(SP)
	0x00de 00222 (r.go:38)	LEAQ	"".~r1+88(SP), DI
	0x00e3 00227 (r.go:38)	LEAQ	"".o+8(SP), SI
	0x00e8 00232 (r.go:38)	DUFFCOPY	$854
	0x00fb 00251 (r.go:38)	MOVQ	56(SP), BP
	0x0100 00256 (r.go:38)	ADDQ	$64, SP
	0x0104 00260 (r.go:38)	RET
	0x0105 00261 (r.go:33)	MOVQ	"".o(SP), AX
	0x0109 00265 (r.go:33)	MOVQ	AX, "".~r1+80(SP)
	0x010e 00270 (r.go:33)	LEAQ	"".~r1+88(SP), DI
	0x0113 00275 (r.go:33)	LEAQ	"".o+8(SP), SI
	0x0118 00280 (r.go:33)	DUFFCOPY	$854
	0x012b 00299 (r.go:33)	MOVQ	56(SP), BP
	0x0130 00304 (r.go:33)	ADDQ	$64, SP
	0x0134 00308 (r.go:33)	RET

So it makes a zero-valued local variable o, and copied to the return value in each branch. It could lift the copy up out of the branches. Better if it could use the return value slot for o. I guess this issue focuses on this problem, not the statictmp ones.

@4ad
Copy link
Member

4ad commented Jun 30, 2017

@robpike I use named return values, even though I lack permission.

@josharian
Copy link
Contributor

Some minor observations.

  • If the returned types are small (in particular, small enough to be registerized), the generated code appears to already be identical.
  • If the returned types are large, and naked returns are only used in unlikely cases, then it is worse performance-wise to use a named return value, since it will add unnecessary zeroing in the common case.
  • Zeroing the return early (as triggered by named returns) could be worse for performance in some cases due to the cache impact of the zeroing.

So while the compiler doesn't do well on the example given, and should be improved, it's not obvious to me that named and unnamed returns should always generate the same code.

And in fact, I'd think using unnamed returns should always be a pareto performance improvement on named returns, since using named returns ties the compiler's hands in a way that unnamed returns do not. But, with @robpike, the primary consideration should be a judgment call about the readability of the code anyway. Except where critical to performance, there is significant danger of overfitting the compiler.

It could realize that they are the same value so just generate one statictmp and reuse it. Better, it could realize it is zero value therefore no need to generate statictmp.

Agreed on both counts. I'm surprised that there is a statictmp for this; I thought sinit was pretty good about recognizing zero values.

copied to the return value in each branch. It could lift the copy up out of the branches.

Yes, although why should it copy instead of just zeroing?

Better if it could use the return value slot for o.

Seems to me like that is the crux of this issue, although I'm not sure how feasible that is to implement.

@fwessels
Copy link

fwessels commented Jul 7, 2017

Just a few remarks:

Zeroing the return early (as triggered by named returns) could be worse for performance in some cases due to the cache impact of the zeroing.

Looking at NoNamedReturnParams it is calling DUFFZERO

"".NoNamedReturnParams t=1 size=243 args=0x40 locals=0x0
0x0000 	TEXT	"".NoNamedReturnParams(SB), $0-64
0x0000 	MOVQ	$0, "".~r1+16(FP)
0x0009 	LEAQ	"".~r1+24(FP), DI
0x000e 	XORPS	X0, X0
0x0011 	ADDQ	$-16, DI
0x0015 	DUFFZERO	$288

just like in NamedReturnParams, so it seems to be zeroing out the return value either way.:

"".NamedReturnParams t=1 size=67 args=0x40 locals=0x0
0x0000 	TEXT	"".NamedReturnParams(SB), $0-64
0x0000 	MOVQ	$0, "".oi+16(FP)
0x0009 	LEAQ	"".oi+24(FP), DI
0x000e 	XORPS	X0, X0
0x0011 	ADDQ	$-16, DI
0x0015 	DUFFZERO	$288

(and looking at NoNamedReturnParams2 as listed above, it is calling DUFFZERO twice:)

"".NoNamedReturnParams2 STEXT nosplit size=309 args=0x40 locals=0x40
0x000e 00014 (r.go:29)	MOVQ	$0, "".~r1+80(SP)
0x0017 00023 (r.go:29)	LEAQ	"".~r1+88(SP), DI
0x001c 00028 (r.go:29)	XORPS	X0, X0
0x001f 00031 (r.go:29)	ADDQ	$-16, DI
0x0023 00035 (r.go:29)	DUFFZERO	$288
0x0036 00054 (r.go:30)	MOVQ	$0, "".o(SP)
0x003e 00062 (r.go:30)	LEAQ	"".o+8(SP), DI
0x0043 00067 (r.go:30)	ADDQ	$-16, DI
0x0047 00071 (r.go:30)	DUFFZERO	$288

So while the compiler doesn't do well on the example given, and should be improved, it's not obvious to me that named and unnamed returns should always generate the same code.

Agreed, when using a named return value its content can be modified, so it is not possible to rely on the fact that it is not modified, so for a return objectInfo{} you may need to copy/zero anyway.

... since using named returns ties the compiler's hands in a way that unnamed returns do not.

Would think that this actually makes it easier, especially in combination with naked returns where, by definition, the (named) return values are already setup, so you can just RET out.

@dr2chase
Copy link
Contributor

Not likely for 1.11, moved to 1.12.

@odeke-em
Copy link
Member

odeke-em commented Feb 4, 2019

Nor is it likely for Go1.12, I am moving this to Go1.13 /cc @randall77

@odeke-em odeke-em modified the milestones: Go1.12, Go1.13 Feb 4, 2019
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@rsc rsc removed this from the Go1.14 milestone Oct 9, 2019
@bradfitz
Copy link
Contributor Author

I just hit this again when I sent https://go-review.googlesource.com/c/go/+/322329 to clean up some inconsistent godoc and it instead increased the size of stacks.

@mdempsky
Copy link
Contributor

@bradfitz I think CL 322329 is a slightly different issue? I understand this issue to be that if a function has a named result parameter, but the name isn't used, then it should compile identically to if it was anonymous or blank instead. However, that CL not only removed the result parameter names, but also added new local variables to replace them.

In that particular case, we could optimize the code to directly allocate the ret variable in the result parameter slot on the stack. But in general with panic/defer, there can be differences. So I think the details of addressing this would be more complex.

@josharian
Copy link
Contributor

I wonder whether we could use space reserved for unnamed return parameters as scratch space (assuming compatible types). That'd probably fix this mismatch and be a bonus optimization in other cases. cc @dr2chase

@mdempsky
Copy link
Contributor

@josharian I think as long as panic and defer are handled correctly, that seems reasonable. I'd suggest as a starting point to not attempt that optimization in any function that contains a defer statement.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361674 mentions this issue: net/netip: optimize As4 and As16

gopherbot pushed a commit that referenced this issue Nov 5, 2021
name    old time/op  new time/op  delta
As16-8  2.88ns ± 3%  2.16ns ± 3%  -25.19%  (p=0.000 n=15+15)

Fixes #49379
Updates #20859

Change-Id: If4cf58d19ed0e2ac0f179da5c132ed37061e4cb7
Reviewed-on: https://go-review.googlesource.com/c/go/+/361674
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: Triage Backlog
Development

No branches or pull requests