New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: recognize append loop #14325

Open
bradfitz opened this Issue Feb 14, 2016 · 5 comments

Comments

Projects
None yet
6 participants
@bradfitz
Member

bradfitz commented Feb 14, 2016

Loops with appends are common in Go. And often the loops don't terminate or have any conditionals.

cmd/compile could recognize something of the form:

        for ... := range x {
            s = append(s, v)
        }

And compile it as:

        s = runtime_ensurecapacity(s, len(x))
        for ... := range x {
            s = append(s, v)
        }

That would remove the need for the (often premature) optimization of:

      s := make([]T, 0, len(x))

And let people just write instead:

    var s []T
    for k := range m {
          s = append(s, k)
    }

This would also help eliminating garbage when the number of iterations is very large. (e.g. @campoy's https://news.ycombinator.com/item?id=11098802)

/cc @randall77 @dr2chase @rsc

@bradfitz bradfitz added this to the Unplanned milestone Feb 14, 2016

@cespare

This comment has been minimized.

Contributor

cespare commented Feb 14, 2016

In the specific case of

s := make([]T, 0, len(x)) // or just var s []T, to avoid the possibly-premature optimization
for _, v := range x {
  s = append(s, v)
}

IMO it's better style anyway to write

s := make([]T, len(x))
for i, v := range x {
  s[i] = v
}

because it's about the same complexity (make is needed, but append is not) and because it's more obviously performant without the need for compiler optimizations. I think that this style preference is borne out by stdlib code as well.

But I suppose your proposal still helps in cases where s was previously initialized with some other elements.

By the way, I find your last code snippet confusing (why are the variable names different from the preceding snippets, and why are we appending slice indexes?) Should it instead read

var s []T
for _, v := range x {
  s = append(s, v)
}
@mrkaspa

This comment has been minimized.

mrkaspa commented Mar 17, 2016

@cespare I agree, this is a most logical way to proceed if you define a fixed size array

s := make([]T, len(x))
for i, v := range x {
  s[i] = v
}
@mdempsky

This comment has been minimized.

Member

mdempsky commented Mar 18, 2016

There are some subtleties to when this is a safe transformation. E.g., in

var a [2]int
p := a[:0]

for _, v := range [...]int{1,2,3,4} {
    p = append(p, v)
}

the first two calls to append(p, x) need to still have the side effect of assigning a[0] = 1 and a[1] = 2, because the Go spec guarantees append will reuse the underlying array if there's available capacity.

Instead of up front calling a hypothetical runtime.ensurecapacity function, we could instead be smarter about calculating the desired length argument for runtime.growslice. Currently, for append(p, v), we just pass len(p)+1 each time.

However, the Go spec makes no guarantees about how large the newly allocated slice will be. That gives us freedom to try to predict the target size better and let that influence what size slice to allocate.

@rsc

This comment has been minimized.

Contributor

rsc commented Mar 18, 2016

I am skeptical. What's missing from this report is any kind of motivation from performance of a real program. Even Francesc's post points out that this kind of microbenchmark is likely misleading. Any real loop is going to have possible calculations that can cause it to panic, aborting the loop early. You don't want to be left with an enormous piece of unnecessary garbage in that case. And to the extent that you might underallocate during an append loop, you shoot yourself in the foot by breaking the guarantees of append, for example when that loop is enclosed in another loop. I suspect we should not change anything here, certainly not until there are real programs to worry about instead of toys.

@tilinna

This comment has been minimized.

tilinna commented Aug 16, 2016

Whenever I'm coding in languages like Python or Javascript, I'm much less concerned about memory allocation. With Go, I somehow switch into C-mode and see mallocs everywhere. And because I know how the append() works, it's really hard to not add that make() before every for-range -loop.

I'm sure I'm not the only one having this struggle so I hope the compiler is fixed to do it automatically. Also the lint tools should be aligned with it.

A recent example of litter here: 77e68ea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment