-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slices: add Repeat function #65238
Comments
At least some of the examples you shared in other languages are of repeating (something equivalent to) The hypothetical Go func Repeat[T any](count int, vals ...T) []T {
l := count * len(vals)
xs := make([]T, l)
for i := 0; i < l; i += len(vals) {
copy(xs[i:], vals)
}
return xs
} This new signature still supports repeating just one item, at the expense of the program having to quietly construct a single-element slice for that case. |
If we repeat slices and not elements then it should work in-place if the underlying array has enough capacity. |
I would find it quite surprising if I asked this function to repeat a sub-slice of a larger slice and it reacted by overwriting other elements outside of the sub-slice I provided. I think writing outside of the bounds of a slice into the excess capacity -- particularly if it only happens under specific circumstances like only when there is enough excess capacity -- is likely to lead to bugs where callers don't fully understand the implications. In particular, I imagine an author writing unit tests where the source slice always has no excess capacity because that's the easiest thing to write as a literal, not considering that a caller might provide a sub-slice of a larger backing array in more practical usage If it's desirable to repeat while avoiding a new allocation, then I would suggest changing the signature to pass a separate slice to write the result into, and document that it's acceptable for the target slice to overlap the source slice only if the two slices share a common prefix, so that the function doesn't have to jump through hoops to avoid overwriting its own input during its work. However, I would also tend to think that repeating into excess capacity of the backing array of the input slice would be a rare enough need that I would hand-write it that way if I needed it, rather than expecting the standard library to provide it. YMMV, of course! Possible alternate signature meeting that additional requirement: func RepeatCopy[S ~[]E, E any](dst []E, src S) The function would keep copying If If If
func Repeat[T any](count int, vals ...T) []T {
l := count * len(vals)
xs := make([]T, l)
RepeatCopy(xs, vals)
return xs
} |
My original idea is that you want to do that for anything that adds new elements else you risk quadratic time complexity when the operation is repeated on the same slice. Things would be much better if we had type system support for slices that are only up to length long. |
To add to
Python has this builtin on all sequence types (including lists):
|
Looking again at the motivating examples for other languages, I notice that the Scala one would probably translate into Go more like this: func RepeatFunc[T any](count int, makeElem func () T) []T {
xs := make([]T, count)
for i := 0; i < count; i++ {
xs[i] = makeElem()
}
return xs
} I'm not sure if this additional variant is actually useful enough to justify inclusion, but just exploring the different design tradeoffs that each of the other languages made in their similar library features. |
Another option: make it return an |
FWIW, Python uses >>> ["a"] * 5
['a', 'a', 'a', 'a', 'a'] JavaScript is weirder. You first make an Array with capacity and no items then you fill it. > new Array(5).fill("a")
< ["a", "a", "a", "a", "a"] It seems like a pretty common slice utility across languages. |
@apparentlymart: You're quite right. The "related" in "prior, related art" is bearing more load than I made clear. I'm sorry about that! Thanks for all of the illuminating comments. So far, I'd categorise things into two mirrored families of three variants.
We might think of 1c and 2c as "RepeatFunc", as suggested. I think there are two (current) weaknesses of 2a and 2b. Firstly, the iter package is crystallising, though it looks like it'll be accepted imminently. Secondly, the slices package operates on and returns slices. I'm not sure what the implications of new ideas like range-over-func and iter will be for existing packages, but I expect there'd be a general preference to keep the standard library's signatures frozen (to avoid breaking changes) and to keep each package's signatures internally "consistent" (and so precluding a mixture of slices and sequences in the slices package). I think 1b is strictly preferable to 1a. It's simple, it's predictable in its behaviour and performance (with a single natural implementation), and it directly extends 1a without contortion. I think it'd slot easily into the existing slices package. 1c and 2c are interesting. Repeatedly calling a function is a natural way to generate a sequence. As well as the Scala example, there's a loosely-related (lazy, infinite) version of this in Elixir called Stream.unfold/2 which has a kind of beauty to it. But I'm not sure exactly how 1c and 2c would fit with Go just yet; I'll confess that I've not played around with the iter (and xiter, etc.) experiments. It all seems in flux. |
1c strikes me as not any easier than just writing a loop. n := -2
s := slices.Repeat(5, func() int {
n += 2
return n
})
// vs
s := make([]int, 5)
n := 0
for i := range s {
s[i] = n
n += 2
} |
I think there's a case for 1c, but that it's marginal, with the advantages probably being small enough to not warrant extending the slices package. Go's syntax for closures is a bit noisy, but things can be visually compressed to package main
import "fmt"
func main() {
n := -2
fmt.Printf("%v\n", repeat(10, func() int { n += 2; return n }))
}
func repeat[T any](count int, fn func() T) []T {
xs := make([]T, count)
for i := 0; i < count; i++ {
xs[i] = fn()
}
return xs
} which does turn the 6 lines to 2 (but at what cost?). I think it'd be a better fit if Go had syntax like func main() {
n := -2
fmt.Printf("%v\n", repeat(10, () => n += 2))
} but Go's not such a language—and it probably shouldn't become such a language. |
I agree that slices.Repeat should repeat slices (not elements), just as strings.Repeat repeats strings. The implementation needs to check for overflow. It should panic with a message containing the values of both n and len(values), instead of unhelpfully calling make with a negative argument, or, worse, spuriously returning an empty slice when the product is a multiple of the word size. Also, there's a nice optimization you can do here which is to make fewer calls to copy by working in powers of two: |
@adonovan: Thanks for the suggestions and guidance. The log improvement's pleasant and something I'm embarrassed to have missed with the first cut. An updated implementation follows. My check for wrapping around is likely suboptimal. (It's at least not obviously wrong.) Below that, in a // Repeat returns a new slice that repeats the provided slice the given number of times.
func Repeat[E any](count int, xs ...E) []E {
if count < 0 {
panic(fmt.Sprintf("the count (%d) must be non-negative", count))
}
if count > 1 && len(xs) > math.MaxInt/count {
panic(fmt.Sprintf("wraparound: the count (%d) and the given input slice (len: %d) are mutually too large", count, len(xs)))
}
out := make([]E, count*len(xs))
n := copy(out, xs)
for n < len(out) {
copy(out[n:], out[:n])
n *= 2
}
return out
} An attempt at a test suite.package main
import (
"slices"
"testing"
"unsafe"
)
func TestRepeat(t *testing.T) {
for _, tc := range []struct {
count int
xs []int
exp []int
}{
{count: 0, xs: []int{}, exp: []int{}},
{count: 0, xs: []int{1}, exp: []int{}},
{count: 1, xs: []int{}, exp: []int{}},
{count: 1, xs: []int{0}, exp: []int{0}},
{count: 2, xs: []int{0}, exp: []int{0, 0}},
{count: 2, xs: []int{1}, exp: []int{1, 1}},
{count: 3, xs: []int{1, 2}, exp: []int{1, 2, 1, 2, 1, 2}},
} {
if out := Repeat(tc.count, tc.xs...); !slices.Equal(tc.exp, out) {
t.Errorf("Repeat(%d, %v...): expected %v, given %v", tc.count, tc.xs, tc.exp, out)
}
}
}
func TestRepeat_panics(t *testing.T) {
var x int
wordSize := 8 * unsafe.Sizeof(x)
for _, tc := range []struct {
name string
count int
xs []struct{}
shouldPanic bool
}{
{name: "negative_count", count: -1, xs: []struct{}{}, shouldPanic: true},
{name: "wraparound/1", count: 1 << (wordSize - 2), xs: make([]struct{}, 2), shouldPanic: true},
{name: "wraparound/2", count: 2, xs: make([]struct{}, 1<<(wordSize-2)), shouldPanic: true},
{name: "wraparound/3", count: 1 << (wordSize/2 - 1), xs: make([]struct{}, 1<<(wordSize/2)), shouldPanic: true},
{name: "wraparound/4", count: 1 << (wordSize / 2), xs: make([]struct{}, 1<<((wordSize/2)-1)), shouldPanic: true},
{name: "wraparound/5", count: 1 << (wordSize/2 - 1), xs: make([]struct{}, 1<<(wordSize/2-1)), shouldPanic: false},
} {
var r any
t.Run(tc.name, func(t *testing.T) {
func() {
defer func() { r = recover() }()
_ = Repeat(tc.count, tc.xs...)
}()
if panics := r != nil; panics != tc.shouldPanic {
if panics {
t.Errorf("Repeat: unexpectedly panicked")
} else {
t.Errorf("Repeat: unexpectedly failed to panic")
}
}
})
}
} |
No reason to be embarrassed. I wasn't aware of it either until @josharian suggested it while reviewing the linked Starlark code. You might want to add this comment at the overflow check: // See Hacker's Delight 2-12, detecting overflow of signed multiplication for nonnegative arguments. |
I've found a copy of it (the 2nd edition) and the relevant section and table. This book is amazing—thanks for the suggestion! I've added the comment below. I'll also update the original proposal to link to this comment. // Repeat returns a new slice that repeats the provided slice the given number of times.
func Repeat[E any](count int, xs ...E) []E {
if count < 0 {
panic(fmt.Sprintf("the count (%d) must be non-negative", count))
}
// See Section 2-13 of Hacker's Delight (2nd ed.) on detecting overflow of signed multiplication
// for non-negative arguments, in particular Table 2-2 (p. 32).
if count > 1 && len(xs) > math.MaxInt/count {
panic(fmt.Sprintf("overflow: the count (%d) and the given input slice (len: %d) are mutually too large", count, len(xs)))
}
out := make([]E, count*len(xs))
n := copy(out, xs)
for n < len(out) {
copy(out[n:], out[:n])
n *= 2
}
return out
} An attempt at a test suite.package main
import (
"slices"
"testing"
"unsafe"
)
func TestRepeat(t *testing.T) {
for _, tc := range []struct {
count int
xs []int
exp []int
}{
{count: 0, xs: []int{}, exp: []int{}},
{count: 0, xs: []int{1}, exp: []int{}},
{count: 1, xs: []int{}, exp: []int{}},
{count: 1, xs: []int{0}, exp: []int{0}},
{count: 2, xs: []int{0}, exp: []int{0, 0}},
{count: 2, xs: []int{1}, exp: []int{1, 1}},
{count: 3, xs: []int{1, 2}, exp: []int{1, 2, 1, 2, 1, 2}},
} {
if out := Repeat(tc.count, tc.xs...); !slices.Equal(tc.exp, out) {
t.Errorf("Repeat(%d, %v...): expected %v, given %v", tc.count, tc.xs, tc.exp, out)
}
}
}
func TestRepeat_panics(t *testing.T) {
var x int
wordSize := 8 * unsafe.Sizeof(x)
for _, tc := range []struct {
name string
count int
xs []struct{}
shouldPanic bool
}{
{name: "negative_count", count: -1, xs: []struct{}{}, shouldPanic: true},
{name: "wraparound/1", count: 1 << (wordSize - 2), xs: make([]struct{}, 2), shouldPanic: true},
{name: "wraparound/2", count: 2, xs: make([]struct{}, 1<<(wordSize-2)), shouldPanic: true},
{name: "wraparound/3", count: 1 << (wordSize/2 - 1), xs: make([]struct{}, 1<<(wordSize/2)), shouldPanic: true},
{name: "wraparound/4", count: 1 << (wordSize / 2), xs: make([]struct{}, 1<<((wordSize/2)-1)), shouldPanic: true},
{name: "wraparound/5", count: 1 << (wordSize/2 - 1), xs: make([]struct{}, 1<<(wordSize/2-1)), shouldPanic: false},
} {
var r any
t.Run(tc.name, func(t *testing.T) {
func() {
defer func() { r = recover() }()
_ = Repeat(tc.count, tc.xs...)
}()
if panics := r != nil; panics != tc.shouldPanic {
if panics {
t.Errorf("Repeat: unexpectedly panicked")
} else {
t.Errorf("Repeat: unexpectedly failed to panic")
}
}
})
}
} |
You can also use |
@randall77: Thanks for directing me to math/bits and the sharpened approach. My check for wrapping around was indeed "likely suboptimal". (It's also shown me that math/bits.UintSize exists, so the tests no longer import unsafe.) Things are now as follows. package main
import (
"fmt"
"math"
"math/bits"
)
// Repeat returns a new slice that repeats the provided slice the given number of times.
func Repeat[E any](count int, xs ...E) []E {
if count < 0 {
panic(fmt.Sprintf("the count (%d) must be non-negative", count))
}
if hi, lo := bits.Mul(uint(count), uint(len(xs))); hi > 0 || lo > math.MaxInt {
panic(fmt.Sprintf("overflow: the count (%d) and the given input slice (len: %d) are mutually too large", count, len(xs)))
}
out := make([]E, count*len(xs))
n := copy(out, xs)
for n < len(out) {
copy(out[n:], out[:n])
n *= 2
}
return out
} An attempt at a test suite.package main
import (
"math/bits"
"slices"
"testing"
)
func TestRepeat(t *testing.T) {
for _, tc := range []struct {
count int
xs []int
exp []int
}{
{count: 0, xs: []int{}, exp: []int{}},
{count: 0, xs: []int{1}, exp: []int{}},
{count: 1, xs: []int{}, exp: []int{}},
{count: 1, xs: []int{0}, exp: []int{0}},
{count: 2, xs: []int{0}, exp: []int{0, 0}},
{count: 2, xs: []int{1}, exp: []int{1, 1}},
{count: 3, xs: []int{1, 2}, exp: []int{1, 2, 1, 2, 1, 2}},
} {
if out := Repeat(tc.count, tc.xs...); !slices.Equal(tc.exp, out) {
t.Errorf("Repeat(%d, %v...): expected %v, given %v", tc.count, tc.xs, tc.exp, out)
}
}
}
func TestRepeat_panics(t *testing.T) {
for _, tc := range []struct {
name string
count int
xs []struct{}
shouldPanic bool
}{
{name: "negative_count", count: -1, xs: []struct{}{}, shouldPanic: true},
{name: "overflow/1", count: 1 << (bits.UintSize - 2), xs: make([]struct{}, 2), shouldPanic: true},
{name: "overflow/2", count: 2, xs: make([]struct{}, 1<<(bits.UintSize-2)), shouldPanic: true},
{name: "overflow/3", count: 1 << (bits.UintSize/2 - 1), xs: make([]struct{}, 1<<(bits.UintSize/2)), shouldPanic: true},
{name: "overflow/4", count: 1 << (bits.UintSize / 2), xs: make([]struct{}, 1<<((bits.UintSize/2)-1)), shouldPanic: true},
{name: "overflow/5", count: 1 << (bits.UintSize/2 - 1), xs: make([]struct{}, 1<<(bits.UintSize/2-1)), shouldPanic: false},
} {
var r any
t.Run(tc.name, func(t *testing.T) {
func() {
defer func() { r = recover() }()
_ = Repeat(tc.count, tc.xs...)
}()
if panics := r != nil; panics != tc.shouldPanic {
if panics {
t.Errorf("Repeat: unexpectedly panicked")
} else {
t.Errorf("Repeat: unexpectedly failed to panic")
}
}
})
}
} |
This proposal has been added to the active column of the proposals project |
The rationale for adding this is that it is already in strings and bytes, so having it in slices fits with those. Let's not worry about sequences here; the ones in strings and bytes do not take sequences. Also typically Repeat is of a short amount of data, so having a sequence form is not too helpful. |
Also, for consistency with strings and bytes it seems like the signature should be func Repeat[E any](x []E, count int) []E This does mean we can't finesse the question of one element vs a slice argument, but a slice argument is consistent with bytes and strings. |
That's a good point—I hadn't given enough thought to consistency across Go's packages. It'd mean that (what I've seen as) the common case would be a bit unfortunate-looking, namely slices.Repeat([]T{t}, count) for some t in T, but I think consistency trumps a touch of beauty. |
Have all remaining concerns about this proposal been addressed? The proposal is to add to package slices:
|
// Repeat returns a new slice that repeats the provided slice the given number of times.
-// The result has length and capacity len(x) * count.
+// The result has length len(x) * count.
// The result is never nil.
// Repeat panics if count is negative or if the result of (len(x) * count)
// overflows.
func Repeat[E any](x []E, count int) []E I don't think capacity should be specified, if the underlying implementation can get some free capacity (like by rounding up by classes) it should be allowed to do so. |
If you have a global like
then it matters a lot whether the capacity is equal to the len. Multiple goroutines can safely append to x simultaneously to produce different variants. If x has extra capacity, that stops being safe. We've been prioritizing safety over the potential efficiency of leaving a tiny bit of extra capacity in some cases depending on the memory allocator. That only leads to portability problems. bytes.Repeat also returns a trimmed slice. |
Based on the discussion above, this proposal seems like a likely accept. The proposal is to add to package slices:
|
No change in consensus, so accepted. 🎉 The proposal is to add to package slices:
|
Change https://go.dev/cl/571895 mentions this issue: |
Sorry for not noticing this earlier, but every other function in package slices that takes a slice as an argument takes the type of the slice as a type argument. For consistency the signature here should be func Repeat[S ~[]E, E any](x S, count int) S |
Proposal Details
Note
The current version of this proposal is here.
The signature in that version of the proposal is
func Repeat[T any](count int, xs ...T) []T
.I propose a small, trivial addition to the slices package.
There's prior, related art in other languages with list-like types, including the following. (The Python and JavaScript examples were provided by @Jorropo and @earthboundkid; see below.) Not all of the below are identical in their behaviour, but they're thematically similar.
*
for sequences.I've personally found it a useful helper to have—particularly when combined with Concat, which looks like it'll make its way into Go 1.22.
I appreciate that slices oughtn't become a dumping ground for everyone's ideas about what's useful, but this feels like a natural addition to me.
I'm happy to contribute a pull request with tests if this is considered a good idea.
The text was updated successfully, but these errors were encountered: