Skip to content

proposal: slices: ChunkFunc to determine chunks of a slice using a function #69123

@DeedleFake

Description

@DeedleFake

Proposal Details

I propose adding a new function to the slices package that would return an iterator yielding variable-length subslices of a given slice based on a provided function. There are two possible implementations of this.

Version 1

The first is something I have personally used several times before, as well as being similar to a function in the Elixir standard library, looks like

func ChunkFunc[T any, C comparable, S ~[]T](s S, chunker func(T) C) iter.Seq[[]T]

This function returns an iterator that yields subslices of s composed of consecutive runs of elements for which chunker returns the same value. In other words,

ChunkFunc([]int{-1, -2, -3, 1, 2, -1, 3, 2, 1}, func(v int) bool { return v > 0 })

would yield []int{-1, -2, -3}, []int{1, 2}, []int{-1}, and then []int{3, 2, 1}. This is useful for a number of different things. For example, let's say that you have a slice of lines of output from something, some of which are to stdout and some to stderr. If you want to output those with a header to indicate which is which, being able to group the consecutive runs of lines that were to each is very useful, and a function like this can do it quite efficiently.

groups := slices.ChunkFunc(outputs, func(output Output) string { return output.Destination })
for group := range groups {
  fmt.Printf("%v:\n", group[0].Destination) // Like Chunk(), never yields an empty slice.
  for _, output := range group {
    fmt.Println(output.Text)
  }
}

Version 2

The other possible implementation is to simply make the function always return a bool, and then define that each time it returns true is the start of a new chunk.

While this is not a function that I've personally had a use for, I'm not really stuck on either specifically simply because I'm pretty sure that both can be implemented using the other. I think it's probably easier to implement the comparable one using the bool version, though, as it would simply be something like the following untested function:

func ChunkBy[T any, C comparable, S ~[]S](s S, chunker func(T) C) iter.Seq[[]T] {
  return func(yield func([]T) bool) {
    first := true
    var prev C
    chunks := ChunkFunc(s, func(v T) bool {
      check := chunker(v)
      if first {
        prev = check
        return true
      }
      start := check != prev
      prev = check
      return start
    })
    chunks(yield)
  }
}

Version 3

Another alternative is to simply provide both as, say, ChunkFunc() for the bool version and ChunkBy() for the comparable one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions