Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupedWithin emits empty groups #1404

Closed
Sebruck opened this issue Jan 28, 2019 · 4 comments
Closed

groupedWithin emits empty groups #1404

Sebruck opened this issue Jan 28, 2019 · 4 comments
Labels

Comments

@Sebruck
Copy link
Contributor

Sebruck commented Jan 28, 2019

Hi

I use fs2 version 1.0.2 and the groupedWithin function seems not to behave as documented.

/**
    * Divide this streams into groups of elements received within a time window,
    * or limited by the number of the elements, whichever happens first.
    * Empty groups, which can occur if no elements can be pulled from upstream
    * in a given time window, **will not be emitted.**
    *
    * Note: a time window starts each time downstream pulls.
    */
  def groupWithin[F2[x] >: F[x]](n: Int, d: FiniteDuration)(

The documentation says that empty groups will not be emitted. But when numberOfElements % n == 0 then the last group is still empty.

Reproducer:

import cats.effect.{ContextShift, IO, Timer}
import fs2.Stream

import scala.concurrent.duration._

object Fs2Test extends App {

  implicit val cs: ContextShift[IO] = IO.contextShift(concurrent.ExecutionContext.global)
  implicit val timer: Timer[IO]     = IO.timer(concurrent.ExecutionContext.global)

  val numberStrings = (1 to 30).map(_.toString).toList
  Stream
    .emits(numberStrings)
    .covary[IO]
    .groupWithin(30, 1.milli)
    .map(chunk => println(chunk.size))
    .compile
    .drain
    .unsafeRunSync()

  // Output is:
  // 30
  // 0
}

Is this a bug or is the documentation miss-leading?

@SystemFw SystemFw added the bug label Jan 28, 2019
@Sebruck
Copy link
Contributor Author

Sebruck commented Jan 28, 2019

I am currently looking into this.

@SystemFw
Copy link
Collaborator

Nice, this specific issue should be in the algorithmic (non concurrent) part of the code so hopefully not too bad

@yannick-cw
Copy link

One thing I was wondering, when the groupedWithin always emits nonempty batches, why not make it something like a NonEmptyList?

@SystemFw
Copy link
Collaborator

Mostly because Chunk is optimised and very often you want to keep working with it. NonEmptyList forces a very specific access pattern. So, it's a tradeoff :) (we don't have NonEmptyChunk anymore)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants