Skip to content

Something like Future.sequentialRun/chunkRun #71

@exoego

Description

@exoego

On parallel processing, I think it is relatively common usecase to limit the number of concurrent tasks running simultaneously.

For example, let's say we have 10,000 files and upload each of them to somewhere 5-by-5 at a time, to avoid service down due to uploading 10,000 files all together.

Scala standard libs currently offers Future.sequence, whose name sounds suitable for this purpose, but unfortunately it is not...

For this usecase, I found myself and colleagues often define utilities like below:

def sequentialRun[T, U](items: IterableOnce[T])
                       (op: T => Future[U])
                       (implicit ec: ExecutionContext): Future[Seq[U]] = {
  items.iterator
    .foldLeft(Future.successful(Vector.empty[U])) { (f, item) =>
      f.flatMap { acc =>
        op(item).map(acc :+ _)
      }
    }
}

def chunkRun[T, U](chunkSize: Int, items: Seq[T])
                  (op: T => Future[U])
                  (implicit ec: ExecutionContext): Future[Seq[U]] = {
  sequentialRun(items.grouped(chunkSize)) { item =>
    Future.traverse(item)(op)
  }.map(_.flatten)
}


// use-site
chunkRun(5, files) { file =>
  ashnchronoslyUploadToSomewhere(file)
}

I thinks this is helpful if Scala library next offers similar feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions