# GroupIntoBatches

Batches the input into desired batch size.

## Setup

To run a code cell, you can click the **Run cell** button at the top left of the cell,
or select it and press **`Shift+Enter`**.
Try modifying a code cell and re-running it to see what happens.

First, let's install the `apache-beam` module.

In [None]:
!pip install --quiet -U apache-beam

## Examples

In the following example, we create a pipeline with a `PCollection` of produce by season.

We use `GroupIntoBatches` to get fixed-sized batches for every key, which outputs a list of elements for every key.

In [1]:
import apache_beam as beam

with beam.Pipeline() as pipeline:
  batches_with_keys = (
      pipeline
      | 'Create produce' >> beam.Create([
          ('spring', '🍓'),
          ('spring', '🥕'),
          ('spring', '🍆'),
          ('spring', '🍅'),
          ('summer', '🥕'),
          ('summer', '🍅'),
          ('summer', '🌽'),
          ('fall', '🥕'),
          ('fall', '🍅'),
          ('winter', '🍆'),
      ])
      | 'Group into batches' >> beam.GroupIntoBatches(3)
      | beam.Map(print))



('spring', ['🍓', '🥕', '🍆'])
('summer', ['🥕', '🍅', '🌽'])
('spring', ['🍅'])
('fall', ['🥕', '🍅'])
('winter', ['🍆'])


## Related transforms

For unkeyed data and dynamic batch sizes, one may want to use
[BatchElements](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.BatchElements).