<a href="https://colab.research.google.com/github/apache/beam/blob/master/examples/notebooks/documentation/transforms/python/elementwise/map-py.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/></a>

<table align="left"><td><a target="_blank" href="https://beam.apache.org/documentation/transforms/python/aggregation/cogroupbykey"><img src="https://beam.apache.org/images/logos/full-color/name-bottom/beam-logo-full-color-name-bottom-100.png" width="32" height="32" />View the docs</a></td></table>

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License")
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# CoGroupByKey

<script type="text/javascript">
localStorage.setItem('language', 'language-py')
</script>

<table align="left" style="margin-right:1em">
  <td>
    <a class="button" target="_blank" href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.CoGroupByKey"><img src="https://beam.apache.org/images/logos/sdks/python.png" width="32px" height="32px" alt="Pydoc"/> Pydoc</a>
  </td>
</table>

<br/><br/><br/>

Aggregates all input elements by their key and allows downstream processing
to consume all values associated with the key. While `GroupByKey` performs
this operation over a single input collection and thus a single type of input
values, `CoGroupByKey` operates over multiple input collections. As a result,
the result for each key is a tuple of the values associated with that key in
each input collection.

## Setup

To run a code cell, you can click the **Run cell** button at the top left of the cell,
or select it and press **`Shift+Enter`**.
Try modifying a code cell and re-running it to see what happens.

> To learn more about Colab, see
> [Welcome to Colaboratory!](https://colab.sandbox.google.com/notebooks/welcome.ipynb).

First, let's install the `apache-beam` module.

In [None]:
!pip install --quiet -U apache-beam

## Examples

In the following example, we create a pipeline with two `PCollection`s of key-value pairs. Then, we apply `CoGroupByKey` to group the two `PCollection`s by key into a single `PCollection`.

In [None]:
import apache_beam as beam

with beam.Pipeline() as pipeline:
    icon_pairs = pipeline | 'Create icons' >> beam.Create([
        ('Apple', '🍎'),
        ('Apple', '🍏'),
        ('Eggplant', '🍆'),
        ('Tomato', '🍅'),
    ])

    duration_pairs = pipeline | 'Create durations' >> beam.Create([
        ('Apple', 'perennial'),
        ('Carrot', 'biennial'),
        ('Tomato', 'perennial'),
        ('Tomato', 'annual'),
    ])

    plants = (({
        'icons': icon_pairs, 'durations': duration_pairs
    })
            | 'Merge' >> beam.CoGroupByKey()
            | beam.Map(print))

<table align="left" style="margin-right:1em">
  <td>
    <a class="button" target="_blank" href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/aggregation/cogroupbykey.py"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" width="32px" height="32px" alt="View source code"/> View source code</a>
  </td>
</table>

<br/><br/><br/>

## Related transforms

* [CombineGlobally](/documentation/transforms/python/aggregation/combineglobally) to combine elements.
* [GroupByKey](/documentation/transforms/python/aggregation/groupbykey) takes one input collection.

<table align="left" style="margin-right:1em">
  <td>
    <a class="button" target="_blank" href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.CoGroupByKey"><img src="https://beam.apache.org/images/logos/sdks/python.png" width="32px" height="32px" alt="Pydoc"/> Pydoc</a>
  </td>
</table>

<br/><br/><br/>

<table align="left"><td><a target="_blank" href="https://beam.apache.org/documentation/transforms/python/aggregation/cogroupbykey"><img src="https://beam.apache.org/images/logos/full-color/name-bottom/beam-logo-full-color-name-bottom-100.png" width="32" height="32" />View the docs</a></td></table>