# Apache Beam - Flatten

Partition takes an input PCollection and creates a new set of PCollections from the original elements by applying a selection function to determine which of the new PCollections each of the elements should be sent to.


* [JavaDoc: Class Partition](https://beam.apache.org/releases/javadoc/2.42.0/org/apache/beam/sdk/transforms/Partition.html)


First, we define the dependencies that we wish to load from the Maven repositories.

In [1]:
%%loadFromPOM

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>beam-sdks-java-core</artifactId>
  <version>2.40.0</version>
</dependency>

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>beam-runners-direct-java</artifactId>
  <version>2.40.0</version>
  <scope>runtime</scope>
</dependency>

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-api</artifactId>
  <version>2.0.6</version>
</dependency>

Next we define our imports required for execution.

In [2]:
import java.util.Arrays;
import java.util.List;

import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.options.Default;
import org.apache.beam.sdk.options.Description;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.options.PipelineOptions;
import org.apache.beam.sdk.options.StreamingOptions;
import org.apache.beam.sdk.transforms.Create;
import org.apache.beam.sdk.transforms.GroupByKey;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.coders.KvCoder;
import org.apache.beam.sdk.coders.StringUtf8Coder;
import org.apache.beam.sdk.values.KV;
import org.apache.beam.sdk.values.PDone;
import org.apache.beam.sdk.values.TupleTag;
import org.apache.beam.sdk.transforms.join.CoGbkResult;
import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
import org.apache.beam.sdk.transforms.join.CoGroupByKey;
import org.apache.beam.sdk.transforms.Combine.CombineFn;
import org.apache.beam.sdk.transforms.Combine;
import org.apache.beam.sdk.transforms.SerializableFunction;
import org.apache.beam.sdk.transforms.Sum;
import org.apache.beam.sdk.transforms.Flatten;
import org.apache.beam.sdk.values.PCollectionList;
import org.apache.beam.sdk.transforms.Partition;
import org.apache.beam.sdk.transforms.Partition.PartitionFn;

String args[] = new String[] {};
var options = PipelineOptionsFactory.fromArgs(args).withValidation().create();

Here we Partition a single `PCollection` into two `PCollections` corresponding to whether the numbers are odd or even.  The result is a `PCollectionList` from which we can get each of the partitions.

In [3]:
public class LoggingDoFn<T> extends DoFn<T, T>  {
  @ProcessElement
  public void processElement(@Element T element, OutputReceiver<T> out) {
    System.out.println(element);
    out.output(element);
  }
} // End of LoggingDoFn

class MyPartitionFn implements PartitionFn<Integer> {
  public int partitionFor(Integer elem, int numPartitions) {
    if (elem % 2 == 0) {
      return 0;
    }
    return 1;
  }
} // End of MyPartitionFn

var pipeline = Pipeline.create(options);
var single = pipeline
  .apply("Create elements", Create.of(1,2,3,4,5,6,7,8,9));
var pCollectionList = single.apply("Partition elements", Partition.of(2, new MyPartitionFn()));
pCollectionList.get(1).apply("Print elements", ParDo.of(new LoggingDoFn<>()));
  
pipeline.run().waitUntilFinish();

7
5
1
3
9


DONE