# Interactive Beam Example

self link: go/interactive-beam-example

## Blaze run the notebook kernel
```
google3$ blaze run pipeline/dataflow/python/interactive:beam_notebook.par
```

## Running on local machine (Direct Runner)

This is a very simple example of how to use Interactive Runner.


In [None]:
import uuid

import apache_beam as beam
from apache_beam.runners.direct import direct_runner
from apache_beam.runners.interactive import interactive_runner

In [None]:
%matplotlib inline

In [None]:
try:
    %load_ext autoreload
    %autoreload 2
except ModuleNotFoundError:
    print("autoreload not found.")

### The initial run

In [None]:
runner = interactive_runner.InteractiveRunner(
    underlying_runner=direct_runner.BundleBasedDirectRunner(), render_option="graph", cache_format="tfrecord"
)
p = beam.Pipeline(runner=runner)

In [None]:
pcoll_init = p | beam.Create(range(10))
squares = pcoll_init | 'Square' >> beam.Map(lambda x: x*x)
cubes = pcoll_init | 'Cube' >> beam.Map(lambda x: x**3)
result = p.run()

### Fetching PCollection
You can fetch PCollection from the result as a list.

In [None]:
init_list = list(range(10))
squares_list = result.get(squares)
cubes_list = result.get(cubes)

squares_list.sort()
cubes_list.sort()

from matplotlib import pyplot as plt
plt.scatter(init_list, squares_list, label='squares', color='red')
plt.scatter(init_list, cubes_list, label='cubes', color='blue')
plt.legend(loc='upper left')
plt.show()

### Hack with the pipeline and run Round 2

In [None]:
class AverageFn(beam.CombineFn):
    def create_accumulator(self):
        return (0.0, 0)

    def add_input(self, sum_count, input):
        (sum, count) = sum_count
        return sum + input, count + 1

    def merge_accumulators(self, accumulators):
        sums, counts = zip(*accumulators)
        return sum(sums), sum(counts)

    def extract_output(self, sum_count):
        (sum, count) = sum_count
        return sum / count if count else float("NaN")

In [None]:
average_squares = squares | "AverageSquares {}".format(uuid.uuid4().hex[:4]) >> beam.CombineGlobally(AverageFn())
result = p.run()

In [None]:
average_cubes = cubes | "AverageCubes {}".format(uuid.uuid4().hex[:4]) >> beam.CombineGlobally(AverageFn())
result = p.run()