# Hello World
This notebook presents a very simple workflow writing some strings to a file. It showcases the ability to run Python Dataflow workflows from Jupyter notebooks.

## Prerequisites

The steps required to use this notebook are:
1. Install the tools for creating conda environments: http://conda.pydata.org/miniconda.html
2. Create and activate a virtual environment: conda create -n NAME jupyter; source activate NAME
3. Install the Python Dataflow package (current latest version is v0.2.3): pip install https://github.com/GoogleCloudPlatform/DataflowPythonSDK/archive/v0.2.3.tar.gz
4. Start a Jupyter notebook: jupyter notebook

## Workflow

Start with an idiomatic import statement. Most objects are in the _df_ or _df.io_ namespace

In [1]:
import google.cloud.dataflow as df

There are three steps involved in creating and running a pipeline:
* Create the `Pipeline` object
* Create the graph of data transforms
* Run the pipeline graph

The graph of transforms is sometimes called workflow and therefore running a workflow is the same thing as running a pipeline. 

## Create a Pipeline object
The code below associates with the pipeline a runner that will execute the workflow locally. This is very useful for testing your code. For running at scale the same identical code can be run in the Google Cloud (https://github.com/GoogleCloudPlatform/DataflowPythonSDK/#signing-up-for-alpha-batch-cloud-execution).

In [2]:
p = df.Pipeline('DirectPipelineRunner')

## Build the graph of transforms
A `Create` transform will read line by line  from a text file source and then the resulting PCollection is written using a `Write` transform to a text file sink. 

In [3]:
p | df.Create(['hello', 'world']) | df.io.Write(df.io.TextFileSink('./test.txt'))

<PValue transform=<_NativeWrite(PTransform) label=[native_write]> at 0x7f2b375c2790>

## Run the pipeline

In [4]:
p.run()

<google.cloud.dataflow.runners.direct_runner.DirectPipelineResult at 0x7f2b4ca1d950>

The output file contains the written elements. There is no guarantee that a transform (including a Write) will output elements in the same order as they are received as input. 

In [5]:
!more ./test.txt

hello
world
