This represents the entirety of a Pipeline, divided into executable stages which are executed in either the runner or within a user container. This representation must include the ports over which the SDK harness communicates to the runner.
The construction of this graph likely includes most of the nodes present within the Runner API graph (PTransform and PCollection), but injects additional nodes to represent a remote read or write between harnesses.
Simple fusion (naive producer/consumer and sibling fusion) also should be performed here.
This will also require rewriting some boundary coders (e.g. runner -> SDK Harness and vice versa) to be agnostic to the language of the runner harness, likely by converting into length-prefixed bytes.
Imported from Jira BEAM-3337. Original Jira may contain additional context.
Reported by: tgroh.