The dataflow runner collects (and publishes to the dataflow service) a large number of useful stats. While these can be polled from the dataflow service via its API, there are a few downsides to this:
- it requires another process to poll and collect the stats
- the stats are aggregated across all workers, so per-worker stats are lost
It would be simple to provide a hook to allow users to receive stats updates as well, and then do whatever they want with them.
Imported from Jira BEAM-7605. Original Jira may contain additional context.
Reported by: SteveNiemitz.
The dataflow runner collects (and publishes to the dataflow service) a large number of useful stats. While these can be polled from the dataflow service via its API, there are a few downsides to this:
It would be simple to provide a hook to allow users to receive stats updates as well, and then do whatever they want with them.
Imported from Jira BEAM-7605. Original Jira may contain additional context.
Reported by: SteveNiemitz.