Skip to content

[Task]: Add warnings and docuementation bout using SpannerIO.EeadAll in a streaming pipeline.  #29583

@nielm

Description

@nielm

What needs to happen?

Using SpannerIO.ReadAll in a streaming pipeline has negative effects and is probably not what the customer wants to do:

  • Streaming pipelines run effectively forever (unless manually stopped)
  • SpannerIO.ReadAll creates a session and ReadOnlyTransaction on pipeline startup, and uses it for the rest of the pipeline duration.
  • This will mean that all data read from spanner will be 'stale' and from the timestamp when the pipeline was first started.
  • and if no reads occur for more than an hour, the session and transaction will be auto-closed from the spanner server side, causing the pipeline to fail.

Adding warnings to and documentation to SpannerIO.ReadAll about using it in a streaming pipeline and the negative side-effect

Issue Priority

Priority: 2 (default / most normal work should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions