Skip to content

[Bug]: cassandraIO ReadAll does not let a pipeline handle or retry exceptions #34160

Closed
@VardhanThigle

Description

@VardhanThigle

What happened?

If anyone runs CassandraIO to read all rows on a fairly large Cassandra Cluster (~50 Nodes, > 2 TB)
and there are any timeout exceptions a set of rows is never read, CassandraIO only logs the error and proceeds.

Root Cause

cassandraIO ReadAll does not let a pipeline handle or retry exceptions

JDBCIO throws exception which gets retried by dataflow runner on other nodes.
In the most ideal case there should be a way to plug in an exception handler to handle such corner cases in production.

Ref in Code

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions