Skip to content

[Bug]: HBase read could stuck at HBaseReader.advance indefinitely #28025

@Abacn

Description

@Abacn

What happened?

There is report that HBaseIO read get stuck intermittently, possible due to failed connection never recovered

At here:

https://github.com/apache/beam/blob/16e7af2ea4148c65e3e1045e6c05f0d40f498db2/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseIO.java#L535C21-L535C21

Operation ongoing in bundle process_bundle-8636262876883182297-1613 for PTransform{id=Read-from-HBase-Read-HBaseSource--ParDo-BoundedSourceAsSDFWrapper--ParMultiDo-BoundedSourceAsSDFWrap/ProcessElementAndRestrictionWithSizing-ptransform-66, name=Read-from-HBase-Read-HBaseSource--ParDo-BoundedSourceAsSDFWrapper--ParMultiDo-BoundedSourceAsSDFWrap/ProcessElementAndRestrictionWithSizing-ptransform-66, state=process} for at
 least 01h15m05s without outputting or completing: at
 java.lang.Object.wait(Native Method) at
 java.lang.Object.wait(Object.java:460) at
 java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348) at
 org.apache.hadoop.hbase.client.ResultBoundedCompletionService.poll(ResultBoundedCompletionService.java:159) at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:198) at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60) at
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) at
 org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) at
 org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403) at
 org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) at
 org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:94) at
 org.apache.beam.sdk.io.hbase.HBaseIO$HBaseReader.advance(HBaseIO.java:514) at
 org.apache.beam.sdk.io.Read$BoundedSourceAsSDFWrapperFn$BoundedSourceAsSDFRestrictionTracker.tryClaim(Read.java:366) at
 ...
 java.lang.Thread.run(Thread.java:750)

Shall we add a timeout to advance?

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions