Skip to content

BigQueryIO - table "Not found" when using BigQueryIO.write() with CREATE_NEVER and WRITE_APPEND[Feature Request]:  #24001

@rajkgupt

Description

@rajkgupt

What would you like to happen?

Problem Encountered:
For a BigQueryIO.Write configured like in [1], the if target table doesn’t exist, then pipeline throws 404 Table Not Found exception and continuously retries the work item [2].

Whereas for insert errors (broken json or schema error), it is able to catch the error (via getFailedInsertsWithErr)
It was most recently reproduced on Apache Beam SDK for Java 2.39.0

What you expected to happen:
Table not found errors should be caught by getFailedInsertsWithErr so that those records can be handled separately (like writing to dead letter queue or to GCS etc.)

[1]
WriteResult writeResult = results.get(SUCCESS_TAG).apply("WriteSuccessfulRecordsToBQ", BigQueryIO.writeTableRows() .withMethod(BigQueryIO.Write.Method.STREAMING_INSERTS) .withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors()) //Retry all failures except for known persistent errors. .withWriteDisposition(WRITE_APPEND) .withCreateDisposition(CREATE_NEVER) .withExtendedErrorInfo() //- getFailedInsertsWithErr .ignoreUnknownValues() .skipInvalidRows() .withoutValidation() .to((row) -> { String tableName = Objects.requireNonNull(row.getValue()).get("event_type").toString(); return new TableDestination(String.format("%s:%s.%s", BQ_PROJECT, BQ_DATASET, tableName), "Some destination"); })

[2]
Error message from worker: java.lang.RuntimeException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found POST https://bigquery.googleapis.com/bigquery/v2/projects/dfdfdfdfdfd/datasets/sdfsdfdsfsfs/tables/dddddddd/insertAll?prettyPrint=false { "code" : 404, "errors" : [ { "domain" : "global", "message" : "Not found: Table dfdfdfdfdfd:sdfsdfdsfsfs.dddddddd", "reason" : "notFound" } ], "message" : "Not found: Table dfdfdfdfdfd:sdfsdfdsfsfs.dddddddd", "status" : "NOT_FOUND" } org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1108) org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1161)

Issue Priority

Priority: 2

Issue Component

Component: sdk-java-core

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions