-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
What would you like to happen?
Problem Encountered:
For a BigQueryIO.Write configured like in [1], the if target table doesn’t exist, then pipeline throws 404 Table Not Found exception and continuously retries the work item [2].
Whereas for insert errors (broken json or schema error), it is able to catch the error (via getFailedInsertsWithErr)
It was most recently reproduced on Apache Beam SDK for Java 2.39.0
What you expected to happen:
Table not found errors should be caught by getFailedInsertsWithErr so that those records can be handled separately (like writing to dead letter queue or to GCS etc.)
[1]
WriteResult writeResult = results.get(SUCCESS_TAG).apply("WriteSuccessfulRecordsToBQ", BigQueryIO.writeTableRows() .withMethod(BigQueryIO.Write.Method.STREAMING_INSERTS) .withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors()) //Retry all failures except for known persistent errors. .withWriteDisposition(WRITE_APPEND) .withCreateDisposition(CREATE_NEVER) .withExtendedErrorInfo() //- getFailedInsertsWithErr .ignoreUnknownValues() .skipInvalidRows() .withoutValidation() .to((row) -> { String tableName = Objects.requireNonNull(row.getValue()).get("event_type").toString(); return new TableDestination(String.format("%s:%s.%s", BQ_PROJECT, BQ_DATASET, tableName), "Some destination"); })
[2]
Error message from worker: java.lang.RuntimeException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found POST https://bigquery.googleapis.com/bigquery/v2/projects/dfdfdfdfdfd/datasets/sdfsdfdsfsfs/tables/dddddddd/insertAll?prettyPrint=false { "code" : 404, "errors" : [ { "domain" : "global", "message" : "Not found: Table dfdfdfdfdfd:sdfsdfdsfsfs.dddddddd", "reason" : "notFound" } ], "message" : "Not found: Table dfdfdfdfdfd:sdfsdfdsfsfs.dddddddd", "status" : "NOT_FOUND" } org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1108) org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1161)
Issue Priority
Priority: 2
Issue Component
Component: sdk-java-core