PLUGIN-557: BigQuerySink fix for upsert operation in existing table but in different project #555
Conversation
…ut in different project
|
lgtm |
| private void configureTable(Schema schema) { | ||
| AbstractBigQuerySinkConfig config = getConfig(); | ||
| Table table = BigQueryUtil.getBigQueryTable(config.getProject(), config.getDataset(), | ||
| Table table = BigQueryUtil.getBigQueryTable(config.getDatasetProject(), config.getDataset(), |
There was a problem hiding this comment.
what if config.getProject() is provided? In that case we will rely on config.getDatasetProject()?
There was a problem hiding this comment.
If config.getProject() and config.getDatasetProject() are provided, then datasetProject will get priority.
If datasetProject is not defined, then config.getProject() will be set up in configuration.
That's how it is defined in AbstractBigQuerySinkConfig
CuriousVini
left a comment
There was a problem hiding this comment.
Is there any unit test or integration test that can be added for this bug?
We created this JIRA Ticket https://cdap.atlassian.net/browse/PLUGIN-606 in order to add integration test in the future because currently it would be hard to add without having another project for integration test. |
|
/gcbrun |
|
Has this fix been tested with dataset id being a macro? |
Yes, this fix has been tested with dataset id being a macro. This is the block of code that prevents any validation exception if any of the fields contains macro: google-cloud/src/main/java/io/cdap/plugin/gcp/bigquery/sink/BigQuerySink.java Lines 251 to 253 in 86f4b91 And this is the code of shouldConnect(): |
|
/gcbrun |
BigQuery sink pipeline fails doing upsert operation to existing table in different project. The table is partitioned by ingestion-time (day). Partition field is left empty.
JIRA: https://cdap.atlassian.net/browse/PLUGIN-557