Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Destination BigQuery: Added more logs and updated the JobId made filled with locations and projectID #5049

Conversation

etsybaev
Copy link
Contributor

@etsybaev etsybaev commented Jul 28, 2021

What

It's not clear the exact fail reason for #3549 and jobs IDs were filled without location and project's ID. So presumably at some moment, we may try to update it by the wrong location and get 404.

How

Added more logs and made the JobId filled with location and projectID.
Performed a smoke test: migrated some data from Postres to BigQuery sandbox
Selection_277
Selection_279

Pre-merge Checklist

Expand the checklist which is relevant for this PR.

Connector checklist

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • Secrets are annotated with airbyte_secret in the connector's spec
  • Credentials added to Github CI if needed and not already present. instructions for injecting secrets into CI.
  • Unit & integration tests added as appropriate (and are passing)
    • Community members: please provide proof of this succeeding locally e.g: screenshot or copy-paste acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • /test connector=connectors/<name> command as documented here is passing.
    • Community members can skip this, Airbyters will run this for you.
  • Code reviews completed
  • Documentation updated
    • README.md
    • docs/SUMMARY.md if it's a new connector
    • Created or updated reference docs in docs/integrations/<source or destination>/<name>.
    • Changelog in the appropriate page in docs/integrations/.... See changelog example
    • docs/integrations/README.md contains a reference to the new connector
    • Build status added to build page
  • Build is successful
  • Connector version bumped like described here
  • New Connector version released on Dockerhub by running the /publish command described here
  • No major blockers
  • PR merged into master branch
  • Follow up tickets have been created
  • Associated tickets have been closed & stakeholders notified

Connector Generator checklist

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed.

@github-actions github-actions bot added the area/connectors Connector related issues label Jul 28, 2021
@etsybaev etsybaev changed the title [DRAFT_PR][3549] Destination:BigQuery: Added more logs and updated the way BigQ… [DRAFT_PR][3549] Destination:BigQuery: Added more logs and updated the made JobId filled with locations and projectID Jul 28, 2021
@etsybaev etsybaev force-pushed the etsybaev/3549-destination-bigquery-added-more-logs-and-proper-job-creation branch from 951fcc5 to 1e05a67 Compare July 28, 2021 13:26
@etsybaev
Copy link
Contributor Author

etsybaev commented Jul 28, 2021

/test connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1075314464
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1075314464

@etsybaev etsybaev changed the title [DRAFT_PR][3549] Destination:BigQuery: Added more logs and updated the made JobId filled with locations and projectID Destination:BigQuery: Added more logs and updated the made JobId filled with locations and projectID Jul 28, 2021
@etsybaev etsybaev changed the title Destination:BigQuery: Added more logs and updated the made JobId filled with locations and projectID 🐛 Destination BigQuery: Added more logs and updated the made JobId filled with locations and projectID Jul 28, 2021
@etsybaev etsybaev force-pushed the etsybaev/3549-destination-bigquery-added-more-logs-and-proper-job-creation branch 2 times, most recently from d72dfc0 to 1ccfaba Compare July 28, 2021 16:01
writer.getWriter().write(ByteBuffer.wrap((Jsons.serialize(formatRecord(writer.getSchema(), recordMessage)) + "\n").getBytes(Charsets.UTF_8)));
} catch (IOException e) {
} catch (IOException | RetryHelperException e) {
LOGGER.error("Got an error while writing message:" + e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's log as much as we possibly can here if the status is 404:

  • region
  • number of input streams
  • all job information
  • any other info that would be helpful

Copy link
Contributor Author

@etsybaev etsybaev Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sherifnada.
Many thanks for your comment.
I've updated the PR with adding more logs. But the point here is that at the moment when we get an exception - we get it all wrapped as a String. So the only way to check if 404 is "string.contains()" method that looks a little bit ugly for this case as for me. But here is the thing, if we fail at some moment for other reason we anyway would like to know all the details and the whole "sync attempt" would be marked as failed. So I added full logs without additional verification for reason. Please let me know if you don't agree.
By the way, this PR is in "draft" status for now and waiting for GL internal code review.

P.S. Here is an example what we would get from Job object in logs:

Job{job=JobId{project=dataline-integration-testing, job=72c84b79-4549-442c-aa74-3ad323761f21, location=EU}, status=JobStatus{state=RUNNING, error=null, executionErrors=null}, statistics=LoadStatistics{creationTime=1627492909725, endTime=null, startTime=1627492910001, numChildJobs=null, parentJobId=null, scriptStatistics=null, inputBytes=null, inputFiles=null, outputBytes=null, outputRows=null, badRecords=null}, userEmail=test-bigquery-user@dataline-integration-testing.iam.gserviceaccount.com, etag=IhA80VDWAmmB2tZ5LSocCQ==, generatedId=dataline-integration-testing:EU.72c84b79-4549-442c-aa74-3ad323761f21, selfLink=https://www.googleapis.com/bigquery/v2/projects/dataline-integration-testing/jobs/72c84b79-4549-442c-aa74-3ad323761f21?location=EU, configuration=LoadJobConfiguration{type=LOAD, destinationTable=GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=airbyte_tests_kxudixzk, projectId=dataline-integration-testing, tableId=_airbyte_tmp_mqh_users}}, destinationEncryptionConfiguration=null, createDisposition=CREATE_IF_NEEDED, writeDisposition=null, formatOptions=FormatOptions{format=NEWLINE_DELIMITED_JSON}, nullMarker=null, maxBadRecords=null, schema=Schema{fields=[Field{name=_airbyte_ab_id, type=STRING, mode=null, description=null, policyTags=null}, Field{name=_airbyte_data, type=STRING, mode=null, description=null, policyTags=null}, Field{name=_airbyte_emitted_at, type=TIMESTAMP, mode=null, description=null, policyTags=null}]}, ignoreUnknownValue=null, sourceUris=[gs://bigquery-prod-upload-eu/prod-scotty-bbe62546-332e-4956-94c1-8df38458b14e], schemaUpdateOptions=null, autodetect=null, timePartitioning=null, clustering=null, useAvroLogicalTypes=null, labels=null, jobTimeoutMs=null, rangePartitioning=null, hivePartitioningOptions=null}}

@etsybaev etsybaev force-pushed the etsybaev/3549-destination-bigquery-added-more-logs-and-proper-job-creation branch from 1ccfaba to 666fb83 Compare July 28, 2021 19:08
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Jul 28, 2021
@etsybaev
Copy link
Contributor Author

etsybaev commented Jul 28, 2021

/test connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1076162540
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1076162540

@@ -114,6 +114,7 @@ Therefore, Airbyte BigQuery destination will convert any invalid characters into
| :--- | :--- | :--- | :--- |
| 0.3.6 | 2021-06-18 | [#3947](https://github.com/airbytehq/airbyte/issues/3947) | Service account credentials are now optional. |
| 0.3.4 | 2021-06-07 | [#3277](https://github.com/airbytehq/airbyte/issues/3277) | Add dataset location option |
| 0.3.9 | 2021-07-28 | [#3549](https://github.com/airbytehq/airbyte/issues/3549) | Add extended logs and made JobId filled with region and projectId |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be in reverse chrono order

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... how didn't I notice it. It was a hard day. Updated, sorry for that

@etsybaev etsybaev linked an issue Jul 29, 2021 that may be closed by this pull request
@etsybaev etsybaev changed the title 🐛 Destination BigQuery: Added more logs and updated the made JobId filled with locations and projectID 🐛 Destination BigQuery: Added more logs and updated the JobId made filled with locations and projectID Jul 29, 2021
@etsybaev etsybaev marked this pull request as ready for review July 29, 2021 12:10
ievgeniit added 3 commits July 29, 2021 15:13
…ore-logs-and-proper-job-creation

# Conflicts:
#	docs/integrations/destinations/bigquery.md
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shipit

@etsybaev
Copy link
Contributor Author

etsybaev commented Jul 29, 2021

/test connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1079646437
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1079646437

@etsybaev
Copy link
Contributor Author

etsybaev commented Jul 29, 2021

/publish connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1079705932
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/1079705932

@etsybaev etsybaev merged commit 0c03361 into master Jul 29, 2021
@etsybaev etsybaev deleted the etsybaev/3549-destination-bigquery-added-more-logs-and-proper-job-creation branch July 29, 2021 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Destination Bigquery returns 404 when uploading data + resumable
4 participants