Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigquery normalization failing with Bad int64 value: 1.0 #21413

Closed
rodireich opened this issue Jan 13, 2023 · 2 comments · Fixed by #21844
Closed

Bigquery normalization failing with Bad int64 value: 1.0 #21413

rodireich opened this issue Jan 13, 2023 · 2 comments · Fixed by #21844
Assignees
Labels

Comments

@rodireich
Copy link
Contributor

Environment

  • Deployment: cloud
  • Source Connector and version: postgres 1.0.36
  • Destination Connector and version: bigquery 1.2.9
  • Step where error happened: Sync job

Current Behavior

See in the following connection
BigQuery normalization failing on a bad int64 value with

2023-01-12 23:04:09 normalization > Completed with 1 error and 0 warnings:
2023-01-12 23:04:09 normalization > Database Error in model orchestrator_segments (models/generated/airbyte_tables/analytics/orchestrator_segments.sql)
2023-01-12 23:04:09 normalization >   Bad int64 value: 1.0
2023-01-12 23:04:09 normalization >   compiled SQL at ../build/run/airbyte_utils/models/generated/airbyte_tables/analytics/orchestrator_segments.sql
2023-01-12 23:04:09 normalization > Done. PASS=15 WARN=0 ERROR=1 SKIP=0 TOTAL=16

This may be related to #oncall/1228, for which we made a fix for destination-bigquery-denormalizedbut but not in destination-bigquery.

Expected Behavior

Successful sync

@bleonard
Copy link
Contributor

@rodi. THink this is related to #20466 ?

@VitaliiMaltsev VitaliiMaltsev self-assigned this Jan 23, 2023
@VitaliiMaltsev
Copy link
Contributor

VitaliiMaltsev commented Jan 25, 2023

@bleonard @rodireich this issue is not related to #20466

The root cause of this issue is that user manually changed schema in Postgres for table orchestrator_segments
In initial sync column 2023-01-02 was integer
stream=io.airbyte.protocol.models.AirbyteStream@4d6bef1[name=orchestrator_segments,jsonSchema={"type":"object","properties":{"segment":{"type":"string"},"2023-01-02":{"type":"number","airbyte_type":"integer"},"updated_at":{"type":"string","format":"date-time","airbyte_type":"timestamp_without_timezone"},"_ab_cdc_lsn":{"type":"number"}

and then he modified table and column 2023-01-02 became float

2023-01-16 22:48:21 �[44msource�[0m > Table orchestrator_segments column 2023-01-02 (type float8[17], nullable false) -> JsonSchemaType({type=number})
Currently we don't have possibility to automatically update json schema in airbyte-db connection table so user just need to click "Refresh source schema" on UI to make it up-to-date and sync will be successful

Created PR to validate actual source schema vs catalog schema to detect quickly such cases in future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants