Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source MSSQL to S3 Failed to convert JSON to Avro #5609

Closed
Tracked by #6994
marcosmarxm opened this issue Aug 25, 2021 · 6 comments · Fixed by #7386
Closed
Tracked by #6994

Source MSSQL to S3 Failed to convert JSON to Avro #5609

marcosmarxm opened this issue Aug 25, 2021 · 6 comments · Fixed by #7386

Comments

@marcosmarxm
Copy link
Member

Enviroment

  • Is this your first time deploying Airbyte: no
  • OS Version / Instance: Linux EC2 t3.xlarge
  • Deployment: Docker
  • Airbyte Version: 0.29.11-alpha
  • Source name: MSSQL 0.3.4 (latest)
  • destination: S3 0.1.3 (latest)
  • Description: I tried to ingest data from MSQQL database to S3 bucket, using CDC mode. Job was failling with Exception in thread "main" tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro: Could not evaluate union, field dtm_Creation_Date is expected to be one of these: NULL, STRING. If this is a complex type, check if offending field: dtm_Creation_Date adheres to schema

Current Behavior

Tell us what happens.

Expected Behavior

Tell us what should happen.

Logs

If applicable, please upload the logs from the failing operation.
For sync jobs, you can download the full logs from the UI by going to the sync attempt page and
clicking the download logs button at the top right of the logs display window.

2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 - Exception in thread "main" tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro: Could not evaluate union, field dtm_Creation_Date is expected to be one of these: NULL, STRING. If this is a complex type, check if offending field: dtm_Creation_Date adheres to schema.
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:60)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:49)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonAvroConverter.convertToGenericDataRecord(JsonAvroConverter.java:58)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.destination.s3.avro.AvroRecordFactory.getAvroRecord(AvroRecordFactory.java:63)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.destination.s3.parquet.S3ParquetWriter.write(S3ParquetWriter.java:110)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.destination.s3.S3Consumer.acceptTracked(S3Consumer.java:133)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.base.FailureTrackingAirbyteMessageConsumer.accept(FailureTrackingAirbyteMessageConsumer.java:66)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.base.IntegrationRunner.consumeWriteStream(IntegrationRunner.java:167)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.base.IntegrationRunner.run(IntegrationRunner.java:148)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at io.airbyte.integrations.destination.s3.S3Destination.main(S3Destination.java:49)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 - Caused by: org.apache.avro.AvroTypeException: Could not evaluate union, field dtm_Creation_Date is expected to be one of these: NULL, STRING. If this is a complex type, check if offending field: dtm_Creation_Date adheres to schema.
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.AvroTypeExceptions.unionException(AvroTypeExceptions.java:28)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.readUnion(JsonGenericRecordReader.java:162)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:98)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.lambda$readRecord$0(JsonGenericRecordReader.java:71)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at java.base/java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:710)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.readRecord(JsonGenericRecordReader.java:68)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:58)
2021-08-23 04:37:55 ERROR () LineGobbler(voidCall):85 -    ... 9 more

Steps to Reproduce

Are you willing to submit a PR?

Remove this with your answer.

@marcosmarxm marcosmarxm added type/bug Something isn't working area/connectors Connector related issues lang/java labels Aug 25, 2021
@sherifnada
Copy link
Contributor

Slack thread: https://airbytehq.slack.com/archives/C01MFR03D5W/p1629693918005800
the SQL type is smalldatetime

@teslahenry
Copy link

For more details: It's only failed when I choose parquet format for S3 destination. (it's ok with avro file format)

@tuliren tuliren self-assigned this Aug 27, 2021
@tuliren
Copy link
Contributor

tuliren commented Aug 27, 2021

It's only failed when I choose parquet format for S3 destination. (it's ok with avro file format)

Hmm, this is very interesting. We need to check what the smalldatetime looks like from MySQL. Could even be a bug in the AvroParquetWriter.

@tuliren
Copy link
Contributor

tuliren commented Aug 29, 2021

From Luat Nguyen Cong:

I debugged and found the cause is the difference between avro schema and real cdc event. Eg, MSSQL's smalldatetime is converted to ["null", "string"] type in avro schema, but it's double type in cdc event. (edited)

The MSSQL's money type also cause the same error

@yurii-bidiuk
Copy link
Contributor

#7386

yurii-bidiuk added a commit that referenced this issue Nov 5, 2021
…from mssql source (#5609) (#7386)

* Fix data type (smalldatetime, smallmoney) conversion from mssql source (#5609)

* Fixed code format

* Bumb new version

* Update documentation (mssql.md)

* formating

* fixed converter properties

* aligned converter utils with #7339

Co-authored-by: Andrii Leonets <30464745+DoNotPanicUA@users.noreply.github.com>
schlattk pushed a commit to schlattk/airbyte that referenced this issue Jan 4, 2022
…from mssql source (airbytehq#5609) (airbytehq#7386)

* Fix data type (smalldatetime, smallmoney) conversion from mssql source (airbytehq#5609)

* Fixed code format

* Bumb new version

* Update documentation (mssql.md)

* formating

* fixed converter properties

* aligned converter utils with airbytehq#7339

Co-authored-by: Andrii Leonets <30464745+DoNotPanicUA@users.noreply.github.com>
@mp-pinheiro
Copy link

#12949

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
7 participants