Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BE: Serde: Impl Avro type for timestamp-nanos and `local-timestamp-… #877

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

wernerdv
Copy link
Contributor

…nanos` logical types

  • Breaking change? (if so, please describe the impact and migration path for existing application instances)

What changes did you make? (Give an overview)
Resolve #872

Is there anything you'd like reviewers to focus on?

How Has This Been Tested? (put an "x" (case-sensitive!) next to an item)

  • No need to
  • Manually (please, describe, if necessary)
  • Unit checks
  • Integration checks
  • Covered by existing automation

Checklist (put an "x" (case-sensitive!) next to all the items, otherwise the build will fail)

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. ENVIRONMENT VARIABLES)
  • My changes generate no new warnings (e.g. Sonar is happy)
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged

Check out Contributing and Code of Conduct

A picture of a cute animal (not mandatory but encouraged)

@wernerdv wernerdv requested a review from a team as a code owner February 28, 2025 19:55
@kapybro kapybro bot added status/triage Issues pending maintainers triage area/serde Serialization & Deserialization (plugins) status/triage/manual Manual triage in progress status/triage/completed Automatic triage completed and removed status/triage Issues pending maintainers triage labels Feb 28, 2025
@wernerdv
Copy link
Contributor Author

@Haarolean I encountered an error in the test SchemaRegistrySerdeTest#avroLogicalTypesRepresentationIsConsistentForSerializationAndDeserialization:

java.lang.ClassCastException: value 2007-12-03T10:15:30.123456789Z (a java.time.Instant) cannot be cast to expected type long at TestAvroRecord.lt_timestamp_nanos

I don't know how to fix this yet.
Do you have any ideas on how to solve this problem?

@Haarolean
Copy link
Member

@Haarolean I encountered an error in the test SchemaRegistrySerdeTest#avroLogicalTypesRepresentationIsConsistentForSerializationAndDeserialization:

java.lang.ClassCastException: value 2007-12-03T10:15:30.123456789Z (a java.time.Instant) cannot be cast to expected type long at TestAvroRecord.lt_timestamp_nanos

I don't know how to fix this yet. Do you have any ideas on how to solve this problem?

@wernerdv
According to org.apache.avro.data.TimeConversions there is a method to convert micros/millis (see TimestampMicrosConversion) but none for nanos.
That's why we have to explicitly put a long here instead of relying on non-existent auto-conversion for nanos:
inputRecord.put("lt_timestamp_nanos", Instant.parse("2007-12-13T10:15:30.123456789Z"));

Copy link
Member

@Haarolean Haarolean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests fixes needed :)

@Haarolean Haarolean added type/enhancement En enhancement/improvement to an already existing feature scope/backend Related to backend changes and removed status/triage/manual Manual triage in progress labels Mar 10, 2025
@wernerdv wernerdv requested a review from a team as a code owner March 10, 2025 19:31
@wernerdv
Copy link
Contributor Author

wernerdv commented Mar 11, 2025

@wernerdv According to org.apache.avro.data.TimeConversions there is a method to convert micros/millis (see TimestampMicrosConversion) but none for nanos. That's why we have to explicitly put a long here instead of relying on non-existent auto-conversion for nanos: inputRecord.put("lt_timestamp_nanos", Instant.parse("2007-12-13T10:15:30.123456789Z"));

In the Avro 1.12.0 library, there are methods for nanos conversion: TimeConversions.TimestampNanosConversion and TimeConversions.LocalTimestampNanosConversion.

It seems the issue is that the Serialize#serializeAvro method uses AvroSchemaUtils.getDatumWriter from the Confluent library, which lacks support for converting logical types TimeConversions.TimestampNanosConversion and TimeConversions.LocalTimestampNanosConversion.
An error occurs in the SchemaRegistrySerdeTest#avroLogicalTypesRepresentationIsConsistentForSerializationAndDeserialization test because of this.

Essentially, we can copy the AvroSchemaUtils class and add the nanos conversion to the static method addLogicalTypeConversion.
However, this doesn't seem like the best solution.
What do you think?

@Haarolean
Copy link
Member

@wernerdv According to org.apache.avro.data.TimeConversions there is a method to convert micros/millis (see TimestampMicrosConversion) but none for nanos. That's why we have to explicitly put a long here instead of relying on non-existent auto-conversion for nanos: inputRecord.put("lt_timestamp_nanos", Instant.parse("2007-12-13T10:15:30.123456789Z"));

In the Avro 1.12.0 library, there are methods for nanos conversion: TimeConversions.TimestampNanosConversion and TimeConversions.LocalTimestampNanosConversion.

It seems the issue is that the Serialize#serializeAvro method uses AvroSchemaUtils.getDatumWriter from the Confluent library, which lacks support for converting logical types TimeConversions.TimestampNanosConversion and TimeConversions.LocalTimestampNanosConversion. An error occurs in the SchemaRegistrySerdeTest#avroLogicalTypesRepresentationIsConsistentForSerializationAndDeserialization test because of this.

Essentially, we can copy the AvroSchemaUtils class and add the nanos conversion to the static method addLogicalTypeConversion. However, this doesn't seem like the best solution. What do you think?

could you please take a look at confluent issues on github or wherever they do that to inquire if they plan on implementing these methods? If yes, we could wait for a new library version. If not, a temporary solution like you suggested could be implemented instead.

@wernerdv
Copy link
Contributor Author

wernerdv commented Mar 11, 2025

could you please take a look at confluent issues on github or wherever they do that to inquire if they plan on implementing these methods? If yes, we could wait for a new library version. If not, a temporary solution like you suggested could be implemented instead.

I found a closed issue.
There is a regression in Avro 1.12.0 (see the comment: confluentinc/schema-registry#3318 (comment)).
I think the developers at Confluent will update Avro to version 1.12.1 as soon as it becomes available.
Given that the regression was detected in Avro 1.12.0, does it make sense to upgrade to that version now as part of this task?

@Haarolean
Copy link
Member

could you please take a look at confluent issues on github or wherever they do that to inquire if they plan on implementing these methods? If yes, we could wait for a new library version. If not, a temporary solution like you suggested could be implemented instead.

I found a closed issue. There is a regression in Avro 1.12.0 (see the comment: confluentinc/schema-registry#3318 (comment)). I think the developers at Confluent will update Avro to version 1.12.1 as soon as it becomes available. Given that the regression was detected in Avro 1.12.0, does it make sense to upgrade to that version now as part of this task?

sure, let's try!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/serde Serialization & Deserialization (plugins) scope/backend Related to backend changes status/triage/completed Automatic triage completed type/enhancement En enhancement/improvement to an already existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Serde: Impl Avro type for timestamp-nanos logical type
2 participants