-
Notifications
You must be signed in to change notification settings - Fork 553
feat(protobuf): enhance Protobuf data handling with LogicalMap support and enhance test coverage #3020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances Protobuf-to-Avro data handling by introducing LogicalMap support for proper MAP type preservation and improving union schema handling. The changes include:
- A new
LogicalMapProtobufDataclass that annotates Protobuf map fields with Iceberg's LogicalMap logical type - Refactored
ProtoToAvroConverterwith improved union unwrapping and timestamp conversion - Enhanced
RecordBinderwith better union validation and modularized binder precomputation - Comprehensive test coverage additions for Protobuf conversion scenarios
Key Changes
- LogicalMap Support: Protobuf map fields now preserve MAP semantics instead of defaulting to ARRAY<record<key,value>>
- Union Handling: Improved validation to reject non-optional unions with multiple non-NULL types, with clear error messages
- Test Coverage: Added ~900 lines of new tests covering edge cases, optional fields, oneofs, and complex nested structures
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
LogicalMapProtobufData.java |
New class extending ProtobufData to annotate map fields with LogicalMap logical type |
ProtoToAvroConverter.java |
Refactored conversion logic with consistent union unwrapping and improved timestamp handling |
RecordBinder.java |
Enhanced union resolution with validation and modularized binder creation for nested types |
ProtobufRegistryConverter.java |
Updated to use LogicalMapProtobufData singleton |
CodecSetup.java |
Added public accessor for LogicalMap instance |
ProtobufRegistryConverterUnitTest.java |
New comprehensive test suite (719 lines) |
ProtobufRegistryConverterTest.java |
Added map field test case |
ProtoToAvroConverterTest.java |
New unit tests for converter edge cases |
AvroRecordBinderTypeTest.java |
New comprehensive type conversion tests (1000 lines) |
AvroRecordBinderTest.java |
Refactored to focus on specific binding scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/src/main/java/kafka/automq/table/process/convert/LogicalMapProtobufData.java
Outdated
Show resolved
Hide resolved
core/src/main/java/kafka/automq/table/process/convert/ProtoToAvroConverter.java
Show resolved
Hide resolved
core/src/test/java/kafka/automq/table/process/convert/ProtobufRegistryConverterUnitTest.java
Show resolved
Hide resolved
core/src/test/java/kafka/automq/table/process/convert/ProtoToAvroConverterTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/kafka/automq/table/binder/AvroRecordBinderTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/kafka/automq/table/binder/AvroRecordBinderTypeTest.java
Show resolved
Hide resolved
core/src/test/java/kafka/automq/table/binder/AvroRecordBinderTypeTest.java
Show resolved
Hide resolved
0fe2f1e to
d358ff1
Compare
…t and enhance test coverage
d358ff1 to
ebe6e3a
Compare
…t and enhance test coverage (#3020)
Protobuf to Avro Conversion Enhancements
LogicalMapProtobufData: This new class ensures that Protobuf map fields are annotated with Iceberg's LogicalMap logical type, allowing downstream conversion to preserve MAP semantics instead of defaulting to ARRAY<record<key,value>>.ProtoToAvroConverter: Now uses the newLogicalMapProtobufDatasingleton, applies logical type conversions (e.g., for timestamps), and consistently unwraps union schemas to their non-NULL element for conversion. Also improves handling of optional and oneof fields, ensuring correct Avro nullability and presence semantics. [1] [2]Avro Schema Union Handling and Binder Precomputation
RecordBinder: Now unions are only unwrapped if they contain exactly one non-NULL type, with clear error handling for unsupported multi-type unions. This prevents incorrect assumptions about union schemas and improves error messages. [1] [2]