forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
apacheGH-35352: [Java] Fix issues with "semi complex" types. (apache#…
…35353) ### Rationale for this change "semi complex" types like `TimeStamp*TZ`, `Duration`, and `FixedSizeBinary` were missing implementations in `UnionListWriter`, `UnionVector`, `UnionReader` and other associated classes. This patch adds these missing methods so that these types can now be written to things like `ListVector`s, whereas before it would throw an exception because the methods were just not implemented. For example, without this patch, one of the new tests added would fail: ``` TestListVector.testWriterGetTimestampMilliTZField:913 ? IllegalArgument You tried to write a TimeStampMilliTZ type when you are using a ValueWriter of type UnionListWriter. ``` There are also fixes for get and set methods for holders for the respective `*Vector`s classes for these types: - The get methods did not set fields like `TimeStampMilliTZHolder.timezone`, `DurationHolder.unit`, `FixedSizeBinaryHolder.byteWidth`. - The set methods did not all validate that those fields matched what the vector's `ArrowType` was set to. For example `TimeStampMilliTZHolder.timezone` should match `ArrowType.Timestamp.timezone` on the vector and should throw if it doesn't. Otherwise users would never get a signal that there is anything wrong with their code writing these holders with mismatching values. This patch additionally marks some of the existing interfaces for writing these "semi complex" types as deprecated, because they do not actually provide enough context to properly write these parameterized `ArrowTypes`. Instead, users should use the write methods that take `*Holders` that do provide enough context for the writers to properly write these types. Also note that the equivalent write methods for `Decimal` have already also been marked as deprecated, for the same reasoning. ### What changes are included in this PR? - Various template changes. See this gist for a diff of the generated files: https://gist.github.com/henrymai/03b0f5a4165cd9822aa797c89bbd74e9 - Note that the diff for the ComplexWriter.java generated ones is not included since that change is easy to see from just the template. - Additional tests to verify that this is fixed. ### Are these changes tested? Yes, added unit tests and they pass. ### Are there any user-facing changes? Yes, users will now be able to write to types like TimeStamp*TZ, Duration, and FixedSizeBinary on ListVector, MapVector, StructVector due to added implementations for these types on UnionListVector, UnionWriter. This also marks the non-holder write interfaces for these types as `@ Deprecated`. The interfaces themselves have not been removed, so this part is not a breaking change. Also as mentioned above, this follows in the footsteps of what `Decimal*` has already done, where they marked the non holder write interfaces as `@ Deprecated`. Additionally another user visible change is that when reading these types into a holder all of the fields are now populated. For example `TimeStamp*TZHolder.timezone` is now populated when it was not before. **This PR includes breaking changes to public APIs.** We now throw an exception when some of the holder fields that correspond to the Vector's ArrowType parameters do not match. This is a breaking change because previously this would silently accept the writes but was actually incorrect, since the element written does not correspond to the right ArrowType variant due to mismatching parameters (like ArrowType.Timestamp.unit, ArrowType.Duration.unit, and ArrowType.FixedSizeBinary.byteWidth). * Closes: apache#35352 Authored-by: Henry Mai <henrymai@users.noreply.github.com> Signed-off-by: David Li <li.davidm96@gmail.com>
- Loading branch information
Showing
19 changed files
with
909 additions
and
149 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.