Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Time32 and Time64 for Type Coercion #4104

Conversation

andre-cc-natzka
Copy link
Contributor

This PR aims at extending the Type Coercion optimization rule from Datafusion to support time types (Time32 and Time64), which is required for where clauses involving fields of these types.

The main changes can be found at datafusion/expr/src/type_coercion/binary.rs, in the method temporal_coercion, which is adapted to ensure that, when getting a pair of types with a string and a time type, the later is taken as coerced type. Given that both Time32 and Time64 have an argument to specify the TimeUnit, one must be careful. In particular, the implementation must be consistent with the TimeUnits supported by Arrow, which include Second and Millisecond for Time32, and Microsecond and Nanosecond for Time64. These four cases are taken into account in temporal_coercion by calling a function check_time_unit that checks if the TimeUnit is consistent with the type and returns an option to it. The changes in binary.rs are checked in new tests, which are all passing.

These modifications required a bunch of adaptations to be implemented in the datafusion code so as to consistently support Time32(TimeUnit::Second), Time32(TimeUnit::Millisecond), Time64(TimeUnit::Microsecond) and Time64(TimeUnit::Nanosecond). The changes are straightforward in general, except for the ones in the proto crate. I chose not to change the datafusion.proto file, which only supports a generic Time64 type, with no unit specification, and does not support Time32 at all. This is not a problem in itself, however the datafusion/proto/src/to_proto.rs and datafusion/proto/src/from_proto.rs files have to be adapted. Right now, the proto Time64 translates into a Time64Nanosecond, as specified in the datafusion/proto/src/from_proto.rs file. On the other hand, a proto Time64 can be obtained from all four Time64Nanosecond, Time64Microsecond, Time32Millisecond and Time32Second scalar value types, as implemented in datafusion/proto/src/to_proto.rs. This makes sense to me, but I am not sure it is the best solution. Perhaps we could change the datafusion.proto?

@andre-cc-natzka andre-cc-natzka marked this pull request as draft November 4, 2022 08:33
@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant