-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: cast Binary/String dictionary to view #8912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| // TODO: handle LargeUtf8/LargeBinary -> View (need to check offsets can fit) | ||
| // TODO: handle cross types (String -> BinaryView, Binary -> StringView) | ||
| // (need to validate utf8?) | ||
| (Utf8, Utf8View) => view_from_dict_values::<K, Utf8Type, StringViewType>( | ||
| array.keys(), | ||
| array.values().as_string::<i32>(), | ||
| ), | ||
| (Binary, BinaryView) => view_from_dict_values::<K, BinaryType, BinaryViewType>( | ||
| array.keys(), | ||
| array.values().as_binary::<i32>(), | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main change; we check the values beforehand to see if it is valid for the fast path, which was missing before (and the assumption lead to an error). I've intentionally kept the behaviour as intended (limited the fast path only for Dictionary<Utf8> -> Utf8View, Dictionary<Binary> -> BinaryView) since was mainly interested in just fixing the cast bug. Left comments for potentially extending this fast path to be valid for more combinations (can raise issues for this).
|
|
||
| match to_type { | ||
| Dictionary(to_index_type, to_value_type) => { | ||
| let dict_array = array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this inner logic into a separate function to be similar to how the other arms delegate to functions for the full logic.
| array: &dyn Array, | ||
| // Unpack a dictionary into a flattened array of type to_type | ||
| pub(crate) fn unpack_dictionary<K: ArrowDictionaryKeyType>( | ||
| array: &DictionaryArray<K>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the cast to dictionary array to the parent function (dictionary_cast()) so we don't need to always cast in each of the specialized functions.
Which issue does this PR close?
Rationale for this change
Be able to successfully cast from Dictionary type to View types.
What changes are included in this PR?
Add checks on which array types can use the fast path that was previously erroring.
Also do a little refactoring in surrounding code.
Are these changes tested?
Added new tests.
Are there any user-facing changes?
No.