Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add freeze_with_dictionary API to MutableArrayData #2915

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 12 additions & 0 deletions arrow-data/src/transform/mod.rs
Expand Up @@ -637,6 +637,18 @@ impl<'a> MutableArrayData<'a> {
unsafe { self.data.freeze(self.dictionary).build_unchecked() }
}

/// Creates a [ArrayData] from the pushed regions up to this point, consuming `self`,
/// with the provided dictionary values `ArrayData`, without any validation
///
/// # Safety
///
/// As this doesn't validate the provided dictionary `ArrayData` values, the input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure this will work as expected, MutableArrayData has some pretty hard assumptions when it comes to dictionaries - it concatenates the dictionary arrays, and computes new keys based on this assumption

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the values of dictionaries. What MutableArrayData does is simply concatenates the value arrays with another MutableArrayData. The caller must ensure it provides correct concatenated dictionary value array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm? That is also where I looked at. For dictionaries, it uses another MutableArrayData to extend from their value arrays. Isn't it?

/// dictionary values *must* form a valid Arrow dictionary array, or undefined behavior
/// can results.
pub unsafe fn freeze_with_dictionary(self, dictionary: ArrayData) -> ArrayData {
self.data.freeze(Some(dictionary)).build_unchecked()
}

/// Creates a [ArrayDataBuilder] from the pushed regions up to this point, consuming `self`.
/// This is useful for extending the default behavior of MutableArrayData.
pub fn into_builder(self) -> ArrayDataBuilder {
Expand Down