Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add finish_cloned to ArrayBuilder #3158

Merged
merged 5 commits into from
Nov 23, 2022
Merged

Add finish_cloned to ArrayBuilder #3158

merged 5 commits into from
Nov 23, 2022

Conversation

askoa
Copy link
Contributor

@askoa askoa commented Nov 22, 2022

Which issue does this PR close?

Closes #3154

Are there any user-facing changes?

Yes, users will get a new function finish_cloned to get snapshot of data in ArrayBuilder without resetting the builder.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Nov 22, 2022
@tustvold
Copy link
Contributor

tustvold commented Nov 22, 2022

I wonder if instead of MutableBuffer: Clone and adding finish_cloned to BufferBuilder, etc... we could just make use of https://docs.rs/arrow-buffer/latest/arrow_buffer/struct.Buffer.html#method.from_slice_ref?

e.g.

let buffer = Buffer::from_slice_ref(self.values_builder.as_slice());

What do you think?

@askoa
Copy link
Contributor Author

askoa commented Nov 22, 2022

I wonder if instead of MutableBuffer: Clone and adding finish_cloned to BufferBuilder, etc... we could just make use of https://docs.rs/arrow-buffer/latest/arrow_buffer/struct.Buffer.html#method.from_slice_ref?

e.g.

let buffer = Buffer::from_slice_ref(self.values_builder.as_slice());

What do you think?

I made the change locally and getting below error

Error for BooleanBufferBuilder:

"the size for values of type `[u8]` cannot be known at compilation time\nthe trait `Sized` is not implemented for `[u8]`"

Error for BufferBuilder:

"the size for values of type `[T]` cannot be known at compilation time\nthe trait `Sized` is not implemented for `[T]`"

@tustvold
Copy link
Contributor

Aah yeah, the signature for Buffer::from_slice_ref is a little strange... Buffer::from_slice_ref(&self.values_builder.as_slice()); should work

@askoa
Copy link
Contributor Author

askoa commented Nov 22, 2022

Aah yeah, the signature for Buffer::from_slice_ref is a little strange... Buffer::from_slice_ref(&self.values_builder.as_slice()); should work

The suggestion works. I'll send complete PR.

Edit: I think finish_cloned function needs to be added to null_buffer_builder as null_buffer_builder.bitmap_builder might not be accessible outside the struct.

@tustvold
Copy link
Contributor

Edit: I think finish_cloned function needs to be added to null_buffer_builder as null_buffer_builder.bitmap_builder might not be accessible outside the struct.

I'd suggest adding something like

    /// Returns the packed bits
    pub fn as_slice(&self) -> Option<&[u8]> {
        Some(self.bitmap_builder?.as_slice())
    }

@askoa askoa changed the title <WIP> Add finish_cloned to ArrayBuilder Add finish_cloned to ArrayBuilder Nov 22, 2022
@askoa askoa marked this pull request as ready for review November 22, 2022 20:00
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor nits

arrow-array/src/builder/map_builder.rs Outdated Show resolved Hide resolved
arrow-array/src/builder/generic_list_builder.rs Outdated Show resolved Hide resolved
arrow-array/src/builder/fixed_size_list_builder.rs Outdated Show resolved Hide resolved
arrow-array/src/builder/fixed_size_list_builder.rs Outdated Show resolved Hide resolved
@tustvold tustvold merged commit 6c466af into apache:master Nov 23, 2022
@ursabot
Copy link

ursabot commented Nov 23, 2022

Benchmark runs are scheduled for baseline = 12a67b9 and contender = 6c466af. 6c466af is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@askoa askoa deleted the finish-cloned branch December 1, 2022 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ArrayBuilder::finish_cloned()
3 participants