Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow primitive array creation from iterators of PrimitiveTypes (as well as Option) #1367

Merged
merged 4 commits into from
Mar 3, 2022

Conversation

viirya
Copy link
Member

@viirya viirya commented Feb 26, 2022

Which issue does this PR close?

Closes #1298.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 26, 2022
@codecov-commenter
Copy link

codecov-commenter commented Feb 26, 2022

Codecov Report

Merging #1367 (53404ee) into master (4d82e24) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1367      +/-   ##
==========================================
+ Coverage   82.98%   83.04%   +0.05%     
==========================================
  Files         181      181              
  Lines       52932    53025      +93     
==========================================
+ Hits        43926    44033     +107     
+ Misses       9006     8992      -14     
Impacted Files Coverage Δ
arrow/src/array/array_primitive.rs 94.83% <100.00%> (+0.14%) ⬆️
arrow/src/array/transform/mod.rs 84.39% <0.00%> (-0.14%) ⬇️
arrow/src/compute/kernels/temporal.rs 97.22% <0.00%> (-0.13%) ⬇️
parquet_derive/src/parquet_field.rs 66.21% <0.00%> (+0.22%) ⬆️
arrow/src/compute/kernels/comparison.rs 92.45% <0.00%> (+0.27%) ⬆️
arrow/src/array/array_binary.rs 93.49% <0.00%> (+0.28%) ⬆️
arrow/src/datatypes/datatype.rs 66.80% <0.00%> (+0.39%) ⬆️
parquet/src/file/metadata.rs 94.42% <0.00%> (+0.54%) ⬆️
arrow/src/array/cast.rs 93.10% <0.00%> (+1.43%) ⬆️
arrow/src/array/array_boolean.rs 94.61% <0.00%> (+1.53%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d82e24...53404ee. Read the comment docs.

@viirya
Copy link
Member Author

viirya commented Feb 26, 2022

cc @alamb

@alamb
Copy link
Contributor

alamb commented Feb 28, 2022

Thanks @viirya -- Sorry for the delay in review; I was out last week so I am am still catching up -- I'll add this one to my review queue and hope to get to it shortly.

@viirya
Copy link
Member Author

viirya commented Feb 28, 2022

Thank you @alamb !

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very clever @viirya -- thank you very much.

I have a concern (but no specific suggestions just yet)

Today if people impl Borrow<i8> for MyAwesomeType then I think (untested) they can create arrays from an iterator of MyAwesomeType which they might not be able to do after this PR 🤔 Though perhaps they just need to implement ArrowPrimitiveTypeNative

I tried a little to figure out a clever way to implement the conversion using a trait that is a superset of Borrow to make it backwards compatible and more general. Thus instead of

impl<T: ArrowPrimitiveType, Ptr: Borrow<Option<<T as ArrowPrimitiveType>::Native>>>
    FromIterator<Ptr> for PrimitiveArray<T>

It would look something like

impl<T: ArrowPrimitiveType, Ptr: BorrowAsOption<<T as ArrowPrimitiveType>::Native>>
    FromIterator<Ptr> for PrimitiveArray<T>
..

pub trait BorrowAsOption<T: ArrowPrimitiveType> {
    fn as_native(&self) -> Option<&<T as ArrowPrimitiveType>::Native>;
}

But my mind was being blown by type errors.

impl<T: ArrowPrimitiveType, Ptr: Borrow<Option<<T as ArrowPrimitiveType>::Native>>>
FromIterator<Ptr> for PrimitiveArray<T>
#[derive(Debug)]
pub struct ArrowPrimitiveTypeNative<T: ArrowPrimitiveType> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct ArrowPrimitiveTypeNative<T: ArrowPrimitiveType> {
struct ArrowPrimitiveTypeNative<T: ArrowPrimitiveType> {

I wonder if this needs to be pub?

If it does need to be pub I think we should probably document it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If people want to use their types to create arrays from an iterator, they need to implement From for ArrowPrimitiveTypeNative. So I guess we need it to be pub? I will document it.

@viirya
Copy link
Member Author

viirya commented Mar 1, 2022

Today if people impl Borrow<i8> for MyAwesomeType then I think (untested) they can create arrays from an iterator of MyAwesomeType which they might not be able to do after this PR 🤔 Though perhaps they just need to implement ArrowPrimitiveTypeNative

I guess you mean impl Borrow<Option<i8>> for MyAwesomeType? Yes, this is a concern. People need to implement impl From<MyAwesomeType> for ArrowPrimitiveTypeNative<Int8Type>.

I tried a little to figure out a clever way to implement the conversion using a trait that is a superset of Borrow to make it backwards compatible and more general. Thus instead of

impl<T: ArrowPrimitiveType, Ptr: Borrow<Option<<T as ArrowPrimitiveType>::Native>>>
    FromIterator<Ptr> for PrimitiveArray<T>

It would look something like

impl<T: ArrowPrimitiveType, Ptr: BorrowAsOption<<T as ArrowPrimitiveType>::Native>>
    FromIterator<Ptr> for PrimitiveArray<T>
..

pub trait BorrowAsOption<T: ArrowPrimitiveType> {
    fn as_native(&self) -> Option<&<T as ArrowPrimitiveType>::Native>;
}

But my mind was being blown by type errors.

Not sure if I understand above way. Seems people still need to implement BorrowAsOption, so it's still not backwards compatible.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand above way. Seems people still need to implement BorrowAsOption, so it's still not backwards compatible.

Yeah, after more thought here I can't come up with anything better

Some documentation comments on ArrowPrimitiveTypeNative would definitely be useful. I wonder if changing its name to something like NativeAdapter would make its purpose clearer.

I wonder if we need something similar for str and bool to make StringArrays and BooleanArrays (can do as a follow on PR)

maybe @shepmaster or @carols10cents has some alternate suggestions

@viirya
Copy link
Member Author

viirya commented Mar 3, 2022

Thanks @alamb ! Just renamed ArrowPrimitiveTypeNative to NativeAdapter and added some documentation.

@alamb alamb changed the title More idiomatic primitive array creation Allow primitive array creation from iterators of PrimitiveTypes (as well as Option) Mar 3, 2022
@alamb
Copy link
Contributor

alamb commented Mar 3, 2022

Thanks @viirya -- this looks great

@alamb alamb merged commit 258f828 into apache:master Mar 3, 2022
@viirya
Copy link
Member Author

viirya commented Mar 3, 2022

Thanks @alamb for your review and suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More ergonomic / idiomatic primitive array creation from iterators
3 participants