Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Fixed nested boolean offset #1404

Merged
merged 2 commits into from Feb 14, 2023

Conversation

ritchie46
Copy link
Collaborator

Simply ensure we do the same as on primitive arrays. This fixed this panic:

df = pl.Series([[None, True], [False, False], [True, True]]).slice(2, 2).to_frame()
f = io.BytesIO()
df.write_parquet(f)
thread '<unnamed>' panicked at 'the offset of the new Buffer cannot exceed the existing length', /home/ritchie46/.cargo/git/checkouts/arrow2-945af624853845da/f510012/src/array/boolean/mod.rs:174:5
stack backtrace:
   0: rust_begin_unwind
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/panicking.rs:64:14
   2: arrow2::array::boolean::BooleanArray::sliced
             at /home/ritchie46/.cargo/git/checkouts/arrow2-945af624853845da/f510012/src/array/mod.rs:410:13
   3: arrow2::io::parquet::write::boolean::nested::array_to_page
             at /home/ritchie46/.cargo/git/checkouts/arrow2-945af624853845da/f510012/src/io/parquet/write/boolean/nested.rs:27:17
   4: arrow2::io::parquet::write::array_to_page_nested
             at /home/ritchie46/.cargo/git/checkouts/arrow2-945af624853845da/f510012/src/io/parquet/write/mod.rs:541:13
   5: arrow2::io::parquet::write::array_to_page
             at /home/ritchie46/.cargo/git/checkouts/arrow2-945af624853845da/f510012/src/io/parquet/write/mod.rs:293:5
   6: arrow2::io::parquet::write::array_to_pages::{{closure}}
             at /home/ritchie46/.cargo/git/checkouts/arrow2-945af624853845da/f510012/src/io/parquet/write/mod.rs:269:13
   7: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/ops/function.rs:310:13
   8: core::option::Option<T>::map
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/option.rs:970:29
   9: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/iter/adapters/map.rs:103:9
  10: <alloc::boxed::Box<I,A> as core::iter::traits::iterator::Iterator>::next
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/alloc/src/boxed.rs:1922:9
  11: <parquet2::write::dyn_iter::DynIter<V> as core::iter::traits::iterator::Iterator>::next
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet2-0.17.1/src/write/dyn_iter.rs:13:9
  12: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/iter/adapters/map.rs:103:9
  13: <parquet2::write::compression::Compressor<I> as fallible_streaming_iterator::FallibleStreamingIterator>::advance
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet2-0.17.1/src/write/compression.rs:149:20
  14: <fallible_streaming_iterator::MapErr<I,F> as fallible_streaming_iterator::FallibleStreamingIterator>::advance
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/fallible-streaming-iterator-0.1.9/src/lib.rs:684:9
  15: <parquet2::write::dyn_iter::DynStreamingIterator<V,E> as fallible_streaming_iterator::FallibleStreamingIterator>::advance
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet2-0.17.1/src/write/dyn_iter.rs:43:9
  16: fallible_streaming_iterator::FallibleStreamingIterator::next
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/fallible-streaming-iterator-0.1.9/src/lib.rs:52:9
  17: parquet2::write::column_chunk::write_column_chunk
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet2-0.17.1/src/write/column_chunk.rs:45:39
  18: parquet2::write::row_group::write_row_group::{{closure}}
             at /home/ritchie46/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet2-0.17.1/src/write/row_group.rs:100:17
  19: core::iter::adapters::map::map_try_fold::{{closure}}
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/iter/adapters/map.rs:91:28
  20: core::iter::traits::iterator::Iterator::try_fold
             at /rustc/333ee6c466972185973d5097f8b5fb0f9fb13fa5/library/core/src/iter/traits/iterator.rs:2262:21
  21: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try

So I assume we slice_nested_leaf already on dyn Array?

However I am not sure. So feel free to dismiss this as I can be entirely wrong here.

@codecov
Copy link

codecov bot commented Feb 14, 2023

Codecov Report

Base: 83.63% // Head: 83.59% // Decreases project coverage by -0.04% ⚠️

Coverage data is based on head (eae6698) compared to base (3a8da98).
Patch coverage: 78.76% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1404      +/-   ##
==========================================
- Coverage   83.63%   83.59%   -0.04%     
==========================================
  Files         373      373              
  Lines       40284    40278       -6     
==========================================
- Hits        33692    33672      -20     
- Misses       6592     6606      +14     
Impacted Files Coverage Δ
src/io/parquet/read/statistics/mod.rs 87.31% <14.28%> (-1.30%) ⬇️
src/io/parquet/read/deserialize/simple.rs 83.03% <82.35%> (-2.07%) ⬇️
src/io/parquet/write/boolean/nested.rs 96.42% <100.00%> (+1.55%) ⬆️
src/io/parquet/write/mod.rs 87.83% <0.00%> (-0.25%) ⬇️
src/io/ipc/read/file.rs 96.87% <0.00%> (+0.44%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jorgecarleitao jorgecarleitao changed the title fix: fix nested boolean offset Fixed nested boolean offset Feb 14, 2023
@jorgecarleitao jorgecarleitao merged commit c12b600 into jorgecarleitao:main Feb 14, 2023
@jorgecarleitao
Copy link
Owner

Oops, yes, that should not be there - leftover from the previous iteration. Thanks!

jorgecarleitao pushed a commit that referenced this pull request Feb 14, 2023
jgmartin pushed a commit to jgmartin/arrow2 that referenced this pull request Feb 22, 2023
ritchie46 added a commit to ritchie46/arrow2 that referenced this pull request Mar 29, 2023
ritchie46 added a commit to ritchie46/arrow2 that referenced this pull request Apr 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants