Hi Vortex! I recently played a bit on Vortex <-> Arrow with FastLane encoding.
However, it seems that current implementation automatically convert signed interger to unsigned variant, as shown in the following code: https://github.com/spiraldb/vortex/blob/develop/encodings/fastlanes/src/bitpacking/compress.rs#L33-L40
Here's a minimal reproducible code, it will panic because the Int64Array is converted back to UInt64Array
use arrow::{
array::{AsArray, Int64Array, UInt64Array},
datatypes::{Int64Type, UInt64Type},
};
use vortex::{arrow::FromArrowArray, IntoCanonical};
use vortex_fastlanes::{find_best_bit_width, BitPackedArray};
fn main() {
let vortex_array =
vortex::Array::from_arrow(&Int64Array::from_iter(-500..500), false).as_primitive();
let bitpacked_array = BitPackedArray::encode(
vortex_array.as_ref(),
find_best_bit_width(&vortex_array).unwrap(),
)
.unwrap();
println!("bitpacked_array: {:?}", bitpacked_array);
let canonical_array = bitpacked_array
.into_canonical()
.unwrap()
.into_arrow()
.unwrap();
let arrow_array_again = canonical_array.as_primitive::<Int64Type>();
println!("arrow_array_again: {:?}", arrow_array_again);
}
I tried to comment out the .to_unsigned() etc, and it seems to work again.
I haven't read FastLane paper carefully, can you provide some hints on why to convert it into unsigned values?
Hi Vortex! I recently played a bit on Vortex <-> Arrow with FastLane encoding.
However, it seems that current implementation automatically convert signed interger to unsigned variant, as shown in the following code: https://github.com/spiraldb/vortex/blob/develop/encodings/fastlanes/src/bitpacking/compress.rs#L33-L40
Here's a minimal reproducible code, it will panic because the
Int64Arrayis converted back toUInt64ArrayI tried to comment out the
.to_unsigned()etc, and it seems to work again.I haven't read FastLane paper carefully, can you provide some hints on why to convert it into unsigned values?