Skip to content

[bug] FastLanes signed integer array round trip failed #929

@XiangpengHao

Description

@XiangpengHao

Hi Vortex! I recently played a bit on Vortex <-> Arrow with FastLane encoding.

However, it seems that current implementation automatically convert signed interger to unsigned variant, as shown in the following code: https://github.com/spiraldb/vortex/blob/develop/encodings/fastlanes/src/bitpacking/compress.rs#L33-L40

Here's a minimal reproducible code, it will panic because the Int64Array is converted back to UInt64Array

use arrow::{
    array::{AsArray, Int64Array, UInt64Array},
    datatypes::{Int64Type, UInt64Type},
};
use vortex::{arrow::FromArrowArray, IntoCanonical};
use vortex_fastlanes::{find_best_bit_width, BitPackedArray};

fn main() {
    let vortex_array =
        vortex::Array::from_arrow(&Int64Array::from_iter(-500..500), false).as_primitive();

    let bitpacked_array = BitPackedArray::encode(
        vortex_array.as_ref(),
        find_best_bit_width(&vortex_array).unwrap(),
    )
    .unwrap();

    println!("bitpacked_array: {:?}", bitpacked_array);

    let canonical_array = bitpacked_array
        .into_canonical()
        .unwrap()
        .into_arrow()
        .unwrap();
    let arrow_array_again = canonical_array.as_primitive::<Int64Type>();
    println!("arrow_array_again: {:?}", arrow_array_again);
}

I tried to comment out the .to_unsigned() etc, and it seems to work again.
I haven't read FastLane paper carefully, can you provide some hints on why to convert it into unsigned values?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions