unshred_variant returns Result<VariantArray, ArrowError> and its docs say it errors on spec violations.
However, it internally calls VariantMetadata::new, Variant::new_with_metadata, and VariantObject::iter. All of which panics on bad input.
This is clearly bad because unshred_varaint is meant to consume bytes that came off disk, which by definition can be wrong. Though the fix is fairly mechanical since the panicking call sites have their try_ counterparts
|
/// Removes all (nested) typed_value columns from a VariantArray by converting them back to binary |
|
/// variant and merging the resulting values back into the value column. |
|
/// |
|
/// This function efficiently converts a shredded VariantArray back to an unshredded form where all |
|
/// data resides in the value column. |
|
/// |
|
/// # Arguments |
|
/// * `array` - The VariantArray to unshred |
|
/// |
|
/// # Returns |
|
/// A new VariantArray with all data in the value column and no typed_value column |
|
/// |
|
/// # Errors |
|
/// - If the shredded data contains spec violations (e.g., field name conflicts) |
|
/// - If unsupported data types are encountered in typed_value columns |
|
pub fn unshred_variant(array: &VariantArray) -> Result<VariantArray> { |
unshred_variantreturnsResult<VariantArray, ArrowError>and its docs say it errors on spec violations.However, it internally calls
VariantMetadata::new,Variant::new_with_metadata, andVariantObject::iter. All of which panics on bad input.This is clearly bad because
unshred_varaintis meant to consume bytes that came off disk, which by definition can be wrong. Though the fix is fairly mechanical since the panicking call sites have theirtry_counterpartsarrow-rs/parquet-variant-compute/src/unshred_variant.rs
Lines 46 to 61 in 4676c06