Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"operator does not support primitive Int128" when using Decimal from arrow_array #16111

Open
2 tasks done
cdgleber opened this issue May 8, 2024 · 0 comments
Open
2 tasks done
Labels
A-dtype-decimal Area: decimal data type bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars

Comments

@cdgleber
Copy link

cdgleber commented May 8, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

use std::sync::Arc;
use arrow_array::Decimal128Array;
use arrow_schema::{ DataType, Field, Schema };
use polars::{ frame::DataFrame, series::Series };
use anyhow::Error;

fn main() -> Result<(), Error> {
    let arr = Decimal128Array::from(vec![8272993, 2901082, 94298476]).with_precision_and_scale(
        18,
        0
    )?;

    let batch = arrow::record_batch::RecordBatch
        ::try_new(
            Arc::new(Schema::new(vec![Field::new("id", DataType::Decimal128(18, 0), false)])),
            vec![Arc::new(arr)]
        )
        .unwrap();

    let schema = batch.schema();
    let mut columns = Vec::with_capacity(batch.num_columns());
    for (i, column) in batch.columns().iter().enumerate() {
        let name = schema.fields().get(i).unwrap().name();
        let pl_arrow = Box::<dyn polars_arrow::array::Array>::from(&**column);
        columns.push(Series::from_arrow(name, pl_arrow)?);
    }

    let _ = dbg!(DataFrame::from_iter(columns));

    Ok(())
}

this line causes the error

let pl_arrow = Box::<dyn polars_arrow::array::Array>::from(&**column);

Log output

Finished dev [unoptimized + debuginfo] target(s) in 5.23s
     Running `target\debug\polars_issue.exe`
thread 'main' panicked at C:\Users\user\.cargo\registry\src\index.crates.io-6f17d22bba15001f\polars-arrow-0.39.2\src\array\mod.rs:442:33:
operator does not support primitive `Int128`
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library\std\src\panicking.rs:647
   1: core::panicking::panic_fmt
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library\core\src\panicking.rs:72
   2: polars_arrow::array::from_data
             at C:\Users\user\.cargo\registry\src\index.crates.io-6f17d22bba15001f\polars-arrow-0.39.2\src\array\mod.rs:442
   3: polars_arrow::array::impl$5::from
             at C:\Users\user\.cargo\registry\src\index.crates.io-6f17d22bba15001f\polars-arrow-0.39.2\src\array\mod.rs:400
   4: polars_issue::main
             at .\src\main.rs:28
   5: core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,anyhow::Error> > (*)(),tuple$<> >
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97\library\core\src\ops\function.rs:250
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
error: process didn't exit successfully: `target\debug\polars_issue.exe` (exit code: 101)

Issue description

This might be somewhat unique or convoluted. I am building an ETL application for a data pipeline. I am using arrow-odbc to pull data from a database, using polars to do transforms, and then back to arrow-odbc to insert the DataFrame into another database.

I ran into an issue with one of my datasets. Similar to an issue seen in a python issue from previously but not in a rust issue as far as I can tell. #12393 which was fixed by #12413.

I followed the similar fix an updated polars-arrow-0.39.2\src\array\mod.rs functions to and from arrow_data::ArrayData to use the macro with_match_primitive_type_full! and it is working for me.

(I'm new to open source and felt intimidated going straight to adding a pull request for this fix, so I started with an issue even though I found a fix for my purpose. In my opinion, this fix is likely not exhaustive and may not be appropriate for most situations. i defer to the maintainers.)

Expected behavior

the above code should print the contents of the dataframe rather than error.

    Finished dev [unoptimized + debuginfo] target(s) in 1m 08s
     Running `target\debug\polars_issue.exe`
Polars does not support decimal types so the 'Series' are read as Float64
[src\main.rs:33:14] DataFrame::from_iter(columns) = shape: (3, 1)
┌─────────────┐
│ id          │
│ ---         │
│ f64         │
╞═════════════╡
│ 8.272993e6  │
│ 2.901082e6  │
│ 9.4298476e7 │

Installed versions

[dependencies]
anyhow = "1.0.83"
arrow = "51.0.0"
arrow-array = "51.0.0"
arrow-schema = "51.0.0"
polars = "0.39.2"
polars-arrow = { version = "0.39.2", features = ["arrow_rs"] }

@cdgleber cdgleber added bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars labels May 8, 2024
@alexander-beedie alexander-beedie added the A-dtype-decimal Area: decimal data type label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dtype-decimal Area: decimal data type bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars
Projects
None yet
Development

No branches or pull requests

2 participants