implement Arrow2's odbc reader and writers #2994

ritchie46 · 2022-03-28T09:41:44Z

We now have native ODBC support upstream. This has to be exposed in polars similarly to existing IO readers and writers.

trickster · 2022-06-28T06:51:24Z

Preliminary version I came up with,

This only works with string columns (UTF8 implemented only).

use arrow2::array::Utf8Array;
use arrow2::error::Result;
use arrow2::io::odbc::api::Cursor;
use arrow2::io::odbc::{api, read};
use polars::prelude::*;

const QUERY: &str = include_str!("../query.sql");

fn main() -> Result<()> {
    let connector = "ODBC_STRING";

    let env = api::Environment::new()?;
    let connection = env.connect_with_connection_string(connector)?;
    let mut prep = connection.prepare(QUERY)?;

    let fields = read::infer_schema(&prep)?;

    let mut df = fields
        .iter()
        .map(|s| match s.data_type {
            ArrowDataType::Utf8 => Series::new_empty(&s.name, &DataType::Utf8),
            _ => unimplemented!(),
        })
        .collect::<Vec<_>>();

    let max_batch_size = 100;
    let buffer = read::buffer_from_metadata(&prep, max_batch_size)?;

    let cursor = prep.execute(())?.unwrap();
    let mut cursor = cursor.bind_buffer(buffer)?;

    while let Some(batch) = cursor.fetch()? {
            for ((idx, field), df_elem) in (0..batch.num_cols()).zip(fields.iter()).zip(df.iter_mut()) {
                let column_view = batch.column(idx);
                let arr = Arc::from(read::deserialize(column_view, field.data_type.clone()));
                let series = Series::try_from((field.name.as_str(), vec![arr])).unwrap();
                df_elem.append(&series).unwrap();
            }
        }
    
        let dataframe = DataFrame::new(df).unwrap();
        dbg!(dataframe);
        Ok(())
    }

We need to ideally utilize this function, although it works on chunks together, not individual one.

Edit: Series::try_from would be enough

cnphil · 2022-08-19T03:05:12Z

I'm tempting to work on this. Will draft the PR on the weekend.

trickster · 2022-08-19T13:54:40Z

I got a version that is working here (can infer schema) here

cnpryer · 2023-07-16T04:49:07Z

I see this is still open. Is there interest in this?

sportfloh · 2023-11-19T20:03:58Z

hey,
@ cnpryer yes :-)

I have a similar requirement.
Currently I use odbc-api to get data from an DB2 (IBM i) database .
I tried to use arrow-odbc but I didn't find a way to create a polars DataFrame from arrow RecordBatch.
It would be really nice if some thing like the python from_arrow function could be implemented in the Rust API.
Or maybe I didn't find a simple way to do it?
Thanks and Cheers!

sportfloh · 2024-03-15T21:06:05Z

Hi,
here is my current solution with arrow_odbc, arrow, polars_arrow.
pacman82/odbc-api#536 (comment)

I found the arrow RecordBatch to DataFrame code hiere:
https://stackoverflow.com/questions/78084066/arrow-recordbatch-as-polars-dataframe

ritchie46 added feature good first issue Good for newcomers labels Mar 28, 2022

sa- mentioned this issue Mar 28, 2022

Bump arrow2 version #2995

Closed

SuperSupeng mentioned this issue May 16, 2022

Polars 贡献指南 weopenprojects/WeOpen-Star#69

Open

4 tasks

alexander-beedie mentioned this issue Jun 1, 2022

rust-polars SQL read and write #3540

Open

huang12zheng mentioned this issue Sep 13, 2022

Using Rust and Polars with Postgres #4842

Closed

2 tasks

stinodego added enhancement New feature or an improvement of an existing feature and removed feature labels Jul 14, 2023

cjackal mentioned this issue Jul 30, 2023

read_database incurs 'pyodbc.Connection' object has no attribute 'split #10177

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement Arrow2's odbc reader and writers #2994

implement Arrow2's odbc reader and writers #2994

ritchie46 commented Mar 28, 2022

trickster commented Jun 28, 2022 •

edited

Loading

cnphil commented Aug 19, 2022

trickster commented Aug 19, 2022

cnpryer commented Jul 16, 2023

sportfloh commented Nov 19, 2023

sportfloh commented Mar 15, 2024

implement Arrow2's odbc reader and writers #2994

implement Arrow2's odbc reader and writers #2994

Comments

ritchie46 commented Mar 28, 2022

trickster commented Jun 28, 2022 • edited Loading

cnphil commented Aug 19, 2022

trickster commented Aug 19, 2022

cnpryer commented Jul 16, 2023

sportfloh commented Nov 19, 2023

sportfloh commented Mar 15, 2024

trickster commented Jun 28, 2022 •

edited

Loading