-
Notifications
You must be signed in to change notification settings - Fork 32
Description
What happens?
While looking at [#98], I noticed that 1.34.0 added support for UINT128. Which lead to three problems:
- polars_io.py is missing UINT128 and INT128*
- external/duckdb - arrow_converter.cpp is missing a case for UHUGEINT, which results in a panic (in both 1.33.1 and 1.34.0).
- Interval types panic in 1.34.0 but not in 1.33.1.
Fix for 1
Add UINT128 and INT128 to the list of types in polars_io.py.
* I didn't come up with a test case / failure here.
Fix for 2
The fix for 2 is to insert this before the HUGEINT case:
case LogicalTypeId::UHUGEINT:
Generally, there seems to be a lack of "dynamic" tests that exercise all types. See the reproduction below for an idea of a test case that'll smoke-test every type.
Fix for 3
The offending commit seems to be - pola-rs/polars@0ed3499
Ironically, the related issue says this is meant to fix a DuckDB issue: pola-rs/polars#15969
The workaround is to set os.environ["POLARS_IMPORT_INTERVAL_AS_STRUCT"]="1"
.
Unsure of what the correct fix is.
1.33.1 behavior - panics on UHUGEINT only
Polars version: 1.33.1
duckdb version: 1.5.0.dev73
Trying type='INTERVAL'
INTERVAL: Failed - The datatype "tin" is still not supported in Rust implementation
Trying type='UHUGEINT'thread '' (1586047) panicked at crates/polars-core/src/datatypes/field.rs:280:19:
Arrow datatype Extension(ExtensionType { name: "arrow.opaque", inner: FixedSizeBinary(16), metadata: Some("{"type_name":"uhugeint","vendor_name":"DuckDB"}") }) not supported by Polars. You probably need to activate that data-type feature.
note: run withRUST_BACKTRACE=1
environment variable to display a backtrace
1.34.0 behavior: Panics on both INTERVALand UHUGEINT
Trying type='INTERVAL'
thread '' (1591205) panicked at crates/polars-core/src/datatypes/field.rs:283:85:
calledResult::unwrap()
on anErr
value: ComputeError(ErrString("could not import frommonth_day_nano_interval
type. Hint: This can be imported by setting POLARS_IMPORT_INTERVAL_AS_STRUCT=1 in the environment. Note however that this is unstable functionality that may change at any time."))
note: run withRUST_BACKTRACE=1
environment variable to display a backtraceTrying type='UHUGEINT'
thread '' (1591205) panicked at crates/polars-core/src/datatypes/field.rs:290:19:
Arrow datatype Extension(ExtensionType { name: "arrow.opaque", inner: FixedSizeBinary(16), metadata: Some("{"type_name":"uhugeint","vendor_name":"DuckDB"}") }) not supported by Polars. You probably need to activate that data-type feature.
To Reproduce
* I'd suggest adding a variation of the test below so that any new types are exercised
import duckdb.typing
import duckdb
import polars as pl
print(f"Polars version: {pl.__version__}")
print(f"duckdb version: {duckdb.__version__}")
types = [c for c in filter(lambda x: str(x)[0]!="_", dir(duckdb.typing))]
for type in types:
with duckdb.connect() as con:
print(f"Trying {type=}")
try:
rel = con.sql(f"select 1::{type} as a")
comparison_literal = pl.lit(1)
lazy_df = rel.pl(lazy=True)
print(f"{type}: Passed")
except Exception as e:
print(f"{type}: Failed - {e}")
OS:
Linux
DuckDB Package Version:
1.5.x (source build)
Python Version:
3.13
Full Name:
Paul T
Affiliation:
Iqmo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have not tested with any build
Did you include all relevant data sets for reproducing the issue?
No - Other reason (please specify in the issue body)
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration to reproduce the issue?
- Yes, I have