You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When calling map_batches without providing a return_dtype in a grouped context, the resulting dtype for the batch is inferred by looking at the first value. Without doing this, it is therefore Unknown. When we ask for collect_schema(), however, the grouped column's schema will be List(Unknown) and parse_into_dtype will be called via the List dtype constructor on the python side. This raises TypeError (by default) for Unknown values.
On the rust conversion side, we call unwrap on this error result and therefore get a (difficult to catch) PanicException from pyo3:
thread '<unnamed>' panicked at py-polars/src/conversion/mod.rs:241:39:
called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'TypeError'>, value: TypeError("cannot parse input of type 'Unknown' into Polars data type: Unknown"), traceback: Some(<traceback object at 0x7209e5244640>) }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Expected behavior
No panic, and a re-raised TypeError.
Perhaps also one could accept returning a schema that has List(Unknown) as the dtype for the column.
Checks
Reproducible example
Log output
No response
Issue description
When calling
map_batches
without providing areturn_dtype
in a grouped context, the resulting dtype for the batch is inferred by looking at the first value. Without doing this, it is thereforeUnknown
. When we ask forcollect_schema()
, however, the grouped column's schema will beList(Unknown)
andparse_into_dtype
will be called via theList
dtype constructor on the python side. This raisesTypeError
(by default) forUnknown
values.On the rust conversion side, we call
unwrap
on this error result and therefore get a (difficult to catch)PanicException
from pyo3:Expected behavior
No panic, and a re-raised
TypeError
.Perhaps also one could accept returning a schema that has
List(Unknown)
as the dtype for the column.Installed versions
The text was updated successfully, but these errors were encountered: