-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map
with struct return_dtype
errors if return is single dict with correct keys & None
#10398
Comments
@orlp, @ritchie46, a quick coffee-break triage for you: I think the comparison check here needs to be slightly more forgiving than
...which is true, but should still pass the check as all-null data is acceptable for any dtype in this context. Perhaps we already have such a comparison function hiding somewhere? If not, guess we need one :) |
map
with struct return_dtype
crashes if return is single dict with correct keys & None
map
with struct return_dtype
errors if return is single dict with correct keys & None
I encountered this bug. What would it take to get it fixed? |
I will prefer it is correct behavior since Solution is tell series the |
Tbh, have lost track of what this error means. Would appreciate if someone could express a minimal reproducable example. Would be happy to take a stab at fixing it. |
When using a flat/scalar null, there is no error and it is "upcast": df = pl.DataFrame({"a": 1})
df.with_columns(
pl.all().map_elements(lambda x:
None,
return_dtype = pl.Float64
)
)
# shape: (1, 1)
# ┌──────┐
# │ a │
# │ --- │
# │ f64 │ # <- dtype `null` "upcast" to `f64` as per return_dtype
# ╞══════╡
# │ null │
# └──────┘ But with a list/dict it errors instead: df.with_columns(
pl.all().map_elements(lambda x:
{"x": None},
return_dtype = pl.Struct({"x": pl.Float64})
)
)
# SchemaError: expected output type ... The expectation appears to be that the inner null would upcast: s = pl.Series([{"x": None}])
s.dtype
# Struct({'x': Null})
s.cast(pl.Struct({"x": pl.Float64})).dtype
# Struct({'x': Float64}) side-note: I've also just noticed on a Series there is no error but the inner dtype remains as null: pl.Series(["a"]).map_elements(lambda x:
{"x": None},
return_dtype = pl.Struct({"x": pl.Float64})
).dtype
# Struct({'x': Null}) |
Any idea where to start looking in the codebase? |
Should be fixed by #15699 |
Checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Issue description
I believe both
map
calls should work since there is an explicitreturn_dtype
and the shape and order of thedict
s match thatdtype
. Changing the return to justNone
without the enclosingdict
still results in aSchemaError
Expected behavior
No
SchemaError
Installed versions
The text was updated successfully, but these errors were encountered: