Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle pd.NaT values in lists passed to DataFrame constructor #16481

Closed
2 tasks done
d-reynol opened this issue May 24, 2024 · 5 comments · Fixed by #16957
Closed
2 tasks done

Handle pd.NaT values in lists passed to DataFrame constructor #16481

d-reynol opened this issue May 24, 2024 · 5 comments · Fixed by #16957
Assignees
Labels
A-input-parsing Area: parsing input arguments A-interop-pandas Area: interoperability with pandas A-panic Area: code that results in panic exceptions accepted Ready for implementation enhancement New feature or an improvement of an existing feature P-low Priority: low python Related to Python Polars

Comments

@d-reynol
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import pandas as pd
import polars as pl

print(f'Polars Version: {pl.__version__}')

try:
    df = pl.DataFrame({'x': [pd.NaT]})
except:
    print('This failed')

try:
    df = pl.DataFrame({'x': [pd.NaT]}, schema={'x': pl.Datetime})
except:
    print('Specifying datetime failed')

try:
    df = pl.DataFrame({'x': [pd.NaT]}, schema={'x': pl.Float64})
except:
    print('This failed too')

df = pl.from_pandas(data=pd.DataFrame({'x': [pd.NaT]}))
print(df)

Log output

Polars Version: 0.20.29
This failed
Specifying datetime failed
This failed too
shape: (1, 1)
┌──────────────┐
│ x            │
│ ---          │
│ datetime[ns] │
╞══════════════╡
│ null         │
└──────────────┘
thread '<unnamed>' panicked at py-polars\src\conversion\any_value.rs:207:43:
called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'TypeError'>, value: TypeError("'float' object cannot be interpreted as an integer"), traceback: None }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at py-polars\src\conversion\any_value.rs:207:43:
called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'TypeError'>, value: TypeError("'float' object cannot be interpreted as an integer"), traceback: None }

Issue description

This issue is similar to #15518.

I can create a polars dataframe from a pandas dataframe that includes NaT values, but polars panics when creating the same dataframe directly.

Expected behavior

I'd expect the NaT to be treated as a null value

Installed versions

--------Version info---------
Polars:               0.20.29
Index type:           UInt32
Platform:             Windows-10-10.0.19045-SP0
Python:               3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)]
----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          <not installed>
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               <not installed>
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           <not installed>
nest_asyncio:         <not installed>
numpy:                1.26.4
openpyxl:             <not installed>
pandas:               2.2.2
pyarrow:              <not installed>
pydantic:             <not installed>
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
torch:                <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@d-reynol d-reynol added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels May 24, 2024
@ritchie46
Copy link
Member

@stinodego or @MarcoGorelli could you take a look?. I am not sure which recent PR's influenced this one.

@MarcoGorelli
Copy link
Collaborator

thanks @d-reynol for the report

Did this ever work in previous versions? I just tried it out in 0.20.3 and it also raises there

@ritchie46
Copy link
Member

Ok, then we never supported this properly. Sorry for the ping.

@stinodego stinodego added A-interop-pandas Area: interoperability with pandas P-low Priority: low A-panic Area: code that results in panic exceptions and removed needs triage Awaiting prioritization by a maintainer labels May 25, 2024
@stinodego stinodego changed the title Inconsistent handling of pd.NaT values in DataFrame creation Handle pd.NaT values in lists passed to DataFrame constructor May 25, 2024
@stinodego
Copy link
Member

We shouldn't panic here, but I'm not sure we should even support this.

You are passing a Python list with some values in it. We need to determine which type those values are, because a Python list isn't strongly typed (unlike a NumPy ndarray or a pandas Series).

We don't know what type those values are, so we do some checks for various possible values we expect. Looks like we don't currently check for pd.NaT. I guess we could. But you can imagine that checking for every possible type of object someone could throw in a list is not feasible.

My advice is to just use None instead, or create a pandas Series with your pd.NaT values and then convert that to Polars.

I will take a look at our constructors to see if we can support this.

@stinodego stinodego added enhancement New feature or an improvement of an existing feature A-input-parsing Area: parsing input arguments and removed bug Something isn't working labels May 25, 2024
@ritchie46
Copy link
Member

We should not check for pd.Nat in conversion to AnyValue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-input-parsing Area: parsing input arguments A-interop-pandas Area: interoperability with pandas A-panic Area: code that results in panic exceptions accepted Ready for implementation enhancement New feature or an improvement of an existing feature P-low Priority: low python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants