Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.backward_fill() does not consider np.nan to be invalid #15494

Closed
2 tasks done
marc-at-brightnight opened this issue Apr 5, 2024 · 2 comments
Closed
2 tasks done

.backward_fill() does not consider np.nan to be invalid #15494

marc-at-brightnight opened this issue Apr 5, 2024 · 2 comments
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@marc-at-brightnight
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import numpy as np

df = pl.DataFrame({
    "f": [1.0, None, 1.0],
})

bfilled_df = df.select(pl.all().backward_fill())

assert bfilled_df[1, 'f'] == 1 # works

df = pl.DataFrame({
    "f": [1.0, np.nan, 1.0],
})

bfilled_df = df.select(pl.all().backward_fill())

assert bfilled_df[1, 'f'] == 1 # does not work

Log output

Traceback (most recent call last):
  File "/Users/mnhmbp/PycharmProjects/poweralpha/venv/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-cc1558aaef03>", line 17, in <module>
    assert bfilled_df[1, 'f'] == 1 # does not work
AssertionError

Issue description

.backward_fill() works as expected with Nones. However, with np.nan values, it does not work. This seems strange to me.

Expected behavior

It should work with np.nan as it does with None

Installed versions

--------Version info---------
Polars:               0.20.18
Index type:           UInt32
Platform:             macOS-14.2-arm64-arm-64bit
Python:               3.10.13 (main, Aug 24 2023, 12:59:26) [Clang 15.0.0 (clang-1500.0.40.1)]
----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               2024.2.0
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           3.8.3
nest_asyncio:         <not installed>
numpy:                1.23.0
openpyxl:             3.1.2
pandas:               1.5.3
pyarrow:              15.0.1
pydantic:             2.6.3
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           1.4.52
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@marc-at-brightnight marc-at-brightnight added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Apr 5, 2024
@cmdlineluser
Copy link
Contributor

cmdlineluser commented Apr 5, 2024

I think this is by-design:

These NaN values are considered to be a type of floating point data rather than missing data.

@orlp
Copy link
Collaborator

orlp commented Apr 8, 2024

This is by design. Use .fill_nan(None) first if you want to also fill NaNs.

@orlp orlp closed this as not planned Won't fix, can't repro, duplicate, stale Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants