Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SchemaFieldNotFoundError when chaining select and collect #17739

Closed
2 tasks done
philiporlando opened this issue Jul 19, 2024 · 0 comments
Closed
2 tasks done

SchemaFieldNotFoundError when chaining select and collect #17739

philiporlando opened this issue Jul 19, 2024 · 0 comments
Assignees
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@philiporlando
Copy link

philiporlando commented Jul 19, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

 import polars as pl

data = {
    "column1": [None, "x", "y"],
    "column2": ["a", None, "c"]
}

df = pl.LazyFrame(data)

df.select(pl.when(pl.col("column1").is_null()).then(False).otherwise(True))
# <LazyFrame at 0x13538E450>

df.select(pl.when(pl.col("column1").is_null()).then(False).otherwise(True)).collect()
# shape: (3, 1)
# ┌─────────┐
# │ literal │
# │ ---     │
# │ bool    │
# ╞═════════╡
# │ false   │
# │ true    │
# │ true    │
# └─────────┘

df.select(pl.when(pl.col("column1").is_null()).then(False).otherwise(True)).select("literal")
# <LazyFrame at 0x13FF16990>

df.select(pl.when(pl.col("column1").is_null()).then(False).otherwise(True)).select("literal").collect()
# Traceback (most recent call last):
#   File "c:\local\projects\polars-bug-17739\test.py", line 28, in <module>
#     df.select(pl.when(pl.col("column1").is_null()).then(False).otherwise(True)).select("literal").collect()      
#   File "C:\local\projects\polars-bug-17739\.venv\Lib\site-packages\polars\lazyframe\frame.py", line 1942, in collect
#     return wrap_df(ldf.collect(callback))
#                    ^^^^^^^^^^^^^^^^^^^^^
# polars.exceptions.SchemaFieldNotFoundError: literal

Log output

Traceback (most recent call last):
  File "C:\local\projects\polars-bugf-17739\test.py", line 30, in <module>
    df.select(pl.when(pl.col("column1").is_null()).then(False).otherwise(True)).select("literal").collect()
  File "C:\local\projects\polars-bug-17739\.venv\Lib\site-packages\polars\lazyframe\frame.py", line 1942, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.SchemaFieldNotFoundError: literal

Issue description

There is a bug in Polars where chaining select calls on a LazyFrame and subsequently calling collect results in a SchemaFieldNotFoundError.

This bug was originally reported in pandera/issues#1657.

Expected behavior

The final chained select and collect call should execute without errors, and the result DataFrame should be returned as expected.

Installed versions

--------Version info---------
Polars:               1.2.1
Index type:           UInt32
Platform:             Windows-10-10.0.19045-SP0
Python:               3.12.1 (tags/v3.12.1:2305ca5, Dec  7 2023, 22:03:25) [MSC v.1937 64 bit (AMD64)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          <not installed>
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               <not installed>
gevent:               <not installed>
great_tables:         <not installed>
hvplot:               <not installed>
matplotlib:           <not installed>
nest_asyncio:         <not installed>
numpy:                <not installed>
openpyxl:             <not installed>
pandas:               <not installed>
pyarrow:              <not installed>
pydantic:             <not installed>
pyiceberg:            <not installed>
sqlalchemy:           <not installed>
torch:                <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

2 participants