Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polars.exceptions.ComputeError: could not append value: "2024-03-05T17:39:39Z" of type: str to the builder; make sure that all rows have the same schema or consider increasing infer_schema_length #16529

Closed
2 tasks done
rodrigogonegit opened this issue May 27, 2024 · 1 comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@rodrigogonegit
Copy link

rodrigogonegit commented May 27, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import json
import polars as pl

schema_override = {
    "created_at": pl.Datetime,
}
# df = pl.read_json("example.json", schema_overrides=schema_override) # Loading the file directly works, for some reason

# This does not work
with open("example.json", 'r') as f:
    content = json.load(f)
    print(content)
    df = pl.DataFrame(content, schema_overrides=schema_override)


print(df.schema)

Content of example.json

[
    {
        "created_at": "2024-03-05T17:39:39Z"
    }
]

Log output

Traceback (most recent call last):
  File "/Users/ad_hoc_testing.py", line 30, in <module>
    df = pl.DataFrame(content, schema_overrides=schema_override)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rodrigo/Library/Caches/pypoetry/virtualenvs/integration-test-cli-MEwVG7Wo-py3.11/lib/python3.11/site-packages/polars/dataframe/frame.py", line 376, in __init__
    self._df = sequence_to_pydf(
               ^^^^^^^^^^^^^^^^^
  File "/Users/rodrigo/Library/Caches/pypoetry/virtualenvs/integration-test-cli-MEwVG7Wo-py3.11/lib/python3.11/site-packages/polars/_utils/construction/dataframe.py", line 433, in sequence_to_pydf
    return _sequence_to_pydf_dispatcher(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rodrigo/.pyenv/versions/3.11.9/lib/python3.11/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rodrigo/Library/Caches/pypoetry/virtualenvs/integration-test-cli-MEwVG7Wo-py3.11/lib/python3.11/site-packages/polars/_utils/construction/dataframe.py", line 674, in _sequence_of_dict_to_pydf
    pydf = PyDataFrame.from_dicts(
           ^^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: could not append value: "2024-03-05T17:39:39Z" of type: str to the builder; make sure that all rows have the same schema or consider increasing `infer_schema_length`

it might also be that a value overflows the data-type's capacity

Issue description

It seems it's not able to parse. Issue with similar output: #8902

Expected behavior

I would expect the timestamp to be parsed

Installed versions

--------Version info---------
Polars:               0.20.19
Index type:           UInt32
Platform:             macOS-14.5-arm64-arm-64bit
Python:               3.11.9 (main, May  2 2024, 12:16:02) [Clang 15.0.0 (clang-1500.3.9.4)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          <not installed>
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               <not installed>
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           <not installed>
nest_asyncio:         <not installed>
numpy:                1.26.4
openpyxl:             <not installed>
pandas:               2.2.2
pyarrow:              <not installed>
pydantic:             <not installed>
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>```

</details>
@rodrigogonegit rodrigogonegit added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels May 27, 2024
@stinodego
Copy link
Member

Thanks for the report. This is a duplicate of #15882 - I am closing in favor of that one.

@stinodego stinodego closed this as not planned Won't fix, can't repro, duplicate, stale May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

2 participants