-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Is your feature request related to a problem?
We’re using the native ILP Python client (sender.dataframe()) to ingest Polars DataFrames after converting them to Pandas. When we enable use_pyarrow_extension_array=True (or DataFrame.to_pandas(use_pyarrow_extension_array=True)), QuestDB rejects the batch with errors such as:
Unsupported dtype large_string[pyarrow]
Unsupported dtype double[pyarrow]
To work around this we currently rerun the conversion with use_pyarrow_extension_array=False, which copies every column back to NumPy/Python dtypes. That re-conversion adds ~30–40% CPU and memory overhead for large batches, and prevents us from using the zero-copy Arrow path that Polars/Pandas now offer.
Feature request
Allow sender.dataframe() to accept Pandas columns backed by pyarrow dtypes (e.g., string[pyarrow], float64[pyarrow], timestamp[pyarrow]).
Alternatively, provide an option to let the client detect pyarrow-backed columns and convert them server-side without forcing us to re-materialize the entire batch in Python.
Why it matters
Newer Pandas/Polars pipelines default to Arrow-backed storage for performance; QuestDB ingestion currently forces an extra copy step.
Importing tens of millions of rows per batch becomes CPU-bound on the client simply because we have to downgrade data types.
Happy to provide sample code or traces if needed. Thanks for considering!
Describe the solution you'd like.
No response
Describe alternatives you've considered.
No response
Full Name:
Lixiang Cao
Affiliation:
I am a freelancer
Additional context
No response