-
Notifications
You must be signed in to change notification settings - Fork 151
Open
Description
Summary
Several SessionContext methods for reading data sources and registering tables from upstream DataFusion v53 are not yet exposed in datafusion-python.
Missing Methods
Read methods:
-
read_arrow— read an Arrow IPC file into a DataFrame -
read_batch— read a single RecordBatch into a DataFrame -
read_batches— read multiple RecordBatches into a DataFrame -
read_empty— create an empty DataFrame with a given schema
Write methods:
-
write_csv— write query results to CSV directly from context -
write_json— write query results to JSON directly from context -
write_parquet— write query results to Parquet directly from context
Registration:
-
register_arrow— register an Arrow IPC file as a table -
register_batch— register a single RecordBatch as a table
Upstream Reference
Implementation
- Rust bindings:
crates/core/src/context.rs - Python wrappers:
python/datafusion/context.py
Note: This gap analysis was performed using an AI agent comparing upstream DataFusion v53 documentation against the current datafusion-python codebase.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels