Skip to content

Add missing SessionContext read/write and registration methods #1458

@timsaucer

Description

@timsaucer

Summary

Several SessionContext methods for reading data sources and registering tables from upstream DataFusion v53 are not yet exposed in datafusion-python.

Missing Methods

Read methods:

  • read_arrow — read an Arrow IPC file into a DataFrame
  • read_batch — read a single RecordBatch into a DataFrame
  • read_batches — read multiple RecordBatches into a DataFrame
  • read_empty — create an empty DataFrame with a given schema

Write methods:

  • write_csv — write query results to CSV directly from context
  • write_json — write query results to JSON directly from context
  • write_parquet — write query results to Parquet directly from context

Registration:

  • register_arrow — register an Arrow IPC file as a table
  • register_batch — register a single RecordBatch as a table

Upstream Reference

Implementation

  • Rust bindings: crates/core/src/context.rs
  • Python wrappers: python/datafusion/context.py

Note: This gap analysis was performed using an AI agent comparing upstream DataFusion v53 documentation against the current datafusion-python codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions