Skip to content

Conversation

timsaucer
Copy link
Member

Which issue does this PR close?

Closes #1005

Rationale for this change

In addition to closing #1005 this exposes an important function in DataFrame operations, writing to tables. This functionality exists in the upstream DataFusion project but it has not previously been exposed to python. Now that we have external table support and external catalogs, we should make this function accessible to users.

What changes are included in this PR?

  • Adds dataframe writer Python wrapper.
  • Adds insert operation enum Python wrapper.
  • Exposes DataFrame.write_table
  • Adds unit test coverage
  • Adds dataframe writer options to write_csv, write_json, and write_parquet

Are there any user-facing changes?

There are no breaking changes. The existing methods have a new optional parameter. If it is not provided then the operations are unchanged.

@timsaucer timsaucer requested a review from Copilot October 7, 2025 17:26
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR exposes the DataFrame.write_table functionality from DataFusion to Python, along with supporting dataframe writer options. It enables users to write DataFrame results directly to registered tables with configurable write operations and formatting options.

  • Adds Python wrappers for DataFrameWriteOptions and InsertOp enum
  • Introduces DataFrame.write_table method for writing to registered tables
  • Enhances existing write methods (write_csv, write_json, write_parquet) with optional write options

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/lib.rs Registers new Python classes for insert operations and write options
src/dataframe.rs Implements Rust bindings for write options and insert operations, updates write methods
python/datafusion/dataframe.py Adds Python wrapper classes and updates DataFrame write methods with new options
python/datafusion/init.py Exports new classes in public API
python/tests/test_dataframe.py Adds comprehensive test coverage for new functionality
Comments suppressed due to low confidence (1)

python/tests/test_dataframe.py:1

  • The parameter name write_options is inconsistent with the Rust function signature which expects a positional parameter, not a keyword argument.
# Licensed to the Apache Software Foundation (ASF) under one

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

timsaucer and others added 3 commits October 7, 2025 13:32
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@timsaucer timsaucer marked this pull request as ready for review October 7, 2025 20:16
@timsaucer timsaucer self-assigned this Oct 7, 2025
@timsaucer
Copy link
Member Author

Would anyone be able to review? Maybe @kosiew @crystalxyz @mesejo @kevinjqliu ? This is really just exposing options that already exist upstream.

Copy link
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found something that you'll want to change.

Comment on lines +1310 to +1312
self._raw_write_options = DataFrameWriteOptionsInternal(
insert_operation, single_file_output, partition_by, sort_by_raw
)
Copy link
Contributor

@kosiew kosiew Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can't pass insert_operation directly here.

eg this test will fail

def test_dataframe_write_options_accepts_insert_op() -> None:
    """DataFrameWriteOptions should accept InsertOp enums."""

    try:
        DataFrameWriteOptions(insert_operation=InsertOp.REPLACE)
    except TypeError as exc:  
        pytest.fail(f"DataFrameWriteOptions rejected InsertOp: {exc}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose all write options

2 participants