ENH: Implement to_iceberg #61507

datapythonista · 2025-05-27T21:35:59Z

Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

mroeschke · 2025-06-02T16:52:10Z

pandas/core/frame.py

+        """
+        Write a DataFrame to an Apache Iceberg table.
+
+        .. versionadded:: 3.0.0


Could you add an experimental tag to this API as well like we did with read_iceberg?

Absolutely, I forgot about that. Added it now. I also expanded the user guide docs of iceberg with to_iceberg, which I also had forgotten. Thanks for the review and the feedback!

…_iceberg

IsaacWarren · 2025-06-03T20:23:20Z

pandas/io/iceberg.py

+    *,
+    catalog_properties: dict[str, Any] | None = None,
+    location: str | None = None,
+    snapshot_properties: dict[str, str] | None = None,


Any thoughts on adding append to match to_parquet? Something like

append: bool = False

Then this could default to table.overwrite instead of append. I think it might be confusing if this doesn't match other to_* functions

How does PyIceberg support it?

With the table.overwrite method

Of course, I didn't think about it. I'll add it, thanks for the feedback.

IsaacWarren · 2025-06-03T20:23:58Z

pandas/io/iceberg.py

+        identifier=table_identifier,
+        schema=arrow_table.schema,
+        location=location,
+        # we could add `partition_spec`, `sort_order` and `properties` in the


I definitely think these would be great to have but I don't really have any ideas on how to do it without just using PyIceberg objects

Adding them later is easy if we think of a good signature. That's why I didn't worry too much about adding them.

datapythonista · 2025-06-04T10:16:10Z

Added the append parameter. I think it's a great addition, thanks for the feedback @IsaacWarren.

I was thinking that for the parameters that receive PyIceberg objects, one option is to use a generic **kwargs like to_parquet does, that are sent to the engine (only PyIceberg so far). This wouldn't directly expose PyIceberg details to our API and they could be supported. This would be very simple if only one method from PyIceberg received extra parameters. But there are a couple in overwrite where the same could be done, and that makes it a bit trickier. I think it's still best to leave this to a follow up PR, so it can be analyzed and discussed in greater detail. And even if it's done after this is released in 3.0, no big deal since there are no backward compatibility problems.

datapythonista added 3 commits May 27, 2025 19:09

Update docs to include to_iceberg

1875fb5

Implementing to_iceberg

b31ae80

Typo in whatsnew

373138c

datapythonista added the IO Data label May 27, 2025

datapythonista added 6 commits May 28, 2025 00:11

Fix CI and app test

a83c62f

Fix CI, fix existing table test

4246d17

typing of to_iceberg

fd728e0

Fix docstring validation error

f926ffd

Simplifying getting the catalog, to make linting easier

1290931

Merge branch 'main' into to_iceberg

08b234c

mroeschke reviewed Jun 2, 2025

View reviewed changes

datapythonista added 3 commits June 2, 2025 23:32

Adding experimental note, fixing typos and adding user guide docs

43255a7

Merge branch 'to_iceberg' of github.com:datapythonista/pandas into to…

e62ef3a

…_iceberg

Merge remote-tracking branch 'upstream/main' into to_iceberg

1c4de5b

IsaacWarren reviewed Jun 3, 2025

View reviewed changes

Add append parameter

21e6e9b

Merge branch 'main' into to_iceberg

dfc76ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Implement to_iceberg #61507

ENH: Implement to_iceberg #61507

Uh oh!

datapythonista commented May 27, 2025

Uh oh!

mroeschke Jun 2, 2025

Uh oh!

datapythonista Jun 2, 2025

Uh oh!

IsaacWarren Jun 3, 2025 •

edited

Loading

Uh oh!

datapythonista Jun 3, 2025

Uh oh!

IsaacWarren Jun 3, 2025

Uh oh!

datapythonista Jun 3, 2025

Uh oh!

IsaacWarren Jun 3, 2025

Uh oh!

datapythonista Jun 3, 2025

Uh oh!

datapythonista commented Jun 4, 2025

Uh oh!

Uh oh!

Uh oh!

ENH: Implement to_iceberg #61507

Are you sure you want to change the base?

ENH: Implement to_iceberg #61507

Uh oh!

Conversation

datapythonista commented May 27, 2025

Uh oh!

mroeschke Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

IsaacWarren Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

datapythonista Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

IsaacWarren Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

IsaacWarren Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

datapythonista commented Jun 4, 2025

Uh oh!

Uh oh!

IsaacWarren Jun 3, 2025 •

edited

Loading