Skip to content

feat: Implement rewrite data files functionality#2

Draft
EnyMan wants to merge 1 commit intoupsert-optimizationfrom
rewrite-data-files
Draft

feat: Implement rewrite data files functionality#2
EnyMan wants to merge 1 commit intoupsert-optimizationfrom
rewrite-data-files

Conversation

@EnyMan
Copy link
Owner

@EnyMan EnyMan commented Jan 22, 2026

This pull request adds support for data file compaction operations to the MaintenanceTable class in pyiceberg. The main change is the introduction of the rewrite_data_files method, which allows users to compact small data files into larger, optimally-sized files, improving storage efficiency and performance.

New compaction feature:

  • Added a rewrite_data_files method to the MaintenanceTable class, providing a builder for configuring and executing data file compaction via the RewriteDataFiles class. This method includes documentation and usage examples.
  • Updated type checking imports to include RewriteDataFiles from pyiceberg.table.update.snapshot.

@EnyMan EnyMan force-pushed the upsert-optimization branch from 5d1a4dd to d13e3d5 Compare January 22, 2026 20:07
@EnyMan EnyMan force-pushed the rewrite-data-files branch from dbb3d32 to e4610c6 Compare January 22, 2026 20:10
@EnyMan EnyMan force-pushed the upsert-optimization branch from 9b80939 to e36d994 Compare January 22, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant