Skip to content

Parquet per column compression #16090

@mengna-lin

Description

@mengna-lin

Feature Request / Improvement

Enable parquet per column compression based on apache/parquet-java#3526 and apache/parquet-java#3396

Example:
Iceberg table can be created with

      spark.sql(
          "CREATE TABLE local.default.test_per_col ("
              + "  int_col int,"
              + "  string_col string"
              + ") USING iceberg"
              + " TBLPROPERTIES ("
              + "  'write.parquet.compression-codec' = 'zstd',"
              + "  'write.parquet.compression-codec.column.int_col' = 'snappy'"
              + ")");

Query engine

None

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementPR that improves existing functionality

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions