Skip to content

Apache Iceberg version 1.11.0 (org.apache.iceberg:iceberg-spark-runtime-4.0_2.13:1.11.0) #16567

@soumilshah1995

Description

@soumilshah1995

Apache Iceberg version

1.11.0 (latest release)

Query engine

Spark

Please describe the bug 🐞

Bug report

Apache Iceberg version

1.11.0 (org.apache.iceberg:iceberg-spark-runtime-4.0_2.13:1.11.0)

Spark version

Spark 4.0.2-amzn-0 (AWS EMR), PySpark, writeTo(...).append()

Catalog

Iceberg REST catalog → Amazon S3 Tables (SparkCatalog, S3FileIO)

What happened

Append to an Iceberg v3 table with VARIANT columns fails during Parquet file close when variant shredding is enabled and default column metrics are collected.

  • Read from S3 Parquet: OK
  • CREATE TABLE: OK
  • Append: fails in stage 3 at ParquetWriter.metrics() / DataWriter.close()

Dataset: ~669M rows (~216 GB), with parse_json(PROPERTIES) and parse_json(V).

Error message

java.util.NoSuchElementException
    at org.apache.iceberg.relocated.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:262)
    at org.apache.iceberg.parquet.ParquetMetrics$MetricsVisitor$MetricsVariantVisitor.value(ParquetMetrics.java:480)
    at org.apache.iceberg.parquet.ParquetMetrics$MetricsVisitor$MetricsVariantVisitor.value(ParquetMetrics.java:410)
    at org.apache.iceberg.parquet.ParquetVariantVisitor.visitValue(ParquetVariantVisitor.java:215)
    at org.apache.iceberg.parquet.ParquetVariantVisitor.visitObjectFields(ParquetVariantVisitor.java:265)
    at org.apache.iceberg.parquet.ParquetVariantVisitor.visitValue(ParquetVariantVisitor.java:226)
    at org.apache.iceberg.parquet.ParquetVariantVisitor.visit(ParquetVariantVisitor.java:183)
    at org.apache.iceberg.parquet.ParquetMetrics$MetricsVisitor.variant(ParquetMetrics.java:362)
    at org.apache.iceberg.parquet.TypeWithSchemaVisitor.visitVariant(TypeWithSchemaVisitor.java:221)
    at org.apache.iceberg.parquet.ParquetMetrics.metrics(ParquetMetrics.java:101)
    at org.apache.iceberg.parquet.ParquetWriter.metrics(ParquetWriter.java:173)
    at org.apache.iceberg.io.DataWriter.close(DataWriter.java:90)
    at org.apache.iceberg.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:126)
    at org.apache.iceberg.spark.source.SparkWrite$PartitionedDataWriter.commit(SparkWrite.java:838)


Spark config:

spark.sql.iceberg.shred-variants=true
spark.sql.catalog.<catalog>.table-default.write.parquet.shred-variants=true
Expected behavior
Append completes; Iceberg can compute metrics for shredded VARIANT columns without error.

Actual behavior
NoSuchElementException in MetricsVariantVisitor.value() when closing Parquet writers. Looks like metrics code expects exactly one sub-value when visiting variant object fields.

Workaround
Disable metrics on VARIANT columns (shredding still works):

ALTER TABLE catalog.test.events SET TBLPROPERTIES (
  'write.metadata.metrics.column.properties' = 'none',
  'write.metadata.metrics.column.v' = 'none'
);
After this, append of ~669M rows succeeds with shred enabled.




### Willingness to contribute

- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- [x] I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions