Skip to content

SPJ with Bucket Partition Key: Error: Wrong class, expected java.lang.CharSequence, but was java.lang.Integer #15349

@ammarchalifah

Description

@ammarchalifah

Apache Iceberg version

Tried both:

  • iceberg-spark-runtime-3.5_2.12-1.10.1.jar
  • /usr/share/aws/iceberg/lib/iceberg-spark3-runtime.jar (from AWS EMR)

Query engine

Spark 3.5.6

Please describe the bug 🐞

I have a target table & input table, both are written in Iceberg. Based on the manifest, the partition spec of target table is:

"partition-specs":[{"spec-id":0,"fields":[{"name":"collected_date","transform":"identity","source-id":29,"field-id":1000},{"name":"user_id_bucket","transform":"bucket[32]","source-id":2,"field-id":1001}]}]

And partition spec of input table is

"partition-specs":[{"spec-id":0,"fields":[{"name":"collected_date","transform":"identity","source-id":29,"field-id":1000},{"name":"user_id_bucket","transform":"bucket[32]","source-id":2,"field-id":1001}]}]

The field user_id is originally a string column, but the bucketed value turns to Integer. I'm running SPJ by doing a MERGE INTO with join on user_id

MERGE INTO target AS t
        USING source AS s
        ON t.post_id = s.post_id AND t.user_id = s.user_id
        WHEN MATCHED AND (t.collected_at IS NULL OR t.collected_at <= s.collected_at)
            THEN UPDATE SET *
        WHEN NOT MATCHED
            THEN INSERT *

The MERGE INTO fails with this error

pyspark.errors.exceptions.captured.IllegalArgumentException: Wrong class, expected java.lang.CharSequence, but was java.lang.Integer, for object: 1

Here's the trace

2026-02-17 15:06:12,666 - job - ERROR - Error: Wrong class, expected java.lang.CharSequence, but was java.lang.Integer, for object: 1
Traceback (most recent call last):
  File "/mnt/yarn/usercache/hadoop/appcache/application_1771340623606_0001/container_1771340623606_0001_01_000001/pyspark.zip/pyspark/sql/session.py", line 1631, in sql
  File "/mnt/yarn/usercache/hadoop/appcache/application_1771340623606_0001/container_1771340623606_0001_01_000001/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__
  File "/mnt/yarn/usercache/hadoop/appcache/application_1771340623606_0001/container_1771340623606_0001_01_000001/pyspark.zip/pyspark/errors/exceptions/captured.py", line 185, in deco
pyspark.errors.exceptions.captured.IllegalArgumentException: Wrong class, expected java.lang.CharSequence, but was java.lang.Integer, for object: 1

Both target & source are stored with format-version=2

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions