Apache Iceberg version
1.6.1
Query engine
Spark
Please describe the bug 🐞
- Run
mkdir warehouse/default to create dir
- Download attachment for table data
- Run
tar -xzf iceberg.tar.gz && mv tmp/* /tmp/ to put table data into dir
- Start spark shell
/home/ubuntu/Apps/spark-3.5.4-bin-hadoop3/bin/spark-sql \
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.2\
--conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.spark_catalog.type=hadoop \
--conf spark.sql.catalog.spark_catalog.warehouse=/tmp/spark-warehouse-28428
- Run count(*) it shows 217 rows.
spark-sql (default)> select count(*) from tmp_table_gw0_754291685_0;
217
Time taken: 2.33 seconds, Fetched 1 row(s)
- Run
select * from tmp_table_gw0_754291685_0 it reports fetching rows not 750 rows.
spark-sql (default)> select * from tmp_table_gw0_754291685_0;
...
Time taken: 0.213 seconds, Fetched 327 row(s)
The oddest part is that the difference only happens when the first time you run select * after select count(*), after that when you run select * everything is back to normal.
I tested againts 1.6.1, 1.7.2, and 1.9.2, buth all failed.
iceberg.tar.gz
Willingness to contribute
Apache Iceberg version
1.6.1
Query engine
Spark
Please describe the bug 🐞
mkdir warehouse/defaultto create dirtar -xzf iceberg.tar.gz && mv tmp/* /tmp/to put table data into dirselect * from tmp_table_gw0_754291685_0it reports fetching rows not 750 rows.The oddest part is that the difference only happens when the first time you run
select *afterselect count(*), after that when you runselect *everything is back to normal.I tested againts 1.6.1, 1.7.2, and 1.9.2, buth all failed.
iceberg.tar.gz
Willingness to contribute