-
Notifications
You must be signed in to change notification settings - Fork 278
chore: Add microbenchmark for IcebergScan operator serde roundtrip #3296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This benchmark measures the serialization/deserialization performance of Iceberg FileScanTask objects to protobuf, starting from actual Iceberg Java objects rather than pre-constructed protobuf messages. The benchmark: - Creates a real Iceberg table with configurable number of partitions - Extracts FileScanTask objects through query planning - Benchmarks conversion from FileScanTask to Protobuf - Benchmarks serialization to bytes and deserialization Usage: make benchmark-org.apache.spark.sql.benchmark.CometOperatorSerdeBenchmark -- 30000 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| } | ||
|
|
||
| /** | ||
| * Creates an Iceberg table with the specified number of partitions. Each partition contains one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true? Often when I write in batches to Iceberg tables I get different files for iterations of inserts, and then need to do a compaction after.
| // Benchmark the serialization | ||
| val iterations = 100 | ||
| val benchmark = new Benchmark( | ||
| s"IcebergScan serde ($numPartitions partitions, ${tasks.size()} tasks)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The table has numPartitions partitions, right. But does the query run with that many? I figured Spark would use however many shuffle partitions it's configured for, and if num shuffle partitions < num table partitions, table partitions get grouped together. Could you confirm what you mean here? For serde it might be interesting to measure Spark partitions, not table partitions. I just want to make sure we're measuring what we expect here.
mbutrovich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andygrove. Let's merge this so we can quantify the serialization efforts we're doing. This is a good foundation.
) This benchmark measures the serialization/deserialization performance of Iceberg FileScanTask objects to protobuf, starting from actual Iceberg Java objects rather than pre-constructed protobuf messages. The benchmark: - Creates a real Iceberg table with configurable number of partitions - Extracts FileScanTask objects through query planning - Benchmarks conversion from FileScanTask to Protobuf - Benchmarks serialization to bytes and deserialization Usage: make benchmark-org.apache.spark.sql.benchmark.CometOperatorSerdeBenchmark -- 30000 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Summary
This PR adds a microbenchmark for measuring the serialization/deserialization performance of Iceberg
FileScanTaskobjects to protobuf.The benchmark:
FileScanTaskobjects through query planningFileScanTaskto Protobuf viaCometIcebergNativeScan.convert()Usage
Sample Results (1000 partitions)
Key insight: The conversion from
FileScanTaskto protobuf dominates (~99% of time). Protobuf parsing is extremely fast.Serialized size: 178.7 KB for 1000 tasks (~179 bytes/task)
Test plan
🤖 Generated with Claude Code