Background
The current implementation relies on hard-coded class name strings to detect Iceberg scan types:
`
if (!scanClassName.startsWith("org.apache.iceberg.spark.source.")) {
return None
}
if (scanClassName == "org.apache.iceberg.spark.source.SparkChangelogScan") {
return None
}
if (className != "org.apache.iceberg.spark.source.SparkInputPartition") {
return None
}
`
This approach introduces tight coupling to Iceberg internal class naming and has several drawbacks:
Fragile to upstream refactoring (class/package rename)
Lacks type safety
Hard to maintain and extend
Problem
Auron needs to:
✅ Handle Iceberg batch scans
❌ Exclude changelog scans (row-level CDC not supported)
However, the current logic:
Uses string matching for package detection
Explicitly hard-codes SparkChangelogScan
This makes the code:
Non-robust across Iceberg versions
Semantically unclear
Background
The current implementation relies on hard-coded class name strings to detect Iceberg scan types:
`
if (!scanClassName.startsWith("org.apache.iceberg.spark.source.")) {
return None
}
if (scanClassName == "org.apache.iceberg.spark.source.SparkChangelogScan") {
return None
}
if (className != "org.apache.iceberg.spark.source.SparkInputPartition") {
return None
}
`
This approach introduces tight coupling to Iceberg internal class naming and has several drawbacks:
Fragile to upstream refactoring (class/package rename)
Lacks type safety
Hard to maintain and extend
Problem
Auron needs to:
✅ Handle Iceberg batch scans
❌ Exclude changelog scans (row-level CDC not supported)
However, the current logic:
Uses string matching for package detection
Explicitly hard-codes SparkChangelogScan
This makes the code:
Non-robust across Iceberg versions
Semantically unclear