[AURON #2273] Fallback Hudi incremental queries from native scan conversion#2274
[AURON #2273] Fallback Hudi incremental queries from native scan conversion#2274weimingdiit wants to merge 1 commit into
Conversation
| assert( | ||
| !HudiScanSupport.isSupported( | ||
| fileFormatName, | ||
| cowOptions + ("Hoodie.DataSource.Read.Begin.InstantTime" -> "20240101010101"))) |
There was a problem hiding this comment.
Consider adding a case for the legacy hoodie.datasource.view.type=incremental key — queryTypeFromOptions checks both keys, so it already works, but an explicit assertion would document the behavior:
assert(!HudiScanSupport.isSupported(fileFormatName, cowOptions + ("hoodie.datasource.view.type" -> "incremental")))
There was a problem hiding this comment.
Thanks for the suggestion. I added an explicit assertion for the legacy hoodie.datasource.view.type=incremental option to document that it also falls back from native scan conversion.
There was a problem hiding this comment.
Pull request overview
This PR prevents Hudi incremental queries from being converted to native file scans, preserving Hudi timeline and instant filtering semantics until native support exists.
Changes:
- Adds detection for Hudi incremental query type and begin/end instant options.
- Falls back from native scan support when incremental options are present.
- Adds unit coverage for incremental fallback, including case-insensitive option keys.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
thirdparty/auron-hudi/src/main/scala/org/apache/spark/sql/auron/hudi/HudiScanSupport.scala |
Adds incremental query option detection and rejects native scan support for those reads. |
thirdparty/auron-hudi/src/test/scala/org/apache/spark/sql/auron/hudi/HudiScanSupportSuite.scala |
Adds assertions covering incremental query and instant option fallback behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…n conversion Signed-off-by: weimingdiit <weimingdiit@gmail.com>
52b175d to
8893f6b
Compare
Which issue does this PR close?
Closes #2273
Rationale for this change
Hudi incremental queries depend on Hudi timeline and instant filtering semantics. Native Hudi scan currently converts supported Hudi file scans to native Parquet/ORC scans, but native scan does not implement incremental query semantics. These queries should fallback to Spark/Hudi until incremental scan support is explicitly added.
What changes are included in this PR?
HudiScanSupport.hoodie.datasource.query.type=incrementalhoodie.datasource.read.begin.instanttimehoodie.datasource.read.end.instanttimeAre there any user-facing changes?
Yes. Hudi incremental queries will stay on Spark/Hudi scan instead of being converted to native scan.
How was this patch tested?
HudiScanSupportSuite.