[spark] Support filter pushdown for log tables by fresh-borzoni · Pull Request #3116 · apache/fluss

fresh-borzoni · 2026-04-17T13:35:27Z

Adds SupportsPushDownFilters to FlussAppendScanBuilder and a SparkPredicateConverter mirroring Flink's PredicateConverter.

Record-batch pushdown uses the server-side filter from #2951, Spark re-applies every filter as a safety net, making pushdown a pure optimization.

fresh-borzoni · 2026-04-18T01:26:45Z

@Yohahaha @YannByron @luoyuxia PTAL 🙏

Yohahaha

thank you! left some comments.

Yohahaha

LGTM!

fresh-borzoni · 2026-04-21T09:37:15Z

@Yohahaha @YannByron Ty for the review 👍

Redesigned to the newer API, PTAL 🙏

YannByron · 2026-04-24T06:43:03Z

LGTM. thanks @fresh-borzoni

fresh-borzoni · 2026-04-24T09:05:51Z

@luoyuxia Can you take a look, pls?

Copilot

Pull request overview

This PR adds Spark-side filter pushdown for Fluss log (append) tables by converting Spark V2 predicates into Fluss predicates and applying them as server-side record-batch filters, while still letting Spark re-apply predicates for row-exact correctness.

Changes:

Introduce SparkPredicateConverter to translate Spark V2 Predicate expressions into Fluss Predicates.
Wire Spark V2 filter pushdown into FlussAppendScanBuilder/FlussAppendScan and apply the pushed predicate in FlussAppendPartitionReader via table.newScan().filter(...).
Add unit/integration tests validating predicate conversion and verifying pushdown shows up in Spark scan plans.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/utils/SparkPredicateConverter.scala	New converter from Spark predicates to Fluss predicates for pushdown.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussScanBuilder.scala	Add `SupportsPushDownV2Filters` mixin and pushdown plumbing for append scans.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussScan.scala	Extend `FlussAppendScan` to carry pushed predicates and include them in scan description.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussBatch.scala	Thread pushed predicate into append batch reader factory creation.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussPartitionReaderFactory.scala	Extend append reader factory to accept an optional pushed predicate.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussAppendPartitionReader.scala	Apply server-side batch filter via `TableScan.filter(...)` when reading.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussMicroBatchStream.scala	Update append micro-batch reader factory call signature (currently still passes `None`).
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/lake/FlussLakeAppendBatch.scala	Update factory construction signature for fallback path.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/lake/FlussLakePartitionReaderFactory.scala	Update append partition reader construction signature for log splits.
fluss-spark/fluss-spark-ut/src/test/scala/org/apache/fluss/spark/utils/SparkPredicateConverterTest.scala	New unit tests covering predicate conversion semantics.
fluss-spark/fluss-spark-ut/src/test/scala/org/apache/fluss/spark/SparkLogTableReadTest.scala	New tests asserting Spark plans show pushed predicates for log-table reads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

luoyuxia

@fresh-borzoni Thanks for the pr. Only one minor comments. Also, will the pushdown for lake reader be done in following pr?

fresh-borzoni · 2026-04-25T00:44:56Z

@luoyuxia Ty for the review, addressed comments, PTAL 🙏

Yes, pushdown for the lake reader is planned as a follow-up, here it is primarly converter introduction for further use-cases.

Yohahaha reviewed Apr 20, 2026

View reviewed changes

Yohahaha approved these changes Apr 21, 2026

View reviewed changes

YannByron reviewed Apr 21, 2026

View reviewed changes

Comment thread ...s-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussScanBuilder.scala Outdated

fresh-borzoni added 3 commits April 21, 2026 09:31

[spark] Support filter pushdown for log tables

7340d0f

addressed review

e58e75b

change to v2

71a79c5

fresh-borzoni force-pushed the feat/spark-log-filter-pushdown branch from ee3d458 to 71a79c5 Compare April 21, 2026 08:50

fresh-borzoni requested review from YannByron and Yohahaha April 21, 2026 09:37

Yohahaha approved these changes Apr 21, 2026

View reviewed changes

luoyuxia requested review from Copilot and removed request for YannByron April 24, 2026 09:09

Copilot started reviewing on behalf of luoyuxia April 24, 2026 09:09 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

luoyuxia reviewed Apr 24, 2026

View reviewed changes

Comment thread ...uss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussAppendPartitionReader.scala

address feedback

351e186

luoyuxia approved these changes Apr 25, 2026

View reviewed changes

luoyuxia merged commit 4982aaf into apache:main Apr 25, 2026
7 checks passed

Ugbot pushed a commit to Ugbot/fluss that referenced this pull request Apr 26, 2026

[spark] Support filter pushdown for log tables (apache#3116)

21005af

Conversation

fresh-borzoni commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fresh-borzoni commented Apr 18, 2026

Uh oh!

Yohahaha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Yohahaha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fresh-borzoni commented Apr 21, 2026

Uh oh!

YannByron commented Apr 24, 2026

Uh oh!

fresh-borzoni commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luoyuxia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fresh-borzoni commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fresh-borzoni commented Apr 17, 2026 •

edited

Loading

fresh-borzoni commented Apr 25, 2026 •

edited

Loading