Skip to content

Equality Delete File Scanning via Join in DataFusion #1530

@ZENOTME

Description

@ZENOTME

Is your feature request related to a problem or challenge?

The current handling of equality deletes is done inside the file reader layer #630, where the delete keys are loaded and filtered row-by-row. However, this approach doesn’t support spilling to disk. While it’s possible to implement spill logic directly in the reader, but another direction is to use semi join in datafusion and we can utilize the spill disk ability of it.

Another potential benefit may be:

  • Unified resource control (e.g. compute thread), as we move the filter compute to a join operator, the resource of this part will be control by datafusion compute engine
  • Potential for physical plan optimization: By expressing equality deletes as part of the physical plan (e.g., via a semi join), it opens the door for cost-based optimizations, join reordering, and operator fusion.

I plan to conduct more performance experiments to evaluate whether this approach is worthwhile in practice, especially under large datasets and memory pressure and welcome any suggestion for this idea.

Describe the solution you'd like

Image

When we call scan for TableProvider(convert the logical plan to physical plan), we return a execution plan like above: a semi hash join connect:

  • equality delete scan
  • data file with position delete file scan

The semi-hash join will only return the batches from the right side that do not have matches in the left side. We partition the files based on the parallelism level set by DataFusion. The rule is to first divide the file scan tasks according to the configured parallelism, and then assign the equality delete files to each partition based on the corresponding file scan tasks. This ensures that all relevant equality delete files are included in the same partition as the data they affect.

Willingness to contribute

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions