Skip to content

fix: preserve None projection semantics across FFI boundary in ForeignTableProvider::scan#20393

Open
Kontinuation wants to merge 1 commit intoapache:mainfrom
Kontinuation:fix/ffi-scan-none-projection
Open

fix: preserve None projection semantics across FFI boundary in ForeignTableProvider::scan#20393
Kontinuation wants to merge 1 commit intoapache:mainfrom
Kontinuation:fix/ffi-scan-none-projection

Conversation

@Kontinuation
Copy link
Member

@Kontinuation Kontinuation commented Feb 17, 2026

Which issue does this PR close?

N/A (newly discovered bug)

This is originally found in apache/sedona-db when working on a custom plan node: apache/sedona-db#611 (comment)

Rationale for this change

ForeignTableProvider::scan() converts a None projection (meaning "return all columns") into an empty RVec<usize> before passing it across the FFI boundary. On the receiving side, scan_fn_wrapper always wraps the received RVec in Some(...), passing Some(&vec![]) to the inner TableProvider::scan(). This means "project zero columns" — the exact opposite of the intended "project all columns."

The root cause is that the FFI_TableProvider::scan function signature uses RVec<usize> for the projections parameter. Since RVec<usize> cannot represent None, the None vs. empty-vec distinction is lost at the FFI boundary.

What changes are included in this PR?

Three coordinated changes in datafusion/ffi/src/table_provider.rs:

  1. FFI struct definition: Changed scan function pointer signature from RVec<usize> to ROption<RVec<usize>> for the projections parameter, matching how limit already uses ROption<usize> for the same None-vs-value distinction.

  2. Receiver side (scan_fn_wrapper): Converts ROption<RVec<usize>> via .into_option().map(...) and passes projections.as_ref() to the inner provider, preserving None semantics.

  3. Sender side (ForeignTableProvider::scan): Converts Option<&Vec<usize>> to ROption<RVec<usize>> via .into() instead of using unwrap_or_default().

Plus a new unit test test_scan_with_none_projection_returns_all_columns that directly exercises the FFI round-trip with projection=None and verifies all 3 columns are returned.

Also fixed the existing test_aggregation test to set library_marker_id = mock_foreign_marker_id so it actually exercises the FFI path instead of taking the local bypass.

How are these changes tested?

  • New test test_scan_with_none_projection_returns_all_columns: creates a 3-column MemTable, wraps it through FFI with the foreign marker set, calls scan(None), and asserts 3 columns are returned (previously returned 0).

Are these changes safe?

This is a breaking FFI ABI change to the FFI_TableProvider::scan function pointer signature. However:

  • The abi_stable crate's #[derive(StableAbi)] generates layout checks at dylib load time, so mismatched dylibs will be caught at load rather than causing silent corruption.
  • All existing providers construct FFI_TableProvider via ::new() or ::new_with_ffi_codec(), which internally wire up scan_fn_wrapper — nobody constructs the scan function pointer manually.

@github-actions github-actions bot added the ffi Changes to the ffi crate label Feb 17, 2026
…nTableProvider::scan

ForeignTableProvider::scan() converted projection: None (meaning 'all
columns') into an empty RVec<usize> via unwrap_or_default() before
passing it across the FFI boundary. On the receiving side,
scan_fn_wrapper always wrapped the received RVec in Some(...), passing
Some(&vec![]) to the inner TableProvider::scan(). This means 'project
zero columns' -- the opposite of the intended 'all columns'.

Fix by changing the FFI function signature's projections parameter from
RVec<usize> to ROption<RVec<usize>>, matching how limit already uses
ROption<usize> for the same None-vs-value distinction. Update both the
sender (ForeignTableProvider::scan) and receiver (scan_fn_wrapper) to
properly convert between Option and ROption.

Add test that verifies scan(projection=None) returns all columns.
@paleolimbot
Copy link
Member

cc @timsaucer !

Copy link
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for a very clear and concise solution!

One alternative we could do that would not be breaking for the FFI boundary would be on the Foreign side to check for Some([]) and return an empty batch executor and then on the local side (wrapped fn) change an empty projections into None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ffi Changes to the ffi crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants