Skip to content

Conversation

@timsaucer
Copy link
Member

@timsaucer timsaucer commented Nov 13, 2025

Which issue does this PR close?

Note: This is based on top of #18657. I will rebase after that one merges to remove the parts of the code that are not relevant to this PR.

Rationale for this change

This PR adds the concept of a library_marker_id to the FFI crate. The reason for this is described in the README text as part of the PR. In our current use of the FFI library we get into issues where we round trip FFI structs back to their original library that now have FFI wrappers when they are no longer needed.

What changes are included in this PR?

  • Adds a method to find the memory address of a static constant in the library.
  • Adds a check in methods that are creating Foreign FFI structs to see if we are actually in the local library or are actually foreign.
  • Replaces the From<> for Foreign to From<> for Arc<dyn ...>. The actual use case is essentially always to use these structs as their implementation of the trait they back.
  • Adds unit tests and a method to mock being in foreign code
  • Removed an unused function call in FFI_ScalarUDF

Are these changes tested?

  • Code is tested against existing unit tests and with datafusion-python.
  • Added unit tests
  • Coverage report compared to main:
coverage-report

Are there any user-facing changes?

This does change the API for the FFI functions.

@timsaucer timsaucer added api change Changes the API exposed to users of the crate ffi Changes to the ffi crate labels Nov 13, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 13, 2025
@timsaucer timsaucer force-pushed the feat/ffi-add-library-marker branch from adff895 to 27d983c Compare November 13, 2025 16:28
@timsaucer timsaucer force-pushed the feat/ffi-add-library-marker branch from 27d983c to bc0968b Compare November 20, 2025 18:04
@timsaucer timsaucer requested a review from Copilot November 20, 2025 21:13
@timsaucer timsaucer marked this pull request as ready for review November 20, 2025 21:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves FFI (Foreign Function Interface) handling by introducing a library_marker_id mechanism to detect when FFI objects originate from the same library, avoiding unnecessary wrapper overhead during round trips. This optimization allows the system to return original implementations directly when FFI structures are passed back to their originating library.

Key changes:

  • Added library_marker_id field to all FFI structs to identify the originating library
  • Modified From implementations to return Arc<dyn Trait> instead of Foreign* wrapper types, bypassing wrappers when detecting local library origin
  • Updated all test code and examples to use the new direct trait conversion pattern

Reviewed Changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
datafusion/ffi/src/lib.rs Adds get_library_marker_id() function and test mock to identify library origin
datafusion/ffi/README.md Documents the library marker ID concept and its purpose
docs/source/library-user-guide/upgrading.md Provides migration guide for API changes from Foreign* to Arc<dyn Trait>
datafusion/ffi/src/execution_plan.rs Adds library marker support and bypass logic to execution plans
datafusion/ffi/src/table_provider.rs Updates table provider with marker checks and helper methods
datafusion/ffi/src/catalog_provider*.rs Implements library marker checks for catalog providers
datafusion/ffi/src/schema_provider.rs Adds marker-based bypass for schema providers
datafusion/ffi/src/ud*.rs Updates all user-defined function types with library marker logic
datafusion/ffi/src/udaf/accumulator*.rs Implements bypass for accumulator types with null pointer safety
datafusion/ffi/src/udwf/partition_evaluator.rs Adds marker checks to partition evaluators
datafusion/ffi/tests/*.rs Updates test code to use new Arc<dyn Trait> conversion pattern
datafusion-examples/examples/ffi/ffi_module_loader/* Updates example to use new API pattern

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

## Library Marker ID

When reviewing the code, many of the structs in this crate contain a call to
a `library_maker_id`. The purpose of this call is to determine if a library is
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'library_maker_id' to 'library_marker_id'.

Suggested change
a `library_maker_id`. The purpose of this call is to determine if a library is
a `library_marker_id`. The purpose of this call is to determine if a library is

Copilot uses AI. Check for mistakes.
Comment on lines +350 to +356
assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan));

// Verify different library markers still can produce identical properties
let mut ffi_plan = FFI_PlanProperties::from(&props);
ffi_plan.library_marker_id = crate::mock_foreign_marker_id;
let foreign_plan: PlanProperties = ffi_plan.try_into()?;
assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan));
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion compares foreign_plan to itself, which will always pass. This should likely compare against props to verify that the properties are preserved.

Suggested change
assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan));
// Verify different library markers still can produce identical properties
let mut ffi_plan = FFI_PlanProperties::from(&props);
ffi_plan.library_marker_id = crate::mock_foreign_marker_id;
let foreign_plan: PlanProperties = ffi_plan.try_into()?;
assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan));
assert_eq!(format!("{foreign_plan:?}"), format!("{props:?}"));
// Verify different library markers still can produce identical properties
let mut ffi_plan = FFI_PlanProperties::from(&props);
ffi_plan.library_marker_id = crate::mock_foreign_marker_id;
let foreign_plan: PlanProperties = ffi_plan.try_into()?;
assert_eq!(format!("{foreign_plan:?}"), format!("{props:?}"));

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate documentation Improvements or additions to documentation ffi Changes to the ffi crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant