-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Reduce FFI wrappers when round tripping code #18672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
adff895 to
27d983c
Compare
27d983c to
bc0968b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves FFI (Foreign Function Interface) handling by introducing a library_marker_id mechanism to detect when FFI objects originate from the same library, avoiding unnecessary wrapper overhead during round trips. This optimization allows the system to return original implementations directly when FFI structures are passed back to their originating library.
Key changes:
- Added
library_marker_idfield to all FFI structs to identify the originating library - Modified
Fromimplementations to returnArc<dyn Trait>instead ofForeign*wrapper types, bypassing wrappers when detecting local library origin - Updated all test code and examples to use the new direct trait conversion pattern
Reviewed Changes
Copilot reviewed 25 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| datafusion/ffi/src/lib.rs | Adds get_library_marker_id() function and test mock to identify library origin |
| datafusion/ffi/README.md | Documents the library marker ID concept and its purpose |
| docs/source/library-user-guide/upgrading.md | Provides migration guide for API changes from Foreign* to Arc<dyn Trait> |
| datafusion/ffi/src/execution_plan.rs | Adds library marker support and bypass logic to execution plans |
| datafusion/ffi/src/table_provider.rs | Updates table provider with marker checks and helper methods |
| datafusion/ffi/src/catalog_provider*.rs | Implements library marker checks for catalog providers |
| datafusion/ffi/src/schema_provider.rs | Adds marker-based bypass for schema providers |
| datafusion/ffi/src/ud*.rs | Updates all user-defined function types with library marker logic |
| datafusion/ffi/src/udaf/accumulator*.rs | Implements bypass for accumulator types with null pointer safety |
| datafusion/ffi/src/udwf/partition_evaluator.rs | Adds marker checks to partition evaluators |
| datafusion/ffi/tests/*.rs | Updates test code to use new Arc<dyn Trait> conversion pattern |
| datafusion-examples/examples/ffi/ffi_module_loader/* | Updates example to use new API pattern |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## Library Marker ID | ||
|
|
||
| When reviewing the code, many of the structs in this crate contain a call to | ||
| a `library_maker_id`. The purpose of this call is to determine if a library is |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'library_maker_id' to 'library_marker_id'.
| a `library_maker_id`. The purpose of this call is to determine if a library is | |
| a `library_marker_id`. The purpose of this call is to determine if a library is |
| assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan)); | ||
|
|
||
| // Verify different library markers still can produce identical properties | ||
| let mut ffi_plan = FFI_PlanProperties::from(&props); | ||
| ffi_plan.library_marker_id = crate::mock_foreign_marker_id; | ||
| let foreign_plan: PlanProperties = ffi_plan.try_into()?; | ||
| assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan)); |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion compares foreign_plan to itself, which will always pass. This should likely compare against props to verify that the properties are preserved.
| assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan)); | |
| // Verify different library markers still can produce identical properties | |
| let mut ffi_plan = FFI_PlanProperties::from(&props); | |
| ffi_plan.library_marker_id = crate::mock_foreign_marker_id; | |
| let foreign_plan: PlanProperties = ffi_plan.try_into()?; | |
| assert_eq!(format!("{foreign_plan:?}"), format!("{:?}", foreign_plan)); | |
| assert_eq!(format!("{foreign_plan:?}"), format!("{props:?}")); | |
| // Verify different library markers still can produce identical properties | |
| let mut ffi_plan = FFI_PlanProperties::from(&props); | |
| ffi_plan.library_marker_id = crate::mock_foreign_marker_id; | |
| let foreign_plan: PlanProperties = ffi_plan.try_into()?; | |
| assert_eq!(format!("{foreign_plan:?}"), format!("{props:?}")); |
Which issue does this PR close?
Note: This is based on top of #18657. I will rebase after that one merges to remove the parts of the code that are not relevant to this PR.
Rationale for this change
This PR adds the concept of a
library_marker_idto the FFI crate. The reason for this is described in theREADMEtext as part of the PR. In our current use of the FFI library we get into issues where we round trip FFI structs back to their original library that now have FFI wrappers when they are no longer needed.What changes are included in this PR?
From<> for ForeigntoFrom<> for Arc<dyn ...>. The actual use case is essentially always to use these structs as their implementation of the trait they back.FFI_ScalarUDFAre these changes tested?
datafusion-python.main:Are there any user-facing changes?
This does change the API for the FFI functions.