-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
toarrow and fromarrow #68
Comments
Most expected users of Awkward ↔ Arrow conversion are using Python, not C++. Writing a converter in C++ is not a performance consideration, but an accessibility one—it allows pure C++ programs to exchange structured data with other programs in the Arrow ecosystem. However, writing it would mean spinning off a separate package, since Awkward can't take on Arrow as a dependency; the Awkward-Arrow package would have to depend on both Awkward and Arrow, which is too much to deal with right now. For the time being, Awkward ↔ Arrow conversion will be a Python function. (And that Python function can continue to exist as a fallback if the Awkward-Arrow package isn't accessible). |
On second thought, there's this: https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst We may have a minimal-dependency way to consume and produce Arrow buffers after all. (Need to check on the status of that from Arrow.) |
It looks promising. |
My reading of this (Arrow JIRA ticket and pull request) is that this human-readable specification is the entirety of the C interface. There's no code other than what we see on the instructions page. We're supposed to copy its struct definitions into our project, populate them according to the rules on the page, and that's it: the in-memory buffer we've just made is an Arrow buffer. It would be nice to see an example of wrapping that buffer in |
I can't make it an "assignment," but @trickarcher is actively working on this. |
Similar to the Awkward ↔ Arrow conversions in Awkward0, except in C++, rather than Python.
It's a recursive if-elseif-elseif-...-else chain down the list of array node types, replacing each from one library with its equivalent in the other. Conversions from Arrow → Awkward can be zero-copy, now that BitMaskedArray exists, and conversions from Awkward → Arrow would involve one copy (to move the disparate buffers into Arrow's single buffer format).
The text was updated successfully, but these errors were encountered: