Skip to content

chore: polish ArrowDeviceArray#8023

Open
0ax1 wants to merge 6 commits into
developfrom
ad/cudf
Open

chore: polish ArrowDeviceArray#8023
0ax1 wants to merge 6 commits into
developfrom
ad/cudf

Conversation

@0ax1
Copy link
Copy Markdown
Contributor

@0ax1 0ax1 commented May 19, 2026

Changes

  • Generate Arrow C Device ABI bindings from a vendored Arrow reference header
  • Fixed VarBin/Utf8 Arrow export layout by reporting the correct n_buffers = 3
  • Fixed Bool export to ensure buffers are resident on the CUDA device
  • Fixed struct child ArrowArray lifetime/release handling
  • Replaced unsupported-export panics with recoverable errors
  • Added schema normalization so exported schemas match the physical Arrow Device layout (Utf8View/BinaryViewUtf8/Binary)

Notes

This PR is scoped to Arrow Device array export. The intent is to build cuDF support in a follow up on top of this.

0ax1 added 5 commits May 19, 2026 15:14
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 added the changelog/chore A trivial change label May 19, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 19, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 1 improved benchmark
❌ 1 regressed benchmark
✅ 1235 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_varbinview_opt_canonical_into[(1000, 10)] 187.5 µs 224.8 µs -16.62%
Simulation chunked_varbinview_canonical_into[(100, 100)] 308 µs 273.3 µs +12.7%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ad/cudf (1060cc7) with develop (ba5064a)

Open in CodSpeed

@0ax1 0ax1 changed the title chore: polish ArrowDeviceArray impl chore: polish ArrowDeviceArray May 19, 2026
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 requested a review from robert3005 May 19, 2026 15:59
@0ax1 0ax1 marked this pull request as ready for review May 19, 2026 16:00
@0ax1 0ax1 requested review from a10y and onursatici May 19, 2026 16:02
/// In particular, Vortex string/binary arrays may convert to Arrow view types, but the current CUDA
/// exporter materializes them as standard variable-size Arrow `Utf8`/`Binary` arrays with null,
/// offsets, and data buffers. Struct children are normalized recursively.
fn normalize_device_data_type(data_type: &DataType) -> DataType {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still the case? @0ax1 mentioned that cudf added the adapter layer for this in their code

Comment on lines -166 to 169
// 1 (optional) buffer for nulls, one buffer for the data
n_buffers: 2,
// Arrow Utf8/Binary layout: optional null bitmap, offsets, and data bytes.
n_buffers: 3,
buffers: private_data.buffer_ptrs.as_mut_ptr(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's worth noting that i think cudf now supports varbinview

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it was integrated into cud 26.04 release

rapidsai/cudf@4db37f0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants