Skip to content

Conversation

@Kontinuation
Copy link
Member

@Kontinuation Kontinuation commented Nov 14, 2025

This optimization simply hoists per-column arrays in RasterStructArray, so that we don't need to extract individual arrays from the struct array each time we construct a RasterRefImpl value in RasterStructArray::get. We also marked short getters as inline so that the compiler could optimize away the construction of intermediate objects in simple metadata getter functions such as RS_Width.

I've benchmarked this using the rs_width bench in PR #297, the performance improvement is quite significant.

Baseline:

native-rs_width-Array(Raster(64, 64))
                        time:   [7.3810 ms 7.3912 ms 7.4034 ms]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

This PR:

native-rs_width-Array(Raster(64, 64))
                        time:   [474.81 µs 475.81 µs 476.75 µs]
                        change: [-93.593% -93.576% -93.560%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

@Kontinuation Kontinuation requested a review from Copilot November 14, 2025 11:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the RasterStructArray::get method by hoisting per-column array extractions from the struct array to avoid repeated work on each access. Instead of extracting arrays every time a RasterRefImpl is constructed, the arrays are now extracted once during RasterStructArray::new and stored as fields. The PR also adds #[inline(always)] attributes to short getter methods to enable better compiler optimizations. Benchmarks show a ~93.6% performance improvement (7.4ms → 475µs for rs_width operations).

Key Changes:

  • Refactored RasterStructArray to cache extracted arrays as fields instead of extracting them on every get() call
  • Added #[inline(always)] attributes to getter methods in MetadataRef, RasterRef implementations, and RasterStructArray
  • Modified RasterRefImpl::new to accept RasterStructArray reference instead of StructArray to leverage cached arrays

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Kontinuation Kontinuation requested a review from jesspav November 14, 2025 11:13
@Kontinuation Kontinuation marked this pull request as ready for review November 14, 2025 11:13
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

We marked short getters as inline

For my education, is this always necessary? I had assumed that like C or C++ that if the compiler has access to the sources it will do that inlining automatically, but I truly have no idea.

@Kontinuation
Copy link
Member Author

For calling non-generic functions in the same crate, adding #[inline] or not does not matter too much, since LLVM inlines code aggressively. For calling non-generic functions in other crates and LTO is not enabled, adding #[inline] will enable cross-crate inline for annotated functions; adding #[inline] or not will result in a measurable performance difference in this case. Here is the benchmark result I got before adding #[inline]:

native-rs_width-Array(Raster(64, 64))
                        time:   [1.2776 ms 1.2867 ms 1.3022 ms]
                        change: [-82.709% -82.587% -82.409%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  6 (6.00%) high mild
  8 (8.00%) high severe

@Kontinuation Kontinuation merged commit 563e81d into apache:main Nov 14, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants