Skip to content

perf: improve StringArray GetString decoding#334

Merged
CurtHagenlocher merged 1 commit intoapache:mainfrom
InCerryGit:perf/string-array-getstring
Apr 27, 2026
Merged

perf: improve StringArray GetString decoding#334
CurtHagenlocher merged 1 commit intoapache:mainfrom
InCerryGit:perf/string-array-getstring

Conversation

@InCerryGit
Copy link
Copy Markdown
Contributor

Summary

StringArray.GetString previously routed through GetBytes, which repeated bounds/null/offset work before decoding the returned byte span. This PR decodes directly from the array's offsets and value buffer while preserving the materialized-string fast path.

Benchmark

BenchmarkDotNet, StringArrayGetStringBenchmark, Count=1024:

Method Before After
GetString 23.23 us / 48.08 KB 18.04 us / 48.08 KB
GetStringFromSlice 23.90 us / 48.00 KB 18.05 us / 48.00 KB

Validation

  • dotnet format Apache.Arrow.sln --include src/Apache.Arrow/Arrays/StringArray.cs test/Apache.Arrow.Tests/StringArrayTests.cs test/Apache.Arrow.Benchmarks/StringArrayGetStringBenchmark.cs --no-restore
  • dotnet test test/Apache.Arrow.Tests/Apache.Arrow.Tests.csproj -c Release --filter "FullyQualifiedName~Apache.Arrow.Tests.StringArrayTests"
  • dotnet build Apache.Arrow.sln -c Release

Copy link
Copy Markdown
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you please resolve the merge conflict?

Decode strings directly from the offsets and values buffers instead of routing through GetBytes, avoiding duplicate bounds/null/offset work on a common read path.

BenchmarkDotNet (StringArrayGetStringBenchmark, Count=1024): LegacyGetString 23.50 us / 48.08 KB; GetString 17.79 us / 48.08 KB; LegacyGetStringFromSlice 22.89 us / 48.00 KB; GetStringFromSlice 17.67 us / 48.00 KB.
@InCerryGit InCerryGit force-pushed the perf/string-array-getstring branch from e8296e2 to f7d6733 Compare April 27, 2026 03:20
@InCerryGit
Copy link
Copy Markdown
Contributor Author

Thanks! Can you please resolve the merge conflict?

done.

@CurtHagenlocher CurtHagenlocher merged commit ea58c94 into apache:main Apr 27, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants