-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
enhancementNew feature or requestNew feature or requestperformanceMake DataFusion fasterMake DataFusion faster
Description
Is your feature request related to a problem or challenge?
There seem some opportunities for optimizing ArrowBytesViewMap using some more cleverness.
For e.g. ClickBench query 5, >50% CPU is spent during intern:
A lot of it relates to getting / comparing the bytes from the buffers, etc (append_value, get_value, memcmp, makeview, etc).
Describe the solution you'd like
We should be able to avoid (re)creating views every time and comparing against slices, by storing/comparing the views directly, and avoiding the overhead of the GenericByteViewBuilder methods.
To do so, I think we need:
- Not use
values.iter()but use the view buffer and get buffer index - Compare against the original view (and buffer in the index if needed)
- Update the new view with the new index (don't create it again).
Describe alternatives you've considered
No response
Additional context
No response
alamb
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestperformanceMake DataFusion fasterMake DataFusion faster