Skip to content

[C++] Implementation of ExecuteScalarExpressionOverhead benchmarks without arrow for comparision #20250

@asfimport

Description

@asfimport

The ExecuteScalarExpressionOverhead group of benchmarks for now gives us values we can compare to different batch sizes, or to different expressions. But we don't really see how well arrow does compared to what is possible in general.

The simple_expression and (negate x) complex_expression (x>0 and x<20) benchmarks, which perform an actual operation on data, can be implemented in pure C++ for comparison.

I implemented complex_expression benchmark using technically unnecessary intermediate buffers for the > and < operator results, to match what happens in the arrow expression.

What may seem unfair is that I currently re-use the input/output/intermediate buffers over all iterations. I also tried using new and delete each time, but could not measure a difference in performance. Reusing allowes to use std::vector for sightly cleaner code. Re-creating a vector each time would results in a lot of overhead initializing the vector values and is therefore not useful.

Example output: example-output-baseline.txt

Reporter: Tobias Zagorni / @zagto
Assignee: Tobias Zagorni / @zagto

Original Issue Attachments:

PRs and other links:

Note: This issue was originally created as ARROW-16599. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions