Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.algorithm.sum and std.algorithm.reduce for fixed size arrays too #9925

Open
dlangBugzillaToGithub opened this issue Apr 17, 2012 · 0 comments

Comments

@dlangBugzillaToGithub
Copy link

bearophile_hugs reported this on 2012-04-17T14:13:46Z

Transfered from https://issues.dlang.org/show_bug.cgi?id=7934

Description

This is a second part of Issue 4725


The D type system supports both dynamic arrays and fixed-sized arrays. Despite Phobos regards fixed-sized arrays as second-class citizens of the language, they are quite important and useful in high-performance code, because they reduce the pressure on the garbage collector and allow for some extra optimizations.

An example: their length is known at compile time, so if such length is small, the compiler finds it simple to unroll loops. In scientific code loop unrolling is very important. Sometimes the JavaHotSpot is able to beat C++ in Scientific code on loops with bounds that aren't known at compile-time because the Just-in-time compiler is able to see that an array length known only at run-time is indeed constant for this run, so it's able to partially unroll the loop. I have verified this beats all C++ compilers in some number-crunching code.

A JIT is not needed with fixed-sized D arrays. Throwing away the length known at compile-time to turn them into dynamic arrays to make them ranges, is a waste of optimization opportunities.

If I see code:

int[5] a = foo();
auto s = sum(a);

I'd like that sum() to be replaced by an inlined unrolled loop, if the input array length is known at compile-time, and it's small.

(I think it's not hard to do. The test on the length is easy to do with a template constraint plus a compile-time function that essentially generates a "a[0] + a[1] + a[2] + a[3] + a[4]" mixin).

I'd like a specialization of std.algorithm.reduce too for small fixed-sized arrays.


Both sum and reduce call their normal dynamic array versions (slicing the input with []) if the fixed size inout array is long enough (like more than 8 or 16 items), because a full loop unroll is not good in this case (still, even in this case the back-end of the compiler will enjoy to know the array size at compile-time).
@LightBender LightBender removed the P4 label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants