experiment with llvm vectorization passes #4786

StefanKarpinski · 2013-11-12T03:52:37Z

This may be more applicable once we work with LLVM 3.4 (which may also depend on switching to using MCJIT), but there is now a fairly significant amount of support for autovectorization in LLVM.

ArchRobison · 2013-11-13T18:59:55Z

I'm interested in experimenting with adding an annotation akin to the OpenMP 4.0 "pragma omp simd". It would convey the information that exact sequential semantics are not required. For example, bounds checking would still happen, but a failed bounds check might terminate a loop earlier than if the sequential semantics were followed. Without that grant of permissiveness, autovectorizers are often thwarted.

StefanKarpinski · 2013-11-14T00:39:39Z

Also definitely work considering, although I would, of course, prefer to avoid pragmas where possible.

simonster · 2013-11-14T22:24:08Z

I tried to enable the loop vectorizer in #3929, but I couldn't get it to work because of the way jl_value_t was getting handled in the instruction combining pass. This could use attention from someone who knows more about LLVM than I do. I also tried enabling the SLP vectorizer, but that made building the sysimg extremely slow; it seems like Julia sometimes ends up compiling functions with absurdly large numbers of variables.

StefanKarpinski · 2013-11-14T22:57:48Z

that made building the sysimg extremely slow

Might be worth it? Of course, the concern is more that it will make code compilation after building the system image very slow too.

JeffBezanson · 2013-11-14T23:01:10Z

It'd probably be best to selectively enable it for non-huge functions.

lindahua · 2013-12-20T23:00:16Z

I agree with introducing some way to selectively enable LLVM auto-vectorization for a small set of functions for testing purpose.

If this works, micro-optimization like those in #5205 will no longer be needed.

ArchRobison · 2013-12-21T00:04:28Z

I'm partway through implementing vectorization of loops that are marked by the programmer. The scheme is inspired by OpenMP 4.0's pragma omp simd. I was planning to create an issue describing what I have after the holidays. Here is a summary of what I'm trying to do:

Have the programmer apply @inbounds to eliminate bounds checks. Maybe in the future we can make LLVM eliminate unnecessary checks. Item 2 should greatly help the necessary analysis. Or in the long term, just vectorize bounds checks too.
Leverage type-based alias analysis in LLVM. I have this working. It greatly helps hoisting loop-invariant loads of like arrayptr and arraylen, and thus solves the "second issue" mentioned in issue WIP: Enable LLVM loop vectorizer #3929. Alas it's not much use without item 1, because code in the shadow of a bounds check is no longer guaranteed to be executed each iteration. I'd really like to find a way to teach LLVM to "speculatively" hoist such loads of fields in jl_array_t.
Have the programmer mark the for loop with a @simd macro. The macro transforms the loop into an equivalent while loop that uses a loop test based on < instead of <=. That solves the "first issue" mentioned in issue WIP: Enable LLVM loop vectorizer #3929 without having to set no-signed-wrap. (The < test will do the wrong thing if the original upper loop bound was INT_MAX. But so would no-signed-wrap.) Indeed items 1-3 are enough to vectorize simple loops that don't have memory dependence issues or for which LLVM can insert a run-time dependence test. (Sometimes it does, sometimes it doesn't. I haven't figured out its decision logic yet.)
Have the @simd macro tell the LLVM loop vectorizer to ignore memory dependencies when considering whether to vectorize a loop. This is the reason that OpenMP 4.0 added the equivalent feature. I'm working on this part now. I was planning to attach LLVM Metadata to the "loop latch" BasicBlock, but found out today that LLVM currently does not allow attaching metadata to a BasicBlock (sigh). I'm going to try hacking something that attaches the metadata to an instruction in the block, and then figure out what I have to modify in LLVM to propagate the information.
In lieu of step 4, my current prototoype makes the loop vectorizer ignore memory dependences in any Julia function with a name beginning with banana. That's useful for experiments, but probably not production worthy :-)

StefanKarpinski · 2013-12-21T00:42:14Z

In lieu of step 4, my current prototoype makes the loop vectorizer ignore memory dependences in any Julia function with a name beginning with banana. That's useful for experiments, but probably not production worthy :-)

Seems like a fine interface to me.

ViralBShah · 2013-12-23T16:39:37Z

I like the idea of the banana interface too. :-)

jiahao · 2014-04-07T23:37:00Z

Presumably closed by PR above?

simonster · 2014-04-07T23:50:06Z

This isn't complete until we have the SLPVectorizer (#6271)

JeffBezanson · 2014-08-12T03:50:46Z

I think this is well underway and subsumed by more specific issues.

ArchRobison mentioned this issue Jan 10, 2014

Add support for @simd #5355

Merged

JeffBezanson closed this as completed Aug 12, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiment with llvm vectorization passes #4786

experiment with llvm vectorization passes #4786

StefanKarpinski commented Nov 12, 2013

ArchRobison commented Nov 13, 2013

StefanKarpinski commented Nov 14, 2013

simonster commented Nov 14, 2013

StefanKarpinski commented Nov 14, 2013

JeffBezanson commented Nov 14, 2013

lindahua commented Dec 20, 2013

ArchRobison commented Dec 21, 2013

StefanKarpinski commented Dec 21, 2013

ViralBShah commented Dec 23, 2013

jiahao commented Apr 7, 2014

simonster commented Apr 7, 2014

JeffBezanson commented Aug 12, 2014

experiment with llvm vectorization passes #4786

experiment with llvm vectorization passes #4786

Comments

StefanKarpinski commented Nov 12, 2013

ArchRobison commented Nov 13, 2013

StefanKarpinski commented Nov 14, 2013

simonster commented Nov 14, 2013

StefanKarpinski commented Nov 14, 2013

JeffBezanson commented Nov 14, 2013

lindahua commented Dec 20, 2013

ArchRobison commented Dec 21, 2013

StefanKarpinski commented Dec 21, 2013

ViralBShah commented Dec 23, 2013

jiahao commented Apr 7, 2014

simonster commented Apr 7, 2014

JeffBezanson commented Aug 12, 2014