You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Thorin function foo computes the sequence 0, 1, ..., 99 as a pack «100; %core.I32» via a simple %affine.For counting loop. Initially, the pack ‹100; 0:%core.I32› is passed to this loop and updated as an accumulator acc between iterations. Inside the function bar, said pack is finally stored at some ptr: %mem.Ptr0.
However, this only produces the correct output when the scalerize pass is run which flattens the pack inside the loop. In case it is skipped (e.g. %compile.scalerize_threshold is exceeded), the output contains garbage instead.
Looking at the generated LLVM IR code, it seems acc is not updated correctly when jumping between BBs that form the loop.
The text was updated successfully, but these errors were encountered:
Problem is that the extractvalues are placed in a wrong block, thereby violating a basic SSA property. I'm really astonished that clang silently accepts this garbage and removes it entirely. Maybe a debug build has more sanity checks.
So the problem seems to be that %affine.lower_for_pass creates a couple of things, that the LLVM backend doesn't like. I think the best solution is to rewrite the pass to create a more straightforward loop. I'm on it. Could also speed up a couple of things if you use lots of nested %affine.Fors.
Example:
bug.thorin
The Thorin function
foo
computes the sequence 0, 1, ..., 99 as a pack«100; %core.I32»
via a simple%affine.For
counting loop. Initially, the pack‹100; 0:%core.I32›
is passed to this loop and updated as an accumulatoracc
between iterations. Inside the functionbar
, said pack is finally stored at someptr: %mem.Ptr0
.Example:
bug.c
In addition to that, the following C program calls
bar
with a pointer to an array and prints its contents tostdout
.Commands
However, this only produces the correct output when the scalerize pass is run which flattens the pack inside the loop. In case it is skipped (e.g.
%compile.scalerize_threshold
is exceeded), the output contains garbage instead.Looking at the generated LLVM IR code, it seems
acc
is not updated correctly when jumping between BBs that form the loop.The text was updated successfully, but these errors were encountered: