-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Add vectorization improvement tests #6548
Conversation
// This loop shouldn't be vectorized, because | ||
// LLVM backend cannot check whether A and B overlap | ||
// And runtime check for overlap was turned off | ||
// And --vectorize option too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that 'vectorize' should be on by default for the LLVM backend. Therefore the .compots file for this test should have a --no-vectorize.
|
||
mv $OUTFILE $TMPFILE | ||
$FILECHECK --input-file $TMPFILE $TESTNAME.chpl 2> $OUTFILE | ||
rm $TMPFILE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put this script it util/test
and then symbolically link to it, in the event we need to use it in more places? I'd still like to see FileCheck
as a 1st class alternative to a .good
file in the testing system but until then, let's use a symbolic link to a shared implementation of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR would be best merged after I add FileCheck to Chapel's test system and then refactor it not to use PREDIFF script, because current way we use FileCheck is actually a fast hack to see whether it makes sense to use this tool (it makes!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be comfortable merging this PR as-is or with the symbolic link. I think we should adjust the test system in a separate PR.
@@ -0,0 +1 @@ | |||
../PREDIFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too can be a symbolic link to the script it util/test
start_loop3(); | ||
|
||
//CHECK: llvm.mem.parallel_loop_access ![[LOOP3:[0-9]+]] | ||
//CHECK-NOT: llvm.mem.parallel_loop_access ![[LOOP2]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought this access should be marked with parallel loop access for both loops? i.e. i'm surprised to see the CHECK-NOT for LOOP2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that it is probably not what we want by default. Here is one example of nested loops which are both parallel, but aren't a group of parallel loops:
for i in vectorizeOnly(1..n)
{
A[i] = A[i]*B[i];
for j in vectorizeOnly(A[i], B[i])
{
C[j] = 3*D[j];
}
}
There is a dependency, because result of inner loops depends on multiplication A[i] = A[i]*B[i]
, but once we know range the inner loop is obviously parallel. It's not a big deal to add this anyway, but keep in mind that according to example in documentation, only the innermost loop has metadata referring to the group of loops, meaning that this test is still valid for this case. My example might not be the best but in most general case the innermost loop might not be 'group parallel' and we shouldn't be marking code as such. I'm unable to test whether this will make difference, because as far as I know llvm currently does not use 'group parallel' metadata for anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me to some LLVM documentation explaining what you mean by group parallel
? As I understand llvm.mem.parallel_loop_access
, it indicates:
no loop carried memory dependence exist between it and other instructions denoted with the same loop identifier
In your example, the inner loop and the outer loop both have no loop-carried dependencies (that is, it would be a user error if the program depended upon the loop iteration order). There is a dependency between the first statement in the outer loop and the inner loop. If I understand correctly, my Allen & Kennedy textbook calls such a thing a loop-independent dependency.
@@ -0,0 +1 @@ | |||
--llvm --fast --vectorize --llvm-print-ir loop --llvm-print-ir-stage full --mllvm -force-vector-width=4 --mllvm -force-vector-interleave=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitHub is showing a funny line ending here, maybe, there's an extra newline or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in fact mark, that there is no newline character at the end.
hi @coodie - these are looking great, thanks! Besides my specific comments, my one request is to put a comment at the top of each of your tests to briefly describe what the purpose of the test is. I can tell you're trying to encode that in the test names (which is fine) but I think it'll help maintenance of these tests if a comment indicates what the intent was. Thanks! |
Passed test/llvm with CHPL_LLVM=llvm. @coodie - |
Passed test/llvm with CHPL_LLVM=llvm. |
This PR adds vectorization test testing changes related to #6533.
Tests are done using FileCheck. It's best if this PR is merged after FileCheck is integrated into chapel's test suite because current solution adds PREDIFF script which runs FileCheck, and for every test empty .good file has to be added.