forked from wangliu-iscas/gcc-patch
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tree-optimization/106081 - elide redundant permute
The following patch makes sure to elide a redundant permute that can be merged with existing splats represented as load permutations as we now do for non-grouped SLP loads. This is the last bit missing to fix this PR where the main fix was already done by r14-2117-gdd86a5a69cbda4 Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/106081 * tree-vect-slp.cc (vect_optimize_slp_pass::start_choosing_layouts): Assign layout -1 to splats. * gcc.dg/vect/pr106081.c: New testcase.
- Loading branch information
1 parent
c2d62cd
commit 76f66d4
Showing
2 changed files
with
37 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
/* { dg-do compile } */ | ||
/* { dg-additional-options "-ffast-math -fdump-tree-optimized" } */ | ||
/* { dg-additional-options "-mavx2" { target x86_64-*-* i?86-*-* } } */ | ||
/* { dg-require-effective-target vect_double } */ | ||
/* { dg-require-effective-target vect_unpack } */ | ||
/* { dg-require-effective-target vect_intdouble_cvt } */ | ||
/* { dg-require-effective-target vect_perm } */ | ||
|
||
struct pixels | ||
{ | ||
short a,b,c,d; | ||
} *pixels; | ||
struct dpixels | ||
{ | ||
double a,b,c,d; | ||
}; | ||
|
||
double | ||
test(double *k) | ||
{ | ||
struct dpixels results={}; | ||
for (int u=0; u<1000*16;u++,k--) | ||
{ | ||
results.a += *k*pixels[u].a; | ||
results.b += *k*pixels[u].b; | ||
results.c += *k*pixels[u].c; | ||
results.d += *k*pixels[u].d; | ||
} | ||
return results.a+results.b*2+results.c*3+results.d*4; | ||
} | ||
|
||
/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */ | ||
/* { dg-final { scan-tree-dump-times "VEC_PERM" 4 "optimized" { target x86_64-*-* i?86-*-* } } } */ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters