Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable 16-byte by pieces move [PR111449]
Hi, The patch 2 enables 16-byte by pieces move on rs6000. This patch fixes the regression cases caused by previous patch. For sra-17/18, the long array with 4 elements can be loaded by one 16-byte by pieces move on 32-bit platform. So the array is not be constructed in LC0 and SRA optimization is unable to be taken. "no-vsx" option is added for 32-bit platform, as it sets the MOVE_MAX_PIECES to 4-byte on 32-bit platform and the array can't be loaded by one by pieces move. Another regression is on P8 LE. The 16-byte memory to memory is implemented by two TImode load/store. The TImode load/store is finally split to two DImode load/store on P8 LE as it doesn't have unaligned vector load/store instructions. Actually, 16-byte memory to memory move can be implement by two V2DI reversed load/store on P8 LE. The patch creates a insn_and_split pattern for this optimization. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is this OK for trunk? Thanks Gui Haochen ChangeLog rs6000: Enable 16-byte by pieces move This patch enables 16-byte by pieces move. The 16-byte move is generated with TImode and finally implemented by vector instructions. There are several regression cases after the enablement. 16-byte TImode memory to memory move is originally implemented by two pairs of DImode load/store on P8 LE as there is no unalignment vsx load/store on it. The patch fixes the problem by creating an insn_and_split pattern and converts it to one pair of reversed load/store. Two SRA cases lost the SRA optimization as the array can be loaded by one 16-byte move so that not be initialized in LC0 on 32-bit platform. So fixes them by adding no-vsx option. gcc/ PR target/111449 * config/rs6000/vsx.md (*vsx_le_mem_to_mem_mov_ti): New. gcc/testsuite/ PR target/111449 * gcc.dg/tree-ssa/sra-17.c: Add no-vsx option for powerpc ilp32. * gcc.dg/tree-ssa/sra-18.c: Likewise. * gcc.target/powerpc/pr111449-1.c: New. patch.diff
- Loading branch information