extractps selected too eagerly

|  |  |
| --- | --- |
| Bugzilla Link | [2647](https://llvm.org/bz2647) |
| Resolution | FIXED |
| Resolved on | Oct 31, 2008 09:44 |
| Version | unspecified |
| OS | Windows NT |
| Reporter | LLVM Bugzilla Contributor |
| CC | @sunfishcode,@nlewycky |

## Extended Description 
The following LLVM IR compiles to suboptimal code on x86 CPUs with SSE4 support, but optimizes fine on older CPUs:

external global float, align 16		; <float*>:0 [#uses=2]

define internal void @""() {
	load float* @&#8203;0, align 16		; <float>:1 [#uses=1]
	insertelement <4 x float> undef, float %1, i32 0		; <<4 x float>>:2 [#uses=1]
	call <4 x float> @&#8203;llvm.x86.sse.rsqrt.ss( <4 x float> %2 )		; <<4 x float>>:3 [#uses=1]
	extractelement <4 x float> %3, i32 0		; <float>:4 [#uses=1]
	store float %4, float* @&#8203;0, align 16
	ret void
}

declare <4 x float> @&#8203;llvm.x86.sse.rsqrt.ss(<4 x float>) nounwind readnone

Here's the result on a Penryn CPU:

  push        ebp  
  mov         ebp,esp 
  and         esp,0FFFFFFF0h 
  rsqrtss     xmm0,dword ptr ds:[1762ED0h] 
  extractps   eax, xmm0
  movd        xmm0,eax 
  movss       dword ptr ds:[1762ED0h],xmm0 
  mov         esp,ebp 
  pop         ebp  
  ret      

And this is the lovable code I get on Conroe:

  rsqrtss     xmm0,dword ptr ds:[1762ED0h] 
  movss       dword ptr ds:[1762ED0h],xmm0 
  ret   

Ignoring the stack setup for now, it looks like extractps is selected too eagerly for an extractelement v4f32, 0.

P.S: To quickly test with and without SSE4 support just force X86SSELevel to the desired value in X86Subtarget::AutoDetectSubtargetFeatures().

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

extractps selected too eagerly #3019

Extended Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development


Bugzilla Link	2647
Resolution	FIXED
Resolved on	Oct 31, 2008 09:44
Version	unspecified
OS	Windows NT
Reporter	LLVM Bugzilla Contributor
CC	@sunfishcode,@nlewycky

extractps selected too eagerly #3019

Description

Extended Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions