Suboptimal code for struct of floats #1842

JohanEngelen · 2016-10-19T16:13:11Z

struct Vector { float x, y, z; }
float silly(Vector v) { return v.x * 5; }
float better(float x) { return x * 5; }

results in (-O3):

float example.silly(example.Vector):
        movd    %xmm0, %rax
        movd    %eax, %xmm0
        mulss   LCPI2_0(%rip), %xmm0
        retq

float example.better(float):
        mulss   LCPI3_0(%rip), %xmm0
        retq

missed optimization opportunity, or am I missing something?

The text was updated successfully, but these errors were encountered:

kinke · 2016-10-19T17:14:40Z

First of all, this is most likely specific to the x86_64 System V ABI. I tend to blame DMD's toArgTypes() for this, rewriting the 3 floats to 2 doubles in order to pass it in a XMM register. I guess 4 floats would be more adequate and help LLVM. Unoptimized IR:

define float @_D7current5sillyFS7current6VectorZf({ double, double } %v_arg) #0 comdat {
  %.X86_64_C_struct_rewrite_dump = alloca { double, double }, align 4 ; [#uses = 2, size/byte = 16]
  store { double, double } %v_arg, { double, double }* %.X86_64_C_struct_rewrite_dump
  %v = bitcast { double, double }* %.X86_64_C_struct_rewrite_dump to %current.Vector* ; [#uses = 1]
  %1 = getelementptr inbounds %current.Vector, %current.Vector* %v, i32 0, i32 0 ; [#uses = 1, type = float*]
  %2 = load float, float* %1                      ; [#uses = 1]
  %3 = fmul float %2, 5.000000e+00                ; [#uses = 1]
  ret float %3
}

-O3:

define float @_D7current5sillyFS7current6VectorZf({ double, double } %v_arg) local_unnamed_addr #2 comdat {
  %v_arg.fca.0.extract = extractvalue { double, double } %v_arg, 0 ; [#uses = 1]
  %1 = bitcast double %v_arg.fca.0.extract to i64 ; [#uses = 1]
  %2 = trunc i64 %1 to i32                        ; [#uses = 1]
  %3 = bitcast i32 %2 to float                    ; [#uses = 1]
  %4 = fmul float %3, 5.000000e+00                ; [#uses = 1]
  ret float %4
}

dnadlinger · 2016-10-19T17:35:44Z

Clang rewrites it as (<2 x float> %f.coerce0, float %f.coerce1), completely flattening away the struct.

JohanEngelen added the B-suboptimal-code label Oct 19, 2016

JohanEngelen mentioned this issue May 8, 2017

strange struct code gen #2094

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suboptimal code for struct of floats #1842

Suboptimal code for struct of floats #1842

JohanEngelen commented Oct 19, 2016

kinke commented Oct 19, 2016

dnadlinger commented Oct 19, 2016

Suboptimal code for struct of floats #1842

Suboptimal code for struct of floats #1842

Comments

JohanEngelen commented Oct 19, 2016

kinke commented Oct 19, 2016

dnadlinger commented Oct 19, 2016