Skip to content

[SLP] Suboptimal codegen for {x, y} ↦ {x+x, x+y} #40801

@dsprenkels

Description

@dsprenkels
Bugzilla Link 41456
Version trunk
OS Linux
CC @topperc,@LebedevRI,@RKSimon,@rotateright,@vporpo

Extended Description

https://godbolt.org/z/2GE_vO


The code snippet

#include <xmmintrin.h>

__m128i example(const __m128i vec) {
    return (__m128i){2 * vec[0], vec[0] + vec[1]};
}

is compiled to code that unpacks and repacks the values to the general purpose registers. This uses 7 instructions. However, the same could easily be achieved using

example:
    vpbroadcastq xmm1, xmm0
    vpaddq xmm0, xmm0, xmm1
    ret
```.

Other instructions, instead of vpbroadcastq can also be used; like vmovddup, vpermilpd, etc.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions