Skip to content

[NewGVN][InstCombine] missed load elimination #112232

@nurmukhametov

Description

@nurmukhametov

Consider the following code

%v8_uniform_FVector4 = type { [4 x double] }

define void @foo(ptr %ret_mem, <4 x double> %a, <4 x double> %b) {
  %ptr = alloca %v8_uniform_FVector4, align 8
  store <4 x double> %a, ptr %ptr, align 8
  %e0 = load double, ptr %ptr, align 8
  %of1 = getelementptr inbounds %v8_uniform_FVector4, ptr %ptr, i64 0, i32 0, i64 1
  %e1 = load double, ptr %of1, align 8
  %o2 = getelementptr inbounds %v8_uniform_FVector4, ptr %ptr, i64 0, i32 0, i64 2
  %e2 = load double, ptr %o2, align 8
  %o3 = getelementptr inbounds %v8_uniform_FVector4, ptr %ptr, i64 0, i32 0, i64 3
  %e3 = load double, ptr %o3, align 8
  %vec0 = insertelement <4 x double> poison, double %e1, i64 0
  %vec1 = insertelement <4 x double> %vec0, double %e2, i64 1
  %vec2 = insertelement <4 x double> %vec1, double %e0, i64 2
  %vec3 = insertelement <4 x double> %vec2, double %e3, i64 3
  store <4 x double> %vec3, ptr %ret_mem, align 8
  ret void
}

NewGVN and InstCombine miss to optimize loads, whereas GVN and InstCombine managed to do that as follows

define void @foo(ptr %ret_mem, <4 x double> %a, <4 x double> %b) {
  %vec3 = shufflevector <4 x double> %a, <4 x double> poison, <4 x i32> <i32 1, i32 2, i32 0, i32 3>
  store <4 x double> %vec3, ptr %ret_mem, align 8
  ret void
}

Compiler explorer link: https://godbolt.org/z/sbKv81a6s

Metadata

Metadata

Assignees

No one assigned

    Labels

    llvm:GVNGVN and NewGVN stages (Global value numbering)llvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passesmissed-optimization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions