Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Fix error in #88512. #92770

Merged
merged 3 commits into from
May 20, 2024
Merged

[AMDGPU] Fix error in #88512. #92770

merged 3 commits into from
May 20, 2024

Conversation

PeddleSpam
Copy link
Contributor

@PeddleSpam PeddleSpam commented May 20, 2024

Fixes error in GlobalISel CTLZ lowering caused by #88512.

@llvmbot
Copy link

llvmbot commented May 20, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-amdgpu

Author: Leon Clark (PeddleSpam)

Changes
  • Reapply "[ctx_profile] Integration test (#92456)"
  • [Github] Revert accidental changes to dependabot config
  • Fix: remove wrongly pushed etime-function.mlir at toplevel (#92634)
  • [MCAsmParser] .macro/.rept/.irp/.irpc: remove excess \n after expansion
  • [flang][OpenMP] Re-enable tests when building OpenMP as a runtime (#89046)
  • [flang][OpenMP] Try to unify induction var privatization for OMP regions. (#91116)
  • [MCAsmParser] Improve .rept/.irp tests
  • [clang][ThreadSafety] Skip past implicit cast in translateAttrExpr
  • [clang][NFC] Further improvements to const-correctness
  • [GlobalIsel] Combine select to integer min max more (#92570)
  • [X86][CodeGen] Support flags copy lowering for CCMP/CTEST (#91849)
  • [mlir] Add operator<< for printing Block (#92550)
  • [flang][cuf] Add attr gen dependency to fix #92635
  • [nfc][ctx_profile] Fix printf - related -Wformat-pedantic
  • [NVPTX] support immediate values in st.param instructions (#91523)
  • [VPlan] Remove unused removeLastOperand (NFC).
  • [dsymutil] Use operator==(StringRef, StringRef) (NFC)
  • [DWARFLinker] Use an implicit conversion of SmallString to StringRef (NFC)
  • [DXIL] Use consistent SmallVector parameters
  • [DAG] Use copysign in frem power-2 fold. (#91751)
  • [VectorCombine] Don't transform single shuffles in shuffleToIdentity
  • update_test_checks: match IR basic block labels (#88979)
  • [ThinLTO]Sort imported GUIDs before cache key update (#92622)
  • [nfc][InstrFDO]Encapsulate header writes in a class member function (#90142)
  • Reformat
  • Quick fix for a waning in clang_rt.ctx_profile [-Wgnu-anonymous-struct]
  • [NewPM][AMDGPU] Add CodeGenPassBuilder (#91040)
  • [gn build] Port b4ba3fe
  • [GISel][RISCV] Legalize G_CONSTANT_FOLD_BARRIER (#89960)
  • [VectorCombine] Additional extend tests for shuffleToIdentity. NFC
  • [DAG] canCreateUndefOrPoison - merge INSERT_VECTOR_ELT/EXTRACT_VECTOR_ELT cases. NFC.
  • [ctx_profile] Pass lib path into test
  • [DAG] canCreateUndefOrPoison - only compute extract/index vector elt index knownbits when not poison
  • [DAG] visitAVG - rewrite "fold (avgfloor x, 0) -> x >> 1" to use SDPatternMatch
  • [DAG] visitABD - rewrite "(abs x, 0)" folds to use SDPatternMatch
  • Revert "[Bounds-Safety] Temporarily relax a counted_by attribute restriction on flexible array members"
  • Revert "[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C (#90786)"
  • Revert "[Bounds-Safety] Fix pragma-attribute-supported-attributes-list.test"
  • [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182)
  • [CodeGen][SDAG] Skip preferred extend at O0 (#92643)
  • [CodeGen][SDAG] Track returntwice in lowering info (#92640)
  • [llvm] Add KnownBits implementations for avgFloor and avgCeil (#86445)
  • SimplifyLibCalls: Permit pow(2, x) -> ldexp(1, x) fold for vectors (#92532)
  • [VPlan] Simplify (X && Y) || (X && !Y) -> X. (#89386)
  • HLSL availability diagnostics design doc (#92207)
  • [DOCS] ORCv2.rst Typo (#89482)
  • [Clang][HLSL] Add environment parameter to availability attribute (#89809)
  • ValueTracking: Correct undef handling for constant FP vectors (#92557)
  • [BOLT] Fix preserved offset in fixDoubleJumps (#92485)
  • [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (#88512)
  • [TableGen] Avoid std::string copy. NFC
  • Update llvm-bugs.yml (#77243)
  • [llvm] Use operator==(StringRef, StringRef) (NFC) (#92705)
  • [clang-format][NFC] Clean up SortIncludesTest.cpp
  • [mlir] Use operator==(StringRef, StringRef) (NFC) (#92706)
  • [CallPromotionUtils]Implement conditional indirect call promotion with vtable-based comparison (#81378)
  • [clang] Use operator==(StringRef, StringRef) (NFC) (#92708)
  • [SDAG][X86] Extend SplitVecOp_VSETCC for STRICT_FSETCC. (#92509)
  • [llvm] Use StringRef::contains (NFC) (#92710)
  • [Serialization] Read the initializer for interesting static variables before consuming it (#92353)
  • [BOLT][NFC] Don't assign YAML profile to functions with no CFG (#92487)
  • [InstCombine] Fold pointer adding in integer to arithmetic add (#91596)
  • [AMDGPU] Use removeFnAttrFromReachable in lower-module-lds pass. (#92686)
  • [AMDGPU] Fix kernarg preloading crash with some types and alignments (#91625)
  • [ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option (#88024)
  • [NFC] Remove unused ASTWriter::getTypeID
  • [SCEV] Don't use non-deterministic constant folding for trip counts (#90942)
  • Revert "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#92715)
  • [llvm] Use SmallString::str (NFC) (#92712)
  • [AMDGPU] Only set Info.memVT when not later overridden (#92670)
  • [MC] Make UseAssemblerInfoForParsing mostly true
  • MIPS: Support '%w' token in inline asm template for MSA (#91920)
  • Clang/MIPS: Add +fp64 if MSA and no explicit -mfp option (#91949)
  • MIPS/Clang: Use FP32 by default if CPU is mips1 (#92122)
  • [ELF] Support high address DW_EH_sdata4 for ELFCLASS32
  • [PowerPC]perform bitcast lowering only at 64 bit
  • [LoongArch] Select {DIV,MOD}.{W,WU} instruction to eliminate explicit sign extension (#92205)
  • [Clang] Fix __is_array returning true for zero-sized arrays (#86652)
  • [OpenCL] Add cl_khr_kernel_clock builtins (#91950)
  • [clang][ExtractAPI] Remove symbols defined in categories to external types unless requested (#92522)
  • [RISCV][CostModel] Remove cost of icmp inst in icmp+select with SFB. (#91158)
  • [DebugInfo][GVNSink] Fix #77415: GVNSink fails to optimize LLVM IR with debug info (#77602)
  • [AArch64] Add PreTest for optimizing MOV to ORR
  • [Driver][PS5] Set visibility option defaults (#92091)
  • [AArch64] Optimize MOV to ORR when load symmetric constants (#86249)
  • [Coverage] Rework !SystemHeadersCoverage (#91446)
  • [lldb][Windows] Fixed LibcxxChronoTimePointSecondsSummaryProvider() (#92701)
  • [ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872)
  • InstSimplify: increase shufflevector test coverage (#92407)
  • [flang][HLFIR] Adapt SimplifyHLFIRIntrinsics to run on all top level ops (#92573)
  • movimm-expand-ldst.mir (d3d6565) requires asserts
  • [SLP] NFC. Use TreeEntry::getOperand if setOperandsInOrder is called (#92727)
  • [MLIR][OpenMP] NFC: Split OpenMP dialect definitions (#91741)
  • [mlir][irdl] Fix missing verifier in irdl.parametric (#92700)
  • [VPlan] Add commutative binary OR matcher, use in transform. (#92539)
  • [CloneFunction] Remove check that is no longer necessary (#92577)
  • [ValueTracking] Fix incorrect inferrence about the signbit of sqrt (#92510)
  • [LAA] Add tests with invariant accesses using vector types.
  • [clang] CTAD alias: Fix missing template arg packs during the transformation (#92535)
  • [TableGen] HasOneUse builtin predicate on PatFrags (#91578)
  • [clang] Make PS template DLL attribute propagation the same as MSVC (#92549)
  • [DebugInfo][NaryReassociate] Fix missing debug location updates (#92545)
  • [clang] Use SmallString::str (NFC) (#92717)
  • [libcxx] locale.cpp: Move build_name helper into unnamed namespace (#92461)
  • [Offload] Remove unused version script for plugins
  • [AMDGPU] Fix error in #88512.

Full diff: https://github.com/llvm/llvm-project/pull/92770.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+1-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 15a4b6796880f..3523fcc7dbd50 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -4168,7 +4168,7 @@ bool AMDGPULegalizerInfo::legalizeCTLZ_ZERO_UNDEF(MachineInstr &MI,
 
   auto ShiftAmt = B.buildConstant(S32, 32u - NumBits);
   auto Extend = B.buildAnyExt(S32, {Src}).getReg(0u);
-  auto Shift = B.buildLShr(S32, {Extend}, ShiftAmt);
+  auto Shift = B.buildShl(S32, {Extend}, ShiftAmt);
   auto Ctlz = B.buildInstr(AMDGPU::G_AMDGPU_FFBH_U32, {S32}, {Shift});
   B.buildTrunc(Dst, Ctlz);
   MI.eraseFromParent();

@PeddleSpam PeddleSpam requested review from jayfoad and arsenm May 20, 2024 15:24
@jayfoad
Copy link
Contributor

jayfoad commented May 20, 2024

This ought to require an update to the tests too.

@PeddleSpam PeddleSpam merged commit e1c06c3 into llvm:main May 20, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants