-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] Run X86FastPreTileConfigPass only with FastISel. #70754
Conversation
@llvm/pr-subscribers-backend-x86 Author: Evgenii Kudriashov (e-kud) ChangesCloses #64452 Full diff: https://github.com/llvm/llvm-project/pull/70754.diff 3 Files Affected:
diff --git a/llvm/lib/Target/X86/X86FastPreTileConfig.cpp b/llvm/lib/Target/X86/X86FastPreTileConfig.cpp
index ea942445a18193c..98621eae5861d87 100644
--- a/llvm/lib/Target/X86/X86FastPreTileConfig.cpp
+++ b/llvm/lib/Target/X86/X86FastPreTileConfig.cpp
@@ -667,7 +667,9 @@ bool X86FastPreTileConfig::runOnMachineFunction(MachineFunction &MFunc) {
bool HasVirtTileReg = false;
for (unsigned I = 0, E = NumVirtRegs; I != E; ++I) {
Register VirtReg = Register::index2VirtReg(I);
- if (MRI->getRegClass(VirtReg)->getID() == X86::TILERegClassID) {
+ const TargetRegisterClass *RC = MRI->getRegClassOrNull(VirtReg);
+ assert(RC && "A register must have a class after FastISel");
+ if (RC->getID() == X86::TILERegClassID) {
HasVirtTileReg = true;
break;
}
diff --git a/llvm/lib/Target/X86/X86TargetMachine.cpp b/llvm/lib/Target/X86/X86TargetMachine.cpp
index 5668b514d6dec07..4698123ff00fa77 100644
--- a/llvm/lib/Target/X86/X86TargetMachine.cpp
+++ b/llvm/lib/Target/X86/X86TargetMachine.cpp
@@ -531,10 +531,10 @@ void X86PassConfig::addPreRegAlloc() {
addPass(createX86FlagsCopyLoweringPass());
addPass(createX86DynAllocaExpander());
- if (getOptLevel() != CodeGenOptLevel::None)
- addPass(createX86PreTileConfigPass());
- else
+ if (TM->Options.EnableFastISel)
addPass(createX86FastPreTileConfigPass());
+ else
+ addPass(createX86PreTileConfigPass());
}
void X86PassConfig::addMachineSSAOptimization() {
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 402645ed1e2e5d6..5943e3bca0d34d9 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -1,95 +1,111 @@
; When EXPENSIVE_CHECKS are enabled, the machine verifier appears between each
; pass. Ignore it with 'grep -v'.
; RUN: llc -mtriple=x86_64-- -O0 -debug-pass=Structure < %s -o /dev/null 2>&1 \
-; RUN: | grep -v 'Verify generated machine code' | FileCheck %s
+; RUN: | grep -v 'Verify generated machine code' | FileCheck %s --check-prefixes=COMMON,SDISEL,DEFAULT
+; RUN: llc -mtriple=x86_64-- -O0 -fast-isel=0 -debug-pass=Structure < %s -o /dev/null 2>&1 \
+; RUN: | grep -v 'Verify generated machine code' | FileCheck %s --check-prefixes=COMMON,SDISEL,NOFASTISEL
+; RUN: llc -mtriple=x86_64-- -O0 -global-isel -fast-isel=0 -debug-pass=Structure < %s -o /dev/null 2>&1 \
+; RUN: | grep -v 'Verify generated machine code' | FileCheck %s --check-prefixes=COMMON,GISEL,NOFASTISEL
; REQUIRES: asserts
-; CHECK-LABEL: Pass Arguments:
-; CHECK-NEXT: Target Library Information
-; CHECK-NEXT: Target Pass Configuration
-; CHECK-NEXT: Machine Module Information
-; CHECK-NEXT: Target Transform Information
-; CHECK-NEXT: Create Garbage Collector Module Metadata
-; CHECK-NEXT: Assumption Cache Tracker
-; CHECK-NEXT: Profile summary info
-; CHECK-NEXT: Machine Branch Probability Analysis
-; CHECK-NEXT: ModulePass Manager
-; CHECK-NEXT: Pre-ISel Intrinsic Lowering
-; CHECK-NEXT: FunctionPass Manager
-; CHECK-NEXT: Expand large div/rem
-; CHECK-NEXT: Expand large fp convert
-; CHECK-NEXT: Expand Atomic instructions
-; CHECK-NEXT: Lower AMX intrinsics
-; CHECK-NEXT: Lower AMX type for load/store
-; CHECK-NEXT: Module Verifier
-; CHECK-NEXT: Lower Garbage Collection Instructions
-; CHECK-NEXT: Shadow Stack GC Lowering
-; CHECK-NEXT: Lower constant intrinsics
-; CHECK-NEXT: Remove unreachable blocks from the CFG
-; CHECK-NEXT: Expand vector predication intrinsics
-; CHECK-NEXT: Scalarize Masked Memory Intrinsics
-; CHECK-NEXT: Expand reduction intrinsics
-; CHECK-NEXT: Expand indirectbr instructions
-; CHECK-NEXT: Exception handling preparation
-; CHECK-NEXT: Prepare callbr
-; CHECK-NEXT: Safe Stack instrumentation pass
-; CHECK-NEXT: Insert stack protectors
-; CHECK-NEXT: Module Verifier
-; CHECK-NEXT: Assignment Tracking Analysis
-; CHECK-NEXT: X86 DAG->DAG Instruction Selection
-; CHECK-NEXT: X86 PIC Global Base Reg Initialization
-; CHECK-NEXT: Argument Stack Rebase
-; CHECK-NEXT: Finalize ISel and expand pseudo-instructions
-; CHECK-NEXT: Local Stack Slot Allocation
-; CHECK-NEXT: X86 speculative load hardening
-; CHECK-NEXT: MachineDominator Tree Construction
-; CHECK-NEXT: X86 EFLAGS copy lowering
-; CHECK-NEXT: X86 DynAlloca Expander
-; CHECK-NEXT: Fast Tile Register Preconfigure
-; CHECK-NEXT: Eliminate PHI nodes for register allocation
-; CHECK-NEXT: Two-Address instruction pass
-; CHECK-NEXT: Fast Register Allocator
-; CHECK-NEXT: Fast Tile Register Configure
-; CHECK-NEXT: X86 Lower Tile Copy
-; CHECK-NEXT: Bundle Machine CFG Edges
-; CHECK-NEXT: X86 FP Stackifier
-; CHECK-NEXT: Remove Redundant DEBUG_VALUE analysis
-; CHECK-NEXT: Fixup Statepoint Caller Saved
-; CHECK-NEXT: Lazy Machine Block Frequency Analysis
-; CHECK-NEXT: Machine Optimization Remark Emitter
-; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization
-; CHECK-NEXT: Post-RA pseudo instruction expansion pass
-; CHECK-NEXT: X86 pseudo instruction expansion pass
-; CHECK-NEXT: Insert KCFI indirect call checks
-; CHECK-NEXT: Analyze Machine Code For Garbage Collection
-; CHECK-NEXT: Insert fentry calls
-; CHECK-NEXT: Insert XRay ops
-; CHECK-NEXT: Implement the 'patchable-function' attribute
-; CHECK-NEXT: X86 Indirect Branch Tracking
-; CHECK-NEXT: X86 vzeroupper inserter
-; CHECK-NEXT: Compressing EVEX instrs to VEX encoding when possibl
-; CHECK-NEXT: X86 Discriminate Memory Operands
-; CHECK-NEXT: X86 Insert Cache Prefetches
-; CHECK-NEXT: X86 insert wait instruction
-; CHECK-NEXT: Contiguously Lay Out Funclets
-; CHECK-NEXT: StackMap Liveness Analysis
-; CHECK-NEXT: Live DEBUG_VALUE analysis
-; CHECK-NEXT: Machine Sanitizer Binary Metadata
-; CHECK-NEXT: Lazy Machine Block Frequency Analysis
-; CHECK-NEXT: Machine Optimization Remark Emitter
-; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression
-; CHECK-NEXT: X86 Indirect Thunks
-; CHECK-NEXT: X86 Return Thunks
-; CHECK-NEXT: Check CFA info and insert CFI instructions if needed
-; CHECK-NEXT: X86 Load Value Injection (LVI) Ret-Hardening
-; CHECK-NEXT: Pseudo Probe Inserter
-; CHECK-NEXT: Unpack machine instruction bundles
-; CHECK-NEXT: Lazy Machine Block Frequency Analysis
-; CHECK-NEXT: Machine Optimization Remark Emitter
-; CHECK-NEXT: X86 Assembly Printer
-; CHECK-NEXT: Free MachineFunction
+; COMMON-LABEL: Pass Arguments:
+; COMMON-NEXT: Target Library Information
+; COMMON-NEXT: Target Pass Configuration
+; COMMON-NEXT: Machine Module Information
+; COMMON-NEXT: Target Transform Information
+; COMMON-NEXT: Create Garbage Collector Module Metadata
+; COMMON-NEXT: Assumption Cache Tracker
+; SDISEL-NEXT: Profile summary info
+; COMMON-NEXT: Machine Branch Probability Analysis
+; COMMON-NEXT: ModulePass Manager
+; COMMON-NEXT: Pre-ISel Intrinsic Lowering
+; COMMON-NEXT: FunctionPass Manager
+; COMMON-NEXT: Expand large div/rem
+; COMMON-NEXT: Expand large fp convert
+; COMMON-NEXT: Expand Atomic instructions
+; COMMON-NEXT: Lower AMX intrinsics
+; COMMON-NEXT: Lower AMX type for load/store
+; COMMON-NEXT: Module Verifier
+; COMMON-NEXT: Lower Garbage Collection Instructions
+; COMMON-NEXT: Shadow Stack GC Lowering
+; COMMON-NEXT: Lower constant intrinsics
+; COMMON-NEXT: Remove unreachable blocks from the CFG
+; COMMON-NEXT: Expand vector predication intrinsics
+; COMMON-NEXT: Scalarize Masked Memory Intrinsics
+; COMMON-NEXT: Expand reduction intrinsics
+; COMMON-NEXT: Expand indirectbr instructions
+; COMMON-NEXT: Exception handling preparation
+; COMMON-NEXT: Prepare callbr
+; COMMON-NEXT: Safe Stack instrumentation pass
+; COMMON-NEXT: Insert stack protectors
+; COMMON-NEXT: Module Verifier
+; SDISEL-NEXT: Assignment Tracking Analysis
+; SDISEL-NEXT: X86 DAG->DAG Instruction Selection
+; SDISEL-NEXT: X86 PIC Global Base Reg Initialization
+; SDISEL-NEXT: Argument Stack Rebase
+; GISEL-NEXT: Analysis containing CSE Info
+; GISEL-NEXT: IRTranslator
+; GISEL-NEXT: Analysis containing CSE Info
+; GISEL-NEXT: Analysis for ComputingKnownBits
+; GISEL-NEXT: Legalizer
+; GISEL-NEXT: RegBankSelect
+; GISEL-NEXT: Analysis for ComputingKnownBits
+; GISEL-NEXT: InstructionSelect
+; GISEL-NEXT: ResetMachineFunction
+; COMMON-NEXT: Finalize ISel and expand pseudo-instructions
+; COMMON-NEXT: Local Stack Slot Allocation
+; COMMON-NEXT: X86 speculative load hardening
+; COMMON-NEXT: MachineDominator Tree Construction
+; COMMON-NEXT: X86 EFLAGS copy lowering
+; COMMON-NEXT: X86 DynAlloca Expander
+; DEFAULT-NEXT: Fast Tile Register Preconfigure
+; NOFASTISEL-NEXT: MachineDominator Tree Construction
+; NOFASTISEL-NEXT: Machine Natural Loop Construction
+; NOFASTISEL-NEXT: Tile Register Pre-configure
+; COMMON-NEXT: Eliminate PHI nodes for register allocation
+; COMMON-NEXT: Two-Address instruction pass
+; COMMON-NEXT: Fast Register Allocator
+; COMMON-NEXT: Fast Tile Register Configure
+; COMMON-NEXT: X86 Lower Tile Copy
+; COMMON-NEXT: Bundle Machine CFG Edges
+; COMMON-NEXT: X86 FP Stackifier
+; COMMON-NEXT: Remove Redundant DEBUG_VALUE analysis
+; COMMON-NEXT: Fixup Statepoint Caller Saved
+; COMMON-NEXT: Lazy Machine Block Frequency Analysis
+; COMMON-NEXT: Machine Optimization Remark Emitter
+; COMMON-NEXT: Prologue/Epilogue Insertion & Frame Finalization
+; COMMON-NEXT: Post-RA pseudo instruction expansion pass
+; COMMON-NEXT: X86 pseudo instruction expansion pass
+; COMMON-NEXT: Insert KCFI indirect call checks
+; COMMON-NEXT: Analyze Machine Code For Garbage Collection
+; COMMON-NEXT: Insert fentry calls
+; COMMON-NEXT: Insert XRay ops
+; COMMON-NEXT: Implement the 'patchable-function' attribute
+; COMMON-NEXT: X86 Indirect Branch Tracking
+; COMMON-NEXT: X86 vzeroupper inserter
+; COMMON-NEXT: Compressing EVEX instrs to VEX encoding when possibl
+; COMMON-NEXT: X86 Discriminate Memory Operands
+; COMMON-NEXT: X86 Insert Cache Prefetches
+; COMMON-NEXT: X86 insert wait instruction
+; COMMON-NEXT: Contiguously Lay Out Funclets
+; COMMON-NEXT: StackMap Liveness Analysis
+; COMMON-NEXT: Live DEBUG_VALUE analysis
+; COMMON-NEXT: Machine Sanitizer Binary Metadata
+; COMMON-NEXT: Lazy Machine Block Frequency Analysis
+; COMMON-NEXT: Machine Optimization Remark Emitter
+; COMMON-NEXT: Stack Frame Layout Analysis
+; COMMON-NEXT: X86 Speculative Execution Side Effect Suppression
+; COMMON-NEXT: X86 Indirect Thunks
+; COMMON-NEXT: X86 Return Thunks
+; COMMON-NEXT: Check CFA info and insert CFI instructions if needed
+; COMMON-NEXT: X86 Load Value Injection (LVI) Ret-Hardening
+; COMMON-NEXT: Pseudo Probe Inserter
+; COMMON-NEXT: Unpack machine instruction bundles
+; COMMON-NEXT: Lazy Machine Block Frequency Analysis
+; COMMON-NEXT: Machine Optimization Remark Emitter
+; COMMON-NEXT: X86 Assembly Printer
+; COMMON-NEXT: Free MachineFunction
define void @f() {
ret void
|
This is a revision from the phabricator https://reviews.llvm.org/D157458. It's not I intentionally do not fix comments in |
const TargetRegisterClass *RC = MRI->getRegClassOrNull(VirtReg); | ||
assert(RC && "A register must have a class after FastISel"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use getRegClass instead of doing your own assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should have a registers have a class function property to go with the others that the PM would enforce
This is not the kind of problem that can be correctly solved by changing the pass set. The MIR must follow a consistent set of rules at all points. All selectors must be bound to emit MIR that respects the rules. Shuffling around which passes run with which selector is expected will not work reliably, especially with fallbacks. Every selector should be guaranteeing no virtual registers without classes are passed through |
Agreed. Also I've realized that generally we want to keep this pass for GlobalISel as it allows a fast skip at O0. Closing in favor of the original issue that leaves registers without classes. |
Closes #64452