Skip to content

Commit

Permalink
[AMDGPU] Fix getEUsPerCU for gfx10 in CU mode
Browse files Browse the repository at this point in the history
Summary:
"Per CU" is a bit simplistic for gfx10, but I couldn't think of a better
name.

Reviewers: arsenm, rampitec, nhaehnle, dstuttard, tpr

Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76861
  • Loading branch information
jayfoad committed Mar 27, 2020
1 parent 152d14d commit a6dfd82
Show file tree
Hide file tree
Showing 2 changed files with 566 additions and 0 deletions.
7 changes: 7 additions & 0 deletions llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
Expand Up @@ -268,6 +268,13 @@ unsigned getLocalMemorySize(const MCSubtargetInfo *STI) {
}

unsigned getEUsPerCU(const MCSubtargetInfo *STI) {
// "Per CU" really means "per whatever functional block the waves of a
// workgroup must share". For gfx10 in CU mode this is the CU, which contains
// two SIMDs.
if (isGFX10(*STI) && STI->getFeatureBits().test(FeatureCuMode))
return 2;
// Pre-gfx10 a CU contains four SIMDs. For gfx10 in WGP mode the WGP contains
// two CUs, so a total of four SIMDs.
return 4;
}

Expand Down

0 comments on commit a6dfd82

Please sign in to comment.