Skip to content

Commit

Permalink
[clang-offload-bundler] Make Bundle Entry ID backward compatible
Browse files Browse the repository at this point in the history
Earlier BundleEntryID used to be <OffloadKind>-<Triple>-<GPUArch>.
This used to work because the clang-offload-bundler didn't need
GPUArch explicitly for any bundling/unbundling action. With
unbundleArchive it needs GPUArch to ensure compatibility between
device specific code objects. D93525 enforced triples to have
separators for all 4 components irrespective of number of
components, like "amdgcn-amd-amdhsa--". It was required to
to correctly parse a possible 4th environment component or a GPU.
But, this condition is breaking backward compatibility with
archive libraries compiled with compilers older than D93525.

This patch allows triples to have any number of components with
and without extra separator for empty environment field. Thus,
both the following bundle entry IDs are same:
openmp-amdgcn-amd-amdhsa--gfx906
openmp-amdgcn-amd-amdhsa-gfx906

Reviewed By: yaxunl, grokos

Differential Revision: https://reviews.llvm.org/D106809
  • Loading branch information
saiislam committed Sep 8, 2021
1 parent 7fb66d4 commit 9838076
Show file tree
Hide file tree
Showing 6 changed files with 73 additions and 56 deletions.
10 changes: 1 addition & 9 deletions clang/docs/ClangOffloadBundler.rst
Expand Up @@ -121,15 +121,7 @@ Where:
============= ==============================================================

**target-triple**
The target triple of the code object:

.. code::
<Architecture>-<Vendor>-<OS>-<Environment>
It is required to have all four components present, if target-id is present.
Components are hyphen separated. If a component is not specified then the
empty string must be used in its place.
The target triple of the code object.

**target-id**
The canonical target ID of the code object. Present only if the target
Expand Down
34 changes: 13 additions & 21 deletions clang/lib/Driver/ToolChains/Clang.cpp
Expand Up @@ -7686,16 +7686,12 @@ void OffloadBundler::ConstructJob(Compilation &C, const JobAction &JA,
});
}
Triples += Action::GetOffloadKindName(CurKind);
Triples += "-";
std::string NormalizedTriple = CurTC->getTriple().normalize();
Triples += NormalizedTriple;

if (CurDep->getOffloadingArch() != nullptr) {
// If OffloadArch is present it can only appear as the 6th hypen
// sepearated field of Bundle Entry ID. So, pad required number of
// hyphens in Triple.
for (int i = 4 - StringRef(NormalizedTriple).count("-"); i > 0; i--)
Triples += "-";
Triples += '-';
Triples += CurTC->getTriple().normalize();
if ((CurKind == Action::OFK_HIP || CurKind == Action::OFK_OpenMP ||
CurKind == Action::OFK_Cuda) &&
CurDep->getOffloadingArch()) {
Triples += '-';
Triples += CurDep->getOffloadingArch();
}
}
Expand Down Expand Up @@ -7768,17 +7764,13 @@ void OffloadBundler::ConstructJobMultipleOutputs(

auto &Dep = DepInfo[I];
Triples += Action::GetOffloadKindName(Dep.DependentOffloadKind);
Triples += "-";
std::string NormalizedTriple =
Dep.DependentToolChain->getTriple().normalize();
Triples += NormalizedTriple;

if (!Dep.DependentBoundArch.empty()) {
// If OffloadArch is present it can only appear as the 6th hypen
// sepearated field of Bundle Entry ID. So, pad required number of
// hyphens in Triple.
for (int i = 4 - StringRef(NormalizedTriple).count("-"); i > 0; i--)
Triples += "-";
Triples += '-';
Triples += Dep.DependentToolChain->getTriple().normalize();
if ((Dep.DependentOffloadKind == Action::OFK_HIP ||
Dep.DependentOffloadKind == Action::OFK_OpenMP ||
Dep.DependentOffloadKind == Action::OFK_Cuda) &&
!Dep.DependentBoundArch.empty()) {
Triples += '-';
Triples += Dep.DependentBoundArch;
}
}
Expand Down
20 changes: 17 additions & 3 deletions clang/test/Driver/clang-offload-bundler.c
Expand Up @@ -382,16 +382,30 @@
// Check archive unbundling
//
// Create few code object bundles and archive them to create an input archive
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -inputs=%t.o,%t.tgt1,%t.tgt2 -outputs=%t.simple.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -inputs=%t.o,%t.tgt1,%t.tgt2 -outputs=%t.simple.bundle
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx903 -inputs=%t.o,%t.tgt1 -outputs=%t.simple1.bundle
// RUN: llvm-ar cr %t.input-archive.a %t.simple.bundle %t.simple1.bundle

// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa--gfx908 -inputs=%t.input-archive.a -outputs=%t-archive-gfx906-simple.a,%t-archive-gfx908-simple.a
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa-gfx906,openmp-amdgcn-amd-amdhsa-gfx908 -inputs=%t.input-archive.a -outputs=%t-archive-gfx906-simple.a,%t-archive-gfx908-simple.a
// RUN: llvm-ar t %t-archive-gfx906-simple.a | FileCheck %s -check-prefix=GFX906
// GFX906: simple-openmp-amdgcn-amd-amdhsa--gfx906
// GFX906: simple-openmp-amdgcn-amd-amdhsa-gfx906
// RUN: llvm-ar t %t-archive-gfx908-simple.a | FileCheck %s -check-prefix=GFX908
// GFX908-NOT: {{gfx906}}

// Check for error if no compatible code object is found in the heterogeneous archive library
// RUN: not clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa-gfx803 -inputs=%t.input-archive.a -outputs=%t-archive-gfx803-incompatible.a 2>&1 | FileCheck %s -check-prefix=INCOMPATIBLEARCHIVE
// INCOMPATIBLEARCHIVE: error: no compatible code object found for the target 'openmp-amdgcn-amd-amdhsa-gfx803' in heterogeneous archive library

// Check creation of empty archive if allow-missing-bundles is present and no compatible code object is found in the heterogeneous archive library
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa-gfx803 -inputs=%t.input-archive.a -outputs=%t-archive-gfx803-empty.a -allow-missing-bundles
// RUN: cat %t-archive-gfx803-empty.a | FileCheck %s -check-prefix=EMPTYARCHIVE
// EMPTYARCHIVE: !<arch>

// Tests to check compatibility between Bundle Entry ID formats i.e. between presence/absence of extra hyphen in case of missing environment field
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx906,openmp-amdgcn-amd-amdhsa-gfx908 -inputs=%t.input-archive.a -outputs=%t-archive-gfx906-simple.a,%t-archive-gfx908-simple.a -debug-only=CodeObjectCompatibility 2>&1 | FileCheck %s -check-prefix=BUNDLECOMPATIBILITY
// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa-gfx906] : [Target: openmp-amdgcn-amd-amdhsa--gfx906]
// BUNDLECOMPATIBILITY: Compatible: Exact match: [CodeObject: openmp-amdgcn-amd-amdhsa--gfx908] : [Target: openmp-amdgcn-amd-amdhsa-gfx908]

// Some code so that we can create a binary out of this file.
int A = 0;
void test_func(void) {
Expand Down
8 changes: 4 additions & 4 deletions clang/test/Driver/hip-rdc-device-only.hip
Expand Up @@ -82,7 +82,7 @@
// COMMON-SAME: {{.*}} {{".*a.cu"}}

// COMMON: "{{.*}}clang-offload-bundler" "-type={{(bc|ll)}}"
// COMMON-SAME: "-targets=hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// COMMON-SAME: "-targets=hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// COMMON-SAME: "-outputs=a-hip-amdgcn-amd-amdhsa.{{(bc|ll)}}"

// COMMON: [[CLANG]] "-cc1" "-triple" "amdgcn-amd-amdhsa"
Expand Down Expand Up @@ -112,7 +112,7 @@
// COMMON-SAME: {{.*}} {{".*b.hip"}}

// COMMON: "{{.*}}clang-offload-bundler" "-type={{(bc|ll)}}"
// COMMON-SAME: "-targets=hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// COMMON-SAME: "-targets=hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// COMMON-SAME: "-outputs=b-hip-amdgcn-amd-amdhsa.{{(bc|ll)}}"

// SAVETEMP: [[CLANG:".*clang.*"]] "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu"
Expand Down Expand Up @@ -142,7 +142,7 @@
// SAVETEMP-SAME: {{.*}} "-o" {{"a.*.ll"}} "-x" "ir" [[A_GFX900_TMP_BC]]

// SAVETEMP: "{{.*}}clang-offload-bundler" "-type=ll"
// SAVETEMP-SAME: "-targets=hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// SAVETEMP-SAME: "-targets=hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// SAVETEMP-SAME: "-outputs=a-hip-amdgcn-amd-amdhsa.ll"

// SAVETEMP: [[CLANG]] "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu"
Expand Down Expand Up @@ -172,7 +172,7 @@
// SAVETEMP-SAME: {{.*}} "-o" {{"b.*.ll"}} "-x" "ir" [[B_GFX900_TMP_BC]]

// SAVETEMP: "{{.*}}clang-offload-bundler" "-type=ll"
// SAVETEMP-SAME: "-targets=hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// SAVETEMP-SAME: "-targets=hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// SAVETEMP-SAME: "-outputs=b-hip-amdgcn-amd-amdhsa.ll"

// FAIL: error: cannot specify -o when generating multiple output files
12 changes: 6 additions & 6 deletions clang/test/Driver/hip-toolchain-rdc-separate.hip
Expand Up @@ -44,7 +44,7 @@
// CHECK-SAME: {{.*}} [[A_SRC]]

// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
// CHECK-SAME: "-targets=hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900,host-x86_64-unknown-linux-gnu"
// CHECK-SAME: "-targets=hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900,host-x86_64-unknown-linux-gnu"
// CHECK-SAME: "-outputs=[[A_O:.*a.o]]" "-inputs=[[A_BC1]],[[A_BC2]],[[A_OBJ_HOST]]"

// CHECK: [[CLANG]] "-cc1" "-triple" "amdgcn-amd-amdhsa"
Expand Down Expand Up @@ -79,7 +79,7 @@
// CHECK-SAME: {{.*}} [[B_SRC]]

// CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
// CHECK-SAME: "-targets=hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900,host-x86_64-unknown-linux-gnu"
// CHECK-SAME: "-targets=hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900,host-x86_64-unknown-linux-gnu"
// CHECK-SAME: "-outputs=[[B_O:.*b.o]]" "-inputs=[[B_BC1]],[[B_BC2]],[[B_OBJ_HOST]]"

// RUN: touch %T/a.o
Expand All @@ -91,22 +91,22 @@
// RUN: 2>&1 | FileCheck -check-prefix=LINK %s

// LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// LINK-SAME: "-inputs=[[A_O:.*a.o]]" "-outputs=[[A_OBJ_HOST:.*o]],{{.*o}},{{.*o}}"
// LINK: "-unbundle" "-allow-missing-bundles"

// LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// LINK-SAME: "-inputs=[[B_O:.*b.o]]" "-outputs=[[B_OBJ_HOST:.*o]],{{.*o}},{{.*o}}"
// LINK: "-unbundle" "-allow-missing-bundles"

// LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// LINK-SAME: "-inputs=[[A_O]]" "-outputs={{.*o}},[[A_BC1:.*o]],[[A_BC2:.*o]]"
// LINK: "-unbundle" "-allow-missing-bundles"

// LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
// LINK-SAME: "-targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa-gfx803,hip-amdgcn-amd-amdhsa-gfx900"
// LINK-SAME: "-inputs=[[B_O]]" "-outputs={{.*o}},[[B_BC1:.*o]],[[B_BC2:.*o]]"
// LINK: "-unbundle" "-allow-missing-bundles"

Expand Down
45 changes: 32 additions & 13 deletions clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
Expand Up @@ -14,6 +14,7 @@
///
//===----------------------------------------------------------------------===//

#include "clang/Basic/Cuda.h"
#include "clang/Basic/Version.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/SmallString.h"
Expand Down Expand Up @@ -134,21 +135,28 @@ static std::string BundlerExecutable;
/// * Offload Kind - Host, OpenMP, or HIP
/// * Triple - Standard LLVM Triple
/// * GPUArch (Optional) - Processor name, like gfx906 or sm_30
/// In presence of Proc, the Triple should contain separator "-" for all
/// standard four components, even if they are empty.

struct OffloadTargetInfo {
StringRef OffloadKind;
llvm::Triple Triple;
StringRef GPUArch;

OffloadTargetInfo(const StringRef Target) {
SmallVector<StringRef, 6> Components;
Target.split(Components, '-', 5);
Components.resize(6);
this->OffloadKind = Components[0];
this->Triple = llvm::Triple(Components[1], Components[2], Components[3],
Components[4]);
this->GPUArch = Components[5];
auto TargetFeatures = Target.split(':');
auto TripleOrGPU = TargetFeatures.first.rsplit('-');

if (clang::StringToCudaArch(TripleOrGPU.second) !=
clang::CudaArch::UNKNOWN) {
auto KindTriple = TripleOrGPU.first.split('-');
this->OffloadKind = KindTriple.first;
this->Triple = llvm::Triple(KindTriple.second);
this->GPUArch = Target.substr(Target.find(TripleOrGPU.second));
} else {
auto KindTriple = TargetFeatures.first.split('-');
this->OffloadKind = KindTriple.first;
this->Triple = llvm::Triple(KindTriple.second);
this->GPUArch = "";
}
}

bool hasHostKind() const { return this->OffloadKind == "host"; }
Expand Down Expand Up @@ -1063,9 +1071,10 @@ bool isCodeObjectCompatible(OffloadTargetInfo &CodeObjectInfo,

// Compatible in case of exact match.
if (CodeObjectInfo == TargetInfo) {
DEBUG_WITH_TYPE(
"CodeObjectCompatibility",
dbgs() << "Compatible: Exact match: " << CodeObjectInfo.str() << "\n");
DEBUG_WITH_TYPE("CodeObjectCompatibility",
dbgs() << "Compatible: Exact match: \t[CodeObject: "
<< CodeObjectInfo.str()
<< "]\t:\t[Target: " << TargetInfo.str() << "]\n");
return true;
}

Expand Down Expand Up @@ -1276,9 +1285,19 @@ static Error UnbundleArchive() {
} else if (!AllowMissingBundles) {
std::string ErrMsg =
Twine("no compatible code object found for the target '" + Target +
"' in heterogenous archive library: " + IFName)
"' in heterogeneous archive library: " + IFName)
.str();
return createStringError(inconvertibleErrorCode(), ErrMsg);
} else { // Create an empty archive file if no compatible code object is
// found and "allow-missing-bundles" is enabled. It ensures that
// the linker using output of this step doesn't complain about
// the missing input file.
std::vector<llvm::NewArchiveMember> EmptyArchive;
EmptyArchive.clear();
if (Error WriteErr = writeArchive(FileName, EmptyArchive, true,
getDefaultArchiveKindForHost(), true,
false, nullptr))
return WriteErr;
}
}

Expand Down

0 comments on commit 9838076

Please sign in to comment.