Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP] Reorganize the initialization of PluginAdaptorTy #74397

Merged
merged 1 commit into from
Dec 6, 2023

Conversation

jdoerfert
Copy link
Member

This introduces checked errors into the creation and initialization of PluginAdaptorTy. We also allow the adaptor to "hide" devices from the user if the initialization failed. The new organization avoids the "initOnce" stuff but we still do not eagerly initialize the plugin devices (I think we should merge PluginAdaptorTy::initDevices into PluginAdaptorTy::init)

This introduces checked errors into the creation and initialization of
`PluginAdaptorTy`. We also allow the adaptor to "hide" devices from the
user if the initialization failed. The new organization avoids the
"initOnce" stuff but we still do not eagerly initialize the plugin
devices (I think we should merge `PluginAdaptorTy::initDevices` into
`PluginAdaptorTy::init`)
@jdoerfert jdoerfert added openmp openmp:libomptarget OpenMP offload runtime labels Dec 5, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 5, 2023

@llvm/pr-subscribers-openmp

Author: Johannes Doerfert (jdoerfert)

Changes

This introduces checked errors into the creation and initialization of PluginAdaptorTy. We also allow the adaptor to "hide" devices from the user if the initialization failed. The new organization avoids the "initOnce" stuff but we still do not eagerly initialize the plugin devices (I think we should merge PluginAdaptorTy::initDevices into PluginAdaptorTy::init)


Full diff: https://github.com/llvm/llvm-project/pull/74397.diff

5 Files Affected:

  • (modified) openmp/libomptarget/include/PluginManager.h (+38-14)
  • (modified) openmp/libomptarget/include/device.h (+4-8)
  • (modified) openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp (+1-1)
  • (modified) openmp/libomptarget/src/PluginManager.cpp (+78-53)
  • (modified) openmp/libomptarget/src/device.cpp (+16-23)
diff --git a/openmp/libomptarget/include/PluginManager.h b/openmp/libomptarget/include/PluginManager.h
index bc71e5d70474b..0b0974709b525 100644
--- a/openmp/libomptarget/include/PluginManager.h
+++ b/openmp/libomptarget/include/PluginManager.h
@@ -34,13 +34,29 @@
 #include <mutex>
 #include <string>
 
+struct PluginManager;
+
+/// Plugin adaptors should be created via `PluginAdaptorTy::create` which will
+/// invoke the constructor and call `PluginAdaptorTy::init`. Eventual errors are
+/// reported back to the caller, otherwise a valid and initialized adaptor is
+/// returned.
 struct PluginAdaptorTy {
-  PluginAdaptorTy(const std::string &Name);
+  /// Try to create a plugin adaptor from a filename.
+  static llvm::Expected<std::unique_ptr<PluginAdaptorTy>>
+  create(const std::string &Name);
+
+  /// Initialize as many devices as possible for this plugin adaptor. Devices
+  /// that fail to initialize are ignored.
+  void initDevices(PluginManager &PM);
 
   bool isUsed() const { return DeviceOffset >= 0; }
 
-  /// Return the number of devices available to this plugin.
-  int32_t getNumDevices() const { return NumberOfDevices; }
+  /// Return the number of devices visible to the underlying plugin.
+  int32_t getNumberOfPluginDevices() const { return NumberOfPluginDevices; }
+
+  /// Return the number of devices successfully initialized and visible to the
+  /// user.
+  int32_t getNumberOfUserDevices() const { return NumberOfUserDevices; }
 
   /// Add all offload entries described by \p DI to the devices managed by this
   /// plugin.
@@ -51,9 +67,6 @@ struct PluginAdaptorTy {
   /// registered with this RTL.
   int32_t DeviceOffset = -1;
 
-  /// Number of devices this RTL deals with.
-  int32_t NumberOfDevices = -1;
-
   /// Name of the shared object file representing the plugin.
   std::string Name;
 
@@ -73,6 +86,22 @@ struct PluginAdaptorTy {
   // It is easier to enforce thread-safety at the libomptarget level,
   // so that developers of new RTLs do not have to worry about it.
   std::mutex Mtx;
+
+private:
+  /// Number of devices the underling plugins sees.
+  int32_t NumberOfPluginDevices = -1;
+
+  /// Number of devices exposed to the user. This can be less than the number of
+  /// devices for the plugin if some failed to initialize.
+  int32_t NumberOfUserDevices = 0;
+
+  /// Create a plugin adaptor for filename \p Name with a dynamic library \p DL.
+  PluginAdaptorTy(const std::string &Name,
+                  std::unique_ptr<llvm::sys::DynamicLibrary> DL);
+
+  /// Initialize the plugin adaptor, this can fail in which case the adaptor is
+  /// useless.
+  llvm::Error init();
 };
 
 /// Struct for the data required to handle plugins
@@ -150,20 +179,15 @@ struct PluginManager {
   int getNumUsedPlugins() const {
     int NCI = 0;
     for (auto &P : PluginAdaptors)
-      NCI += P.isUsed();
+      NCI += P->isUsed();
     return NCI;
   }
 
-  // Initialize \p Plugin if it has not been initialized.
-  void initPlugin(PluginAdaptorTy &Plugin);
-
   // Initialize all plugins.
   void initAllPlugins();
 
   /// Iterator range for all plugin adaptors (in use or not, but always valid).
-  auto pluginAdaptors() {
-    return llvm::make_range(PluginAdaptors.begin(), PluginAdaptors.end());
-  }
+  auto pluginAdaptors() { return llvm::make_pointee_range(PluginAdaptors); }
 
   /// Return the user provided requirements.
   int64_t getRequirements() const { return Requirements.getRequirements(); }
@@ -176,7 +200,7 @@ struct PluginManager {
   llvm::SmallVector<__tgt_bin_desc *> DelayedBinDesc;
 
   // List of all plugin adaptors, in use or not.
-  std::list<PluginAdaptorTy> PluginAdaptors;
+  llvm::SmallVector<std::unique_ptr<PluginAdaptorTy>> PluginAdaptors;
 
   /// Executable images and information extracted from the input images passed
   /// to the runtime.
diff --git a/openmp/libomptarget/include/device.h b/openmp/libomptarget/include/device.h
index 5146fc1444b44..fb9e4f572b47c 100644
--- a/openmp/libomptarget/include/device.h
+++ b/openmp/libomptarget/include/device.h
@@ -51,8 +51,6 @@ struct DeviceTy {
   PluginAdaptorTy *RTL;
   int32_t RTLDeviceID;
 
-  bool IsInit;
-  std::once_flag InitFlag;
   bool HasMappedGlobalData = false;
 
   /// Host data to device map type with a wrapper key indirection that allows
@@ -72,13 +70,16 @@ struct DeviceTy {
 
   std::mutex PendingGlobalsMtx;
 
-  DeviceTy(PluginAdaptorTy *RTL);
+  DeviceTy(PluginAdaptorTy *RTL, int32_t DeviceID, int32_t RTLDeviceID);
   // DeviceTy is not copyable
   DeviceTy(const DeviceTy &D) = delete;
   DeviceTy &operator=(const DeviceTy &D) = delete;
 
   ~DeviceTy();
 
+  /// Try to initialize the device and return any failure.
+  llvm::Error init();
+
   // Return true if data can be copied to DstDevice directly
   bool isDataExchangable(const DeviceTy &DstDevice);
 
@@ -145,8 +146,6 @@ struct DeviceTy {
   int associatePtr(void *HstPtrBegin, void *TgtPtrBegin, int64_t Size);
   int disassociatePtr(void *HstPtrBegin);
 
-  // calls to RTL
-  int32_t initOnce();
   __tgt_target_table *loadBinary(__tgt_device_image *Img);
 
   // device memory allocation/deallocation routines
@@ -234,9 +233,6 @@ struct DeviceTy {
   void dumpOffloadEntries();
 
 private:
-  // Call to RTL
-  void init(); // To be called only via DeviceTy::initOnce()
-
   /// Deinitialize the device (and plugin).
   void deinit();
 
diff --git a/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp b/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
index 5a3fd140f27a3..85fc346207d18 100644
--- a/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
+++ b/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
@@ -1704,7 +1704,7 @@ int32_t __tgt_rtl_number_of_devices() { return Plugin::get().getNumDevices(); }
 
 int64_t __tgt_rtl_init_requires(int64_t RequiresFlags) {
   Plugin::get().setRequiresFlag(RequiresFlags);
-  return RequiresFlags;
+  return OFFLOAD_SUCCESS;
 }
 
 int32_t __tgt_rtl_is_data_exchangable(int32_t SrcDeviceId,
diff --git a/openmp/libomptarget/src/PluginManager.cpp b/openmp/libomptarget/src/PluginManager.cpp
index 931143ad2347d..7b50bb6167f21 100644
--- a/openmp/libomptarget/src/PluginManager.cpp
+++ b/openmp/libomptarget/src/PluginManager.cpp
@@ -15,6 +15,7 @@
 
 #include "llvm/Support/Error.h"
 #include "llvm/Support/ErrorHandling.h"
+#include <memory>
 
 using namespace llvm;
 using namespace llvm::sys;
@@ -30,27 +31,43 @@ static const char *RTLNames[] = {
     /* AMDGPU target        */ "libomptarget.rtl.amdgpu",
 };
 
-PluginAdaptorTy::PluginAdaptorTy(const std::string &Name) : Name(Name) {
+Expected<std::unique_ptr<PluginAdaptorTy>>
+PluginAdaptorTy::create(const std::string &Name) {
   DP("Attempting to load library '%s'...\n", Name.c_str());
 
   std::string ErrMsg;
-  LibraryHandler = std::make_unique<DynamicLibrary>(
+  auto LibraryHandler = std::make_unique<DynamicLibrary>(
       DynamicLibrary::getPermanentLibrary(Name.c_str(), &ErrMsg));
 
   if (!LibraryHandler->isValid()) {
     // Library does not exist or cannot be found.
-    DP("Unable to load library '%s': %s!\n", Name.c_str(), ErrMsg.c_str());
-    return;
+    return createStringError(inconvertibleErrorCode(),
+                             "Unable to load library '%s': %s!\n", Name.c_str(),
+                             ErrMsg.c_str());
   }
 
   DP("Successfully loaded library '%s'!\n", Name.c_str());
+  auto PluginAdaptor = std::unique_ptr<PluginAdaptorTy>(
+      new PluginAdaptorTy(Name, std::move(LibraryHandler)));
+  if (auto Err = PluginAdaptor->init())
+    return Err;
+  return PluginAdaptor;
+}
+
+PluginAdaptorTy::PluginAdaptorTy(const std::string &Name,
+                                 std::unique_ptr<llvm::sys::DynamicLibrary> DL)
+    : Name(Name), LibraryHandler(std::move(DL)) {}
+
+Error PluginAdaptorTy::init() {
 
 #define PLUGIN_API_HANDLE(NAME, MANDATORY)                                     \
   NAME = reinterpret_cast<decltype(NAME)>(                                     \
       LibraryHandler->getAddressOfSymbol(GETNAME(__tgt_rtl_##NAME)));          \
   if (MANDATORY && !NAME) {                                                    \
-    DP("Invalid plugin as necessary interface is not found.\n");               \
-    return;                                                                    \
+    return createStringError(inconvertibleErrorCode(),                         \
+                             "Invalid plugin as necessary interface function " \
+                             "(%s) was not found.\n",                          \
+                             NAME);                                            \
   }
 
 #include "Shared/PluginAPI.inc"
@@ -59,22 +76,25 @@ PluginAdaptorTy::PluginAdaptorTy(const std::string &Name) : Name(Name) {
   // Remove plugin on failure to call optional init_plugin
   int32_t Rc = init_plugin();
   if (Rc != OFFLOAD_SUCCESS) {
-    DP("Unable to initialize library '%s': %u!\n", Name.c_str(), Rc);
-    return;
+    return createStringError(inconvertibleErrorCode(),
+                             "Unable to initialize library '%s': %u!\n",
+                             Name.c_str(), Rc);
   }
 
   // No devices are supported by this RTL?
-  NumberOfDevices = number_of_devices();
-  if (!NumberOfDevices) {
-    DP("No devices supported in this RTL\n");
-    return;
+  NumberOfPluginDevices = number_of_devices();
+  if (!NumberOfPluginDevices) {
+    return createStringError(inconvertibleErrorCode(),
+                             "No devices supported in this RTL\n");
   }
 
-  DP("Registered '%s' with %d devices!\n", Name.c_str(), NumberOfDevices);
+  DP("Registered '%s' with %d plugin visible devices!\n", Name.c_str(),
+     NumberOfPluginDevices);
+  return Error::success();
 }
 
 void PluginAdaptorTy::addOffloadEntries(DeviceImageTy &DI) {
-  for (int32_t I = 0; I < NumberOfDevices; ++I) {
+  for (int32_t I = 0, E = getNumberOfUserDevices(); I < E; ++I) {
     auto DeviceOrErr = PM->getDevice(DeviceOffset + I);
     if (!DeviceOrErr)
       FATAL_MESSAGE(DeviceOffset + I, "%s",
@@ -92,45 +112,61 @@ void PluginManager::init() {
   // Attempt to open all the plugins and, if they exist, check if the interface
   // is correct and if they are supporting any devices.
   for (const char *Name : RTLNames) {
-    PluginAdaptors.emplace_back(std::string(Name) + ".so");
-    if (PluginAdaptors.back().getNumDevices() <= 0)
-      PluginAdaptors.pop_back();
+    auto PluginAdaptorOrErr =
+        PluginAdaptorTy::create(std::string(Name) + ".so");
+    if (!PluginAdaptorOrErr) {
+      [[maybe_unused]] std::string InfoMsg =
+          toString(PluginAdaptorOrErr.takeError());
+      DP("%s", InfoMsg.c_str());
+    } else {
+      PluginAdaptors.push_back(std::move(*PluginAdaptorOrErr));
+    }
   }
 
   DP("RTLs loaded!\n");
 }
 
-void PluginManager::initPlugin(PluginAdaptorTy &Plugin) {
-  // If this RTL is not already in use, initialize it.
-  if (Plugin.isUsed() || !Plugin.NumberOfDevices)
+void PluginAdaptorTy::initDevices(PluginManager &PM) {
+  if (isUsed())
     return;
 
+  // If this RTL is not already in use, initialize it.
+  assert(getNumberOfPluginDevices() > 0 &&
+         "Tried to initialize useless plugin adaptor");
+
   // Initialize the device information for the RTL we are about to use.
-  auto ExclusiveDevicesAccessor = getExclusiveDevicesAccessor();
-  const size_t Start = ExclusiveDevicesAccessor->size();
-  ExclusiveDevicesAccessor->reserve(Start + Plugin.NumberOfDevices);
-  for (int32_t DeviceId = 0; DeviceId < Plugin.NumberOfDevices; DeviceId++) {
-    ExclusiveDevicesAccessor->push_back(std::make_unique<DeviceTy>(&Plugin));
-    // global device ID
-    (*ExclusiveDevicesAccessor)[Start + DeviceId]->DeviceID = Start + DeviceId;
-    // RTL local device ID
-    (*ExclusiveDevicesAccessor)[Start + DeviceId]->RTLDeviceID = DeviceId;
-  }
+  auto ExclusiveDevicesAccessor = PM.getExclusiveDevicesAccessor();
 
   // Initialize the index of this RTL and save it in the used RTLs.
-  Plugin.DeviceOffset = Start;
+  DeviceOffset = ExclusiveDevicesAccessor->size();
 
   // If possible, set the device identifier offset in the plugin.
-  if (Plugin.set_device_offset)
-    Plugin.set_device_offset(Start);
+  if (set_device_offset)
+    set_device_offset(DeviceOffset);
+
+  int32_t NumPD = getNumberOfPluginDevices();
+  ExclusiveDevicesAccessor->reserve(DeviceOffset + NumPD);
+  for (int32_t PDevI = 0, UserDevId = DeviceOffset; PDevI < NumPD; PDevI++) {
+    auto Device = std::make_unique<DeviceTy>(this, UserDevId, PDevI);
+    if (auto Err = Device->init()) {
+      DP("Skip plugin known device %d: %s\n", PDevI,
+         toString(std::move(Err)).c_str());
+      continue;
+    }
+
+    ExclusiveDevicesAccessor->push_back(std::move(Device));
+    ++NumberOfUserDevices;
+    ++UserDevId;
+  }
 
-  DP("RTL " DPxMOD " has index %d!\n", DPxPTR(Plugin.LibraryHandler.get()),
-     Plugin.DeviceOffset);
+  DP("Plugin adaptor " DPxMOD " has index %d, exposes %d out of %d devices!\n",
+     DPxPTR(LibraryHandler.get()), DeviceOffset, NumberOfUserDevices,
+     NumberOfPluginDevices);
 }
 
 void PluginManager::initAllPlugins() {
   for (auto &R : PluginAdaptors)
-    initPlugin(R);
+    R->initDevices(*this);
 }
 
 static void registerImageIntoTranslationTable(TranslationTable &TT,
@@ -143,7 +179,8 @@ static void registerImageIntoTranslationTable(TranslationTable &TT,
 
   // Resize the Targets Table and Images to accommodate the new targets if
   // required
-  unsigned TargetsTableMinimumSize = RTL.DeviceOffset + RTL.NumberOfDevices;
+  unsigned TargetsTableMinimumSize =
+      RTL.DeviceOffset + RTL.getNumberOfUserDevices();
 
   if (TT.TargetsTable.size() < TargetsTableMinimumSize) {
     TT.TargetsImages.resize(TargetsTableMinimumSize, 0);
@@ -151,7 +188,7 @@ static void registerImageIntoTranslationTable(TranslationTable &TT,
   }
 
   // Register the image in all devices for this target type.
-  for (int32_t I = 0; I < RTL.NumberOfDevices; ++I) {
+  for (int32_t I = 0; I < RTL.getNumberOfUserDevices(); ++I) {
     // If we are changing the image we are also invalidating the target table.
     if (TT.TargetsImages[RTL.DeviceOffset + I] != Image) {
       TT.TargetsImages[RTL.DeviceOffset + I] = Image;
@@ -194,7 +231,7 @@ void PluginManager::registerLib(__tgt_bin_desc *Desc) {
       DP("Image " DPxMOD " is compatible with RTL %s!\n",
          DPxPTR(Img->ImageStart), R.Name.c_str());
 
-      PM->initPlugin(R);
+      R.initDevices(*this);
 
       // Initialize (if necessary) translation table for this library.
       PM->TrlTblMtx.lock();
@@ -263,7 +300,7 @@ void PluginManager::unregisterLib(__tgt_bin_desc *Desc) {
 
       // Execute dtors for static objects if the device has been used, i.e.
       // if its PendingCtors list has been emptied.
-      for (int32_t I = 0; I < FoundRTL->NumberOfDevices; ++I) {
+      for (int32_t I = 0; I < FoundRTL->getNumberOfUserDevices(); ++I) {
         auto DeviceOrErr = PM->getDevice(FoundRTL->DeviceOffset + I);
         if (!DeviceOrErr)
           FATAL_MESSAGE(FoundRTL->DeviceOffset + I, "%s",
@@ -337,17 +374,5 @@ Expected<DeviceTy &> PluginManager::getDevice(uint32_t DeviceNo) {
         "Device number '%i' out of range, only %i devices available", DeviceNo,
         ExclusiveDevicesAccessor->size());
 
-  DeviceTy &Device = *(*ExclusiveDevicesAccessor)[DeviceNo];
-
-  DP("Is the device %d (local ID %d) initialized? %d\n", DeviceNo,
-     Device.RTLDeviceID, Device.IsInit);
-
-  // Init the device if not done before
-  if (!Device.IsInit && Device.initOnce() != OFFLOAD_SUCCESS) {
-    return createStringError(inconvertibleErrorCode(),
-                             "Failed to init device %d\n", DeviceNo);
-  }
-
-  DP("Device %d is ready to use.\n", DeviceNo);
-  return Device;
+  return *(*ExclusiveDevicesAccessor)[DeviceNo];
 }
diff --git a/openmp/libomptarget/src/device.cpp b/openmp/libomptarget/src/device.cpp
index ad9563e04def4..003e2ff695abd 100644
--- a/openmp/libomptarget/src/device.cpp
+++ b/openmp/libomptarget/src/device.cpp
@@ -22,6 +22,7 @@
 #include "rtl.h"
 
 #include "Shared/EnvironmentVar.h"
+#include "llvm/Support/Error.h"
 
 #include <cassert>
 #include <climits>
@@ -62,8 +63,8 @@ int HostDataToTargetTy::addEventIfNecessary(DeviceTy &Device,
   return OFFLOAD_SUCCESS;
 }
 
-DeviceTy::DeviceTy(PluginAdaptorTy *RTL)
-    : DeviceID(-1), RTL(RTL), RTLDeviceID(-1), IsInit(false), InitFlag(),
+DeviceTy::DeviceTy(PluginAdaptorTy *RTL, int32_t DeviceID, int32_t RTLDeviceID)
+    : DeviceID(DeviceID), RTL(RTL), RTLDeviceID(RTLDeviceID),
       PendingCtorsDtors(), PendingGlobalsMtx() {}
 
 DeviceTy::~DeviceTy() {
@@ -528,14 +529,21 @@ int DeviceTy::deallocTgtPtrAndEntry(HostDataToTargetTy *Entry, int64_t Size) {
   return Ret;
 }
 
-/// Init device, should not be called directly.
-void DeviceTy::init() {
+llvm::Error DeviceTy::init() {
   // Make call to init_requires if it exists for this plugin.
+  int32_t Ret = 0;
   if (RTL->init_requires)
-    RTL->init_requires(PM->getRequirements());
-  int32_t Ret = RTL->init_device(RTLDeviceID);
+    Ret = RTL->init_requires(PM->getRequirements());
   if (Ret != OFFLOAD_SUCCESS)
-    return;
+    return llvm::createStringError(
+        llvm::inconvertibleErrorCode(),
+        "Failed to initialize requirements for device %d\n", DeviceID);
+
+  Ret = RTL->init_device(RTLDeviceID);
+  if (Ret != OFFLOAD_SUCCESS)
+    return llvm::createStringError(llvm::inconvertibleErrorCode(),
+                                   "Failed to initialize device %d\n",
+                                   DeviceID);
 
   // Enables recording kernels if set.
   BoolEnvar OMPX_RecordKernel("LIBOMPTARGET_RECORD", false);
@@ -548,22 +556,7 @@ void DeviceTy::init() {
                                   OMPX_ReplaySaveOutput, ReqPtrArgOffset);
   }
 
-  IsInit = true;
-}
-
-/// Thread-safe method to initialize the device only once.
-int32_t DeviceTy::initOnce() {
-  std::call_once(InitFlag, &DeviceTy::init, this);
-
-  // At this point, if IsInit is true, then either this thread or some other
-  // thread in the past successfully initialized the device, so we can return
-  // OFFLOAD_SUCCESS. If this thread executed init() via call_once() and it
-  // failed, return OFFLOAD_FAIL. If call_once did not invoke init(), it means
-  // that some other thread already attempted to execute init() and if IsInit
-  // is still false, return OFFLOAD_FAIL.
-  if (IsInit)
-    return OFFLOAD_SUCCESS;
-  return OFFLOAD_FAIL;
+  return llvm::Error::success();
 }
 
 // Load binary to device.

Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG with one nit.

if (Plugin.set_device_offset)
Plugin.set_device_offset(Start);
if (set_device_offset)
set_device_offset(DeviceOffset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this function named this way? This function is not like some others that are actually function pointers of C library/plugin functions. Of course it is not this patch's issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is like the others. I moved it into the PluginAdaptorTy so it checks the function pointer and calls it, like other plugin functions.

@jdoerfert jdoerfert merged commit 68db7ae into llvm:main Dec 6, 2023
5 checks passed
@jdoerfert jdoerfert deleted the offload_prep9 branch December 6, 2023 00:04
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Dec 12, 2023
This introduces checked errors into the creation and initialization of
`PluginAdaptorTy`. We also allow the adaptor to "hide" devices from the
user if the initialization failed. The new organization avoids the
"initOnce" stuff but we still do not eagerly initialize the plugin
devices (I think we should merge `PluginAdaptorTy::initDevices` into
`PluginAdaptorTy::init`)

Change-Id: Ieaaa3a745899c6468a9b062e91d8a9767c29b3fe
@dhruvachak
Copy link
Contributor

After this reorganization, all available devices are getting initialized eagerly through PluginAdaptorTy::initDevices. Before this patch, the device initialization would occur on demand through __tgt_target_kernel. While I don't know whether this leads to any measurable performance degradation, quite a bit of unnecessary work is being done now. For example, on an 8-GPU system, we are initializing all 8 devices now whereas previously we would initialize only 1 device (assuming that's all that was used by the program). Any reason why all devices are initialized eagerly?

Additionally, this could have implications for overheads incurred when OMPT-based tools are involved. According to the spec, the runtime shall invoke the device_initialize callback defined by the tool when a device is initialized. So after this patch, this callback will be invoked for all devices which in turn could do quite a bit of unnecessary metadata setup, etc.

I noticed another change because of this reorganization. A device is now getting initialized before main is called whereas previously it would be called from main through __tgt_target_kernel. This implies that the device initialization callback will now be called before main. While the spec does not say when this callback should be invoked, this change could cause problems for tools if static data structures were used by the tool. Perhaps this change is ok since the spec does not mandate any order. But again, is there a reason why this change was made?

@jdoerfert

shiltian added a commit to shiltian/llvm-project that referenced this pull request Dec 14, 2023
This patch fixed the following compile error caused by llvm#74397.

```
openmp/libomptarget/src/PluginManager.cpp: In static member function ‘static llvm::Expected<std::unique_ptr<PluginAdaptorTy> >
PluginAdaptorTy::create(const string&)’:
openmp/libomptarget/src/PluginManager.cpp:50:10: error: could not convert ‘PluginAdaptor’ from ‘std::unique_ptr<PluginAdaptorTy
>’ to ‘llvm::Expected<std::unique_ptr<PluginAdaptorTy> >’
   return PluginAdaptor;                                                                                                                                                                    ^~~~~~~~~~~~~
```
@doru1004
Copy link
Contributor

Would it be possible to have this behavior by default but also have the old behaviour optional?

shiltian added a commit that referenced this pull request Dec 15, 2023
This patch fixed the following compile error caused by #74397.

```
openmp/libomptarget/src/PluginManager.cpp: In static member function ‘static llvm::Expected<std::unique_ptr<PluginAdaptorTy> >
PluginAdaptorTy::create(const string&)’:
openmp/libomptarget/src/PluginManager.cpp:50:10: error: could not convert ‘PluginAdaptor’ from ‘std::unique_ptr<PluginAdaptorTy
>’ to ‘llvm::Expected<std::unique_ptr<PluginAdaptorTy> >’
   return PluginAdaptor;                                                                                                                                                                    ^~~~~~~~~~~~~
```
@@ -1704,7 +1704,7 @@ int32_t __tgt_rtl_number_of_devices() { return Plugin::get().getNumDevices(); }

int64_t __tgt_rtl_init_requires(int64_t RequiresFlags) {
Plugin::get().setRequiresFlag(RequiresFlags);
return RequiresFlags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this changed? It looks unrelated.

@Thyre
Copy link

Thyre commented Dec 18, 2023

After this reorganization, all available devices are getting initialized eagerly through PluginAdaptorTy::initDevices. Before this patch, the device initialization would occur on demand through __tgt_target_kernel. While I don't know whether this leads to any measurable performance degradation, quite a bit of unnecessary work is being done now. For example, on an 8-GPU system, we are initializing all 8 devices now whereas previously we would initialize only 1 device (assuming that's all that was used by the program). Any reason why all devices are initialized eagerly?

Additionally, this could have implications for overheads incurred when OMPT-based tools are involved. According to the spec, the runtime shall invoke the device_initialize callback defined by the tool when a device is initialized. So after this patch, this callback will be invoked for all devices which in turn could do quite a bit of unnecessary metadata setup, etc.

I noticed another change because of this reorganization. A device is now getting initialized before main is called whereas previously it would be called from main through __tgt_target_kernel. This implies that the device initialization callback will now be called before main. While the spec does not say when this callback should be invoked, this change could cause problems for tools if static data structures were used by the tool. Perhaps this change is ok since the spec does not mandate any order. But again, is there a reason why this change was made?

@jdoerfert

I guess having all devices initialized early might help with ROCm/ROCm#2057. However, we moved away from using the returned value a while ago. So this doesn't matter for us.

As for potential additional overhead incurred with OMPT tools, I'll test how this MR affects Score-P. Looking at our code, we will initialize the device for measurements including locations (if tracing is available). Therefore, users will see empty locations for devices not being used. This isn't the case with other accelerator adapters.

Especially the initialization before main worries me a bit, since #69318 occurs when one tries to do certain initialization during OMPT init when offloading is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomptarget OpenMP offload runtime openmp
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants