Dev/unified dispatching prototype #7724

razdoburdin · 2022-03-10T07:45:39Z

In continuous of #5659 and #6212.
Here I present a way for dispatching the various devices (cpu/cuda device/oneapi device).
This request contains only the changes being related to all the devices. The code for oneapi devices support is planned to be added later.

The main idea of dispatching was discussed in #6212. A new global parameter called device_selector is added. This parameter determines the device where the calculations will be made as well as the specific kernel that will be executed. So if the user configures XGBoost by the following parameters:
clf = xgboost.XGBClassifier(... , objective='multi:softmax', tree_method='hist')
the cpu version of the library will be executed. But if the user add device_selector="oneapi:gpu":
clf = xgboost.XGBClassifier(... , device_selector='oneapi:gpu', objective='multi:softmax', tree_method='hist')
the specific code for oneapi GPU will be used.

For cuda the relative logic is not implemented, thus for this case it is just an alternative way for setting the gpu_id. For saving backward compatibility with the existing user code, the priority of gpu_id is made higher.

The additional feature added by this request is an independent specification of devices for fitting and prediction. If the user specifies device_selector='fit:oneapi:gpu; predict:cpu', oneapi GPU will be used for fitting, and CPU will be used for prediction.

Pull dmlc#7607

…elector

…GPUCoordDescent LogitSerializationTest.GpuHist LogitSerializationTest.GPUCoordDescent MultiClassesSerializationTest.GpuHist MultiClassesSerializationTest.GPUCoordDescent

trivialfis · 2022-03-14T14:19:19Z

Thank you for working on this! I also wrote a higher level RFC #7308 for future device dispatching, which should be complementary to this PR.

I will look into this in more detail later.

trivialfis · 2022-04-21T06:20:18Z

include/xgboost/generic_parameters.h

@@ -31,6 +32,22 @@ struct GenericParameter : public XGBoostParameter<GenericParameter> {
  bool fail_on_invalid_gpu_id {false};
  bool validate_parameters {false};

+  /* Device dispatcher object.


trivialfis · 2022-04-21T06:21:18Z

src/device_selector.cc

+}
+
+void DeviceSelector::Init(const std::string& user_input_device_selector) {
+  int fit_position = user_input_device_selector.find(fit_.Prefix());


Do you think it's appropriate that we don't distinguish between predict and fit? Whatever device the user has specified, we will use it everywhere.

Currently, a user can configure the prediction on CPU and fitting on GPU by specifying 'predictor=cpu_predictor', right? The idea here is to provide the user a unified way for selecting devices in both fitting and prediction.

We will remove gpu_predictor and gpu_hist (hopefully in this release) as documented in #7308 . The expected result is we will have only one (global) parameter device to control the dispatch:

with xgb.config_context(device="sycl:0"): booster.predict(X)

Oh, I see.
If you don't plan supporting different devices for fitting and prediction, this feature is inappropriate. Fortunately, it can easily be reduced to the uniform device descriptor for both stages.

That avoids some internal conflicts, it's difficult to configure the states with the current design. We have been working on using https://github.com/dmlc/xgboost/blob/master/include/xgboost/generic_parameters.h as the context object for XGBoost. Maybe we can integrate the device selector in this PR with it?

I will make some progress on setting up the interface and keep you posted. Thank you for working on it.

Hi @trivialfis ,
are there any progress in this direction? May be some help from our side can be useful?

@razdoburdin I have had some experiments on this recently, the problem is distributed environments and multi-threaded environments (like python async). We need to share the device index between all workers and all threads, which needs some synchronization strategies.

We don't need any synchronization if the device id is limited to booster as a local variable. But if we were to extend it to DMatrix as well (for constructing DMatrix from various sources of data), then the issue becomes a headache.

Hi @trivialfis , have we a chance to implement such or similar concept in xgboost 2.0? Maybe you need some help in it?

Yes, I'm still planning it as a major breaking change for 2.0. Got distracted away in the 1.7 by the new pyspark interface. Expect some progress next month. Sorry for the slow update.

trivialfis · 2023-07-20T06:22:18Z

This is mostly complete now. #7308 (comment)

dmitry.razdoburdin added 15 commits January 17, 2022 15:34

Modification of XGBoost v1.5.1 for working with oneapi

14ac49f

Migration to xgboost 1.5.2

45ee9ca

Transfering fix for multi:softmax in sklearn predict from the mainline

1817c6b

Pull dmlc#7607

Adding support of unified device dispatching

cb2c8b6

Modifications and fixed for unified dispatching

5ddd0df

Bug fixes and refactoring

8ce086b

Removing the oneapi plugin

4a30c02

Undo some changes

8531ae7

"Preparing to merge request"

419120d

Cleaning from the oneAPI artofacts

e559ba1

Cleaning from the oneAPI changes

6a4d225

rename device_id to device_selector

572b60b

undo some changes

45493c0

Change the device_selector behaviour to be the same as sycl::filter_s…

4362ae8

…elector

Add some comments

e3c8b4c

trivialfis added this to 1.6 In Progress in 2.0 Roadmap via automation Mar 12, 2022

trivialfis moved this from 1.6 In Progress to 2.0 in 2.0 Roadmap Mar 12, 2022

Fixing failing on tests: SerializationTest.GpuHist SerializationTest.…

bfa830a

…GPUCoordDescent LogitSerializationTest.GpuHist LogitSerializationTest.GPUCoordDescent MultiClassesSerializationTest.GpuHist MultiClassesSerializationTest.GPUCoordDescent

dmitry.razdoburdin added 2 commits March 14, 2022 17:48

Style fixing

4425d5b

Bug fixing

ac01401

razdoburdin mentioned this pull request Mar 16, 2022

Dev/oneapi objectives support #7732

Closed

trivialfis reviewed Apr 21, 2022

View reviewed changes

trivialfis moved this from 2.0 TODO to 2.0 In Progress in 2.0 Roadmap Mar 20, 2023

trivialfis closed this Jul 20, 2023

trivialfis moved this from 2.0 In Progress to 2.0 Done in 2.0 Roadmap Jul 20, 2023

razdoburdin deleted the dev/unified_dispatching_prototype branch May 21, 2024 10:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev/unified dispatching prototype #7724

Dev/unified dispatching prototype #7724

razdoburdin commented Mar 10, 2022

trivialfis commented Mar 14, 2022

trivialfis Apr 21, 2022

trivialfis Apr 21, 2022

razdoburdin May 24, 2022

trivialfis May 24, 2022

razdoburdin May 24, 2022

trivialfis May 26, 2022 •

edited

trivialfis May 30, 2022

razdoburdin Oct 4, 2022

trivialfis Oct 5, 2022

razdoburdin Jan 17, 2023

trivialfis Jan 17, 2023

trivialfis commented Jul 20, 2023

Dev/unified dispatching prototype #7724

Dev/unified dispatching prototype #7724

Conversation

razdoburdin commented Mar 10, 2022

trivialfis commented Mar 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis May 26, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis commented Jul 20, 2023

trivialfis May 26, 2022 •

edited