Slow prediction with dask #5729

RAMitchell · 2020-05-30T01:02:49Z

Prediction with Xgboost dask interface is much slower than expected, taking almost 5x training time.

from dask_cuda import LocalCUDACluster
from dask.distributed import Client
from dask import array as da
import xgboost as xgb
from xgboost.dask import DaskDMatrix
import time


def main(client):
    m = 10000000
    n = 100
    X = da.random.random(size=(m, n), chunks=100)
    y = da.random.random(size=(m,), chunks=100)
    dtrain = DaskDMatrix(client, X, y)

    start = time.time()
    output = xgb.dask.train(client,
                            {
                                'tree_method': 'gpu_hist'},
                            dtrain,
                            num_boost_round=500, evals=[(dtrain, 'train')])
    print("Train time: {}".format(time.time() - start))
    bst = output['booster']
    start = time.time()
    prediction = xgb.dask.predict(client, bst, dtrain)
    prediction = prediction.compute()
    print("Predict time: {}".format(time.time() - start))
    return prediction


if __name__ == '__main__':
    with LocalCUDACluster() as cluster:
        with Client(cluster) as client:
            main(client)

Train Time: 121s
Prediction time: 502s

trivialfis · 2020-05-30T01:31:49Z

Let me take a look.

trivialfis · 2020-05-30T03:52:30Z

The prediction is run for each partition/block so that there's no concatenation for input data (hence lower memory usage). When you set the chunk size to 100, you ended up running prediction 100000 times given you have 10000000 rows.

cdeotte · 2020-05-30T13:35:39Z

I see two issues here. First the chunk size is small. If you change to chunks=1_000_000, then train takes 27.2 seconds and prediction takes 151.2 seconds.

The reason prediction is 5.5x slower than train is because you need to add 'predictor' : 'gpu_predictor'. If you add this parameter then with chunks=1_000_000 train takes 27.2 seconds and predict takes 11.7 seconds.

trivialfis · 2020-05-30T13:39:57Z

I think the predictor is automatically running on GPU in this example.

cdeotte · 2020-05-30T13:41:56Z

No it is not. I just ran this on a DGX and posted my results above. By default XGB always uses CPU for prediction. Even if you use 'tree_method': 'gpu_hist', prediction is still on CPU. You must explicitly set the 'predictor' : 'gpu_predictor'.

trivialfis · 2020-05-30T13:49:26Z

Got it, took another look into the code. CPU predictor might be chosen because data comes from host. Thanks for correcting my mistake.

cdeotte · 2020-05-30T13:50:50Z

It is good you made your comment trivialfis. Many people including myself last week did not know this. As such I suggest making GPU prediction the default if a user activates the tree method gpu hist parameter.

(And yes, maybe the predictor depends on where the data comes from, I'm not sure).

trivialfis · 2020-05-30T13:53:55Z

It's a trade-off. We made some heuristics to avoid copying data into device from host. For training the data is not copied even you are using GPU. Currently DGX is not very accessible to wide public, so we can't make this setting.

trivialfis · 2021-02-02T02:52:04Z

The prediction speed has been improved quite significantly in recent PRs.

hcho3 added the dask label Sep 27, 2020

trivialfis added the doc label Oct 29, 2020

trivialfis mentioned this issue Nov 19, 2020

very low GPU usage when gpu model predicts #6405

Closed

trivialfis mentioned this issue Jan 28, 2021

[dask] Accept Future of model for prediction. #6650

Merged

trivialfis closed this as completed Feb 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow prediction with dask #5729

Slow prediction with dask #5729

RAMitchell commented May 30, 2020

trivialfis commented May 30, 2020

trivialfis commented May 30, 2020 •

edited

cdeotte commented May 30, 2020 •

edited

trivialfis commented May 30, 2020

cdeotte commented May 30, 2020 •

edited

trivialfis commented May 30, 2020

cdeotte commented May 30, 2020 •

edited

trivialfis commented May 30, 2020 •

edited

trivialfis commented Feb 2, 2021

Slow prediction with dask #5729

Slow prediction with dask #5729

Comments

RAMitchell commented May 30, 2020

trivialfis commented May 30, 2020

trivialfis commented May 30, 2020 • edited

cdeotte commented May 30, 2020 • edited

trivialfis commented May 30, 2020

cdeotte commented May 30, 2020 • edited

trivialfis commented May 30, 2020

cdeotte commented May 30, 2020 • edited

trivialfis commented May 30, 2020 • edited

trivialfis commented Feb 2, 2021

trivialfis commented May 30, 2020 •

edited

cdeotte commented May 30, 2020 •

edited

cdeotte commented May 30, 2020 •

edited

cdeotte commented May 30, 2020 •

edited

trivialfis commented May 30, 2020 •

edited