New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LightGBM GPU not working with CUDA 10.0 on RHEL 7.x #2075
Comments
We have a case of successful compilation with Boost 1.69.0 and CUDA 10.0: #2081 (comment), but that was Windows... ping @huanzhang12 |
Thanks @StrikerRUS |
@huanzhang12 , It actually works already! [~]$ python36
... iris = load_iris()
|
@nikolayvoronchikhin Glad that your problem has been solved! And thanks a lot for sharing your workaround here. |
Environment info
Operating System:
RHEL 7.5/7.6
CPU/GPU model:
NVIDIA Tesla P100-PCIE-16GB
C++/Python/R version:
Python 2.7 & Python 3.6
Microsoft R Open 3.4.3
LightGBM version or commit hash:
lightgbm==2.2.4
Error message for lightGBM binary
[~]$ ./lightgbm config=lightgbm_gpu.conf data=higgs.train valid=higgs.test objective=binary metric=auc
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 13.491848 seconds
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 1535
[LightGBM] [Info] Number of data: 10500000, number of used features: 28
[LightGBM] [Info] Using requested OpenCL platform 0 device 0
[LightGBM] [Info] Using GPU Device: Tesla P100-PCIE-16GB, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 64 bins...
Segmentation fault
Error message for lightGBM in python 3.6/2.7
[~]$ python36
Python 3.6.4 |Anaconda custom (64-bit)| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
... iris = load_iris()
... #print(y)
...
... train_data = lgb.Dataset(X, label=y)
... params = {
... 'objective': 'multiclass',
... 'feature_fraction': 1,
... 'bagging_fraction': 1,
... 'num_class':3,
... 'verbose': -1,
... 'device' : 'gpu'
... }
... gbm = lgb.train(params, train_data, num_boost_round=10)
Segmentation fault
Success for lightGBM R
The following issue helped make lightGBM GPU work in RStudio:
#964
Steps to reproduce
But that still results in segmentation fault for me.
Can you suggest any other changes needed or is CUDA 10.0 not supported yet?
The text was updated successfully, but these errors were encountered: