Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CUDA treelearner according to changes introduced for linear trees #3750

Merged
merged 3 commits into from
Jan 15, 2021

Conversation

@@ -534,8 +534,8 @@ void CUDATreeLearner::InitGPU(int num_gpu) {
copyDenseFeature();
}

Tree* CUDATreeLearner::Train(const score_t* gradients, const score_t *hessians) {
Tree *ret = SerialTreeLearner::Train(gradients, hessians);
Tree* CUDATreeLearner::Train(const score_t* gradients, const score_t *hessians, bool is_first_tree) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/gha run cuda-builds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is CUDA not being built automatically with each pull request?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. We didn't set it to be built automatically as it was constantly failing.
Now I think we can set it as ordinary test which will be required to be passed to merge new pull requests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/gha run cuda-builds

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the error is still here. But with fixes from #3748 SOME tests are now passed (🎉), while without it all tests were failing.
The error is (I believe the same as before)

[LightGBM] [Fatal] [CUDA] invalid argument /LightGBM/src/treelearner/cuda_tree_learner.cpp 414

Full test report (. - success, F - failure, s - skipped, x - expected to fail):

============================= test session starts ==============================
platform linux -- Python 3.8.2, pytest-6.2.1, py-1.10.0, pluggy-0.13.1
rootdir: /LightGBM
collected 238 items

../tests/c_api_test/test_.py ..                                          [  0%]
../tests/python_package_test/test_basic.py ...F..FF.....                 [  6%]
../tests/python_package_test/test_consistency.py ......                  [  8%]
../tests/python_package_test/test_dask.py FFFFFFFFFFFFFFFFFFFFFFFFFF..   [ 20%]
../tests/python_package_test/test_dual.py s                              [ 21%]
../tests/python_package_test/test_engine.py FF...F.....FFFF...FFF..FFFF. [ 32%]
.FFFF.FFFF.....F...FFFFFF...FF.FFF.FF.F                                  [ 49%]
../tests/python_package_test/test_plotting.py F...F                      [ 51%]
../tests/python_package_test/test_sklearn.py .F.F.F.FFFFFFFF.FF..FF...FF [ 62%]
FF.FF.xFF..F..FF...FF.F..F.FFF..F.F..FF.....FF.FxFF..F..FF...FF.F......F [ 92%]
.F..FF.....FF.F..                                                        [100%]
= 123 failed, 112 passed, 1 skipped, 2 xfailed, 69 warnings in 152.56s (0:02:32) =

https://github.com/microsoft/LightGBM/runs/1682722249?check_suite_focus=true

@guolinke @jameslamb @btrotta I think we need to merge this PR first to sync CUDA code with the rest codebase. Meantime I'll prepare new minimal reproducible example to show it @ChipKerchner .

UPD: all the rest failures can be fixed by #3450 (comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll provide a pull request to fix the above failures.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems ok to me. I definitely prefer to work like this, several small PRs that are incremental steps towards enabling CUDA tests automatically on every PR.

Copy link
Collaborator

@btrotta btrotta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

@StrikerRUS StrikerRUS merged commit a15a370 into master Jan 15, 2021
@StrikerRUS StrikerRUS deleted the cuda_fix branch January 15, 2021 13:20
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants