Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantized Gradient Creates a Failed Check #6175

Closed
jamespinkerton opened this issue Nov 3, 2023 · 5 comments
Closed

Quantized Gradient Creates a Failed Check #6175

jamespinkerton opened this issue Nov 3, 2023 · 5 comments

Comments

@jamespinkerton
Copy link

Hi. I'm training a large model that I've trained many times before. I wanted to turn on quantized gradient to speed it up, but it's creating an error.

Here's a stack trace of the error:

Traceback (most recent call last):
File "/mnt/disks/condaman/mamba/lib/python3.11/site-packages/lightgbm/engine.py", line 266, in train
booster.update(fobj=fobj)
File "/mnt/disks/condaman/mamba/lib/python3.11/site-packages/lightgbm/basic.py", line 3557, in update
_safe_call(_LIB.LGBM_BoosterUpdateOneIter(
File "/mnt/disks/condaman/mamba/lib/python3.11/site-packages/lightgbm/basic.py", line 237, in _safe_call
raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) at /home/conda/feedstock_root/build_artifacts/lightgbm_1689341180525/work/src/treelearner/serial_tree_learner.cpp, line 845 .

I'm using LGBM 4.1.0. Installed with conda-forge.

Thanks so much,
James

@jameslamb
Copy link
Collaborator

Thanks for using LightGBM, and sorry about this bug.

In the future, please check the issues here before posting. Searching that error message (https://github.com/microsoft/LightGBM/issues?q=%22Check+failed%3A+%28best_split_info.left_count%29+%3E+%280%29%22+is%3Aissue), you'll see #5994 at the top of the list.

That links to related issue #5982, which shows this was fixed in #6092.

That change hasn't been released yet. We will try to get a release up soon.

For now, if you need to use quantized training follow @shiyu1994 's advice in #6134 (comment) and build the Python package from source.

@jamespinkerton
Copy link
Author

My bad. I think I noticed the bug a while ago and checked at the time and there wasn't an issue. And then I finally got around to submitting it and I forgot to re-check. My fault, and thank you!

@jameslamb
Copy link
Collaborator

No problem at all, thanks for using LightGBM and taking the time to report! Sorry we haven't gotten that fix out in a release yet, I'm hoping to put one up in the next few days.

@empowerVictor
Copy link

@jameslamb I am still getting this error on 4.2.0.
Any ideas why? How can I help debug it?
Unfortunately I can't share my data.

@jameslamb
Copy link
Collaborator

How can I help debug it?

Post on #5982 with the following:

  1. environment information (operating system, architecture, version of Python, how you installed lightgbm)
  2. a reproducible example (e.g. using fake data if you can't share the data you're using), exact minimal code showing how you're using LightGBM
  3. any other details like logs

This issue is marked duplicate, so I'm going to lock it to prevent further comments here.

@microsoft microsoft locked as resolved and limited conversation to collaborators Jan 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants