Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime error: Unknown Label with lightgbm algorithm #13

Closed
mb52089 opened this issue Dec 5, 2019 · 2 comments
Closed

runtime error: Unknown Label with lightgbm algorithm #13

mb52089 opened this issue Dec 5, 2019 · 2 comments

Comments

@mb52089
Copy link

mb52089 commented Dec 5, 2019

I'm receiving a runtime error: Unknown Label when I use the lightgbm algorithm in certain circumstances, but not when I use the linear regression algorithm - on the exact same data set. Here's the full error:
RuntimeError: Unknown label: Tue
from /Users/michaelburke/.rvm/gems/ruby-2.6.5@copient_health_rails6/bundler/gems/eps-509da754d6e9/lib/eps/label_encoder.rb:28:in `block in transform'
The name of the label varies with different models' error messages. And SOME of the lightgbm models actually build without error, but others fail every time, depending on what filter of the dataset I use to build the model.

@ankane
Copy link
Owner

ankane commented Dec 5, 2019

Hey @mb52089, that error message isn't great, so here's an explanation of what's going on:

Internally, Eps splits your data into a training and validation set to give you a better idea of performance. With LightGBM, categorical features are encoded to integers before being passed to the library. The mapping is generated from the training set and then used on the validation set.

This error occurs when the validation set contains values that aren't present in the training set (for instance, if the training set only had Monday and Tuesday but the validation set also had Wednesday). In this case, there's no value to map it to, hence the error.

I'm hoping to automatically handle this in the future, but the best options now are either:

  1. Disable the validation set (Eps::Model.new(split: false))
  2. Pass your own validation set with no unseen values

Linear regression uses a different method of mapping categorical features which doesn't have this limitation.

@ankane ankane closed this as completed in 50bfcc9 Dec 5, 2019
@ankane
Copy link
Owner

ankane commented Dec 5, 2019

It looks like LightGBM can handle unseen values in the validation set, so just pushed a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants