Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid nan and inf in weight/label/init_score #2377

Merged
merged 2 commits into from Sep 7, 2019
Merged

Conversation

guolinke
Copy link
Collaborator

@guolinke guolinke commented Sep 4, 2019

to fix #2359

@StrikerRUS
Copy link
Collaborator

@guolinke I can try to add tests if you don't mind.

@StrikerRUS
Copy link
Collaborator

@guolinke During writing a test, I discovered that we need to update wrappers code (Python-package at least) as well.

import numpy as np
import lightgbm as lgb

from sklearn.datasets import load_boston

X, y = load_boston(True)
sequence = np.ones(y.shape[0])
sequence[0] = np.nan

lgb_data = lgb.Dataset(X, y).construct()
lgb_data.set_weight(sequence)

Now, get_field() and get_*() have inconsistent behavior: get_field() returns modified value, while get_*() - not.

lgb_data.get_weight()
>>> array([nan,  1.,  1.,  1.,  1., ...
lgb_data.get_field('weight')
>>> array([0.,  1.,  1.,  1.,  1., ...

@StrikerRUS
Copy link
Collaborator

I noticed that you used this PR to fix code style in metadata.cpp, so I replaced postfix increment with prefix one in b32988a.

@guolinke
Copy link
Collaborator Author

guolinke commented Sep 6, 2019

@StrikerRUS
Thanks, that is because .get_weight() will return the data in python, and .get_field will return data in cpp. This PR only fixes the invalid values in cpp side.

So maybe a better solution is to convert nan/inf in python side as well.

@StrikerRUS
Copy link
Collaborator

@guolinke OK! What do you think about the following plan? Let's merge this PR with cpp changes without tests then. And after that I'll propose Python side updates with tests.

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can approve because changes work locally: #2377 (comment).

@guolinke
Copy link
Collaborator Author

guolinke commented Sep 6, 2019

Sure! Sounds good

@StrikerRUS StrikerRUS merged commit 33d0378 into master Sep 7, 2019
@StrikerRUS StrikerRUS deleted the check-valid-field branch September 7, 2019 17:27
@lock lock bot locked as resolved and limited conversation to collaborators Mar 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check weights for NANs
2 participants