Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] disabled split value histogram for categorical features #2045

Merged
merged 3 commits into from Mar 14, 2019

Conversation

StrikerRUS
Copy link
Collaborator

@StrikerRUS StrikerRUS commented Mar 10, 2019

Replace TypeError for categorical features with more understandable LightGBMError.

import lightgbm as lgb
from sklearn.datasets import load_boston

data = lgb.Dataset(*load_boston(True), categorical_feature=[2])
booster = lgb.train({'verbose': -1}, data, num_boost_round=20)
booster.get_split_value_histogram(2)
TypeError                                 Traceback (most recent call last)
<ipython-input-4-46f3fc73cd0a> in <module>
----> 1 booster.get_split_value_histogram(2)

C:\Program Files\Anaconda3\lib\site-packages\lightgbm\basic.py in get_split_value_histogram(self, feature, bins, xgboost_style)
   2480             bins = max(min(n_unique, bins) if bins is not None else n_unique, 1)
   2481         print(values)
-> 2482         hist, bin_edges = np.histogram(values, bins=bins)
   2483         if xgboost_style:
   2484             ret = np.column_stack((bin_edges[1:], hist))

C:\Program Files\Anaconda3\lib\site-packages\numpy\lib\histograms.py in histogram(a, bins, range, normed, weights, density)
    700     a, weights = _ravel_and_check_weights(a, weights)
    701 
--> 702     bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
    703 
    704     # Histogram is an integer or a float array depending on the weights.

C:\Program Files\Anaconda3\lib\site-packages\numpy\lib\histograms.py in _get_bin_edges(a, bins, range, weights)
    353             raise ValueError('`bins` must be positive, when an integer')
    354 
--> 355         first_edge, last_edge = _get_outer_edges(a, range)
    356 
    357     elif np.ndim(bins) == 1:

C:\Program Files\Anaconda3\lib\site-packages\numpy\lib\histograms.py in _get_outer_edges(a, range)
    248         first_edge, last_edge = 0, 1
    249     else:
--> 250         first_edge, last_edge = a.min(), a.max()
    251         if not (np.isfinite(first_edge) and np.isfinite(last_edge)):
    252             raise ValueError(

C:\Program Files\Anaconda3\lib\site-packages\numpy\core\_methods.py in _amin(a, axis, out, keepdims, initial)
     30 def _amin(a, axis=None, out=None, keepdims=False,
     31           initial=_NoValue):
---> 32     return umr_minimum(a, axis, None, out, keepdims, initial)
     33 
     34 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,

TypeError: cannot perform reduce with flexible type

values here
https://github.com/Microsoft/LightGBM/blob/8d6666e0ffb7165096753dc392044af18ef2eae6/python-package/lightgbm/basic.py#L2481
values = ['2||3||6||10||18', '2||3||6||10||18', '2||4||6||10||18', '2||3||6||10||18', '1||4||5||6', '1||2||4||10||21', '1||4||5||6', '10||18||19||21', '3||6||7||8||10||18||21', '3||5||6||9', '1||2||3||7||10||21', '1||2||3||8||10||21', '5||6||10||18', '5||6||9||18', '6||7||9||10||18', '2||4||6||7||9||18', '1||2||3||8||21']

@guolinke guolinke merged commit ffb134c into master Mar 14, 2019
@StrikerRUS StrikerRUS deleted the hist_hotfix branch March 14, 2019 09:51
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants