catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6" #773

Palmik · 2019-04-08T17:47:21Z

Problem: The above exception is thrown for certain target values.
catboost version: 0.13.1
Operating System: Linux

How to reproduce:

import catboost as cb
import numpy as np

print(cb.__version__)

model = cb.CatBoostRegressor(
    iterations=1,
    depth=1,
    loss_function='RMSE',
    # If you change the eval metric to RMSE it works
    eval_metric='AUC:border={}'.format(0.5),
    train_dir='/tmp/cbtest2',
)

x = np.array([[1.5], [0.1]])
# If you change the following line to: y = np.array([0.6, 0.4]) it works
y = np.array([0.99, 0.4])
pool = cb.Pool(x, label=y)

x_valid = np.array([[0.33]])
y_valid = np.array([0.6])
pool_valid = cb.Pool(x_valid, label=y_valid)

model.fit(X=pool, eval_set=pool_valid, use_best_model=False)

Full output:

0.13.1

---------------------------------------------------------------------------
CatBoostError                             Traceback (most recent call last)
<ipython-input-81-d2333a747008> in <module>
     21 pool_valid = cb.Pool(x_valid, label=y_valid)
     22 
---> 23 model.fit(X=pool, eval_set=pool_valid, use_best_model=False)

~/.conda/envs/thehft-ml/lib/python3.7/site-packages/catboost/core.py in fit(self, X, y, cat_features, sample_weight, baseline, use_best_model, eval_set, verbose, logging_level, plot, column_description, verbose_eval, metric_period, silent, early_stopping_rounds, save_snapshot, snapshot_file, snapshot_interval)
   2699                          use_best_model, eval_set, verbose, logging_level, plot, column_description,
   2700                          verbose_eval, metric_period, silent, early_stopping_rounds,
-> 2701                          save_snapshot, snapshot_file, snapshot_interval)
   2702 
   2703     def predict(self, data, ntree_start=0, ntree_end=0, thread_count=-1, verbose=None):

~/.conda/envs/thehft-ml/lib/python3.7/site-packages/catboost/core.py in _fit(self, X, y, cat_features, pairs, sample_weight, group_id, group_weight, subgroup_id, pairs_weight, baseline, use_best_model, eval_set, verbose, logging_level, plot, column_description, verbose_eval, metric_period, silent, early_stopping_rounds, save_snapshot, snapshot_file, snapshot_interval)
   1171 
   1172         with log_fixup(), plot_wrapper(plot, self.get_params()):
-> 1173             self._train(train_pool, eval_sets, params, allow_clear_pool)
   1174 
   1175         if (not self._object._has_leaf_weights_in_model()) and allow_clear_pool:

~/.conda/envs/thehft-ml/lib/python3.7/site-packages/catboost/core.py in _train(self, train_pool, test_pool, params, allow_clear_pool)
    864 
    865     def _train(self, train_pool, test_pool, params, allow_clear_pool):
--> 866         self._object._train(train_pool, test_pool, params, allow_clear_pool)
    867         self._set_trained_model_attributes()
    868 

_catboost.pyx in _catboost._CatBoost._train()

_catboost.pyx in _catboost._CatBoost._train()

CatBoostError: catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6"

The text was updated successfully, but these errors were encountered:

annaveronika · 2019-04-08T20:26:50Z

This should be already fixed in code. You can try to build from source and run the code. If it's already fixed then it'll be on pypi in the next version tomorrow. But we'll check one more time and get back to you.

annaveronika · 2019-04-09T08:54:40Z

This is fixed in the latest release 0.14

eccodolf · 2019-05-29T10:48:30Z

v0.15, error persists.

annaveronika · 2019-05-29T11:22:26Z

We cannot reproduce the error. It looks like you are still using the old version.

eccodolf · 2019-05-29T11:27:48Z

No, it's 0.15. Also tried 0.14 - same result. Appears when I pass validation pool with categorical columns in pandas dataset. ср, 29 мая 2019 г., 14:22 annaveronika <notifications@github.com>:

…

We cannot reproduce the error. It looks like you are still using the old version. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#773?email_source=notifications&email_token=AI2ZXDBDY3L37NHXTLUZUWDPXZRPVA5CNFSM4HELACMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWPAMCI#issuecomment-496895497>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AI2ZXDA5RFSTKVOQMJWN6T3PXZRPVANCNFSM4HELACMA> .

annaveronika · 2019-05-29T11:29:27Z

Could you please run print(catboost.__version__) just to make sure that you are right?

annaveronika · 2019-05-29T11:30:33Z

And if it reproduces, please create a new issue with the code that you are running. The code above runs correctly in 0.15

andrey-khropov · 2019-05-29T11:31:29Z

No, it's 0.15. Also tried 0.14 - same result. Appears when I pass validation pool with categorical columns in pandas dataset.

Can you provide a new minimal failing example? Original example in #773 (comment) works without problems in 0.15.

eccodolf · 2019-05-29T16:41:43Z

Solved. eval_set contained labels that model have never seen.
My y has roughly 1500 categories and cleaning valuecounts for y =1 and stratifying split by y solved this problem.
Suggest throwing more detailed exception to prevent posting such errors.

annaveronika · 2019-06-04T19:37:20Z

Yes, we'll update the error, thanks for the suggestion!

JunpeiTakubo · 2019-11-19T04:56:30Z

I came across this error at version 0.18.

agcala · 2020-09-15T19:31:43Z

I came across this error at version 0.24. I used the class_names parameter to prevent it happening again.

tobianointing · 2020-10-02T02:21:59Z

I came across this error at version 0.24. I used the class_names parameter to prevent it happening again.

please how did you do this

gitpickle · 2021-01-07T12:39:27Z

Hi, am I understanding this issue correctly ... target category labels are being encountered in y that are not found in X, correct?

@eccodolf states "cleaning valuecounts for y =1 and stratifying split by y solved this problem". Does this mean that he removed the rows that contained labels not found in train, and if possible would someone post an example of how to achieve this?

I don't full understand what he is saying and would greatly appreciate a pointer in the right direction. THANKS. Mike

gitpickle · 2021-01-07T12:48:25Z

ah. I think I am seeing what eccodolf is referring to. https://stackoverflow.com/questions/34842405/parameter-stratify-from-method-train-test-split-scikit-learn.

It looks like we can split train/test in such a manner that we make sure all target y labels are found in both sets with a similar %.

Am I on a correct path?

gitpickle · 2021-01-07T13:09:42Z

the error I am experiencing is similar to the one in this post:

CatBoostError: c:/program files (x86)/go agent/pipelines/buildmaster/catboost.git/catboost/private/libs/target/target_converter.cpp:228: Unknown class label: "57"

Evgueni-Petrov-aka-espetrov · 2021-01-11T07:11:12Z

ah. I think I am seeing what eccodolf is referring to. https://stackoverflow.com/questions/34842405/parameter-stratify-from-method-train-test-split-scikit-learn.

It looks like we can split train/test in such a manner that we make sure all target y labels are found in both sets with a similar %.

Am I on a correct path?

Highly likely.

raffieeey · 2021-09-02T08:31:13Z

A solution to solve this problem is to define the class_name you can do this using:

catb_model= CatBoostClassifier(iterations=1000,learning_rate=0.05, loss_function='MultiClass', class_names=["1","2","3","4","5","6","7","8","9","10","11"])

Evgueni-Petrov-aka-espetrov · 2021-09-02T18:07:59Z

Many thanks @raffieeey !

Ashebir07 · 2022-08-16T07:53:15Z

CatBoostError: catboost/cuda/cuda_lib/cuda_manager.cpp:201: Condition violated: `State == nullptr'

how can i solve this error guys please help me

Ashebir07 · 2022-08-16T07:54:17Z

for i,( train_index, test_index) in enumerate(folds.split(X, y)):
X_train, X_test, y_train, y_test = X.iloc[train_index], X.iloc[test_index], y[train_index], y[test_index]

Instantiate model

model = CatBoostClassifier(n_estimators=20000, task_type='GPU')

model = CatBoostClassifier(max_depth=12, learning_rate=0.15, task_type = 'GPU',
grow_policy = 'Lossguide', n_estimators=1500)

Train model

model.fit(X_train, y_train,
eval_set=[(X_test, y_test)],
early_stopping_rounds=200,
verbose = 1000,
use_best_model = True)

** here is my code

annaveronika closed this as completed Apr 9, 2019

annaveronika reopened this May 29, 2019

annaveronika closed this as completed May 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6" #773

catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6" #773

Palmik commented Apr 8, 2019 •

edited

annaveronika commented Apr 8, 2019

annaveronika commented Apr 9, 2019

eccodolf commented May 29, 2019

annaveronika commented May 29, 2019

eccodolf commented May 29, 2019 via email

annaveronika commented May 29, 2019 •

edited

annaveronika commented May 29, 2019

andrey-khropov commented May 29, 2019

eccodolf commented May 29, 2019 •

edited

annaveronika commented Jun 4, 2019

JunpeiTakubo commented Nov 19, 2019

agcala commented Sep 15, 2020 •

edited

tobianointing commented Oct 2, 2020

gitpickle commented Jan 7, 2021

gitpickle commented Jan 7, 2021

gitpickle commented Jan 7, 2021

Evgueni-Petrov-aka-espetrov commented Jan 11, 2021

raffieeey commented Sep 2, 2021

Evgueni-Petrov-aka-espetrov commented Sep 2, 2021

Ashebir07 commented Aug 16, 2022

Ashebir07 commented Aug 16, 2022

catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6" #773

catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6" #773

Comments

Palmik commented Apr 8, 2019 • edited

annaveronika commented Apr 8, 2019

annaveronika commented Apr 9, 2019

eccodolf commented May 29, 2019

annaveronika commented May 29, 2019

eccodolf commented May 29, 2019 via email

annaveronika commented May 29, 2019 • edited

annaveronika commented May 29, 2019

andrey-khropov commented May 29, 2019

eccodolf commented May 29, 2019 • edited

annaveronika commented Jun 4, 2019

JunpeiTakubo commented Nov 19, 2019

agcala commented Sep 15, 2020 • edited

tobianointing commented Oct 2, 2020

gitpickle commented Jan 7, 2021

gitpickle commented Jan 7, 2021

gitpickle commented Jan 7, 2021

Evgueni-Petrov-aka-espetrov commented Jan 11, 2021

raffieeey commented Sep 2, 2021

Evgueni-Petrov-aka-espetrov commented Sep 2, 2021

Ashebir07 commented Aug 16, 2022

Ashebir07 commented Aug 16, 2022

Instantiate model

model = CatBoostClassifier(n_estimators=20000, task_type='GPU')

Train model

Palmik commented Apr 8, 2019 •

edited

annaveronika commented May 29, 2019 •

edited

eccodolf commented May 29, 2019 •

edited

agcala commented Sep 15, 2020 •

edited