Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiclass classifier error during model training (random search) #1040

Closed
apokrif333 opened this issue Oct 18, 2019 · 1 comment
Closed

Comments

@apokrif333
Copy link

apokrif333 commented Oct 18, 2019

Problem: Multiclass classifier error during model training (random search)
catboost version: '0.17.5'
Operating System: Linux 3.10.0-957.21.3.el7.x86_64
CPU:
GPU:

I get error - 'catboost/private/libs/lapack/linear_system.cpp:31: System of linear equations is not positive definite'.

train_set = np.array([
    [
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        
    ],
    [
        0.00473934, 0.05      , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        
    ],
    [
        0.04739336, 0.1       , 0.        , 0.        , 0.        ,
        0.03191489, 0.        , 0.        , 0.        , 0.        
    ],
    [
        0.        , 0.        , 0.        , 0.        , 0.00298507,
        0.09574468, 0.0195122 , 0.01492537, 0.00787402, 0.        
    ],
    [
        0.0521327 , 0.15      , 0.00480769, 0.07692308, 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        
    ]
])

y_train = np.array([1, 2, 4, 2, 1])

----------------------------------

from catboost import Pool, CatBoostClassifier, cv

cat_params = { 
    'iterations': 500,
    'random_state': 42,
    'loss_function': 'MultiClass',
    'eval_metric': 'TotalF1',
    'early_stopping_rounds': 30,
    'thread_count': -1
}
cat_model = CatBoostClassifier(**cat_params)

param_test ={
    'learning_rate': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.3, 0.5],
    'l2_leaf_reg': [1e-20, 1e-3, 1e-2, 1e-1, 1, 2, 5, 7, 10, 50, 100],
    'depth': [2, 3, 4, 5, 6, 7],
    'border_count':[5, 10, 20, 32, 50],
}
randomized_search_result = cat_model.randomized_search(
    param_test,
    X=np.array(train)[:5, :10],
    y=np.array(y_train)[:5],
    n_iter=500,
    cv=3,
    plot=True,
    refit=True,
    verbose=0
)

-----------------------------------

@annaveronika , I here from - #1022

@Evgueni-Petrov-aka-espetrov , thank you for your hint. If I'm removing '0' from l2_leaf_reg - it's working. But, if I set '1e-20', I still get error.
@Evgueni-Petrov-aka-espetrov, you second question. If I add cat feature:

['male', 'female', 'male', 'female', 'male']
train_pool = Pool(data=small_train, label=y_train[:5], cat_features=[5])

I still get error.

arcadia-devtools pushed a commit that referenced this issue Oct 18, 2019
… trace (#1040 )

ref:0f58b9065abb97d2657aeef2fbde9df3e102f0b6
@annaveronika
Copy link
Contributor

Thank you very much for the report! We have fixed it! The code for the fix is out on github and will be on pypi in the next release. Until than please increase the value of l2_leaf_reg to make it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants