Different behaviour on python-exported model and C++ API predictor #2549

mkornaukhov03 · 2023-12-04T15:31:18Z

Different behaviour on python-exported model and C++ API predictor

Catboost version: 1.2.2
CPU: Intel Core i5-8550U
OS: Ubuntu [20.04]

I have the trained catboost model (see multi_class.py in attachment) and saved it as
model.cbm and model.py. Besides,
I have saved the input and the answer for one row in input.1.json.
In C++ I have the following code:

auto wrapper = ModelCalcerWrapper("model.cbm");

// parse input.1.json

for (const InputRow &row : in.input) {
auto pred = wrapper.CalcMulti(row.vector_float_features, row.vector_categorial_features);
// check that difference (pred - expected) is small enough
}

and the answer matches.
If I look at the python-exported file, then the error is immediately visible there:

current_tree_leaf_values_index += (1 << current_tree_depth) * model.dimension

* model.dimension shouldn't be there (as it is in cpp-exported version, for now it raises index out of bound
exception).
I fixed it in model.fixed.py file and add some code to predict a model:

if __name__ == "__main__":
    input_file = "input.1.json"
    data = open(input_file)
    item = json.load(data)[0]
    ans = item['ans']
    float_feats = item['float_features']
    cat_feats = item['cat_features']
    resp = apply_catboost_model_multi(float_feats, cat_feats)
    print("expected = {}".format(ans))
    print("real     = {}".format(resp))

# expected = [-0.4315705250918395, -0.07602514583990287, 0.5075956709317426]
# real     = [-0.012774193043495252, -0.048760865913935775, 0.38847943878388946]

but the answer doesn't match. I suppose it's a problem. Or am I doing something wrong?

Attachment

https://gist.github.com/mkornaukhov03/5c5d9e394f17141cac4fa63d2b09e026

The text was updated successfully, but these errors were encountered:

mkornaukhov03 · 2024-01-31T20:13:31Z

This problem is still reproducible, snippet is the same. Please, reopen issue

andrey-khropov · 2024-01-31T20:17:12Z

This problem is still reproducible, snippet is the same. Please, reopen issue

What is the commit where it can be reproduced?

mkornaukhov03 · 2024-01-31T20:35:57Z

This problem is still reproducible, snippet is the same. Please, reopen issue

What is the commit where it can be reproduced?

e10f9da

andrey-khropov · 2024-02-28T19:28:05Z

This problem is still reproducible, snippet is the same. Please, reopen issue

What is the commit where it can be reproduced?

e10f9da

I cannot reproduce it, the example in the description works for me.

Also, using catboost 1.2.3 (multi_class.py is from the attachment and check_result.py is a copy of the code in the description of this issue):

$ python --version
Python 3.11.0
$ python -m pip install catboost==1.2.3
...
$ python ./multi_class.py 



0:	learn: 0.9417331	total: 49.8ms	remaining: 448ms
1:	learn: 0.8421839	total: 50ms	remaining: 200ms
2:	learn: 0.6597822	total: 50.1ms	remaining: 117ms
3:	learn: 0.6028493	total: 50.2ms	remaining: 75.3ms
4:	learn: 0.4900112	total: 50.4ms	remaining: 50.4ms
5:	learn: 0.4076408	total: 50.5ms	remaining: 33.7ms
6:	learn: 0.3458205	total: 50.6ms	remaining: 21.7ms
7:	learn: 0.2982687	total: 50.8ms	remaining: 12.7ms
8:	learn: 0.2608927	total: 50.9ms	remaining: 5.65ms
9:	learn: 0.2309514	total: 51ms	remaining: 0us
[['USA']
 ['USA']
 ['UK']
 ['USA']]
[[0.20060959 0.2862616  0.51312881]
 [0.07388963 0.06071726 0.86539311]
 [0.27590481 0.46474219 0.259353  ]
 [0.2580995  0.1213261  0.6205744 ]]
[[-0.43157053 -0.07602515  0.50759567]
 [-0.75475564 -0.95110009  1.70585572]
 [-0.15318701  0.36823989 -0.21505288]
 [-0.04081236 -0.7956756   0.83648797]]
Input #1
	Ans = [-0.43157053 -0.07602515  0.50759567]
	Cat   features = ['winter']
	Float features = [1996, 197]
Input #2
	Ans = [-0.75475564 -0.95110009  1.70585572]
	Cat   features = ['winter']
	Float features = [1968, 37]
Input #3
	Ans = [-0.15318701  0.36823989 -0.21505288]
	Cat   features = ['summer']
	Float features = [2002, 77]
Input #4
	Ans = [-0.04081236 -0.7956756   0.83648797]
	Cat   features = ['summer']
	Float features = [1948, 59]
$ cat ./check_result.py 

import json

from model import apply_catboost_model_multi


if __name__ == "__main__":
    input_file = "input.1.json"
    data = open(input_file)
    item = json.load(data)[0]
    ans = item['ans']
    float_feats = item['float_features']
    cat_feats = item['cat_features']
    resp = apply_catboost_model_multi(float_feats, cat_feats)
    print("expected = {}".format(ans))
    print("real     = {}".format(resp))
$ python ./check_result.py 
expected = [-0.4315705250918395, -0.07602514583990287, 0.5075956709317426]
real     = [-0.4315705250918395, -0.07602514583990279, 0.5075956709317426]

andrey-khropov added eval_formula python C/C++ applier labels Dec 4, 2023

andrey-khropov self-assigned this Dec 27, 2023

andrey-khropov added the bug label Dec 28, 2023

robot-piglet closed this as completed in 70d7459 Jan 4, 2024

robot-piglet pushed a commit that referenced this issue Jan 4, 2024

Test model export to C++ and Python on Multiclass models.. #2549

298b782

andrey-khropov added the need info label Jan 31, 2024

Evgueni-Petrov-aka-espetrov reopened this Feb 5, 2024

andrey-khropov closed this as completed Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different behaviour on python-exported model and C++ API predictor #2549

Different behaviour on python-exported model and C++ API predictor #2549

mkornaukhov03 commented Dec 4, 2023

mkornaukhov03 commented Jan 31, 2024

andrey-khropov commented Jan 31, 2024

mkornaukhov03 commented Jan 31, 2024

andrey-khropov commented Feb 28, 2024

Different behaviour on python-exported model and C++ API predictor #2549

Different behaviour on python-exported model and C++ API predictor #2549

Comments

mkornaukhov03 commented Dec 4, 2023