Fix predict difference #6384

ShvetsKS · 2020-11-12T12:40:24Z

related issue: #6350

trivialfis

Could you please provide a brief explanation for what happened? Also a unittest. Lastly I see that you are adding an extra allocation, does it impact perf?

trivialfis · 2020-11-12T12:53:17Z

Ah, I saw your reply on the issue. Will continue there.

ShvetsKS · 2020-11-12T12:58:50Z

Could you please provide a brief explanation for what happened? Also a unittest. Lastly I see that you are adding an extra allocation, does it impact perf?

The reason of difference in output result is changed sequence of floating point operations:
Before a4ce0ea optimizations we accumulated trees responses for each sample in local variable psum with 0 initial value. And than increment out_pred[i].
But in a4ce0ea optimizations we increment out_pred[i] directly (initial value 0.5 usually), so the fp error was a little bit different.

There is already implemented cpp tests for cpu predictor (cpu_predictor->PredictBatch), there is no significant changes. Or you mentioned about ReduceResult?

I checked it on several benchmarks, seems there is no performance affection.

ShvetsKS · 2020-11-13T08:26:04Z

https://xgboost-ci.net/blue/organizations/jenkins/xgboost-win64/detail/PR-6384/3/pipeline#step-89-log-1987
Not sure that it's related to current changes as locally all tests from tests/python/test_with_sklearn.py are passed.

@trivialfis should we restart Jenkins Win64: Test ?

trivialfis · 2020-11-13T08:38:59Z

@ShvetsKS Let's pause on this with comment: #6350 (comment) . We @RAMitchell @hcho3 agreed that we should not establish the tradition of fixing floating change across versions, as you stated, that will severely limit our development. We haven't decided how to document or whether do we need to document this.

trivialfis · 2020-11-13T08:42:03Z

But to reply your question on the failing CI. Yeah, it happens all the time that error is only reproducible on CI ... Something I have been bumping into quite often. The error in this PR is new to me, so it might be specific to this PR instead of CI glitches. You can see on our issues list for all flaky tests we have found.

ShvetsKS · 2020-11-13T08:45:32Z

@trivialfis seems I should close this PR as #6350 was resolved? :)

hcho3 · 2020-11-13T09:52:09Z

Let us continue our discussion in #6350.

ShvetsKS mentioned this pull request Nov 12, 2020

prediction different in v1.2.1 and master branch #6350

Closed

trivialfis reviewed Nov 12, 2020

View reviewed changes

trivialfis added the Blocking label Nov 12, 2020

ShvetsKS mentioned this pull request Nov 12, 2020

Disable HT for DMatrix creation #6386

Merged

fix difference

2e159d6

ShvetsKS force-pushed the FIX_PREDICT_DIFFERENCE branch from 79ced79 to 2e159d6 Compare November 13, 2020 06:28

fix multiclass

b568110

hcho3 closed this Nov 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix predict difference #6384

Fix predict difference #6384

ShvetsKS commented Nov 12, 2020

trivialfis left a comment

trivialfis commented Nov 12, 2020

ShvetsKS commented Nov 12, 2020

ShvetsKS commented Nov 13, 2020 •

edited

Loading

trivialfis commented Nov 13, 2020 •

edited

Loading

trivialfis commented Nov 13, 2020

ShvetsKS commented Nov 13, 2020

hcho3 commented Nov 13, 2020

Fix predict difference #6384

Fix predict difference #6384

Conversation

ShvetsKS commented Nov 12, 2020

trivialfis left a comment

Choose a reason for hiding this comment

trivialfis commented Nov 12, 2020

ShvetsKS commented Nov 12, 2020

ShvetsKS commented Nov 13, 2020 • edited Loading

trivialfis commented Nov 13, 2020 • edited Loading

trivialfis commented Nov 13, 2020

ShvetsKS commented Nov 13, 2020

hcho3 commented Nov 13, 2020

ShvetsKS commented Nov 13, 2020 •

edited

Loading

trivialfis commented Nov 13, 2020 •

edited

Loading