Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add init_score & test cpp and python result consistency #1007

Merged
merged 15 commits into from
Nov 9, 2017

Conversation

wxchan
Copy link
Contributor

@wxchan wxchan commented Oct 22, 2017

@guolinke my new added test cases always fail. Can you help check where goes wrong?

@guolinke
Copy link
Collaborator

@wxchan sure, I will check it

@guolinke
Copy link
Collaborator

@wxchan I think the reason is the different parameters.
I think you can simply use a toy example for the test, like this: https://github.com/Microsoft/LightGBM/blob/master/tests/python_package_test/test_engine.py#L213-L245

@wxchan
Copy link
Contributor Author

wxchan commented Oct 23, 2017

@guolinke but I read params from train.conf files, it should be the same. The strange thing is only a few of samples fail.

@guolinke
Copy link
Collaborator

guolinke commented Oct 23, 2017

@wxchan i found the trained model is exact the same, but the prediction is not.
Maybe there is something wrong in prediction

@wxchan
Copy link
Contributor Author

wxchan commented Oct 23, 2017

@guolinke maybe something to do with those fields like init_score? It passed when I removed init_score.

@guolinke
Copy link
Collaborator

@wxchan really ? I think the init_score should have no impact on prediction.

@guolinke
Copy link
Collaborator

@wxchan
I found the problem is caused by the Atof: https://github.com/Microsoft/LightGBM/blob/master/include/LightGBM/utils/common.h#L163-L251

if replaced it with the std::stod, everything seems fine.

inline static const char* Atof(const char* p, double* out) {
  *out = NAN;
  while (*p == ' ') {
    ++p;
  }
  size_t cnt = 0;
  while (*(p + cnt) != '\0' && *(p + cnt) != ' '
         && *(p + cnt) != '\t' && *(p + cnt) != ','
         && *(p + cnt) != '\n' && *(p + cnt) != '\r'
         && *(p + cnt) != ':') {
    ++cnt;
  }
  if (cnt > 0) {
    std::string str(p, cnt);
    p += cnt;
    std::transform(str.begin(), str.end(), str.begin(), Common::tolower);
    if (str == std::string("inf")) {
      *out = 1e308;
    } else if (str == std::string("-inf")) {
      *out = -1e308;
    } else if (str != std::string("na") && str != std::string("nan")) {
      *out = std::stod(str);
    }
  }
  while (*p == ' ') {
    ++p;
  }
  return p;
}

@guolinke
Copy link
Collaborator

@wxchan
Copy link
Contributor Author

wxchan commented Oct 24, 2017

@guolinke does it cause by some feature values near thresholds fall into different sub-trees because the precision of model?

@guolinke
Copy link
Collaborator

@wxchan yeah, exactly. I am trying to update atof .

@guolinke
Copy link
Collaborator

guolinke commented Oct 24, 2017

@wxchan can you try to replace the Atof with:

template<class T>
inline static double Pow(T base, int power) {
  if (power < 0) {
    return 1.0 / Pow(base, -power);
  } else if (power == 0) {
    return 1;
  } else if (power % 2 == 0) {
    return Pow(base*base, power / 2);
  } else if (power % 3 == 0) {
    return Pow(base*base*base, power / 3);
  } else {
    return base * Pow(base, power - 1);
  }
}

inline static const char* Atof(const char* p, double* out) {
  int frac;
  double sign, value, scale;
  *out = NAN;
  // Skip leading white space, if any.
  while (*p == ' ') {
    ++p;
  }

  // Get sign, if any.
  sign = 1.0;
  if (*p == '-') {
    sign = -1.0;
    ++p;
  } else if (*p == '+') {
    ++p;
  }

  // is a number
  if ((*p >= '0' && *p <= '9') || *p == '.' || *p == 'e' || *p == 'E') {
    // Get digits before decimal point or exponent, if any.
    for (value = 0.0; *p >= '0' && *p <= '9'; ++p) {
      value = value * 10.0 + (*p - '0');
    }

    // Get digits after decimal point, if any.
    if (*p == '.') {
      double right = 0.0;
      int nn = 0;
      ++p;
      while (*p >= '0' && *p <= '9') {
        right = (*p - '0') + right * 10.0;
        ++nn;
        ++p;
      }
      value += right / Pow(10.0, nn);
    }

    // Handle exponent, if any.
    frac = 0;
    scale = 1.0;
    if ((*p == 'e') || (*p == 'E')) {
      uint32_t expon;
      // Get sign of exponent, if any.
      ++p;
      if (*p == '-') {
        frac = 1;
        ++p;
      } else if (*p == '+') {
        ++p;
      }
      // Get digits of exponent, if any.
      for (expon = 0; *p >= '0' && *p <= '9'; ++p) {
        expon = expon * 10 + (*p - '0');
      }
      if (expon > 308) expon = 308;
      // Calculate scaling factor.
      while (expon >= 50) { scale *= 1E50; expon -= 50; }
      while (expon >= 8) { scale *= 1E8;  expon -= 8; }
      while (expon > 0) { scale *= 10.0; expon -= 1; }
    }
    // Return signed and scaled floating point result.
    *out = sign * (frac ? (value / scale) : (value * scale));
  } else {
    size_t cnt = 0;
    while (*(p + cnt) != '\0' && *(p + cnt) != ' '
      && *(p + cnt) != '\t' && *(p + cnt) != ','
      && *(p + cnt) != '\n' && *(p + cnt) != '\r'
      && *(p + cnt) != ':') {
      ++cnt;
    }
    if (cnt > 0) {
      std::string tmp_str(p, cnt);
      std::transform(tmp_str.begin(), tmp_str.end(), tmp_str.begin(), Common::tolower);
      if (tmp_str == std::string("na") || tmp_str == std::string("nan")) {
        *out = NAN;
      } else if (tmp_str == std::string("inf") || tmp_str == std::string("infinity")) {
        *out = sign * 1e308;
      } else {
        Log::Fatal("Unknown token %s in data file", tmp_str.c_str());
      }
      p += cnt;
    }
  }

  while (*p == ' ') {
    ++p;
  }

  return p;
}

@guolinke
Copy link
Collaborator

strange, I can pass the test of binary classification test locally

@guolinke
Copy link
Collaborator

@wxchan did chech the dtypes of np.loadtext? it is float64 in my local machine.

@wxchan
Copy link
Contributor Author

wxchan commented Oct 24, 2017

@guolinke it's float64 on my machine too, but it still fails. It's possible something to do with np.loadtext because loading with load_svmlight_file passes the test.

@wxchan
Copy link
Contributor Author

wxchan commented Oct 24, 2017

@guolinke still fail if I init dataset with file path string.

@guolinke
Copy link
Collaborator

guolinke commented Oct 25, 2017

@wxchan I can pass these tests except multi-class in my local machine.

@guolinke
Copy link
Collaborator

@wxchan refer to the test result: https://travis-ci.org/Microsoft/LightGBM/jobs/292447965
I think the different is very minor. So maybe we can use other solution to fix the test.
BTW, i test the speed of strtod, it is about 1.5x slow ... So I think we should not switch to it.

@guolinke
Copy link
Collaborator

guolinke commented Oct 25, 2017

@wxchan I don't know why these tests are still failed: https://travis-ci.org/Microsoft/LightGBM/jobs/292509708

they all pass in my local machine

@wxchan
Copy link
Contributor Author

wxchan commented Oct 25, 2017

@guolinke how about giving some tolerances of inequality? I think both results are reasonable?

@guolinke
Copy link
Collaborator

@wxchan I think we can check the model, instead of the prediction result

@wxchan
Copy link
Contributor Author

wxchan commented Oct 25, 2017

@guolinke then we can't check consistency of prediction code.

@guolinke
Copy link
Collaborator

@wxchan we can check it by using the same model. I remember python-package can predict from file, which calls the same function of cpp.

@wxchan
Copy link
Contributor Author

wxchan commented Oct 25, 2017

@guolinke I tried that, strangely can't pass tests too.

@guolinke
Copy link
Collaborator

still the float point issue? did you try the std::atof?

@guolinke
Copy link
Collaborator

@wxchan I think my PR on your branch should pass the tests.
you can try it with the model comparison.

* update atof

* fix bug

* fix tests.

* fix bug

* fix dtypes

* fix categorical feature override

* fix protobuf on vs build (microsoft#1004)

* [optional] support protobuf

* fix windows/LightGBM.vcxproj

* add doc

* fix doc

* fix vs support (#2)

* fix vs support

* fix cmake

* fix microsoft#1012

* [python] add network config api  (microsoft#1019)

* add network

* update doc

* add float tolerance in bin finder.

* fix a bug

* update tests

* add double torelance on tree model

* fix tests

* simplify the double comparison

* fix lightsvm zero base

* move double tolerance to the bin finder.

* fix pylint
@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Nov 4, 2017

@wxchan I think that setting any parameter's default value to None without the appropriate description (except the case when None means no using this parameter) very confusing.

To speak about random_state, can we place sklearn's specific parameters into cpp alias_table too?

@wxchan
Copy link
Contributor Author

wxchan commented Nov 4, 2017

@StrikerRUS it makes no difference. The difference now is, the seed in native api has no default value, it will use c++ default *_seed; sklearn api has a default seed=0, so those *_seed will be set to some other values.

I think users just want reproducible behavior by using seed, they just need to know it's fixed result or random.

@wxchan
Copy link
Contributor Author

wxchan commented Nov 4, 2017

speak of this, a totally different subject: there was a feature request about supporting numpy RandomState for random_state, I think it's unnecessary and didn't have a good idea for it, perhaps we can support it if it's simple. #730

@StrikerRUS
Copy link
Collaborator

@wxchan My first paragraph of comment was more about max_bin and other params.

Your idea about ignoring random_state if it's None and usind default *_seed values looks good to me

set sklearn default random_state=None, and filter it out if it's None

@wxchan
Copy link
Contributor Author

wxchan commented Nov 4, 2017

@StrikerRUS how about just delete it from __init__? It can still be accessed by kwargs.

@StrikerRUS
Copy link
Collaborator

@wxchan I think it'll be confusing: random_state is one of the standard (I don't know how to name it...) sklearn's parameters and users will think it's not presented in LightGBM at all if they will not see it in the constructor's signature.

@@ -555,8 +555,8 @@ def __pred_for_csc(self, csc, num_iteration, predict_type):

class Dataset(object):
"""Dataset in LightGBM."""
def __init__(self, data, label=None, max_bin=255, reference=None,
weight=None, group=None, silent=False,
def __init__(self, data, label=None, max_bin=None, reference=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask why you've changed max_bin to None in public interface when it's actually still 255? I wrote in my previous comment about such cases that they're confusing. If you worry about you'll miss some places while changing the default value in Parameters.rst, just search GitHub with param=old_value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's set by @guolinke

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StrikerRUS This is to solve problem of the parameter priority. User can set max_bin (and other parameters) in both the parameter dict and the function arguments. This will cause the problem: we don't know which one is high "priority" when user set it in both two places. As a result, we can set the default function argument to None. If user didn't set the function argument, we use the value in parameter dict, otherwise use the function arguments.

We can add something like defualt=None (255 in c_api) in python doc, to avoid the addiction search of that parameter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guolinke
I suppose it's obvious for users that function argument shouldn't be duplicated in param dict and parameter in signature which has the default value and the own description has higher priority than parameter in dict.

Maybe than it's better to completely remove max_been from arguments (for 1 or 2 release mark as deprecated)? Because it's oddly that None actually means 255...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some users like to put all parameters into dicts( like me), some like to put them into function arguments. I think set default to None is reasonable, since it is consistent with CLI version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove them from arguments as well

@@ -0,0 +1 @@
2.0.10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What for do we need the duplicate of this?
https://github.com/Microsoft/LightGBM/blob/master/VERSION.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add it to git accidentally

@wxchan
Copy link
Contributor Author

wxchan commented Nov 4, 2017

the gpu test fails. is there some parameter inconsistency?

@StrikerRUS
Copy link
Collaborator

@wxchan Maybe because of this?

  • Question 5: When using LightGBM GPU, I cannot reproduce results over several runs.
  • Solution 5: It is a normal issue, there is nothing we/you can do about, you may try to use gpu_use_dp = true for reproducibility (see Add reproducible warn for Microsoft/LightGBM#559 #560). You may also use CPU version.

@guolinke
Copy link
Collaborator

guolinke commented Nov 5, 2017

@wxchan maybe you can try run gpu by python two times, and check the consistency. And they are not the same, we can remove the consistency in GPU test.

@wxchan wxchan closed this Nov 5, 2017
@wxchan wxchan reopened this Nov 5, 2017
@wxchan
Copy link
Contributor Author

wxchan commented Nov 5, 2017

Fixed by gpu_use_dp=true. @guolinke @StrikerRUS

@@ -633,7 +636,8 @@ def _lazy_init(self, data, label=None, max_bin=255, reference=None,
params = {} if params is None else params
self.max_bin = max_bin
self.predictor = predictor
params["max_bin"] = max_bin
if self.max_bin is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wxchan Please add deprecation warning here.

StrikerRUS 23 hours ago Collaborator
Maybe than it's better to completely remove max_been from arguments (for 1 or 2 release mark as deprecated)? Because it's oddly that None actually means 255...

guolinke an hour ago Member
I think we can remove them from arguments as well

@wxchan
Copy link
Contributor Author

wxchan commented Nov 6, 2017

I am outside. If there are no other issues, can we merge this first and fix it in following commit? @guolinke @StrikerRUS

@guolinke
Copy link
Collaborator

guolinke commented Nov 6, 2017

@wxchan sure.
@StrikerRUS you can open a new issue to track the None and default value problem.

@StrikerRUS
Copy link
Collaborator

@wxchan @guolinke OK, will do it.

@wxchan
Copy link
Contributor Author

wxchan commented Nov 9, 2017

@guolinke merge this now?

@guolinke guolinke merged commit bc0579c into microsoft:master Nov 9, 2017
@wxchan wxchan deleted the init_score branch November 9, 2017 15:26
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants