Skip to content

Commit

Permalink
Fix parsing of non-finite values (#3942)
Browse files Browse the repository at this point in the history
* Fix index out-of-range exception generated by BaggingHelper on small datasets.

Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.

* Update goss.hpp

* Update goss.hpp

* Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)

* Fix incorrect upstream merge

* Add link to LightGBM.NET

* Fix indenting to 2 spaces

* Dummy edit to trigger CI

* Dummy edit to trigger CI

* remove duplicate functions from merge

* Fix parsing of non-finite values.  Current implementation silently returns zero when input string is "inf", "-inf", or "nan" when compiled with VS2017, so instead just explicitly check for these values and fail if there is no match.  No attempt to optimise string allocations in this implementation since it is usually rarely invoked.

* Dummy commit to trigger CI

* Also handle -nan in double parsing method

* Update include/LightGBM/utils/common.h

Remove trailing whitespace to pass linting tests

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: matthew-peacock <matthew.peacock@whiteoakam.com>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
  • Loading branch information
4 people committed Mar 16, 2021
1 parent b6e3076 commit 4e47c93
Showing 1 changed file with 13 additions and 5 deletions.
18 changes: 13 additions & 5 deletions include/LightGBM/utils/common.h
Expand Up @@ -1082,12 +1082,20 @@ struct __StringToTHelper<T, true> {
// Fast (common) path: For numeric inputs in RFC 7159 format:
const bool fast_parse_succeeded = fast_double_parser::parse_number(str.c_str(), &tmp);

// Rare path: Not in RFC 7159 format. Possible "inf", "nan", etc. Fallback to standard library:
// Rare path: Not in RFC 7159 format. Possible "inf", "nan", etc.
if (!fast_parse_succeeded) {
std::stringstream ss;
Common::C_stringstream(ss);
ss << str;
ss >> tmp;
std::string strlower(str);
std::transform(strlower.begin(), strlower.end(), strlower.begin(), [](int c) -> char { return static_cast<char>(::tolower(c)); });
if (strlower == std::string("inf"))
tmp = std::numeric_limits<double>::infinity();
else if (strlower == std::string("-inf"))
tmp = -std::numeric_limits<double>::infinity();
else if (strlower == std::string("nan"))
tmp = std::numeric_limits<double>::quiet_NaN();
else if (strlower == std::string("-nan"))
tmp = -std::numeric_limits<double>::quiet_NaN();
else
Log::Fatal("Failed to parse double: %s", str.c_str());
}

return static_cast<T>(tmp);
Expand Down

0 comments on commit 4e47c93

Please sign in to comment.