-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve conversion performance #60
Conversation
Codecov ReportBase: 91.64% // Head: 88.15% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #60 +/- ##
==========================================
- Coverage 91.64% 88.15% -3.50%
==========================================
Files 5 6 +1
Lines 934 996 +62
==========================================
+ Hits 856 878 +22
- Misses 78 118 +40
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
cc594fd
to
1140c95
Compare
Performance charts before-and-after for the drop-in-replacement functions. Granted, some of the poor "before" performance was bad decisions made in the C++ refactor and the current C implementation is slightly better, but for sure the "after" is much better than the current C implementation. Python 3.7Timing comparison of
|
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 14.588 ± 0.366 | 11.751 ± 0.390 |
Int String | 16.197 ± 2.065 | 16.573 ± 3.420 |
Large Int String | 23.700 ± 1.429 | 28.062 ± 0.614 |
Int | 9.575 ± 0.193 | 11.536 ± 3.479 |
Float | 20.762 ± 2.199 | 24.159 ± 0.449 |
After
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 14.186 ± 0.248 | 7.790 ± 0.138 |
Int String | 15.145 ± 0.093 | 10.171 ± 0.949 |
Medium Int String | 17.014 ± 0.271 | 10.367 ± 0.075 |
Large Int String | 22.850 ± 0.070 | 21.071 ± 0.489 |
Int | 9.258 ± 0.097 | 6.889 ± 0.079 |
Float | 19.593 ± 0.538 | 20.350 ± 0.188 |
Timing comparison of float
functions
Before
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 12.193 ± 0.287 | 15.934 ± 2.573 |
Int String | 12.841 ± 2.024 | 17.251 ± 4.057 |
Large Int String | 38.674 ± 6.709 | 46.769 ± 2.007 |
Small Float String | 14.901 ± 4.136 | 15.891 ± 1.810 |
Float String | 31.982 ± 4.383 | 18.625 ± 1.801 |
Large Float String | 55.922 ± 3.676 | 64.327 ± 3.662 |
Int | 8.534 ± 0.885 | 11.937 ± 0.777 |
Float | 7.511 ± 0.249 | 9.281 ± 1.310 |
After
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 11.829 ± 0.092 | 9.585 ± 0.090 |
Int String | 12.384 ± 0.560 | 9.919 ± 0.055 |
Medium Int String | 13.359 ± 0.051 | 10.308 ± 0.167 |
Large Int String | 35.909 ± 7.434 | 14.476 ± 0.049 |
Small Float String | 12.384 ± 1.176 | 10.130 ± 0.288 |
Float String | 29.721 ± 0.108 | 11.757 ± 1.495 |
Large Float String | 52.927 ± 1.115 | 11.749 ± 0.270 |
Int | 8.040 ± 0.050 | 8.839 ± 0.059 |
Float | 7.213 ± 0.050 | 6.478 ± 0.057 |
Python 3.10
Timing comparison of int
functions
Before
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 9.833 ± 1.317 | 11.301 ± 0.952 |
Int String | 10.943 ± 0.125 | 13.034 ± 0.129 |
Large Int String | 17.007 ± 0.399 | 24.291 ± 0.805 |
Int | 6.068 ± 0.049 | 10.130 ± 0.627 |
Float | 15.456 ± 0.386 | 22.657 ± 0.311 |
After
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 9.631 ± 0.476 | 7.466 ± 0.151 |
Int String | 10.730 ± 0.715 | 9.469 ± 0.185 |
Medium Int String | 11.573 ± 0.726 | 9.414 ± 0.098 |
Large Int String | 16.108 ± 0.168 | 18.211 ± 0.764 |
Int | 5.848 ± 0.814 | 6.317 ± 0.146 |
Float | 15.029 ± 0.244 | 19.011 ± 0.175 |
Timing comparison of float
functions
Before
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 8.967 ± 0.509 | 14.056 ± 1.034 |
Int String | 9.648 ± 0.184 | 14.064 ± 0.207 |
Large Int String | 28.892 ± 1.026 | 40.313 ± 0.423 |
Small Float String | 8.720 ± 0.081 | 13.713 ± 0.017 |
Float String | 26.218 ± 0.220 | 15.497 ± 0.035 |
Large Float String | 50.403 ± 0.354 | 61.001 ± 0.382 |
Int | 4.739 ± 0.048 | 10.855 ± 0.017 |
Float | 3.968 ± 0.051 | 8.958 ± 0.493 |
After
Input type | builtin (ms) | fastnumbers (ms) |
---|---|---|
Small Int String | 8.512 ± 0.565 | 9.487 ± 0.746 |
Int String | 9.216 ± 0.227 | 9.715 ± 0.150 |
Medium Int String | 10.486 ± 0.162 | 10.028 ± 0.087 |
Large Int String | 28.977 ± 1.327 | 14.435 ± 0.182 |
Small Float String | 8.719 ± 0.304 | 9.946 ± 0.652 |
Float String | 26.140 ± 0.447 | 11.401 ± 0.158 |
Large Float String | 52.146 ± 1.277 | 12.464 ± 0.450 |
Int | 4.869 ± 0.045 | 9.016 ± 0.121 |
Float | 4.060 ± 0.064 | 6.736 ± 0.116 |
So I am removing those as a means of communicating timing data. Instead I will just generate markdown directly and store this in the repo.
The previous implementation had as its first priority not invoking the Python exception mechanism, as this is a bit expensive, especially if one is just going to unset it anyway. This optimized the "fail" path at the expense of the numeric path. The logic has been re-thought to optimize for the "happy" path first, and assume that an acceptable trade of for very fast conversions is to have the "fail" path take a little bit longer than before. Scope of the changes - Remove "is_likely" and "might_overflow" pre-checks as they are now no longer needed - Remove the Parser's as_int and as_float methods - we always now just return Python objects - Remove Python's string to double conversion since the C++ parser is now robust enough - Use the integer parser to examine the validity of a string, instead of using a separate full string checker - Add some "peek" methods to the Parser to see if the contained value is of a certain type without having to do a full inspection NOTE: THERE IS A LIE ABOVE!! This commit does not change the C++ string-to-double parser to the more robust version, so many tests fail. The next commit will implement that change.
Instead of 100% relying on the python converter for these integers, use std::from_chars which is fast and provides feedback on overflow, which can be used to determine if we must fall back on the python converter.
Much of the character "parsing" code has been micro-optimized. - When comparing against characters, instead of looking at both upper- and lower-case, we use a bit-hack to force to lower and just make one comparison` - Testing for whitespace and digits is now done with lookup tables - Conversion from character to a digit is now done with a lookup table - All these functions are now constexpr (doesn't help performance, but it makes me feel better :) )
After re-writing parse_int and parse_float, a lot of the helpers that had existed are no longer needed. The parsing file is now cleaned up. As a plus, it was found that we were not being safe about parsing, and we have to check we are not at the end of the char array when we evaluate a character. So now parsing is safer.
Instead of creating a python float object, then seeing if it is int-like and then creating a python long object from there, it is now checked to see which of these two objects should be created.
Code has been added that can parse 8 characters at a time.
This uses the "vector call" functionality to call C functions. It requires fewer memory allocations by Python to call the function and thus is a bit faster. Unsurprisingly, this gave a 1-2 microsecond speedup across the board in the performance tests I ran. Unfortunately, Python does not ship a parser for this, so I adapted the parser from the numpy project. This closes #59.
dig, max_exp, and min_exp no longer have any meaning since fast_float::from_chars was introduced. max_int_len will not have any meaning in a future commit.
Not a performance issue, but stil needed to be fixed.
Python uses long to store non-big integers internally, so originally that is what was being used to parse non-big integers. Some compilers still have 32-bit long (MSVC). So, it is more efficient to use 64-bit int to always use our fast parser when possible, then let python decide how to internally represent it.
Old and new functions have different allow_underscores default, so we have to normalize to do the tests.
1140c95
to
e8ef8c7
Compare
In addition to #57, #59, this PR optimizes performance across many aspects of
fastnumbers
. Some enhancements were algorithmic, some were utilizing more efficient libraries, and some were micro-optimizations and bithacks.This PR also solves #28 by using a more robust string to double conversion function (the one mentioned in #57).