Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve conversion performance #60

Merged
merged 15 commits into from Jan 31, 2023
Merged

Conversation

SethMMorton
Copy link
Owner

In addition to #57, #59, this PR optimizes performance across many aspects of fastnumbers. Some enhancements were algorithmic, some were utilizing more efficient libraries, and some were micro-optimizations and bithacks.

This PR also solves #28 by using a more robust string to double conversion function (the one mentioned in #57).

@codecov
Copy link

codecov bot commented Jan 30, 2023

Codecov Report

Base: 91.64% // Head: 88.15% // Decreases project coverage by -3.50% ⚠️

Coverage data is based on head (e8ef8c7) compared to base (c14959f).
Patch coverage: 85.93% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
- Coverage   91.64%   88.15%   -3.50%     
==========================================
  Files           5        6       +1     
  Lines         934      996      +62     
==========================================
+ Hits          856      878      +22     
- Misses         78      118      +40     
Impacted Files Coverage Δ
src/cpp/extractor.cpp 89.74% <ø> (ø)
src/cpp/argparse.cpp 71.24% <71.24%> (ø)
src/cpp/fastnumbers.cpp 84.88% <96.92%> (-0.12%) ⬇️
src/cpp/c_str_parsing.cpp 99.36% <100.00%> (-0.15%) ⬇️
src/cpp/parser.cpp 93.50% <100.00%> (+1.19%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@SethMMorton SethMMorton force-pushed the improve-conversion-performance branch 11 times, most recently from cc594fd to 1140c95 Compare January 31, 2023 01:22
@SethMMorton
Copy link
Owner Author

Performance charts before-and-after for the drop-in-replacement functions. Granted, some of the poor "before" performance was bad decisions made in the C++ refactor and the current C implementation is slightly better, but for sure the "after" is much better than the current C implementation.

Python 3.7

Timing comparison of int functions

Before

Input type builtin (ms) fastnumbers (ms)
Small Int String 14.588 ± 0.366 11.751 ± 0.390
Int String 16.197 ± 2.065 16.573 ± 3.420
Large Int String 23.700 ± 1.429 28.062 ± 0.614
Int 9.575 ± 0.193 11.536 ± 3.479
Float 20.762 ± 2.199 24.159 ± 0.449

After

Input type builtin (ms) fastnumbers (ms)
Small Int String 14.186 ± 0.248 7.790 ± 0.138
Int String 15.145 ± 0.093 10.171 ± 0.949
Medium Int String 17.014 ± 0.271 10.367 ± 0.075
Large Int String 22.850 ± 0.070 21.071 ± 0.489
Int 9.258 ± 0.097 6.889 ± 0.079
Float 19.593 ± 0.538 20.350 ± 0.188

Timing comparison of float functions

Before

Input type builtin (ms) fastnumbers (ms)
Small Int String 12.193 ± 0.287 15.934 ± 2.573
Int String 12.841 ± 2.024 17.251 ± 4.057
Large Int String 38.674 ± 6.709 46.769 ± 2.007
Small Float String 14.901 ± 4.136 15.891 ± 1.810
Float String 31.982 ± 4.383 18.625 ± 1.801
Large Float String 55.922 ± 3.676 64.327 ± 3.662
Int 8.534 ± 0.885 11.937 ± 0.777
Float 7.511 ± 0.249 9.281 ± 1.310

After

Input type builtin (ms) fastnumbers (ms)
Small Int String 11.829 ± 0.092 9.585 ± 0.090
Int String 12.384 ± 0.560 9.919 ± 0.055
Medium Int String 13.359 ± 0.051 10.308 ± 0.167
Large Int String 35.909 ± 7.434 14.476 ± 0.049
Small Float String 12.384 ± 1.176 10.130 ± 0.288
Float String 29.721 ± 0.108 11.757 ± 1.495
Large Float String 52.927 ± 1.115 11.749 ± 0.270
Int 8.040 ± 0.050 8.839 ± 0.059
Float 7.213 ± 0.050 6.478 ± 0.057

Python 3.10

Timing comparison of int functions

Before

Input type builtin (ms) fastnumbers (ms)
Small Int String 9.833 ± 1.317 11.301 ± 0.952
Int String 10.943 ± 0.125 13.034 ± 0.129
Large Int String 17.007 ± 0.399 24.291 ± 0.805
Int 6.068 ± 0.049 10.130 ± 0.627
Float 15.456 ± 0.386 22.657 ± 0.311

After

Input type builtin (ms) fastnumbers (ms)
Small Int String 9.631 ± 0.476 7.466 ± 0.151
Int String 10.730 ± 0.715 9.469 ± 0.185
Medium Int String 11.573 ± 0.726 9.414 ± 0.098
Large Int String 16.108 ± 0.168 18.211 ± 0.764
Int 5.848 ± 0.814 6.317 ± 0.146
Float 15.029 ± 0.244 19.011 ± 0.175

Timing comparison of float functions

Before

Input type builtin (ms) fastnumbers (ms)
Small Int String 8.967 ± 0.509 14.056 ± 1.034
Int String 9.648 ± 0.184 14.064 ± 0.207
Large Int String 28.892 ± 1.026 40.313 ± 0.423
Small Float String 8.720 ± 0.081 13.713 ± 0.017
Float String 26.218 ± 0.220 15.497 ± 0.035
Large Float String 50.403 ± 0.354 61.001 ± 0.382
Int 4.739 ± 0.048 10.855 ± 0.017
Float 3.968 ± 0.051 8.958 ± 0.493

After

Input type builtin (ms) fastnumbers (ms)
Small Int String 8.512 ± 0.565 9.487 ± 0.746
Int String 9.216 ± 0.227 9.715 ± 0.150
Medium Int String 10.486 ± 0.162 10.028 ± 0.087
Large Int String 28.977 ± 1.327 14.435 ± 0.182
Small Float String 8.719 ± 0.304 9.946 ± 0.652
Float String 26.140 ± 0.447 11.401 ± 0.158
Large Float String 52.146 ± 1.277 12.464 ± 0.450
Int 4.869 ± 0.045 9.016 ± 0.121
Float 4.060 ± 0.064 6.736 ± 0.116

So I am removing those as a means of communicating timing data. Instead
I will just generate markdown directly and store this in the repo.
The previous implementation had as its first priority not invoking
the Python exception mechanism, as this is a bit expensive, especially
if one is just going to unset it anyway. This optimized the "fail" path
at the expense of the numeric path.

The logic has been re-thought to optimize for the "happy" path first,
and assume that an acceptable trade of for very fast conversions is to
have the "fail" path take a little bit longer than before.

Scope of the changes
- Remove "is_likely" and "might_overflow" pre-checks as they are now no
  longer needed
- Remove the Parser's as_int and as_float methods - we always now just
  return Python objects
- Remove Python's string to double conversion since the C++ parser is
  now robust enough
- Use the integer parser to examine the validity of a string, instead of
  using a separate full string checker
- Add some "peek" methods to the Parser to see if the contained value is
  of a certain type without having to do a full inspection

NOTE: THERE IS A LIE ABOVE!!

This commit does not change the C++ string-to-double parser to the more
robust version, so many tests fail. The next commit will implement that
change.
This is a very fast implementation of string to double conversion that
is also very accurate. WIN WIN.

This closes #57 and closes #28.
Instead of 100% relying on the python converter for these integers, use
std::from_chars which is fast and provides feedback on overflow, which
can be used to determine if we must fall back on the python converter.
Much of the character "parsing" code has been micro-optimized.

- When comparing against characters, instead of looking at both upper-
  and lower-case, we use a bit-hack to force to lower and just make one
  comparison`
- Testing for whitespace and digits is now done with lookup tables
- Conversion from character to a digit is now done with a lookup table
- All these functions are now constexpr (doesn't help performance, but
  it makes me feel better :) )
After re-writing parse_int and parse_float, a lot of the helpers that
had existed are no longer needed. The parsing file is now cleaned up.

As a plus, it was found that we were not being safe about parsing, and
we have to check we are not at the end of the char array when we
evaluate a character. So now parsing is safer.
Instead of creating a python float object, then seeing if it is int-like
and then creating a python long object from there, it is now checked to
see which of these two objects should be created.
Code has been added that can parse 8 characters at a time.
This uses the "vector call" functionality to call C functions. It
requires fewer memory allocations by Python to call the function and
thus is a bit faster. Unsurprisingly, this gave a 1-2 microsecond
speedup across the board in the performance tests I ran.

Unfortunately, Python does not ship a parser for this, so I adapted the
parser from the numpy project.

This closes #59.
dig, max_exp, and min_exp no longer have any meaning since
fast_float::from_chars was introduced. max_int_len will not have any
meaning in a future commit.
Not a performance issue, but stil needed to be fixed.
Python uses long to store non-big integers internally, so originally
that is what was being used to parse non-big integers. Some compilers
still have 32-bit long (MSVC). So, it is more efficient to use 64-bit
int to always use our fast parser when possible, then let python decide
how to internally represent it.
Old and new functions have different allow_underscores default, so we
have to normalize to do the tests.
@SethMMorton SethMMorton merged commit d7e41fa into main Jan 31, 2023
@SethMMorton SethMMorton deleted the improve-conversion-performance branch January 31, 2023 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant