Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize binary_to_integer/1 and friends #7426

Merged
merged 6 commits into from Jun 28, 2023

Conversation

bjorng
Copy link
Contributor

@bjorng bjorng commented Jun 21, 2023

No description provided.

When compiling 64-bit programs, goth gcc and clang support an 128-bit
unsigned integer type. Using it to define the `ErtsDoubleDigit` type
will speed up multiplication and division.
@bjorng bjorng self-assigned this Jun 21, 2023
@bjorng bjorng added team:VM Assigned to OTP team VM enhancement testing currently being tested, tag is used by OTP internal CI labels Jun 21, 2023
@bjorng bjorng force-pushed the bjorn/binary_to_integer/otp26/OTP-18659 branch 2 times, most recently from 550768e to 858c4f1 Compare June 21, 2023 08:05
@@ -37,8 +37,11 @@ typedef Uint16 ErtsHalfDigit;
#undef BIG_HAVE_DOUBLE_DIGIT
typedef Uint16 ErtsHalfDigit;

#elif (SIZEOF_VOID_P == 8) && defined(__GNUC__) && (__GNUC__ >= 4)
typedef __uint128_t ErtsDoubleDigit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this universally available on all 64-bit targets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be. I tried all 64-bit gcc and clang targets I could find on godbolt.org.

# define BIG_ARITY_MAX ((1 << 16)-1)
#else
# define BIG_ARITY_MAX ((1 << 17)-1)
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this restriction? The commit doesn't say.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it looks a bit suspicious that the limit is larger on 64b architectures than on 32b architectures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same in bytes though, as these values are in words (pointer-sized integers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wanted to decrease the maximum execution time for a single arithmetic operation. We chose a new limit that we hope don't affect real applications of large integer arithmetic, but that will produce a system_limit exception faster for accidental use of large integers.

@bjorng
Copy link
Contributor Author

bjorng commented Jun 21, 2023

I've pushed a fixup commit that simplifies the last part of the Karatsuba calculation.

%% Too large even for base 2.
erlang:exit(system_limit);
Size > 1262611, Base >= 10 ->
erlang:exit(system_limit);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are these limits chosen?
Actually, what I'd like to know is this: what's the longest it can take (on a typical computer), within these limits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well under a second, at least on 64-bit systems.

On my slowest computer (MacBook Pro from 2013, 2.6 GHz, Intel Core i5) the conversion finishes in about 0.63 seconds. On a HP laptop running Linux (11th Gen Intel, i7-1185G7 @ 3.00 GHz) the time is 0.34 seconds.

Acc
end.

combine(L0, Radix) ->
Copy link

@rgrig rgrig Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for rewriting it this way! It's nice to have better asymptotics.

I wonder if it would be faster to revert to linear scan for relatively short integers. It has worse asymptotics but maybe better constants and more cache friendliness.

What I mean is this: instead of starting with single digits, start with numbers made out of N digits (where N can be tuned based on benchmarks, but probably in the region of 10 or 100). And to get these N-digit numbers, do the linear loop (with result:=10*result+digit), as it was before in C.

Copy link

@rgrig rgrig Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I see you start not from single digits but from what fits in a word. I suppose I'm asking if it's not worth doing linear scan algo on those "small" integers, as long as the scan isn't too long, because constants&cache.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper BIF will convert decimal numbers with up to 17 digits efficiently. We think that will take care of the majority of real-world use cases. Therefore, I have optimized the helper BIFs for correctness and simplicity.

However, after discussions with @jhogberg and @garazdawi I now think that spawning a temporary process is a bad idea and I have now eliminated that extra process. Not spawning the process will improve performance for binaries with more than 17 decimal digits.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, i tried my proposal on some toy code, and it's not only more complicated but also slower

The helper BIF will convert decimal numbers with up to 17 digits efficiently. We think that will take care of the majority of real-world use cases.

I think 64b integers are a common case, especially when interacting with other languages. Probably not worth optimizing for it at this point.

Not spawning the process will improve performance [...]

I didn't understand why there's a process spawned there, but I assumed it's my lack of Erlang knowledge. (Was it perhaps to make sure that timeouts apply to the main process even if the parsing process somwhow gets stuck?? Just a wild guess.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 64b integers are a common case, especially when interacting with other languages. Probably not worth optimizing for it at this point.

If someone can show real-world use cases where integers whose absolute value are greater than 99999999999999999 are frequently used we could optimize this in OTP 27.

I didn't understand why there's a process spawned there, but I assumed it's my lack of Erlang knowledge. (Was it perhaps to make sure that timeouts apply to the main process even if the parsing process somwhow gets stuck?? Just a wild guess.)

It was an attempt at optimization (reducing garbage collection times), but after some more thinking it became clear that it in most cases it would be a pessimization.

@@ -1371,15 +1543,52 @@ list_to_float(_String) ->
%% list_to_integer/1
-spec list_to_integer(String) -> integer() when
String :: string().
list_to_integer(_String) ->
erlang:nif_error(undefined).
list_to_integer(String) ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: Why is this not implemented as list_to_integer(String, 10)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would result in a stacktrace pointing out the wrong function if it fails.

{-1,""} = test_to_integer("-1"),
{1,"="} = test_to_integer("1="),
{7,"F"} = test_to_integer("7F"),
{709,""} = test_to_integer("709"),
{709,"*2"} = test_to_integer("709*2"),
{0,"xAB"} = test_to_integer("0xAB"),
{16,"#FF"} = test_to_integer("16#FF"),

%% Test a bignum.
Big = 12385792987438978973984398348974398593,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would be nice to have more than one number. Maybe some powers of 2, powers of 10, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string:list_to_integer/1 is implemented using binary_to_integer/1, which is thoroughly tested by num_bif_SUITE. Here I just test that handling of signs and trailing non-digits are handled correctly.

@bjorng bjorng force-pushed the bjorn/binary_to_integer/otp26/OTP-18659 branch 2 times, most recently from 2523e03 to 354cfe9 Compare June 22, 2023 12:04
list_to_integer(String) ->
Base = 10,
case erts_internal:list_to_integer(String, Base) of
{Int,[]} ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to just return the integer here to avoid the overhead of the unconditional allocation. The string function could wrap the result in a tuple, so we only "pay" for the allocation where necessary.
Shouldn't be a big impact, but for such low-level functions IMO every little thing matters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a change I want to make now, in case someone is using the BIF directly despite it being undocumented.

@bjorng bjorng force-pushed the bjorn/binary_to_integer/otp26/OTP-18659 branch from 4d6c483 to a769c1f Compare June 27, 2023 05:23
This new 4 Mbit limit should still be sufficiently large for practical
applications.
These will be needed in a future commit when we will rewrite some
exsisting BIFs in Erlang.
Use a divide-and-conquer algorithm to lower the complexity for
the BIFs that convert binaries or strings to integers:

* erlang:list_to_integer/1
* erlang:binary_to_integer/1
* erlang:binary_to_integer/2
* erlang:list_to_integer/2
* string:list_to_integer/1
@bjorng bjorng force-pushed the bjorn/binary_to_integer/otp26/OTP-18659 branch from a769c1f to 86fedb2 Compare June 27, 2023 06:12
@bjorng bjorng merged commit a37afd1 into erlang:maint Jun 28, 2023
@bjorng bjorng deleted the bjorn/binary_to_integer/otp26/OTP-18659 branch September 18, 2023 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants