You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Original reporter: jessestimpson
Affected version: OTP-20.0
Fixed in version: OTP-20.0
Component: stdlib
Migrated from: https://bugs.erlang.org/browse/ERL-366
While profiling my application, which handles many large base64 binaries from an external entity (it's an SMTP server), I noticed that base64:mime_decode/1 is significantly slower than base64:decode/1.
On my local machine, I've produced the following timings from a 400000 byte base64-encoded binary. I decoded it 1000 times and computed the average.
{{Function | Avg exec | Throughput
------------------------+------------+---------------
base64:decode/1 | 0.017 s | 22.502 MiB/s
base64:mime_decode/1 | 0.029 s | 12.953 MiB/s}}
CPU: 2.6 GHz Intel Core i7
Memory: 16 GB 1600 MHz DDR3
OS: macOS 10.11.4
Erlang: Erlang/OTP 20 [DEVELOPMENT] [erts-8.2.2].
You can see here that the throughput is at 58%. I've attached the module I used to produce these results.
I dove into the source and saw that mime_decode isn't very much doing more work, but the implementation is structurally different. I have a hunch that the compiler is unable to fully optimize with the implementation structured this way.
I plan on submitting a Pull Request that includes a discussion of this implementation, and my suggestions, but I also wanted to include a short version here, so you can see the full picture in one place.
I was able to isolate the performance degradation to the pattern matching that occurs after the call to function tail_contains_more/2. When I restructured the implementation of mime_decode/1 to avoid this, I saw a dramatic improvement, using the same test module:
{{Function | Avg exec | Throughput
------------------------+------------+---------------
base64:decode/1 | 0.018 s | 21.170 MiB/s
base64:mime_decode/1 | 0.016 s | 23.519 MiB/s}}
This will be my first Pull Request to Erlang/OTP, so it may take me some time to format it correctly, run the tests, etc. Please bear with me as I get up to speed!
The text was updated successfully, but these errors were encountered:
Thanks!
We will wait for your pull request.
A tip: You can use the option {{bin_opt_info}} to verify whether optimizations of binary matching are applied or not (see the Efficiency Guide).
Indeed, compiling with bin_opt_info reveals the problem
lib/stdlib/src/base64.erl:220: Warning: NOT OPTIMIZED: called function tail_contains_more/2 does not begin with a suitable binary matching instruction
Thanks so much for the tip! I will include this information in the PR.
Original reporter:
jessestimpson
Affected version:
OTP-20.0
Fixed in version:
OTP-20.0
Component:
stdlib
Migrated from: https://bugs.erlang.org/browse/ERL-366
The text was updated successfully, but these errors were encountered: