ERL-366: Given large binaries, base64:mime_decode/1 has 58% throughput of base64:decode/1 #3455

OTP-Maintainer · 2017-02-23T21:02:30Z

Original reporter: jessestimpson
Affected version: OTP-20.0
Fixed in version: OTP-20.0
Component: stdlib
Migrated from: https://bugs.erlang.org/browse/ERL-366

While profiling my application, which handles many large base64 binaries from an external entity (it's an SMTP server), I noticed that base64:mime_decode/1 is significantly slower than base64:decode/1.

On my local machine, I've produced the following timings from a 400000 byte base64-encoded binary. I decoded it 1000 times and computed the average.

{{Function                | Avg exec   |   Throughput
------------------------+------------+---------------
base64:decode/1         | 0.017 s    |   22.502 MiB/s
base64:mime_decode/1    | 0.029 s    |   12.953 MiB/s}}

CPU: 2.6 GHz Intel Core i7
Memory: 16 GB 1600 MHz DDR3
OS: macOS 10.11.4
Erlang: Erlang/OTP 20 [DEVELOPMENT] [erts-8.2.2].

You can see here that the throughput is at 58%. I've attached the module I used to produce these results.

I dove into the source and saw that mime_decode isn't very much doing more work, but the implementation is structurally different. I have a hunch that the compiler is unable to fully optimize with the implementation structured this way.

I plan on submitting a Pull Request that includes a discussion of this implementation, and my suggestions, but I also wanted to include a short version here, so you can see the full picture in one place.

I was able to isolate the performance degradation to the pattern matching that occurs after the call to function tail_contains_more/2. When I restructured the implementation of mime_decode/1 to avoid this, I saw a dramatic improvement, using the same test module:

{{Function                | Avg exec   |   Throughput
------------------------+------------+---------------
base64:decode/1         | 0.018 s    |   21.170 MiB/s
base64:mime_decode/1    | 0.016 s    |   23.519 MiB/s}}

This will be my first Pull Request to Erlang/OTP, so it may take me some time to format it correctly, run the tests, etc. Please bear with me as I get up to speed!

The text was updated successfully, but these errors were encountered:

OTP-Maintainer · 2017-02-24T05:08:10Z

bjorn said:

Thanks!

We will wait for your pull request.

A tip: You can use the option {{bin_opt_info}} to verify whether optimizations of binary matching are applied or not (see the Efficiency Guide).

OTP-Maintainer · 2017-02-24T13:03:10Z

jessestimpson said:

Indeed, compiling with bin_opt_info reveals the problem

lib/stdlib/src/base64.erl:220: Warning: NOT OPTIMIZED: called function tail_contains_more/2 does not begin with a suitable binary matching instruction

Thanks so much for the tip! I will include this information in the PR.

OTP-Maintainer · 2017-02-24T17:59:57Z

jessestimpson said:

My PR can be found here:
https://github.com/erlang/otp/pull/1354

OTP-Maintainer added bug Issue is reported as a bug team:VM Assigned to OTP team VM priority:low labels Feb 10, 2021

OTP-Maintainer added this to the OTP-20.0 milestone Feb 10, 2021

OTP-Maintainer assigned bjorng Feb 10, 2021

OTP-Maintainer closed this as completed Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERL-366: Given large binaries, base64:mime_decode/1 has 58% throughput of base64:decode/1 #3455

ERL-366: Given large binaries, base64:mime_decode/1 has 58% throughput of base64:decode/1 #3455

OTP-Maintainer commented Feb 23, 2017

OTP-Maintainer commented Feb 24, 2017

OTP-Maintainer commented Feb 24, 2017

OTP-Maintainer commented Feb 24, 2017

ERL-366: Given large binaries, base64:mime_decode/1 has 58% throughput of base64:decode/1 #3455

ERL-366: Given large binaries, base64:mime_decode/1 has 58% throughput of base64:decode/1 #3455

Comments

OTP-Maintainer commented Feb 23, 2017

OTP-Maintainer commented Feb 24, 2017

OTP-Maintainer commented Feb 24, 2017

OTP-Maintainer commented Feb 24, 2017