Optimized quoted-printable encoder #86

IceDragon200 · 2019-02-02T17:09:16Z

Hello, I've returned with another optimization!

If you've ever used the quoted-printable encoder for anything over 4000 bytes, you've probably ripped your hair out by now:

I learned the hard way when it nuked my ram.

I've significantly reduced the memory usage and increased throughput (up to a 1000 times)

Here is the benchmark used:

len = 0xFFF
bin =
  (len * 2)
  |> :crypto.strong_rand_bytes()
  |> Base.encode64()
  |> String.slice(0, len)

encoded = Mail.Encoders.QuotedPrintable.encode(bin)

Benchee.run(%{
  "OldQuotedPrintable.encode/1" => fn ->
    Mail.Encoders.LegacyQuotedPrintable.encode(bin)
  end,
  "NewQuotedPrintable.encode/1" => fn ->
    Mail.Encoders.QuotedPrintable.encode(bin)
  end,
}, time: 10, memory_time: 2)

Benchee.run(%{
  "OldQuotedPrintable.decode/1" => fn ->
    Mail.Encoders.LegacyQuotedPrintable.decode(encoded)
  end,
  "NewQuotedPrintable.decode/1" => fn ->
    Mail.Encoders.QuotedPrintable.decode(encoded)
  end,
}, time: 10, memory_time: 2)

Compiling 1 file (.ex)
Operating System: Linux"
CPU Information: Intel(R) Core(TM) i7-2710QE CPU @ 2.10GHz
Number of Available Cores: 8
Available memory: 15.58 GB
Elixir 1.8.0
Erlang 21.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 28 s


Benchmarking NewQuotedPrintable.encode/1...
Benchmarking OldQuotedPrintable.encode/1...

Name                                  ips        average  deviation         median         99th %
NewQuotedPrintable.encode/1        865.14      0.00116 s    ±19.58%      0.00105 s      0.00180 s
OldQuotedPrintable.encode/1          0.91         1.09 s     ±2.28%         1.09 s         1.16 s

Comparison: 
NewQuotedPrintable.encode/1        865.14
OldQuotedPrintable.encode/1          0.91 - 946.62x slower

Memory usage statistics:

Name                           Memory usage
NewQuotedPrintable.encode/1         0.52 MB
OldQuotedPrintable.encode/1       640.28 MB - 1236.38x memory usage

**All measurements for memory usage were the same**
Operating System: Linux"
CPU Information: Intel(R) Core(TM) i7-2710QE CPU @ 2.10GHz
Number of Available Cores: 8
Available memory: 15.58 GB
Elixir 1.8.0
Erlang 21.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 28 s


Benchmarking NewQuotedPrintable.decode/1...
Benchmarking OldQuotedPrintable.decode/1...

Name                                  ips        average  deviation         median         99th %
NewQuotedPrintable.decode/1        1.54 K      647.39 μs    ±11.03%         622 μs     1058.47 μs
OldQuotedPrintable.decode/1        1.21 K      828.94 μs    ±10.90%         804 μs        1429 μs

Comparison: 
NewQuotedPrintable.decode/1        1.54 K
OldQuotedPrintable.decode/1        1.21 K - 1.28x slower

Memory usage statistics:

Name                           Memory usage
NewQuotedPrintable.decode/1       363.22 KB
OldQuotedPrintable.decode/1       711.98 KB - 1.96x memory usage

**All measurements for memory usage were the same**

Granted I couldn't run the old encoder with anything over 4096 bytes and expected a result back in a timely manner.

So consider anything over that limit be encoded soon™ (on the old encoder)

Tests

Didn't need to change anything on that side, this is a drop-in replacement for the old module.

Changes proposed in this pull request

A more efficient and faster quoted-printable.

Notes

I've removed all the private functions and instead relied on TCO (Tail-Call-Optimization) by using the same function name.

Some of the code is duplicated in some of the overloads, this was to avoid having extra utility methods.
I'm trying to keep the stack as clean as possible to avoid extra memory allocations.

One may notice I use byte_size/1 instead of String.length/1, this is due to how they work:

byte_size reports the actual number of bytes in the string, since the encoder needs to work in ASCII (8-bit), it makes sense to count the remaining bytes rather than the remaining characters (String.length has to count each character manually causing a massive slow down when it has to count after every-single-character), this becomes very apparent when dealing with UTF-8 characters that span multiple bytes and need to be encoded to their escape sequences in the end

Fixed Bottlenecks: * Stack thrashing * String.length for ASCII binaries * Excessive concatenations

bcardarella · 2019-02-02T17:52:06Z

Awesome!

Optimized quoted-printable encoder to reduce memory usage

8945ea5

Fixed Bottlenecks: * Stack thrashing * String.length for ASCII binaries * Excessive concatenations

bcardarella merged commit 89d05a8 into DockYard:master Feb 2, 2019

michallepicki mentioned this pull request May 24, 2022

Use a binary accumulator in QuotedPrintable encoder to reduce memory usage #145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized quoted-printable encoder #86

Optimized quoted-printable encoder #86

IceDragon200 commented Feb 2, 2019 •

edited

bcardarella commented Feb 2, 2019

Optimized quoted-printable encoder #86

Optimized quoted-printable encoder #86

Conversation

IceDragon200 commented Feb 2, 2019 • edited

Tests

Changes proposed in this pull request

Notes

bcardarella commented Feb 2, 2019

IceDragon200 commented Feb 2, 2019 •

edited