Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello, I've returned with another optimization!
If you've ever used the quoted-printable encoder for anything over 4000 bytes, you've probably ripped your hair out by now:
I learned the hard way when it nuked my ram.
I've significantly reduced the memory usage and increased throughput (up to a 1000 times)
Here is the benchmark used:
Granted I couldn't run the old encoder with anything over 4096 bytes and expected a result back in a timely manner.
So consider anything over that limit be encoded soon™ (on the old encoder)
Tests
Didn't need to change anything on that side, this is a drop-in replacement for the old module.
Changes proposed in this pull request
A more efficient and faster quoted-printable.
Notes
I've removed all the private functions and instead relied on TCO (Tail-Call-Optimization) by using the same function name.
Some of the code is duplicated in some of the overloads, this was to avoid having extra utility methods.
I'm trying to keep the stack as clean as possible to avoid extra memory allocations.
One may notice I use
byte_size/1
instead ofString.length/1
, this is due to how they work:byte_size reports the actual number of bytes in the string, since the encoder needs to work in ASCII (8-bit), it makes sense to count the remaining bytes rather than the remaining characters (String.length has to count each character manually causing a massive slow down when it has to count after every-single-character), this becomes very apparent when dealing with UTF-8 characters that span multiple bytes and need to be encoded to their escape sequences in the end