Skip to content

Conversation

@rv32ima
Copy link
Member

@rv32ima rv32ima commented Nov 15, 2021

This PR adds in a relatively simplistic Base64 (de|en)coder (#369)

It is incredibly simple by design, as well as being @nogc @safe pure (except for the functions that do not take in Appenders). I've only tested it with some small bits of data, but, it should easily scale up to handle an infinite amount of data.

I have not verified how this fares with systems that are big-endian -- would love more insight on that regard.

Let me know if this needs any more work. Thanks!

@codecov-commenter
Copy link

codecov-commenter commented Nov 15, 2021

Codecov Report

Merging #372 (217898e) into master (7152ba9) will increase coverage by 0.09%.
The diff coverage is 98.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #372      +/-   ##
==========================================
+ Coverage   92.89%   92.98%   +0.09%     
==========================================
  Files          64       65       +1     
  Lines       15679    15797     +118     
==========================================
+ Hits        14565    14689     +124     
+ Misses       1114     1108       -6     
Impacted Files Coverage Δ
source/mir/base64.d 98.47% <98.47%> (ø)
source/mir/rc/array.d 74.62% <0.00%> (-0.38%) ⬇️
source/mir/serde.d 72.50% <0.00%> (ø)
source/mir/math/stat.d 100.00% <0.00%> (ø)
source/mir/ndslice/topology.d 98.62% <0.00%> (ø)
source/mir/math/sum.d 96.40% <0.00%> (+0.32%) ⬆️
source/mir/series.d 94.69% <0.00%> (+0.42%) ⬆️
source/mir/format.d 95.70% <0.00%> (+1.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7152ba9...217898e. Read the comment docs.

@rv32ima
Copy link
Member Author

rv32ima commented Nov 17, 2021

Alright @9il, I believe I addressed all of your points (and more).

One point of issue was how to handle a Base64 string like QU======QUJD. It seems that implementations are split on this: JavaScript's atob would throw, Python's base64.b64decode would return A, Phobos would throw (not because it found the padding was invalid, but because the out contract would fail), and other online Base64 decoders would return garbage.

So, I've taken the safest route, in my opinion (while still being compliant to RFC4648), and decided to just throw if we meet one of two situations:

  • we encounter the pad character in a 4-byte chunk that is not the last chunk
  • we encounter the pad character in the beginning two characters of a chunk

This does mean that Base64 strings like QU====== (extra padding) or ==== (only padding) are invalid, but this is still within specification. This does avoid the concerns raised within the RFC (w.r.t. introducing potential side-channels), and actually improves our Ion test coverage (incidentally).

Let me know if you have any other concerns.

/++
Encode a ubyte array as Base64, placing the result onto an Appender.
+/
void encodeBase64(char PlusChar = '+', char SlashChar = '/', Appender)(scope const(ubyte)[] input,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to make PlusChar and SlashChar generic params as well as CapitalCase. Make them runtime and camelCase.

@9il 9il merged commit 3f14886 into master Nov 21, 2021
@9il 9il deleted the base64-impl branch November 21, 2021 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants