Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to our own Base64 implementation #13

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidebeatrici
Copy link
Member

Written in modern C++, it strictly adheres to the Base64 specification.

This means that alternate alphabets and/or the lack of padding are rejected.

A benchmark would be nice to have, once we switch to a proper test framework.

@Krzmbrzl
Copy link
Member

A benchmark would be nice to have, once we switch to a proper test framework.

-> https://github.com/google/benchmark
I have used that one before and it is pretty amazing.

include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
@davidebeatrici
Copy link
Member Author

Windows build failure
Error: D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): error C2672: 'std::min': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(74): note: could be '_Ty std::min(std::initializer_list<_Elem>)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: '_Ty std::min(std::initializer_list<_Elem>)': expects 1 arguments - 2 provided
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(71): note: or       '_Ty std::min(std::initializer_list<_Elem>,_Pr)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: '_Ty std::min(std::initializer_list<_Elem>,_Pr)': could not deduce template argument for 'std::initializer_list<_Elem>' from 'unsigned long'
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(64): note: or       'const _Ty &std::min(const _Ty &,const _Ty &) noexcept(<expr>)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: 'const _Ty &std::min(const _Ty &,const _Ty &) noexcept(<expr>)': template parameter '_Ty' is ambiguous
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: could be 'gsl::span<uint8_t,18446744073709551615>::size_type'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: or       'unsigned long'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: 'const _Ty &std::min(const _Ty &,const _Ty &) noexcept(<expr>)': could not deduce template argument for 'const _Ty &' from 'gsl::span<uint8_t,18446744073709551615>::size_type'
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(54): note: or       'const _Ty &std::min(const _Ty &,const _Ty &,_Pr) noexcept(<expr>)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(73): note: 'const _Ty &std::min(const _Ty &,const _Ty &,_Pr) noexcept(<expr>)': expects 3 arguments - 2 provided
Error: D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): error C2672: 'std::min': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(74): note: could be '_Ty std::min(std::initializer_list<_Elem>)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: '_Ty std::min(std::initializer_list<_Elem>)': expects 1 arguments - 2 provided
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(71): note: or       '_Ty std::min(std::initializer_list<_Elem>,_Pr)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: '_Ty std::min(std::initializer_list<_Elem>,_Pr)': could not deduce template argument for 'std::initializer_list<_Elem>' from 'unsigned long'
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(64): note: or       'const _Ty &std::min(const _Ty &,const _Ty &) noexcept(<expr>)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: 'const _Ty &std::min(const _Ty &,const _Ty &) noexcept(<expr>)': template parameter '_Ty' is ambiguous
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: could be 'gsl::span<const uint8_t,18446744073709551615>::size_type'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: or       'unsigned long'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: 'const _Ty &std::min(const _Ty &,const _Ty &) noexcept(<expr>)': could not deduce template argument for 'const _Ty &' from 'gsl::span<const uint8_t,18446744073709551615>::size_type'
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\utility(54): note: or       'const _Ty &std::min(const _Ty &,const _Ty &,_Pr) noexcept(<expr>)'
D:\a\libmumble\libmumble\include\mumble/Base64.hpp(122): note: 'const _Ty &std::min(const _Ty &,const _Ty &,_Pr) noexcept(<expr>)': expects 3 arguments - 2 provided

This is of course due to size_t being unsigned long long instead of unsigned long.

Unfortunately the proper literal was only introduced in C++23: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0330r8.html

We can get away with size_t{ value } though.

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially all comments on the decode function also apply to the encode one.

include/mumble/Base64.hpp Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
@Krzmbrzl
Copy link
Member

Oh and another thing to consider: As you want to implement this header-only anyway, we could also template the functions to allow any type of the correct size (or anything that has at least the appropriate size). That way, we wouldn't have the uint8_t vs std::byte vs char vs ... problem

@davidebeatrici
Copy link
Member Author

The failing Windows builds are probably the due to a missing <stdexcept> include.

Oh and another thing to consider: As you want to implement this header-only anyway, we could also template the functions to allow any type of the correct size (or anything that has at least the appropriate size). That way, we wouldn't have the uint8_t vs std::byte vs char vs ... problem

Or rather, template wrappers for them: thanks to the spans we can get away with simple casts.

@Krzmbrzl
Copy link
Member

thanks to the spans we can get away with simple casts.

Doesn't seem like it: https://godbolt.org/z/5P5dWc5nj 👀

@davidebeatrici
Copy link
Member Author

Sorry, by "simple" I meant casting the element type and specifying the size accordingly.

@Krzmbrzl
Copy link
Member

Maybe we should define our own span_cast that does this under the hood. That will probably make our code a bit cleaner

@davidebeatrici
Copy link
Member Author

Indeed, I was actually thinking about it. There are quite a few places in our codebase where it would be useful.

include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some small stuff is left. Once this has been addressed, I think this implementation is ready to go 👍

include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
Comment on lines 119 to 129
for (; !in.empty(); out = out.subspan(3), in = in.subspan(4)) {
out[0] = (charToByte(in[0]) << std::byte(2)) + ((charToByte(in[1]) & std::byte(0x30)) >> 4);

if (in[2] == '=') {
break;
}

out[1] =
((charToByte(in[1]) & std::byte(0x0f)) << std::byte(4)) + ((charToByte(in[2]) & std::byte(0x3c)) >> 2);

if (in[3] == '=') {
break;
}

out[2] = ((charToByte(in[2]) & std::byte(0x03)) << std::byte(6)) + charToByte(in[3]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a comment that textually describes how the decoding works

include/mumble/Base64.hpp Outdated Show resolved Hide resolved
include/mumble/Base64.hpp Outdated Show resolved Hide resolved
Comment on lines +149 to +165
for (; !in.empty(); out = out.subspan(4)) {
out[0] = byteToChar((in[0] & std::byte(0xfc)) >> 2);

Base64();
virtual ~Base64();
if (in.size() >= 2) {
out[1] = byteToChar(((in[0] & std::byte(0x03)) << 4) + ((in[1] & std::byte(0xf0)) >> 4));

virtual explicit operator bool();
if (in.size() >= 3) {
out[2] = byteToChar(((in[1] & std::byte(0x0f)) << 2) + ((in[2] & std::byte(0xc0)) >> 6));
out[3] = byteToChar(in[2] & std::byte(0x3f));
} else {
out[2] = byteToChar((in[1] & std::byte(0x0f)) << 2);
out[3] = PAD_CHAR;
}
} else {
out[1] = byteToChar((in[0] & std::byte(0x03)) << 4);
out[2] = PAD_CHAR;
out[3] = PAD_CHAR;
}

virtual size_t decode(const BufView out, const BufViewConst in);
static size_t encode(const BufView out, const BufViewConst in);
// All data blocks are guaranteed to be 3 bytes except the last one.
in = in.subspan(std::min(size_t{ 3 }, in.size()));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analogous to the decode function, a comment textually explaining how this stuff works would be good.

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and of course: All functions require documentation before merging

I would suggest JavaDoc-style Doxygen comments as those are much easier to distinguish from regular comments than blocks of /// comments. E.g. like this

/**
 * Description of function goes here
 * 
 * @param in The buffer to decode
 */

include/mumble/Base64.hpp Outdated Show resolved Hide resolved
Written in modern C++, it strictly adheres to the Base64 specification.

This means that the lack of padding is rejected when decoding.
The alternate alphabet that is used for URLs is handled just fine.

A benchmark would be nice to have, once we switch to a proper test framework.
@Krzmbrzl
Copy link
Member

Krzmbrzl commented Apr 3, 2023

Just to be explicit: The only thing missing from my POV is the documentation (+ the mentioned in-source explanatory comments)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants