-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MINIFICPP-1026 - Added base64 encoder-decoder to StringUtils #690
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we're leaning toward writing too much library code here. We build curl in most cases and in the cases where we can't we likely couldn't run anyway -- why not just rely on that which is already available versus writing our own.
* @param base64_length the length of base64 | ||
* @return true on success | ||
*/ | ||
inline static bool from_base64(uint8_t* data, size_t* data_length, const char* base64, size_t base64_length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this your own impl or a copy and paste?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Own implementation, reasons explained in the main comment.
std::vector<uint8_t> decoded((base64_length / 4 + 1) * 3); | ||
size_t data_length = decoded.size(); | ||
if (!from_base64(decoded.data(), &data_length, base64, base64_length)) { | ||
throw std::runtime_error("Base64 encoded string is malformatted"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always felt that runtime errors "are due to events beyond the scope of the program and can not be easily predicted." so when those are used versus internal or named exceptions they have different connotations. We catch them but this error seems recoverable from a programmatic perspective. I don't have a super strong opinion on this but it does have a bit of code smell worth mentioning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I'll think of something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote them to std::invalid_argument
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, the only concern I have is the placement of the code.
|
LGTM, I'm happy to proceed with this if CIs pass. |
This is debatable but a tested third party impl if one exists can be trusted. This is a lot of code with the potential for bugs. In your case you've written a lot of code, so I don't want to demean it but the code for an external third party with StringUtils seems much less risky with much lower tests -- unless we simply can't find a third party impl. What you say about cURL is a valid point. Did you attempt to find a third party example? If not I can take a look -- there may be other string related libraries that provide this and some of what we do with StringUtils.. |
|
||
TEST_CASE("TestStringUtils::testBase64EncodeDecode", "[test base64 encode decode]") { | ||
std::mt19937 gen(std::random_device { }()); | ||
for (size_t i = 0U; i < 1024U; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should add a test or scenario for arbitrarily large strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this, but after a few 4-base64-digit groups it doesn't involve different code paths, it would just test how large your memory is. What did you have in mind for size and what advantage would it have?
Partly agree, partly disagree. |
@phrocker There are a myriad of base64 implementations in C, the two I meet the most often are https://web.mit.edu/freebsd/head/contrib/wpa/src/utils/base64.c and https://opensource.apple.com/source/QuickTimeStreamingServer/QuickTimeStreamingServer-452/CommonUtilitiesLib/base64.c
The core part - that we would be able to substitute with a third-party implementation, albeit only after modifications - is composed of two look-up tables, a 30-line and a 65-line function. As for replacing the majority of StringUtils with some third party utility library: I don't know one that would fit our needs. There might well be, and in that I case I wouldn't be against it, but I don't want to do that integration work in the scope of this issue. |
@phrocker Are you -1 on this? If so, please elaborate your technical justification for it, otherwise I would like to go forward with this PR. |
Merging this as no better alternative was proposed in a timely manner and it's clearly an improvement compared to the current (flawed) implementation we have. Feel free to open follow-ups in case any related issue or a promising out-of-the-box alternative is found. |
Yep. I was -1. My gut suggestion was to use boost's base 64 since it's relatively isolated. If you look at the boost variant they mention portions come from this repo ( https://github.com/ReneNyffenegger/cpp-base64). I'll follow up on the JIRA I've created. |
The subjective precedent here should be clarified and defined clearly here : https://github.com/apache/nifi-minifi-cpp/blob/master/CONTRIB.md . |
Please clarify! |
Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.