-
Notifications
You must be signed in to change notification settings - Fork 36.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: Don't depend on locale when trimming strings. Add tests. #17760
util: Don't depend on locale when trimming strings. Add tests. #17760
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation for locale independent trimming?
See the developer notes: "Avoid using locale dependent functions if possible. You can use the provided Rationale: Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Developer notes are ultimately just notes, not a reason to do things the wrong way. Standard input from the user should be using locale-dependent functions.
@@ -794,7 +793,7 @@ static int CommandLineRawTx(int argc, char* argv[]) | |||
// param: hex-encoded bitcoin transaction | |||
std::string strHexTx(argv[1]); | |||
if (strHexTx == "-") // "-" implies standard input | |||
strHexTx = readStdin(); | |||
strHexTx = TrimString(readStdin()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one actually should be locale-dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luke-jr I'm not sure I follow: could you explain why? Please include an example if possible where the new code would give bad results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how important this specific is, but generally we do not use locale-dependent input handling in bitcoind nor utilities. We'd like to avoid differences in locale resulting in hard-to-reproduce bugs and unexpected behavior. Note that this tool is mainly for scripting use where deterministic behavior is important.
@@ -766,8 +767,6 @@ static std::string readStdin() | |||
if (ferror(stdin)) | |||
throw std::runtime_error("error reading stdin"); | |||
|
|||
boost::algorithm::trim_right(ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though it does make sense to trim all whitespace from a hex-encoded transaction, it still also makes sense to trim newline characters here (which should be locale-dependent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luke-jr I'm not sure I follow: could you explain why? Please include an example if possible where the new code would give bad results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, how are newline characters locale dependent?
Don't depend on locale when trimming strings.
Add
TrimString(…)
tests.Add
boost::algorithm::trim
andboost::trim
to list of locale dependent functions we want to avoid.