Minor optimizations to bech32::Decode(); add tests. #12881

murrayn · 2018-04-04T10:39:45Z

Just a few minor optimizations to bech32::Decode():

optimize the order and logic of the conditionals
get rid of subsequent '(c < 33 || c > 126)' check which is redundant (already performed above)
add a couple more bech32 tests (mixed-case)

practicalswift · 2018-04-04T11:05:50Z

src/bech32.cpp

-        if (c >= 'a' && c <= 'z') lower = true;
-        if (c >= 'A' && c <= 'Z') upper = true;
+        // Mixing upper and lowercase is not OK.
+        if (!lower && c >= 'a' && c <= 'z') { if(upper) return {}; lower = true; }


Nit: if (upper) instead of if(upper)

Not even a nit. Should be a compiler error IMO. I must be tired. :-)

practicalswift · 2018-04-04T11:10:57Z

src/bech32.cpp

    size_t pos = str.rfind('1');
    if (str.size() > 90 || pos == str.npos || pos == 0 || pos + 7 > str.size()) {
        return {};
    }
    data values(str.size() - 1 - pos);
    for (size_t i = 0; i < str.size() - 1 - pos; ++i) {
        unsigned char c = str[i + pos + 1];
-        int8_t rev = (c < 33 || c > 126) ? -1 : CHARSET_REV[c];
+        int8_t rev = CHARSET_REV[c];


Add assert(c >= 33 && c <= 126); on the line before int8_t rev = CHARSET_REV[c]; to make assumption explicit?

@practicalswift If I were writing Decode() from scratch, I don't think it would occur to me to add an assert() there. Not convinced it adds any value.

Remember that we have asserts enabled on release notes, better not add them in inner loops especially not if the goal is to 'tighten up' anything.

practicalswift · 2018-04-04T11:13:39Z

LowerCase() and std::tolower() are not equivalent. std::tolower() takes the currently installed C locale into account.

murrayn

Had a feeling there was a reason for LowerCase().

sipa · 2018-04-05T00:44:08Z

src/bech32.cpp

-        if (c >= 'a' && c <= 'z') lower = true;
-        if (c >= 'A' && c <= 'Z') upper = true;
+        // Mixing upper and lowercase is not OK.
+        if (!lower && c >= 'a' && c <= 'z') { if (upper) return {}; lower = true; }


Please follow the coding style: any if that has anything but just a single-statement then body must use braces and indentation.

fanquake · 2018-04-05T01:06:54Z

@murrayn After fixing any nits, please also squash your commits.

promag

Please provide benchmark results. I suspect this makes the most common case worst. Maybe the suggestion below would improve performance for the most common case. An edge case doesn't have to perform better if the implementation makes the remaining cases worst.

promag · 2018-04-05T01:32:12Z

src/bech32.cpp

@@ -161,18 +161,25 @@ std::pair<std::string, data> Decode(const std::string& str) {
    for (size_t i = 0; i < str.size(); ++i) {
        unsigned char c = str[i];
        if (c < 33 || c > 126) return {};
-        if (c >= 'a' && c <= 'z') lower = true;
-        if (c >= 'A' && c <= 'Z') upper = true;


else here would improve a bit.

Maybe better:

if (c >= 'a') { if (c <= 'z') lower = true; } else if (c >= 'A') { if (c <= 'Z') upper = true; }

@promag Do you have an example in mind of a most common case?

The success case?

OK, in that case (let's assume a string consisting only of lowercase or uppercase letters) the existing code will do three comparisons and one assignment, per character. The proposed code will do three comparisons per character, with the added benefit of returning earlier in the case of a malformed string. Not sure why you would "suspect this makes the most common case worst".

Further to this, if we're going to assume the most common case is the success case (which is reasonable), it would probably be good to move the initial (c < 33 || c > 126) check to an ultimate "else if" check to the original code.

@murrayn probably, but can you post benchmark results?

laanwj · 2018-04-05T07:25:16Z

LowerCase() and std::tolower() are not equivalent. std::tolower() takes the currently installed C locale into account.

Thanks for catching this. We should be extremely careful to not introduce locale dependencies in the low-level string parsing functions. We've had serious problems with those in the past. This can result in country-specific bugs...

sipa · 2018-04-05T11:50:32Z

Please don't overthink this. Decoding addresses is hardly relevant (I don't think anyone would notice if they were 100x slower). My goal when writing this was more clarity and simplicity than speed, though I'm obviously not opposed to performance improvements if they don't conflict with those goals.

I am interested in whether the additional branches don't make performance worse though. My gut feeling is that they impact performance more than comparisons.

murrayn · 2018-04-06T02:10:30Z

@sipa Thanks for the feedback. I've reworked the code again to reflect your input.

maflcko · 2018-04-11T19:38:20Z

To get rid of the merge commit, please squash your commits according to https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#squashing-commits

murrayn · 2018-04-20T09:22:46Z

Not sure if this PR is stalled due to the earlier comment about benchmarks...I didn't think benchmarks would be as interesting after my most recent commit, in which the code was more straightforward. Just in case, I have benchmarked:

if (c >= 'a' && c <= 'z') lower = 1;
else if (c >= 'A' && c <= 'Z') upper = 1;
else if (c < 33 || c > 126) return 0;

versus

if (c < 33 || c > 126) return 0;
if (c >= 'a' && c <= 'z') lower = 1;
if (c >= 'A' && c <= 'Z') upper = 1;

and my results show the former is significantly faster; however, with -O2 compiler optimization enabled they benchmark identically, which isn't surprising.

promag

utACK 60f61f9.

Change is slightly better because success checks come first, which is probably the most common case.

maflcko · 2018-04-26T11:47:25Z

Please also adjust the OP

laanwj · 2018-05-15T10:08:14Z

utACK 60f61f9

60f61f9 Tighten up bech32::Decode(); add tests. (murrayn) Pull request description: Just a few minor optimizations to bech32::Decode(): 1) optimize the order and logic of the conditionals 2) get rid of subsequent '(c < 33 || c > 126)' check which is redundant (already performed above) 3) add a couple more bech32 tests (mixed-case) Tree-SHA512: e41af834c8f6b7d34c22c28b724df42c60f72e00df616e70a12efbc4271d15d80627fe1bc36845caf29f615c238499a566298a863cbe119fef457287231053c8

…of locale dependence 698cfd0 docs: Mention lint-locale-dependence.sh in developer-notes.md (practicalswift) 0a4ea2f build: Add linter for checking accidental locale dependence (practicalswift) Pull request description: This linter will check for code accidentally introducing locale dependencies. Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible. Context: #12881 (comment) Example output: ``` $ contrib/devtools/lint-locale-dependence.sh The locale dependent function tolower(...) appears to be used: src/init.cpp: if (s[0] == '0' && std::tolower(s[1]) == 'x') { Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. Please avoid using locale dependent functions if possible. Advice not applicable in this specific case? Add an exception by updating the ignore list in contrib/devtools/lint-locale-dependence.sh ``` **Note to reviewers:** What is the most appropriate `LOCALE_DEPENDENT_FUNCTIONS` function list? What should be added or removed? Tree-SHA512: 14e448828804bb02bf59070647e38b52fce120c700c903a4a8472769a2cee5dd529bd3fc182386993cb8720482cf4250b63a0a477db61b941ae4babe5c65025f

…uction of locale dependence 698cfd0 docs: Mention lint-locale-dependence.sh in developer-notes.md (practicalswift) 0a4ea2f build: Add linter for checking accidental locale dependence (practicalswift) Pull request description: This linter will check for code accidentally introducing locale dependencies. Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. We should avoid using locale dependent functions if possible. Context: bitcoin#12881 (comment) Example output: ``` $ contrib/devtools/lint-locale-dependence.sh The locale dependent function tolower(...) appears to be used: src/init.cpp: if (s[0] == '0' && std::tolower(s[1]) == 'x') { Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix. Please avoid using locale dependent functions if possible. Advice not applicable in this specific case? Add an exception by updating the ignore list in contrib/devtools/lint-locale-dependence.sh ``` **Note to reviewers:** What is the most appropriate `LOCALE_DEPENDENT_FUNCTIONS` function list? What should be added or removed? Tree-SHA512: 14e448828804bb02bf59070647e38b52fce120c700c903a4a8472769a2cee5dd529bd3fc182386993cb8720482cf4250b63a0a477db61b941ae4babe5c65025f

60f61f9 Tighten up bech32::Decode(); add tests. (murrayn) Pull request description: Just a few minor optimizations to bech32::Decode(): 1) optimize the order and logic of the conditionals 2) get rid of subsequent '(c < 33 || c > 126)' check which is redundant (already performed above) 3) add a couple more bech32 tests (mixed-case) Tree-SHA512: e41af834c8f6b7d34c22c28b724df42c60f72e00df616e70a12efbc4271d15d80627fe1bc36845caf29f615c238499a566298a863cbe119fef457287231053c8

fanquake added the Refactoring label Apr 4, 2018

practicalswift reviewed Apr 4, 2018

View reviewed changes

murrayn commented Apr 4, 2018

View reviewed changes

sipa reviewed Apr 5, 2018

View reviewed changes

promag reviewed Apr 5, 2018

View reviewed changes

murrayn force-pushed the bech32_decode branch from 163d794 to 3b7f9ab Compare April 5, 2018 03:12

Tighten up bech32::Decode(); add tests.

60f61f9

murrayn force-pushed the bech32_decode branch from c25a6ea to 60f61f9 Compare April 13, 2018 00:54

practicalswift mentioned this pull request Apr 20, 2018

build: Add linter checking for accidental introduction of locale dependence #13041

Merged

promag reviewed Apr 26, 2018

View reviewed changes

murrayn changed the title ~~Tighten up bech32::Decode(); add tests.~~ Minor optimizations to bech32::Decode(); add tests. Apr 28, 2018

laanwj merged commit 60f61f9 into bitcoin:master May 15, 2018

Bushstar mentioned this pull request May 16, 2018

commits from bitcoin/master FeatherCoin/Feathercoin#332

Merged

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor optimizations to bech32::Decode(); add tests. #12881

Minor optimizations to bech32::Decode(); add tests. #12881

murrayn commented Apr 4, 2018 •

edited

Loading

practicalswift Apr 4, 2018

murrayn Apr 4, 2018

practicalswift Apr 4, 2018

murrayn Apr 5, 2018

laanwj Apr 26, 2018

practicalswift commented Apr 4, 2018 •

edited

Loading

murrayn left a comment

sipa Apr 5, 2018

fanquake commented Apr 5, 2018

promag left a comment

promag Apr 5, 2018

promag Apr 5, 2018

murrayn Apr 5, 2018

promag Apr 5, 2018

murrayn Apr 5, 2018

murrayn Apr 5, 2018

promag Apr 5, 2018

laanwj commented Apr 5, 2018

sipa commented Apr 5, 2018

murrayn commented Apr 6, 2018

maflcko commented Apr 11, 2018

murrayn commented Apr 20, 2018 •

edited

Loading

promag left a comment

maflcko commented Apr 26, 2018

laanwj commented May 15, 2018

Minor optimizations to bech32::Decode(); add tests. #12881

Minor optimizations to bech32::Decode(); add tests. #12881

Conversation

murrayn commented Apr 4, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

practicalswift commented Apr 4, 2018 • edited Loading

murrayn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fanquake commented Apr 5, 2018

promag left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laanwj commented Apr 5, 2018

sipa commented Apr 5, 2018

murrayn commented Apr 6, 2018

maflcko commented Apr 11, 2018

murrayn commented Apr 20, 2018 • edited Loading

promag left a comment

Choose a reason for hiding this comment

maflcko commented Apr 26, 2018

laanwj commented May 15, 2018

murrayn commented Apr 4, 2018 •

edited

Loading

practicalswift commented Apr 4, 2018 •

edited

Loading

murrayn commented Apr 20, 2018 •

edited

Loading