-
Notifications
You must be signed in to change notification settings - Fork 354
parser.c: Optimize json_parse_digits
#887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
aeb0e0e to
53991f0
Compare
|
Interestingly, this seem to fail on ubuntu/GCC, but it passes on clang. I'm not yet clear on why. |
ext/json/ext/parser/parser.c
Outdated
| // From: https://github.com/simdjson/simdjson/blob/32b301893c13d058095a07d9868edaaa42ee07aa/include/simdjson/generic/numberparsing.h#L333 | ||
| // Branchless version of: http://0x80.pl/articles/swar-digits-validate.html | ||
| uint64_t match = (next_8bytes & 0xF0F0F0F0F0F0F0F0) | (((next_8bytes + 0x0606060606060606) & 0xF0F0F0F0F0F0F0F0) >> 4); | ||
| size_t consecutive_digits = trailing_zeros64(match ^ 0x3333333333333333) / CHAR_BIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a google search result, __builtin_ctzll(0) seems to be an undefined behavior. I'm not sure it's really undefined though.
The trailing_zeros64 fallbacks code when __builtin_ctzll is not defined, actually returns 0 instead of 64 in this case. So at least trailing_zeros64 fallback code should be fixed.
# When this part is commented out:
# #if HAVE_BUILTIN_CTZLL
# return __builtin_ctzll(input);
# #else
JSON.parse "[12341234]" #=> JSON::ParserErrorThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thank you I just figured this out:
int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
https://gcc.gnu.org/onlinedocs/gcc/Bit-Operation-Builtins.html
I think GCC assume this case never happens so it removes that branch.
947902e to
7b1a77d
Compare
563a55e to
5bcc397
Compare
|
Now I need to review usage of |
5bcc397 to
82b030f
Compare
| static inline uint32_t trailing_zeros64(uint64_t input) | ||
| { | ||
| #ifdef JSON_DEBUG | ||
| assert(input > 0); // __builtin_ctz(0) is undefined behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm doubting these asserts can actually fail. Not too sure what the compilation flags are on CI, but seems like they are optimized out even with JSON_DEBUG.
I'm unsure what's the best way to add assertions in ruby extensions. Perhaps I should just use RUBY_ASSERT and add a debug ruby in the CI matrix?
We can use
ctzbuiltin to get the number of consecutive digits.