New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[utf8-validator] eliminate unnecessary comparison from must_be_2_3_continuation #2113
[utf8-validator] eliminate unnecessary comparison from must_be_2_3_continuation #2113
Conversation
@Validark I think your assumption is correct. I am running test, and I expect to merge your PR. |
simd8<bool> is_third_byte = prev2 >= uint8_t(0xe0u); | ||
simd8<bool> is_fourth_byte = prev3 >= uint8_t(0xf0u); | ||
return is_third_byte ^ is_fourth_byte; | ||
simdjson_inline simd8<uint8_t> must_be_2_3_continuation(const simd8<uint8_t> prev2, const simd8<uint8_t> prev3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that the ARM change buys you something. Does it?
My expectation is that it is not going to affect the performance. Do you disagree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my use of analogous techniques in Zig, I would think the arm emit would have an equal number of instructions either way. I made the change to the arm implementation solely because I was thinking it has to have the same return type as the other implementations. I suppose I could have achieved that in a different way, but I'm not sure it makes a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good answer. I just wanted to make sure I understood the change.
@Validark I am pushing your proposal to another library (simdutf), and I am getting a nice boost: |
I am merging this. This is very nice. |
Hello, I just thought of this optimization while refactoring the code in simdjzon.
For Zig, which uses LLVM, this optimization does not occur automatically. For that, I have submitted an issue to LLVM.
However, this project compiles with multiple compilers which may or may not do this optimization automatically, so I have edited your code to reflect the optimization idea. It is very basic, and I did not test the modified C++ code. But it looks right to me. (I did test my Zig implementation)
Hope it works for you.
‒ Validark
P.S. I assumed
simd8<uint8_t> must23_80 = must23 & uint8_t(0x80);
splats the0x80
. Is that correct?