-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utf8_string::get_num_bytes_from_start returns incorrect value #33
Comments
Update: the quick workaround for both, and additional, glitches is to return |
Hi Vadim, Happy New Year! 🍾 Thanks for pointing this out, it helped to improve correctness and code-safety alot! Cheers! |
Great, thank you very much for the prompt turnaround, Jakob! I'll test it tomorrow. |
You're very welcome! 😃 |
Excellent. All the test cases worked, thank you! |
Super 👍 |
Hi Jakob,
Happy New Year!
Looks like the issue you kept fixing, strikes again.
It's pretty much the same pattern: mostly plain Western European text with one multibyte interloper.
Similar to #14, but a different point, specifically
get_num_bytes_from_start
. I came across it when usingfind_first_of
.Another glitch, which may be stemming from the same piece of code, is that
substr
truncates the result. Having said that, if the block starting withif( utf8_string::is_lut_active( lut_iter ) )...
underget_num_bytes_from_start
is disabled, thefind_first_of
returns a correct result.Here is the sample snippet demonstrating both:
BTW, I see that we were talking about the code reuse in #14. I am wondering if you can take that encapsulate that lut block that you use in several functions, seems like it might save some efforts in the future.
The text was updated successfully, but these errors were encountered: