Question: dealing with \u0000 in strings from jerry_string_to_utf8_char_buffer / jerry_string_to_char_buffer #2263

martijnthe · 2018-04-03T08:49:23Z

I realize this is not a JerryScript specific question, but I was wondering what people's approaches are towards dealing with strings from JS when they have a \u0000 unicode code point(s) in them. This ends up being encoded as a 0x00 byte, which happens to be the "terminator" of a C string as well.

I can imagine this being a source of bugs, esp. if assumptions are made in the C code w.r.t. the length given by jerry_get_(utf8_)string_length() vs strlen()/strlen_s().

Some approaches I can think of to deal with this:

Don't use anything from <string.h>, instead use a proper utf8 library to deal with strings that only takes in string data in the form of a pointer + length.
Truncate: just use the first 0x00 as the end and wipe all data after it just to be sure.
...

The text was updated successfully, but these errors were encountered:

rerobika · 2020-05-26T13:56:22Z

Closing the issue due to inactivity, please reopen if needed.

LaszloLango added the question Raised question label Apr 3, 2018

rerobika closed this as completed May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: dealing with \u0000 in strings from jerry_string_to_utf8_char_buffer / jerry_string_to_char_buffer #2263

Question: dealing with \u0000 in strings from jerry_string_to_utf8_char_buffer / jerry_string_to_char_buffer #2263

martijnthe commented Apr 3, 2018

rerobika commented May 26, 2020

Question: dealing with \u0000 in strings from jerry_string_to_utf8_char_buffer / jerry_string_to_char_buffer #2263

Question: dealing with \u0000 in strings from jerry_string_to_utf8_char_buffer / jerry_string_to_char_buffer #2263

Comments

martijnthe commented Apr 3, 2018

rerobika commented May 26, 2020