Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Early termination of strings with NULL values in-between for JavascriptCore #34300

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

swittk
Copy link
Contributor

@swittk swittk commented Jul 29, 2022

Summary

createStringFromUtf8 currently calls JavascriptCore's JSStringCreateWithUTF8CString(const char *) method when retrieving strings from the native side, causing strings with NULL values in-between to be early-truncated (since it stops reading with the first NULL value).

Fixes #24129

Changelog

Changed jsi::String JSCRuntime::createStringFromUtf8(const uint8_t *str, size_t length) to use a parser to UTF-16 characters as appropriate for the JSChar * constructor, JSStringCreateWithCharacters(const JSChar* chars, size_t numChars), which allows specifying of length.

This also should provide a minor speed-up for longer strings since JSStringCreateWithUTF8CString internally calls strlen(), then does UTF-8 to UTF-16 conversions.

[General] [Fixed] - Early termination of strings with NULL values in JavascriptCore (#24129)

Test Plan

I executed calls to my level adapter, saving and loading strings.

await db.put('hi', 'hi\u0000world');
const gotRes = await db.get('hi'); 
console.log(gotRes)

Without this fix it would result in only 'hi'.
With this fix, you get the whole string, 'hi\u0000world'.

`createStringFromUtf8` currently calls JavascriptCore's `JSStringCreateWithUTF8CString(const char *)` method when retrieving strings from the native side, causing strings with NULL values in-between to be early-truncated.

Fixes facebook#24129
@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team. labels Jul 29, 2022
@analysis-bot
Copy link

Platform Engine Arch Size (bytes) Diff
android hermes arm64-v8a 7,829,829 +0
android hermes armeabi-v7a 7,220,851 +0
android hermes x86 8,142,240 +0
android hermes x86_64 8,120,936 +0
android jsc arm64-v8a 9,711,193 +5,010
android jsc armeabi-v7a 8,465,008 +4,158
android jsc x86 9,662,359 +5,576
android jsc x86_64 10,262,046 +6,960

Base commit: 1af2bea
Branch: main

@analysis-bot
Copy link

Platform Engine Arch Size (bytes) Diff
ios - universal n/a --

Base commit: 1af2bea
Branch: main

@github-actions
Copy link

This PR is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale There has been a lack of activity on this issue and it may be closed soon. label Jan 26, 2023
@swittk
Copy link
Contributor Author

swittk commented Jan 26, 2023

not stale?

@github-actions github-actions bot removed the Stale There has been a lack of activity on this issue and it may be closed soon. label Jan 27, 2023
@javache
Copy link
Member

javache commented Mar 6, 2023

Thanks for the fix. The approach makes sense, but I'd cautious this may cause performance regressions, given the existing fast path that exists in JSC for all-ASCII stings. I'd also prefer if we could defer a system implementation here, or have an implementation of these methods outside of this file that reference their original source (e.g. https://github.com/WebKit/WebKit/blob/main/Source/WTF/wtf/unicode/UTF8Conversion.cpp)

@swittk
Copy link
Contributor Author

swittk commented May 6, 2023

Hi. Thanks for reviewing! (and sorry for the late reply)

Just wondering; do you have any suggestions on how I might import that function (the one that converts UTF-8 to UTF-16 strings over from the WebKit/JSC side?) over? It's implemented in CPP and I'm not sure how I might define extern function declarations or whatnot to make it work.

As for potential performance regressions, maybe we just have to benchmark in the end? Because of I understand correctly, eventually what the existing code path does is it needs to convert the strings over to UTF-16 internally anyway when on JSC.

Copy link

This PR is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale There has been a lack of activity on this issue and it may be closed soon. label Jan 15, 2024
@swittk
Copy link
Contributor Author

swittk commented Jan 22, 2024

... any potentials for this?

@github-actions github-actions bot removed the Stale There has been a lack of activity on this issue and it may be closed soon. label Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JavaScript strings with NULL character are not handled properly
5 participants