Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ops): Fast zero copy string arguments #16777

Merged
merged 14 commits into from
Dec 2, 2022

Conversation

littledivy
Copy link
Member

@littledivy littledivy commented Nov 23, 2022

Uses SeqOneByteString optimization to do zero-copy &str arguments in fast calls.

#[op]
fn op_string_len(s: &str) -> u32 { 
  str.len() as u32 
}

@littledivy littledivy changed the title feat(ops): Add fast zero copy string arguments feat(ops): Fast zero copy string arguments Nov 23, 2022
@aapoalas
Copy link
Collaborator

aapoalas commented Nov 23, 2022

I recommend testing some Latin-1 (V8's OneByte) and UTF-8 clashing characters, eg. © ("\u00A9") is a good one.

EDIT: Not sure if that's actually a clashing character... But at least I've recently been bitten by presuming that V8's OneByteStrings are directly usable as UTF-8, specifically with the copyright symbol as the test character that broke.

@littledivy littledivy marked this pull request as ready for review December 1, 2022 04:15
@bartlomieju
Copy link
Member

I recommend testing some Latin-1 (V8's OneByte) and UTF-8 clashing characters, eg. © ("\u00A9") is a good one.

EDIT: Not sure if that's actually a clashing character... But at least I've recently been bitten by presuming that V8's OneByteStrings are directly usable as UTF-8, specifically with the copyright symbol as the test character that broke.

@littledivy did you add such a test?

@littledivy
Copy link
Member Author

@bartlomieju There are tests for that in the V8 patch: https://chromium-review.googlesource.com/c/v8/v8/+/4036884

fn op_encoding_encode_into(
scope: &mut v8::HandleScope,
input: serde_v8::Value,
input: Cow<'_, str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not &str?

ext/web/lib.rs Outdated
Comment on lines 386 to 388
// ops guarantee this to be a SeqOneByteString.
Cow::Borrowed(v) => v[..boundary].len() as u32,
Cow::Owned(v) => v[..boundary].encode_utf16().count() as u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cow::Borrowed means SeqOneByteString and Cow::Owned means utf16? I think a more verbose comment here is warranted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 please explain which variant should be used in which situation

ext/web/lib.rs Show resolved Hide resolved
Comment on lines +638 to +656
fn is_string(ty: impl ToTokens) -> Option<bool> {
let toks = tokens(ty);
if toks == "String" {
return Some(false);
}
if toks == "& str" {
return Some(true);
}
None
}

fn is_option_string(ty: impl ToTokens) -> bool {
tokens(ty) == "Option < String >"
}

fn is_cow_str(ty: impl ToTokens) -> bool {
tokens(&ty).starts_with("Cow <") && tokens(&ty).ends_with("str >")
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_string shouldn't also include Cow strings? Maybe it should be called is_non_cow_string?

It would be nice to have unit tests for these little utility functions. Is that hard?

Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - some comments...

awesome work @littledivy - this is super cool

@littledivy littledivy enabled auto-merge (squash) December 2, 2022 05:02
@littledivy littledivy merged commit 9b2b8df into denoland:main Dec 2, 2022
bartlomieju added a commit to bartlomieju/deno that referenced this pull request Dec 15, 2022
dsherret pushed a commit that referenced this pull request Dec 15, 2022
littledivy added a commit that referenced this pull request Mar 3, 2023
Reland #16777

The codegen is disabled in async ops and when fallback to slow call is
possible (return type is a Result) to avoid hitting this V8 bug:
#17159
kt3k pushed a commit that referenced this pull request Mar 10, 2023
Reland #16777

The codegen is disabled in async ops and when fallback to slow call is
possible (return type is a Result) to avoid hitting this V8 bug:
#17159
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants