Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warnings about UTF-16 vs UTF-8 strings #1416

Merged
merged 1 commit into from
Apr 5, 2019

Conversation

alexcrichton
Copy link
Contributor

This commit aims to address #1348 via a number of strategies:

  • Documentation is updated to warn about UTF-16 vs UTF-8 problems
    between JS and Rust. Notably documenting that as_string and handling
    of arguments is lossy when there are lone surrogates.

  • A JsString::is_valid_utf16 method was added to test whether
    as_string is lossless or not.

The intention is that most default behavior of wasm-bindgen will
remain, but where necessary bindings will use JsString instead of
str/String and will manually check for is_valid_utf16 as
necessary. It's also hypothesized that this is relatively rare and not
too performance critical, so an optimized intrinsic for is_valid_utf16
is not yet provided.

Closes #1348

@alexcrichton alexcrichton requested a review from Pauan April 1, 2019 18:19
guide/src/reference/types/str.md Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
crates/js-sys/src/lib.rs Show resolved Hide resolved
crates/js-sys/src/lib.rs Show resolved Hide resolved
Copy link
Contributor

@Pauan Pauan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I also still think as_string should be changed to into)

crates/js-sys/src/lib.rs Outdated Show resolved Hide resolved
guide/src/reference/types/str.md Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
guide/src/reference/types/str.md Show resolved Hide resolved
Copy link
Contributor

@Pauan Pauan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a few nits, this is looking really good! I like how this turned out.

crates/js-sys/src/lib.rs Outdated Show resolved Hide resolved
crates/js-sys/src/lib.rs Outdated Show resolved Hide resolved
crates/js-sys/src/lib.rs Outdated Show resolved Hide resolved
guide/src/reference/types/str.md Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Contributor Author

👍 Thanks for the thorough review!

Copy link
Contributor

@Pauan Pauan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more nits, and then this is ready to merge!

crates/js-sys/src/lib.rs Outdated Show resolved Hide resolved
This commit aims to address rustwasm#1348 via a number of strategies:

* Documentation is updated to warn about UTF-16 vs UTF-8 problems
  between JS and Rust. Notably documenting that `as_string` and handling
  of arguments is lossy when there are lone surrogates.

* A `JsString::is_valid_utf16` method was added to test whether
  `as_string` is lossless or not.

The intention is that most default behavior of `wasm-bindgen` will
remain, but where necessary bindings will use `JsString` instead of
`str`/`String` and will manually check for `is_valid_utf16` as
necessary. It's also hypothesized that this is relatively rare and not
too performance critical, so an optimized intrinsic for `is_valid_utf16`
is not yet provided.

Closes rustwasm#1348
@alexcrichton alexcrichton merged commit 16745ed into rustwasm:master Apr 5, 2019
@alexcrichton alexcrichton deleted the js-string-valid-utf16 branch April 5, 2019 15:12
@alexcrichton
Copy link
Contributor Author

Thanks @Pauan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants