gltfio: enable escaped unicode for node name #7989

poweifeng · 2024-07-23T05:52:22Z

bejado · 2024-07-23T19:37:37Z

libs/gltfio/src/AssetLoader.cpp

+    size_t cur = 0, idx = 0;
+    std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv;
+    while ((idx = sImpl.find("\\u", cur)) != std::string::npos) {
+        auto endIdx = sImpl.find_first_not_of("0123456789abcdefABCDEF", idx + 2);


I think we need to limit this to only 6 characters. Otherwise we could have input like this which causes a std::range_error: wstring_convert: to_bytes error:

\u6211\u53EBBenjamin

(should be 我叫Benjamin)

If the character is in the Basic
Multilingual Plane (U+0000 through U+FFFF), then it may be
represented as a six-character sequence: a reverse solidus, followed
by the lowercase letter u, followed by four hexadecimal digits that
encode the character's code point.

See https://www.ietf.org/rfc/rfc4627.txt

Thanks for the spec link. This makes things a lot easier to understand.

LGTM. My other recommendation is that to_bytes can throw a std::range_error error if the unicode point is not valid, but not sure if we care about handling that case since that would mean the string is malformed.

I feel that's more of the user's responsibility. Let's discuss again if/when someone files a bug on it.

Fixes #7846

poweifeng added the internal Issue/PR does not affect clients label Jul 23, 2024

poweifeng requested review from pixelflinger, z3moon and bejado July 23, 2024 05:52

poweifeng force-pushed the pf/gltfio-fix-unicode branch from bf221e9 to e976d63 Compare July 23, 2024 06:13

poweifeng added internal Issue/PR does not affect clients and removed internal Issue/PR does not affect clients labels Jul 23, 2024

poweifeng force-pushed the pf/gltfio-fix-unicode branch from e976d63 to 71c5031 Compare July 23, 2024 06:20

pixelflinger approved these changes Jul 23, 2024

View reviewed changes

bejado reviewed Jul 23, 2024

View reviewed changes

gltfio: enable unicode for node name

05a1c3a

Fixes #7846

poweifeng force-pushed the pf/gltfio-fix-unicode branch from 71c5031 to 05a1c3a Compare July 24, 2024 03:12

bejado approved these changes Jul 24, 2024

View reviewed changes

Merge branch 'main' into pf/gltfio-fix-unicode

1a9aa1f

poweifeng enabled auto-merge (squash) July 25, 2024 08:44

poweifeng disabled auto-merge July 25, 2024 08:45

poweifeng enabled auto-merge (squash) July 25, 2024 08:45

poweifeng changed the title ~~gltfio: enable unicode for node name~~ gltfio: enable escaped unicode for node name Jul 25, 2024

poweifeng merged commit 7441e87 into main Jul 25, 2024
11 checks passed

poweifeng deleted the pf/gltfio-fix-unicode branch July 25, 2024 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gltfio: enable escaped unicode for node name #7989

gltfio: enable escaped unicode for node name #7989

poweifeng commented Jul 23, 2024

bejado Jul 23, 2024

poweifeng Jul 24, 2024

poweifeng Jul 24, 2024

bejado Jul 24, 2024

poweifeng Jul 25, 2024

gltfio: enable escaped unicode for node name #7989

gltfio: enable escaped unicode for node name #7989

Conversation

poweifeng commented Jul 23, 2024

bejado Jul 23, 2024

Choose a reason for hiding this comment

poweifeng Jul 24, 2024

Choose a reason for hiding this comment

poweifeng Jul 24, 2024

Choose a reason for hiding this comment

bejado Jul 24, 2024

Choose a reason for hiding this comment

poweifeng Jul 25, 2024

Choose a reason for hiding this comment