Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StringLiteral#to_utf16 #14676

Merged

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Jun 8, 2024

Implements the {{ "hello".to_utf16 }} proposal that merely exposes String#to_utf16 to macros.

Kept as draft until we decide which form we prefer: macro call or macro method.

closes #14670

@ysbaddaden ysbaddaden self-assigned this Jun 8, 2024
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jun 8, 2024

I didn't run any benchmark. It's probably not so fast because we must transform the Slice(UInt16) into an ArrayLiteral(NumberLiteral) which means lots of allocations.

@BlobCodes
Copy link
Contributor

As written in #14670 (comment), almost all of the time required to convert a String to UTF-16 in macro land is caused by the parser.

Adding this macro method probably won't be a noticable performance improvement.

Co-authored-by: Quinton Miller <nicetas.c@gmail.com>
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jun 8, 2024

@BlobCodes Most likely, yes, but speed ain't my motive. Reusing stdlib means we only implement the algorithm once, and we can have symmetry:

  • "hello".to_utf16 (runtime)
  • {{ "hello".to_utf16 }} (compile-time)

Also, I had fun poking into macro methods: we can return any ASTNode 😈

@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jun 9, 2024

Let's benchmark a bit, generating a utf16 slice literal of "TEST 😐🐙 ±∀ の" 5000 times with this proposal and the proposed macro. the only significant result is the "Semantic (top level)" result (old Intel Haswell i7):

  • Blank file: 83ms
  • String.utf16_literal macro: 540ms
  • StringLiteral#to_utf16 macro method: 240ms

I didn't expect the macro method to be noticeably faster.

edit: and anyway we don't expect to encode 5000 static UTF-16 strings in a program, and even so the impact is barely noticeable compared to everything else (be it the macro or macro method).

@ysbaddaden ysbaddaden marked this pull request as ready for review June 10, 2024 08:06
@straight-shoota straight-shoota changed the title Add StringLiteral#to_utf16 Add StringLiteral#to_utf16 Jun 15, 2024
@straight-shoota straight-shoota added this to the 1.13.0 milestone Jun 24, 2024
@straight-shoota straight-shoota merged commit b08f4a2 into crystal-lang:master Jun 25, 2024
61 checks passed
@ysbaddaden ysbaddaden deleted the feature/string-literal-to-utf16 branch June 26, 2024 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UTF-16 string literals
5 participants