Skip to content

Conversation

@s3bk
Copy link
Contributor

@s3bk s3bk commented Mar 2, 2021

write out the correct codepoints

write out the correct codepoints
@tafia
Copy link
Owner

tafia commented Mar 4, 2021

Thanks!

I have 2 questions:

  • why not using push_utf8 directly? It looks slightly less expensive than std::char::encode_utf8 (no need to use an intermediary 4 bytes buffer. (It is very likely that it is optimized away)
  • Ideally we should hard code the utf8 encoded characters in all the match branches.

@s3bk
Copy link
Contributor Author

s3bk commented Mar 4, 2021

Yes, ideally the match should have a pre-computed slice each.
I suppose your push_utf8 could be turned into a const fn. Lets see…

@s3bk
Copy link
Contributor Author

s3bk commented Mar 4, 2021

I did not turn it into a const fn, however …

The majority of the time will be spent looking up the name (which is hopefully faster now).
Encoding one char as utf8 using the method in std is probably not even measurable and it avoids duplicating the functionality.

@s3bk
Copy link
Contributor Author

s3bk commented Mar 4, 2021

The alternative would have been to return slices from the match …

@s3bk
Copy link
Contributor Author

s3bk commented Mar 4, 2021

@tafia I think this is now a pretty good solution.
char::encode_utf8 is only called in the case of Ӓ.

@tafia
Copy link
Owner

tafia commented Mar 4, 2021

Thanks!!

@tafia tafia merged commit 2032228 into tafia:master Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants