Coerce symbols internal fstrings in UTF8 rather than ASCII #2242
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ref: https://bugs.ruby-lang.org/issues/15940
It's not uncommon for symbols to have literal string counterparts, e.g.
Since the default source encoding is UTF-8, and that symbols coerce their
internal fstring to ASCII when possible, the above snippet will actually keep
two instances of
"name"
in the fstring registry. One in ASCII, the otherin UTF-8.
Considering that UTF-8 is a strict superset of ASCII, storing the symbols
fstrings as UTF-8 instead makes no real difference, but allows in most cases
to reuse the equivalent string literals.
The only notable behavioral change is
Symbol#to_s
.Previously
:name.to_s.encoding
would be#<Encoding:US-ASCII>
.After this patch it's
#<Encoding:UTF-8>
. I can't foresee any significantcompatibility impact of this change on existing code.
There are several ruby specs asserting this behavior. If this specification is impossible to change, then we could consider changing the encoding of the String returned by
Symbol#to_s
, e.g )ruby pseudo code: