Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

literals for invalid characters? #44646

Closed
stevengj opened this issue Mar 16, 2022 · 6 comments
Closed

literals for invalid characters? #44646

stevengj opened this issue Mar 16, 2022 · 6 comments
Labels
domain:strings "Strings!" domain:unicode Related to unicode characters and encodings

Comments

@stevengj
Copy link
Member

stevengj commented Mar 16, 2022

This seems weird:

julia> "\x94"[1]
'\x94': Malformed UTF-8 (category Ma: Malformed, bad data)

julia> '\x94'
ERROR: syntax: invalid empty character literal

Shouldn't you be able to enter '\x...' for such characters, so that the literal format matches the repr?

cc @StefanKarpinski, who designed Char in #24999.

@stevengj stevengj added domain:unicode Related to unicode characters and encodings domain:strings "Strings!" labels Mar 16, 2022
@simeonschaub
Copy link
Member

Duplicate of #25072?

@stevengj
Copy link
Member Author

Thanks!

@stevengj stevengj reopened this Mar 16, 2022
@stevengj
Copy link
Member Author

stevengj commented Mar 16, 2022

Actually, it's not quite the same. That issue is about overlong Char, which a bit different from '\x94' (a Char representing a single byte of invalid UTF-8).

@simeonschaub
Copy link
Member

Yeah, maybe we can keep both open, but it seems to be the same underlying issue

@StefanKarpinski
Copy link
Sponsor Member

I did mention later in that issue supporting straight up invalid character literals. Jeff had some tips on where to start hacking it into the parser but I never got around to it. This would be nice to have.

@stevengj
Copy link
Member Author

Fair enough, we can close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:strings "Strings!" domain:unicode Related to unicode characters and encodings
Projects
None yet
Development

No branches or pull requests

3 participants