-
-
Notifications
You must be signed in to change notification settings - Fork 705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to!string doesn't throw on invalid UTF sequence #9906
Labels
Comments
andrej.mitrovich (@AndrejMitrovic) commented on 2016-08-27T21:55:57Z-----
import std.conv;
import std.stdio;
void main()
{
auto x = to!string(cast(char)255);
writeln(x);
}
-----
Outputs:
[Decode error - output not utf-8]
I think the to!() routines should be UTF safe so the call to to!string above should throw an exception. Is this right Andrei? |
andrei (@andralex) commented on 2016-10-14T16:55:25ZWell since it doesn't throw we may as well make it nothrow :o) and use the replacement char, or add an overload. I'll bootcamp this. |
lucia.mcojocaru commented on 2016-11-21T13:36:26ZIs this a Windows specific bug?
I tested the following on Linux 64:
1 import std.conv;
2 import std.stdio;
3 import std.utf;
4
5 void main()
6 {
7 auto x = to!string(cast(char)191);
8 auto z = toUTF8(x);
9 writeln(x);
10
11
12 foreach (y; 0 .. 16)
13 foreach (r; 0 .. 16)
14 {
15 auto buffer = to!string(cast(char)(16 * r + y));
16 auto b = toUTF8(buffer);
17 writeln(b);
18 // auto result = buffer.toUTF16z; // call to utf16z for the winapi
19 }
20 }
Only the commented line throws:
core.exception.UnicodeException@src/rt/util/utf.d(292): invalid UTF-8 sequence |
bugzilla (@WalterBright) commented on 2019-12-11T14:18:40ZThe original bug isn't windows specific. I don't know if the example from Lucia Cojocaru can be considered the same bug... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
andrej.mitrovich (@AndrejMitrovic) reported this on 2011-06-08T11:41:02Z
Transfered from https://issues.dlang.org/show_bug.cgi?id=6125
CC List
Description
I'm not sure if this is a bug or wanted behavior: auto x = to!string(cast(char)255); That won't throw. But this will: auto x = to!string(cast(char)255); // or try 128 auto z = toUTF8(x); // throws I've had this example code translated from C: foreach (y; 0 .. 16) foreach (x; 0 .. 16) { auto buffer = to!string(cast(char)(16 * x + y)); auto result = buffer.toUTF16z; // call to utf16z for the winapi } Essentially the code builds a table of characters that it prints out. But it doesn't seem to take into account invalid UTF8 code points. This leads me to another question, how does one iterate through valid UTF code points, starting from 0? Is there a Phobos function that does that?The text was updated successfully, but these errors were encountered: