-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
basics_basic-types #2
Conversation
This was automatically imported from the base repo & has already been reviewed by @WebFreak001, but not finally merged. @WebFreak001 feel free to hit merge if you agree that this is ready ;-) |
er I don't quite agree with the HTML indentations in there right now, I mean the code should also be good looking, not only the website |
or if thats actually the commit because that looks very buggy and it looks like it has mixed in some of the github website |
a8bd3ed
to
462b52d
Compare
Ah damn, seems like my auto-migration had bugs :/ |
</table> | ||
|
||
Der Präfix `u` kennzeichnet Typen ohne Vorzeichen (vom Englischen `unsigned`). | ||
Ein `char` ist ein UTF-8 Zeichen, `wchar` ein UTF-16 Zeichen and `dchar` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate though? A char is only 1 byte and represents a byte in a UTF-8 string, you might need multiple chars to represent 1 UTF-8 character. You have dchar for full characters without needing to have multiple of them.
This code for example works different than you might expect:
import std.stdio;
void main() {
string s = "Ω";
writefln("%s (%s)", s[0], cast(int) s[0]);
}
Output: � (206)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate though?
I translated this literally:
The prefix u denotes unsigned types. char translates to UTF-8 characters, wchar is used in UTF-16 strings and dchar in UTF-32 strings.
https://github.com/dlang-tour/english/blob/master/basics/basic-types.md
A char is only 1 byte and represents a byte in a UTF-8 string, you might need multiple chars to represent 1 UTF-8 character. You have dchar for full characters without needing to have multiple of them.
The document is referring to code units here, e.g. from Wikipedia:
https://en.wikipedia.org/wiki/UTF-8
The encoding is variable-length and uses 8-bit code units. It was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in the alternative UTF-16 and UTF-32 encodings
https://en.wikipedia.org/wiki/UTF-16
The encoding is variable-length, as code points are encoded with one or two 16-bit code units.
https://en.wikipedia.org/wiki/UTF-32
It is a protocol to encode Unicode code points that uses exactly 32 bits per Unicode code point.
Maybe we should edit the base document to make it a bit clearer that code units are referred to?
FYI it's explained in more details on the strings page and since yesterday the DTour has a new gem on Unicode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm we would need to change it in the english version too. Otherwise the translation here is done, gonna merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm we would need to change it in the english version too. Otherwise the translation here is done, gonna merge
Opened it as issue, s.t. we don't forget:
Thanks @WebFreak001! |
oh ok, didnt know about that feature on the website |
* Create foreach.md * Create alias-strings.md * Delete foreach.md * Update alias-strings.md * Update alias-strings.md * Update alias-strings.md * Update alias-strings.md
No description provided.