-
What is Unicode? ⌘
-
What is Unicode Consortium? ⌘
-
What is American Standard Code for Information Interchange (ASCII)? ⌘
-
What is the difference between ASCII and US-ASCII? ⌘
-
What is the difference between ASCII and Extended ASCII? ⌘
-
What is American National Standards Institute (ANSI)? ⌘
-
What is the differece between ASCII and ANSI? ⌘
-
What is UCS-2? ⌘
-
What is UTF-8? ⌘
-
What is the max UTF-
...
available? ⌘ -
What is the difference between UTF-8 and Unicode? ⌘
-
What is the difference between UTF-8 and Extended ASCII? ⌘
-
What is Byte order mark (BOM)? ⌘
-
What is the difference between big-endian and little-endian? ⌘
-
What is the difference between line feed
LF
(0x0A) and\n
, or between carriage returnCR
(0x0D) and\r
? ⌘ -
What is the difference between NEXT LINE (NEL) (U+0085), Line Separator (LS) (U+2028), and End Of Line (EOL)? ⌘
-
What is the default End Of Line (EOL) for each of the following: Windows, Linux, OSX, Unix, older Mac? ⌘
-
What is the meaning of
full stop
(0x2E)? ⌘ -
What is
iconv
? ⌘ -
Why do some of emails contain "J", for example "RegardsJ"?
If you have Wingdings installed on your computer, the following character will appear as a smiley face. Otherwise, it will be the letter "J": J
This is because the letter J represents a smiley face icon in the Wingdings font. Microsoft Outlook, a popular e-mail client, automatically converts the :) and :-) text emoticons into smiley face icons using the Wingdings font. Therefore, when Microsoft Outlook users type smiley faces in an e-mail message, they are sent as visual smiley face icons.
Read more:
https://pc.net/helpcenter/answers/letter_j_in_email_messages
-
What is the difference between Collation and Character Set?
Collation: A collation is a set of rules for comparing characters in a character set.
Character-Set: A character set is a set of symbols and encoding, mostly this information is derived from the type of collation.
Collation and Character-Set in MySQL are meant for strings.
Read more:
https://medium.com/@manish_demblani/breaking-out-from-the-mysql-character-set-hell-24c6a306e1e5
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ -
How many levels of basic collation and character set exists in MySQL? Just note if collation and character set may be set specifically for a database, a table, a column.
MySQL defines a basic collation and a character-set for each of its databases; Furthermore, each table created can have its own collation and character-set which can be same as that of the database or different from it. Furthermore, to provide even more flexibility, a column in a table has a collation and character-set of its own, which can be same as the table or different from it. Although this gives a lot of flexibility but also increases the complexity of handling data.
Read more:
https://medium.com/@manish_demblani/breaking-out-from-the-mysql-character-set-hell-24c6a306e1e5 -
Lets say we have MySQL database that supports UTF-8 (utf8mb3 by default, utf8mb4 if modified). There is a type of data (a data container) for a text to be stored in - TINYTEXT. TINYTEXT is set to be 255 bytes in size. How many utf8mb3 and utf8mb4 characters can TINYTEXT store?
???
utf8mb3
- 255 bytes / 3 characters per byte -> 85 characters
utf8mb4
- 255 bytes / 4 characters per byte -> 63 characters
Read more:
https://mathiasbynens.be/notes/mysql-utf8mb4 -
What is the difference between
utf8_unicode_ci
andutf8_general_ci
?
In general,utf8_general_ci
is faster thanutf8_unicode_ci
, but less correct.
For any Unicode character set, operations performed using the_general_ci
collation are faster than those for the_unicode_ci
collation. For example, comparisons for theutf8_general_ci
collation are faster, but slightly less correct, than comparisons forutf8_unicode_ci
. The reason for this is thatutf8_unicode_ci
supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languagesß
is equal toss
.utf8_unicode_ci
also supports contractions and ignorable characters.utf8_general_ci
is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.
Read more:
https://stackoverflow.com/questions/2344118/utf-8-general-bin-unicode/2344130#2344130
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html -
What is the difference between
utf8_bin
andutf8_general_ci
?
utf8_bin
compares the bits blindly. No case folding, no accent stripping.
utf8_general_ci
compares one byte with one byte. It does case folding and accent stripping, but no 2-character comparisions:ij
is not equalij
in this collation.
Read more:
https://stackoverflow.com/questions/2344118/utf-8-general-bin-unicode/2344130