Encodings

What is Unicode? ⌘
What is Unicode Consortium? ⌘
What is American Standard Code for Information Interchange (ASCII)? ⌘
What is the difference between ASCII and US-ASCII? ⌘
What is the difference between ASCII and Extended ASCII? ⌘
What is American National Standards Institute (ANSI)? ⌘
What is the differece between ASCII and ANSI? ⌘
What is UCS-2? ⌘
What is UTF-8? ⌘
What is the max UTF-... available? ⌘
What is the difference between UTF-8 and Unicode? ⌘
What is the difference between UTF-8 and Extended ASCII? ⌘
What is Byte order mark (BOM)? ⌘
What is the difference between big-endian and little-endian? ⌘
What is the difference between line feed LF (0x0A) and \n, or between carriage return CR (0x0D) and \r ? ⌘
What is the difference between NEXT LINE (NEL) (U+0085), Line Separator (LS) (U+2028), and End Of Line (EOL)? ⌘
What is the default End Of Line (EOL) for each of the following: Windows, Linux, OSX, Unix, older Mac? ⌘
What is the meaning of full stop (0x2E)? ⌘
What is iconv? ⌘
Why do some of emails contain "J", for example "RegardsJ"?
If you have Wingdings installed on your computer, the following character will appear as a smiley face. Otherwise, it will be the letter "J": J
This is because the letter J represents a smiley face icon in the Wingdings font. Microsoft Outlook, a popular e-mail client, automatically converts the :) and :-) text emoticons into smiley face icons using the Wingdings font. Therefore, when Microsoft Outlook users type smiley faces in an e-mail message, they are sent as visual smiley face icons.
Read more:
https://pc.net/helpcenter/answers/letter_j_in_email_messages

MySQL

What is the difference between Collation and Character Set?
Collation: A collation is a set of rules for comparing characters in a character set.
Character-Set: A character set is a set of symbols and encoding, mostly this information is derived from the type of collation.
Collation and Character-Set in MySQL are meant for strings.
Read more:
https://medium.com/@manish_demblani/breaking-out-from-the-mysql-character-set-hell-24c6a306e1e5
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
How many levels of basic collation and character set exists in MySQL? Just note if collation and character set may be set specifically for a database, a table, a column.
MySQL defines a basic collation and a character-set for each of its databases; Furthermore, each table created can have its own collation and character-set which can be same as that of the database or different from it. Furthermore, to provide even more flexibility, a column in a table has a collation and character-set of its own, which can be same as the table or different from it. Although this gives a lot of flexibility but also increases the complexity of handling data.
Read more:
https://medium.com/@manish_demblani/breaking-out-from-the-mysql-character-set-hell-24c6a306e1e5
Lets say we have MySQL database that supports UTF-8 (utf8mb3 by default, utf8mb4 if modified). There is a type of data (a data container) for a text to be stored in - TINYTEXT. TINYTEXT is set to be 255 bytes in size. How many utf8mb3 and utf8mb4 characters can TINYTEXT store?
???
utf8mb3 - 255 bytes / 3 characters per byte -> 85 characters
utf8mb4 - 255 bytes / 4 characters per byte -> 63 characters
Read more:
https://mathiasbynens.be/notes/mysql-utf8mb4
What is the difference between utf8_unicode_ci and utf8_general_ci?
In general, utf8_general_ci is faster than utf8_unicode_ci, but less correct.
For any Unicode character set, operations performed using the _general_ci collation are faster than those for the _unicode_ci collation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. The reason for this is that utf8_unicode_ci supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languages ß is equal to ss. utf8_unicode_ci also supports contractions and ignorable characters. utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.
Read more:
https://stackoverflow.com/questions/2344118/utf-8-general-bin-unicode/2344130#2344130
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html
What is the difference between utf8_bin and utf8_general_ci?
utf8_bin compares the bits blindly. No case folding, no accent stripping.
utf8_general_ci compares one byte with one byte. It does case folding and accent stripping, but no 2-character comparisions: ij is not equal ĳ in this collation.
Read more:
https://stackoverflow.com/questions/2344118/utf-8-general-bin-unicode/2344130

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encodings.md

Encodings.md

Encodings

MySQL

Files

Encodings.md

Latest commit

History

Encodings.md

File metadata and controls

Encodings

MySQL