Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utf8mb4 charset hint to database documentation #315

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

da2x
Copy link
Contributor

@da2x da2x commented Dec 2, 2021

utf8 is an alias for utf8mb3 in MySQL and MariaDB.
Some emojis use 4-bytes, so recommend utf8mb4.

utf8 is an alias for utf8mb3 in MySQL and MariaDB.
Some emojis use 4-bytes, so recommend utf8mb4.
@jacobwb
Copy link
Owner

jacobwb commented Dec 2, 2021

Is there any reason not to also use utf8mb4 as the default in secrets.php?
I would like to support all emoji by default, unless there's a good reason not to.

@da2x
Copy link
Contributor Author

da2x commented Dec 2, 2021

SQLite, PostgreSQL, and others handles 2–4 bytes from utf8 as per the Unicode standard. MySQL wanted to save RAM back in the day and normalized on utf8 meaning 3-bytes instead; which is why you need to specify utf8mb4 to get full Unicode support. MariaDB inherited this legacy from MySQL. The other database defaults in the secrets file is for SQLite.

So … yeah. Do you want to default to MySQL-legacy-workaround or the guys who’ve followed the Unicode standard without introducing issues for their users? The ambiguity is why I put it in the documentation. It’s a common issue and you might end up with breaking multibyte emojis. But that’s kind of what you get when choosing MySQL/MariaDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants