-
Notifications
You must be signed in to change notification settings - Fork 444
-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode math italics/bold truncates text notes #7663
Comments
@fgnievinski, OJS should support the full Unicode set. Can you please include the following information:
|
As a note... I've had some encoding issues when using the |
thanks for the tip. the installation has been recently restored after been invaded, so probably there are some lose ends. I'll change the issue description to focus on the reproducible problem. |
Anyway, every application will present some kind of weird behavior when faced with the strings in this list https://github.com/minimaxir/big-list-of-naughty-strings, so perhaps you've found one of them :) |
just to tell a sad story: the reviewer submitted their review but didn't notice it was truncated then the author ended up missing half of the reviewer's comments, who was not very pleased. :-| |
I was able to reproduce this on the OJS3 test drive install but not locally. Locally, I am running psql (PostgreSQL) 12.9 (Ubuntu 12.9-0ubuntu0.20.04.1) and this is the character encoding:
It seems this is related to the deployment in some way. If this requires a change to the default or recommended database configuration, can someone propose a change? |
@NateWr I've just checked in my environment and my previous comment is enough to address the issue. Just forgot to add that the charset must be configured in the [i18n]
connection_charset = utf8mb4
[database]
collation = utf8mb4_general_ci As I've already created an issue to address this configuration, I'll close this one. |
It's interesting, though, because these are my config settings and I didn't have the problem. Is it because I'm on postgres?
Also, I don't have |
Yeah, it happens only in MySQL. The old PostgreSQL supports the 4 bytes pattern, but it will also fail if you try to insert an invalid UTF-8 sequence. |
A reviewer copied content from an article (typeset externally), resulting in formatted text such as "𝑚𝑒𝑡𝑒𝑟𝑠" (contrast: meters and meters).
When such text is pasted in a TineMCE box, it's displayed correctly, but saving the text is not possible:
I believe these Latin italics characters belong to the Unicode Mathematical Alphanumeric Symbols block.
Maybe the database doesn't support storing such so extended Unicode?
Slightly related issue: #2564
The text was updated successfully, but these errors were encountered: