Use mb_substr() for correct abbreviation of non-ASCII characters #651

xalt7x · 2024-05-21T11:41:42Z

When using substr() or another method to reduce a string to/by 1 byte, many UTF-8 characters are lost (displayed as � ). Switching to mb_substr() fixes this.

xalt7x · 2024-05-21T11:48:43Z

The problem is easily reproducible with Cyrillic/Ukrainian characters (e.g., "Джон Дое" as the User/Owner name, or "Навички обслуговування клієнтів" string for "Key Skills").

Additional information:

If you’re working with strings encoded as UTF-8 you may lose characters when you try to get a part of them using the PHP substr function. This happens because in UTF-8 characters are not restricted to one byte, they have variable length to match Unicode characters, between 1 and 4 bytes.

RussH · 2024-09-16T13:04:03Z

Thanks @xalt7x !

Use mb_substr() for correct abbreviation of non-ASCII characters

7d31c45

When using substr() or another method to reduce a string to/by 1 byte, many UTF-8 characters are lost (displayed as � ). Switching to mb_substr() fixes this.

RussH merged commit e7c1ab1 into opencats:master Sep 16, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Use mb_substr() for correct abbreviation of non-ASCII characters #651

xalt7x commented May 21, 2024

xalt7x commented May 21, 2024

RussH commented Sep 16, 2024

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Conversation

xalt7x commented May 21, 2024

xalt7x commented May 21, 2024

RussH commented Sep 16, 2024