Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 characters encoded in 4 bytes not supported in filenames #598

Open
nvxos opened this issue Sep 1, 2023 · 4 comments
Open

UTF-8 characters encoded in 4 bytes not supported in filenames #598

nvxos opened this issue Sep 1, 2023 · 4 comments

Comments

@nvxos
Copy link

nvxos commented Sep 1, 2023

Trying to use UTF-8 characters encoded in 4 bytes in filenames doesn't seem supported. It triggers a "file or folder does not exist" error in Windows for example. Same type of error on Linux.
The characters in question are for example some emojis like "🔥" (https://apps.timwhitlock.info/unicode/inspect/hex/1F525), while characters encoded in 3 bytes or less don't seem to pose a problem like for example the emoji "❤️" (https://apps.timwhitlock.info/unicode/inspect?s=%E2%9D%A4%EF%B8%8F).

Some reference I found about the subject and other helpful links I used to pinpoint the issue:
https://en.wikipedia.org/wiki/Unicode#Code_planes_and_blocks
https://apps.timwhitlock.info/emoji/tables/unicode
https://apps.timwhitlock.info/unicode/inspect

For some context, this issue happened on a deployment of KSMBD on FreeboxOS, a french ISP (Free) router (Freebox) OS. After issuing a ticket on their bug tracker (https://dev.freebox.fr/bugs/task/38504), they asked me to issue a ticket here.

@mmakassikis

@namjaejeon
Copy link
Member

@mmakassikis Do you have the time to fix it ? I guess that we need to compare unicode.c in ksmbd and cifs_unicode.c. cifs.ko seems to use utf8_to_utf32 or utf8s_to_utf16s instead of ->char2uni. char2uni doesn't fully support utf8.

@namjaejeon
Copy link
Member

namjaejeon commented Oct 19, 2023

@nvxos Can you check if problem is improved with the following patch ? (namjaejeon@f389804)

@mmakassikis
Copy link

@namjaejeon
I have tested the patch and this fixes renaming a file named "🔥", which didn't previously work.
I left a couple comments on the patch.

@namjaejeon
Copy link
Member

@mmakassikis Thanks for your review. I updated the patch(namjaejeon@8dffdce). Let me know if you find any issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants