Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting variable %fn is not UTF8 normalized #2858

Open
mgutt opened this issue May 12, 2024 · 2 comments
Open

Sorting variable %fn is not UTF8 normalized #2858

mgutt opened this issue May 12, 2024 · 2 comments
Labels

Comments

@mgutt
Copy link

mgutt commented May 12, 2024

SABnzbd version

4.3.1

Operating system

Arch Linux (binhex Docker Container)

Using Docker image

Other

Description

How the problem occurred
After Copy & Paste something in the name field of the NZB upload form, the file has been renamed after download as follows (special chars in hex):

/ M C3 B6 v i e  ( 1 9 5 0 ) / M 6F CC 88 v i e . 1 9 5 0 . 4 8 0 p . m k v

The renaming is based on this Sabnzbd sorter rule:

%title (%y)/%fn.%ext

This means "%title" uses a normalized UTF8 char to represent "ö" and the "%fn" variable uses non-normalized UTF8 (I don't know if it was already unnormalized while pasting the string or because something which happens inside of Sabnzbd). More information about the two different representations:
https://stackoverflow.com/questions/12147410/different-utf-8-signature-for-same-diacritics-umlauts-2-binary-ways-to-write

I renamed this file manually as follows:

cd "/Mövie (1950)"
find . -type f -exec sh -c 'mv "{}" "$(echo "{}" | uconv -x any-nfc)"' \;

Why was this a problem?
Nextcloud does not support the non-normalized UTF8 representation:

docker exec --user 99 nextcloud php occ files:scan --path="/jogi/files/Movie"
Starting scan for user 1 out of 1 (jogi)
        Entry "Mövie (1950)/Mövie.1950.480p.mkv" will not be accessible due to incompatible encoding

So it's not really a bug. It has only nasty side-effects in special scenarios.

Question
Is it possible to influence the "%fn" variable with a pre or post script? Maybe something like this?

export SAB_FILENAME=$(echo "$SAB_FILENAME" | uconv -x any-nfc)

EDIT: Does not work as the non-normalized filename seems to be the one which is inside the archive.

@mgutt mgutt added the Bug label May 12, 2024
@thezoggy
Copy link
Contributor

thezoggy commented May 12, 2024

Just sounds like you're not setting the locale correctly for your env ?

Do locale -a and see what your using.. if it's not utf8 then prob need to set env stuff.

Also you can try and guard against some stuff by using:
Config > switches > Make Windows compatible

@Safihre
Copy link
Member

Safihre commented May 16, 2024

I can imagine there's a difference, because %fn is based also on disk-listing, while %title is based on parsing.
Have to really do a deep dive to see where to do this.

We have other problems related to this also in other parts: #1633

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants