-
-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files double unpacked due to different UTF-8 normalizations #1633
Comments
I'm not sure where to apply it so I haven't tested but maybe this will help:
False |
I tried that indeed: I made a unicode-friendly |
And how about this: hard-encode to pure ASCII?
Not nice if there are unicode chars without ASCII equivalant, but maybe acceptable?
|
@sanderjo Indeed that fails for the Chinese-download test. Actually I did work on a solution, where we don't rely on the output reading of |
From the Unicode Consortium Normalization FAQ
I personally use NFKD for internal comparisons and NFKC for output. |
For a few days the tests on Travis have been failing for macOS, for some reason the
test_download_unicode_made_on_windows
was unpacking the resulting file twice.After much debugging I found out that the string
frènch_german_demö
that is printed in the logs is actually presented in 2 different ways. This caused the unpacker not to detect a set was already unpacked, basically because:The reason is that they are obtained from 2 different sources:
Output of
os.listdir
:Output of the par2 files:
https://stackoverflow.com/a/26733055
Now just need to find a way to fix this..
The text was updated successfully, but these errors were encountered: